alyssa changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - Logs https://oftc.irclog.whitequark.org/panfrost - <macc24> i have been here before it was popular
vstehle has quit [Ping timeout: 480 seconds]
atler is now known as Guest530
atler has joined #panfrost
Guest530 has quit [Ping timeout: 480 seconds]
<Dylanger>
Has anyone where worked on or with Google's gfxstream?
<Dylanger>
Looks to be a virgl replacement
vstehle has joined #panfrost
wwilly has joined #panfrost
wwilly_ has joined #panfrost
wwilly has quit [Ping timeout: 480 seconds]
Rathann has joined #panfrost
wwilly__ has joined #panfrost
wwilly_ has quit [Ping timeout: 480 seconds]
stepri01 has joined #panfrost
<daniels>
gfxstream also includes Vulkan, which is parallel to Venus
pendingchaos has quit [Read error: Connection reset by peer]
pendingchaos has joined #panfrost
wicast has joined #panfrost
wicast has quit []
Fat-Zer has joined #panfrost
wicast has joined #panfrost
wicast has quit []
wicast has joined #panfrost
wicast has quit []
wicast has joined #panfrost
wicast has quit []
wicast has joined #panfrost
wicast has quit []
wicast has joined #panfrost
wicast has quit []
wicast has joined #panfrost
wicast has quit []
wicast has joined #panfrost
alpernebbi has joined #panfrost
wicast has quit []
wicast has joined #panfrost
wicast has quit []
wicast has joined #panfrost
wicast has quit []
wicast has joined #panfrost
wicast has quit []
wicast has joined #panfrost
wicast has quit []
wicast has joined #panfrost
wicast has quit []
wicast has joined #panfrost
wicast has quit []
wicast has joined #panfrost
wicast has quit []
wicast has joined #panfrost
wwilly_ has joined #panfrost
wwilly__ has quit [Ping timeout: 480 seconds]
Rathann|pine has joined #panfrost
Rathann is now known as Guest578
Rathann|pine is now known as Rathann
Rathann is now known as Rathann|pine
Rathann|pine is now known as Rathann
Rathann is now known as Rathann|pine
wwilly_ has quit [Ping timeout: 480 seconds]
<icecream95>
alyssa: When the source of a 16-bit instruction is spilled on Midgard, lcra_restrict_range can set the modulus wrong because len > bound
pendingchaos has quit []
pendingchaos has joined #panfrost
wwilly_ has joined #panfrost
Rathann|pine has quit [Quit: Leaving]
<icecream95>
ignore, this was caused by some code I added to debug another issue
pjakobsson has joined #panfrost
Guest578 has left #panfrost [#panfrost]
Rathann has joined #panfrost
Fat-Zer has quit [Ping timeout: 480 seconds]
Fat-Zer has joined #panfrost
Danct12 has quit [Quit: Quitting]
Danct12 has joined #panfrost
<narmstrong>
jernej: you should get some dumps of valid afbc buffers and try to decode them, I had the chance to have an AFBC enabled Android libMali, and I enabled different configurations and implemented support based on that
<narmstrong>
jernej: seems YTR mandates SPARSE (if you can't control it, then it's mandatory), so you either have AFBC_FORMAT_MOD_BLOCK_SIZE_16x16 or AFBC_FORMAT_MOD_BLOCK_SIZE_16x16|AFBC_FORMAT_MOD_SPARSE|AFBC_FORMAT_MOD_YTR as valid modifiers
<narmstrong>
jernej: a way to do is add support for modifiers, don't add anything else, display AFBC buffers and poke the AFBC reg bits directlry until you have something correctly decoded...
<alyssa>
narmstrong: Thank you for providing the documentation that Arm didn't...
<narmstrong>
"basic" must be 16x16 or 16x16/SPARSE/YTR, I don't remember
<alyssa>
jernej: Worth noting architecturally, Bifrost can't render into non-SPARSE
<narmstrong>
ooooooook, this is why the Bifrost Amlogic SoC with the ARM AFBC decoder module doesn't have a SPARSE control bit
<narmstrong>
ok so the "basic" afbc dump must be 16x16/SPARSE/YTR
<alyssa>
narmstrong: also because non-SPARSE sucks for hardware
<alyssa>
narmstrong: 16x16/SPARSE/YTR is what's used for internal RGBA framebuffer objects on bith midgard and bifrost, it's somehow the most generally optimal choice
<alyssa>
and any deviation from it has to be justified with a tradeoff
<alyssa>
e.g. SPLIT mode may be *worse* for the GPU but is better for display controllers
<alyssa>
likely 32x8 blocks
<alyssa>
non-SPARSE mode consumes less memory but prevents all parallelism (Midgard has incredible hacks here)
<narmstrong>
yes, I had to switch 32x8/SPLIT/SPARSE/YTR to have smooth 4k display on S905X3
<narmstrong>
would be great if documented anyway :-p
<alyssa>
narmstrong: Thanks for offering to write the documentation :-)
<alyssa>
;-P
<narmstrong>
Ha ha
camus has joined #panfrost
camus1 has quit [Read error: Connection reset by peer]
Danct12 has quit [Quit: Quitting]
Danct12 has joined #panfrost
Fat-Zer has quit [Ping timeout: 480 seconds]
Fat-Zer has joined #panfrost
<macc24>
ummm
<macc24>
firefox started flickering on midgard(t860) on sway, can anyone reproduce/
<macc24>
i attached the psu back onto my 3d printer and now it (mostly) won't break itself when you pick it up :D
<macc24>
alyssa: not much
<macc24>
pinging robclark as i see same issue on adreno 618
<macc24>
like literally the same symptoms
<alyssa>
Oof
<jernej>
alyssa, narmstrong: thanks for pointers
<robclark>
macc24: maybe it is a ffox bug ;-)
<macc24>
robclark: i don't see anything like that on my x86 desktop, neither on my duet
<jernej>
just one more question, do you have any numbers or at least feeling how much is AFBC able to compress images?
<robclark>
macc24: doesn't mean it is not a ffox bug.. but I think I'm still waiting for an apitrace or similar
<alyssa>
jernej: AFBC isn't about reducing memory usage (unlike PNG/JPEG type compression), it's about memory bandwidth
<alyssa>
actually it increases memory usage
<alyssa>
but on average content, IIRC Arm ballparks a 50% reduction in memory bandwidth
<macc24>
robclark: sure just let me finish doing 101 other things
<macc24>
THE FACTORY MUST GROW
<jernej>
interesting, I didn't know it's possible to reduce bandwith with increasing memory usage
<jernej>
I guess alignment plays a big role?
<narmstrong>
yes
<alyssa>
Mostly that -- in good circumstances -- the hardware doesn't actually access most of the AFBC data
<robclark>
jernej: the increased memory usage is about addition of metadata to allow hw to avoid accessing the real per-pixel data
<alyssa>
The "slightly more than the uncompressed size" is the worst case, when AFBC fails to compress anything whatsoever
<urja>
well, since the compression ratio is content dependent, the full buffer (plus extra because compression) space must exist anyways...
<narmstrong>
and it reduces the DDR pages changes and loads the max of data in a same AXI
<narmstrong>
request
<alyssa>
jernej: As a super simple example, consider a framebuffer compression scheme that does the following:
<alyssa>
- Divides the framebuffer into 16x16 tiles
<alyssa>
- For each tile, allocates a bit in a separate header
<alyssa>
- If the bit is set, the tile is assumed to be all 0's
<alyssa>
- If the bit is clear, the tile is not all 0's so the contents should be read from the uncompressed image
<alyssa>
This expands the memory allocation, since now you need to store the extra header which is (width/16)*(height/16)/8 bytes
<alyssa>
But on average it reduces bandwidth -- assuming there are lots of regions in your images that are just all 0's -- since the hardware can skip the lookup in that case, reading only 1 bit instead of 16*16*32 bits
<alyssa>
(At the cost of an extra 1-bit access for every non-all-zeroes tile)
<jernej>
thanks, this explains a lot
<alyssa>
This is a bad scheme but it illustrates the example. You can image extending the header to have arbitrary solid colours, AFBC does this with solid colour blocks, good with 2D UI
<jernej>
yeah, but I still wonder why it would help with VPU, since it doesn't have blocks of solid color very often
<macc24>
alyssa: so a solid color wallpapers are a good idea on mali machines?
<macc24>
for a typical desktop usage
<alyssa>
jernej: Solid colour was just a simple exaple
<alyssa>
Actual AFBC is a lot more complicated
<alyssa>
macc24: Sure ;P
<macc24>
alyssa: *noted*
<alyssa>
~~or tile aligned checkerboards~~
<macc24>
still gotta put a logo somewhere. what if users forget what distro theyre using?
<alyssa>
PrawnOS, right?
<alyssa>
:p
<macc24>
xD
<jernej>
understood, I was a bit misled by "compression" in the name, I thought it relates to memory usage
<alyssa>
nod
<jernej>
(size)
<alyssa>
It's still compression, just compressing... on the time domain, not space? :p
<robclark>
at least adreno calls it something that isn't quite so misleading... "Universal BandWidth Compression"
<alyssa>
hah
<alyssa>
Arm FrameBuffer Compressio
<alyssa>
No idea what Apple calls theirs but there's one there too
<macc24>
Apple FrameBuffer Compression
<macc24>
AFBC in short
<macc24>
;D
<narmstrong>
same for Amlogic FrameBuffer Compression :-p
<alyssa>
macc24: srsly :p
<alyssa>
no Apple has a lot of things that could be so more specifically use uh
<alyssa>
AGX FrameBuffer Compression
<alyssa>
AFBC in short
<alyssa>
:p
<robmur01>
did you put your wizard cape on to say "Compressio!"? :D
<macc24>
there should be 'perf' debug variable for panfrost to complain when programs do dumb stuff. like freedreno does
<narmstrong>
-EDUMBSTUFF
Fat-Zer has quit [Ping timeout: 480 seconds]
warpme_ has quit [Quit: Connection closed for inactivity]
<robclark>
alyssa: feel free to copy/pasta perf_debug/perf_time and friends
<alyssa>
i'll copy/pasta as long as the code isn't spaghetti
<robclark>
well, there is a bit of macro magic in perf_time but nothing to bad.. it looks like clang-format made some interesting choices there..
<alyssa>
don't remind me I still want to reindent ;p
Fat-Zer has quit [Ping timeout: 480 seconds]
Fat-Zer has joined #panfrost
jernej_ is now known as jernej
wwilly has joined #panfrost
somy has joined #panfrost
alpernebbi has quit [Quit: alpernebbi]
<icecream95>
macc24: AFBC also supports making the bottom quarter/half/three quarters of a 16x16 tile a second colour for only ~20% extra space if you want a more interesting desktop background
<icecream95>
jernej: It is possible to remove the empty space from an AFBC resource to save ~50% of space, though making it non-renderable, and deduplicating tiles could be done to compress it even more
<alyssa>
Technically still renderable on Midgard only
<icecream95>
alyssa: On midgard, spilling can occasionally spill a value that is used in the same bundle it is written, so the load from TLS is before the store
* alyssa
groans
<alyssa>
If you post a shader_test I can take a look... old Bifrost RA hit a similar issue.
<icecream95>
I only hit this by forcing spilling to run more times than needed
<alyssa>
Ah.
<alyssa>
Currently trying to get GLES3.1 conformant, otherwise I'd be down to rewrite the Midgard backend compiler :-p
<icecream95>
I also found that the code setting min_bound for srcs needs to be moved above the continue
<icecream95>
(All this was found trying to fix a bug that seems to be caused by spilling >256 bytes)