ChanServ changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - Logs https://oftc.irclog.whitequark.org/panfrost - <macc24> i have been here before it was popular
JulianGro has joined #panfrost
JulianGro has quit []
JulianGro has joined #panfrost
rasterman has quit [Quit: Gettin' stinky!]
<tlwoerner>
alyssa: (psst XDC is still on-going) ;-)
* tlwoerner
hasn't noticed any change in alyssa's accent, other than pronouncing "Tronno" like a local (which is a good thing)
<tlwoerner>
the only canadians who say "aboot" are a very small group of people on the far east cost who are mostly of Scottish descent, who have all been saying "aboot" over in Scotland for centuries
<tlwoerner>
:-D
nlhowell is now known as Guest111
nlhowell has joined #panfrost
Guest111 has quit [Remote host closed the connection]
<alyssa>
tlwoerner: *day after I speak at XDC
<alyssa>
"Trono" yep :-p
<alyssa>
also, all the accent shaming made me listen for local pronunciations of "about" today
<alyssa>
there are.. at least 3 distinct ones.
<tlwoerner>
in a city like toronto? there will be 100's :-)
<alyssa>
tlwoerner: ^that I hit going about my normal day today, filtering only for native Canadian English speakers
nlhowell has quit [Ping timeout: 480 seconds]
<jambalaya>
not sure if that's off-topic. i'm using a pinebook-pro with manjaro, e.g. linux on an aarch64-based laptop. naturally with mesa/panfrost. and i would like to use a more nifty (read: opengl) terminal emulator.
<jambalaya>
both kitty and alacritty, however, apparently expect opengl/glsl 3.3. which is not available ofc. any suggestion as to whether there's an accelerated terminal emulator available?
<HdkR>
Alacritty specifically needs dual source blending. If you do a version override then you might be fine anyway
<jambalaya>
version override? as in telling it to just use the features available and skip the rest?
<HdkR>
You can set the environment variable `MESA_GL_VERSION_OVERRIDE=3.3` and the driver will change its reported GL version. Since Panfrost supports dual source blending, it may just work as long as other features aren't used that aren't supported
<jambalaya>
:D
<jambalaya>
alright. let's see what's asploding now
<icecream95>
I'm also getting other corruption, both linear and AFBC-shaped, where pages of memory seem to be filled with garbage that is (in some cases) persistent when the content of the window "under it" changes, but moves as the window is moved
<icecream95>
[looking at some corruption] ..is that block of pixels an index buffer?
<icecream95>
..more likely it's output from vertex or tiler jobs. Yet another tiler heap related bug?
<robmur01>
the long-outstanding kernel fixes have only just landed in mainline this week - several more weeks until 5.15 is released and maybe *then* it'll all be put to bed...
<icecream95>
robmur01: I've run -rc1 kernels before, nothing bad happened then, so there's obviously no danger in trying it again
<tomeu>
alyssa: if you refer to those rk3288 artifacts, they don't look like what I'm seeing in chromeos
<alyssa>
ack
<icecream95>
Hmm, Chrome OS on Panfrost? I wonder if I could test that myself by chrooting into /dev/disk/by-label/ROOT-A from my Void system and swapping out lib{GLES,EGL} for Mesa
<alyssa>
tomeu: i think is rebuildiing all of chromiumos
<alyssa>
hotswapping sounds like less work, dunno how broken it would be :-p
<icecream95>
First I have to work out what command-line to start Chrome with, it's complaining that it "Can't find icudtl.dat"
<icecream95>
tomeu: Does the chromebook in the photo you took of the arefacts have a 1920x1080 display?
<icecream95>
(if so, then the artefacts seem to match suspiciously well to a 16x16 pixel grid. Possibly it's another CRC bug..)
<alyssa>
tomeu: PAN_MESA_DEBUG=nocrc for u
<alyssa>
PAN_MESA_DEBUG=nocrc,noafbc for good measure
camus has quit [Ping timeout: 480 seconds]
* icecream95
goes back to debugging Extreme Tux Racer being broken under Musl
<icecream95>
(with XDC livestream in the background, of course)
pendingchaos has quit [Ping timeout: 480 seconds]
pendingchaos has joined #panfrost
<tomeu>
icecream95: too much stuff links to libmali, one needs to rebuild
<tomeu>
icecream95: it's 1366 width, I think the artifacts matches the 256x256 tiles that the browser has for content
* tomeu
I measured with a rule :p
<tomeu>
robclark: guess there's no way of running chrome's gpu process under apitrace? something seems to be eating my LD_PRELOAD
<tomeu>
apparently minijail does that, but I don't know if the gpu process is run under it
camus has joined #panfrost
camus has quit [Ping timeout: 480 seconds]
camus has joined #panfrost
<robclark>
tomeu: chrome does it's own gpu sandbox.. but there is a cmdline arg to disable it.. I don't think I've tried this on CrOS but I've managed to apitrace chromium under linux.. https://github.com/apitrace/apitrace/issues/128
<tomeu>
yeah, but I cannot reproduce the problem under linux :(
<robclark>
hmm
<robclark>
I'd *guess* a fencing issue.. but I'd also try disabling partial swap since that might also be a thing that is different compared to chromium on linux
<tomeu>
have tried with --no-sandbox and --disable-gpu-sandbox, but lsof shows that the gpu process doesn't have the apitrace .so opened
<tomeu>
yeah, I suspected partial swap from the start, but disabling it didn't make any difference
<tomeu>
I have seen that textures such as those that contain the content tiles (256x256) are exported and imported on creation, and we were wondering if that was confusing our resource(no BO)-based dependency tracking
<robclark>
so, I'm not sure if this is still the case, but it is possible that boards with mali (at least ones on older 4.19 kernel) are using implicit sync to work around bugs?
<robclark>
I can check that later
<tomeu>
ah, I have indeed seen that there's no calls to fencesync
<tomeu>
will see as well if I can enable explicit sync
<robclark>
fwiw, krh did have mali working on something (kevin) at one point.. he may remember something?
<tomeu>
ah, disable_explicit_dma_fences
<alyssa>
womp womp.
<krh>
I remember being traumatized by trying to find the right kernel option to enable to get the right voltage controller compiled in
<tomeu>
actually, maybe don't spend any more time on this
<tomeu>
I found that the file in /etc with the uses flags was corrupted, and the artifacts weren't present any more, with the webgl aquarium at 40 fps
<tomeu>
so I guess there was a value in that file that got things working on libmali, but broke panfrost
<tomeu>
I'mr eflashing to recheck that
<tomeu>
it was maybe disable_explicit_dma_fences?
<alyssa>
tomeu: Womp womp!
<alyssa>
tomeu: Can you uh
<alyssa>
Can you still send the fixes for the other 2 things we found? 😋
<tomeu>
hehe
<tomeu>
sure
<alyssa>
(or not? idk)
<alyssa>
need to think through all the flushing rules
<tomeu>
we are going to have to hack a lot on scheduling, etc so we can beat libmali on performance
<alyssa>
(for those following along at home --- freedreno implements flush_resource but panfrost/v3d does not, also like fd we do pipe_resource level tracking which might be wrong if you export/import the same BO)
<tomeu>
that's probably the weakest part of the opengl driver?
<alyssa>
🤷
<alyssa>
tomeu: I have a hammer (mesa) so every performance problem looks like a nail
<alyssa>
I don't know if anyone has put any real effort into panfrost.ko optimization
<tomeu>
oh, I was meaning how we send stuff to the kernel
<alyssa>
yeah, that's covered
<alyssa>
(Er. covered by kernel work.)
<tomeu>
which is related to the kernel exposing the second slot
<alyssa>
I've made sure the cmdstream and shaders are sane and the CPU usage in mesa is minimal
<alyssa>
what happens after ioctl() i've mostly ignored.
<alyssa>
thanks for volunteering to chase that!
<alyssa>
😉
<tomeu>
well, if all that remains to be done regarding job scheduling is in the kernel, I won't mind doing it (rebasing patches from robmur01 ?)
<tomeu>
but, are we really submitting the optimal cmdstream to the kernel?
<alyssa>
optimal? no
<tomeu>
I was expecting we could reduce flushes, etc
<alyssa>
PAN_MESA_DEBUG=perf will help there (logs implicit flushes)
<robmur01>
we've just landed Steve's patch to use the hardware queueing registers properly, is there much else the kernel can do?
<robmur01>
I thought most of the fiddly bits were in the job descriptor, aka userspace's problem :)
<alyssa>
G72 shouldn't be missing any big ticket optimizations, but.. 🤷
jambalaya has quit [Remote host closed the connection]
jambalaya has joined #panfrost
<robmur01>
not sure whether stuff like core affinity has a major effect this side of T628
<tomeu>
krh: robclark: native_gpu_memory_buffers seems to be what was braking panfrost
<alyssa>
what's that do
<tomeu>
trying to figure that out :)
<robclark>
hmm, I think you don't want to disable that..
* robclark
trying to check what it does
<robclark>
I think disabling it nerfs gpu accel somehow.. krh may know better what it does
<tomeu>
yeah, wonder if we are papering over a real issue
<robclark>
right
<alyssa>
robmur01: as for optimizations i'd offer to help but apparently today is "do 4 classes worth of homework due next week that I ignored because XDC" day
<alyssa>
(that national holiday traditionally comes after "International Make Fun of How Alyssa Talks Day")
* robclark
never said anything aboot alyssa's accent :-P
<alyssa>
robclark: ❤️
<tomeu>
ok, I think I got it now
<alyssa>
🎉
<tomeu>
so what happens is that hw acceleration for 2d-rendered content was disabled because fo a blocklist
<alyssa>
that's.. bad?
<tomeu>
and there must be a race when the cpu-rendered tiles are sampled when compositing
<tomeu>
but freedreno isn't in the blocklist
<tomeu>
so I override the blocklist so panfrost also has gpu-rendered tiles
<alyssa>
ah, so we have two problems
<tomeu>
and it works :)
<tomeu>
I think we have many problems, alyssa :p
<tomeu>
I see some artifacts, but those will be solved once we pass skqp ;)
<alyssa>
robclark: what does android-cts test of the gfx driver that the gles31 cts doesn't?
<robclark>
a lot.. it covers deqp-egl/gles2/gles3/gles31, but also skqp, bunch of video tests that involve gpu, etc, etc
<robclark>
passing deqp is really only the starting point
<robclark>
cts-tradefed thing runs tests single threaded.. so fire it off before you go home for the weekend or at some point where you aren't waiting for it to finish
<tomeu>
robclark: if I disable gpu rasterization on a lazor, I see the exact same artifacts as I was seeing with panfrost :D
<tomeu>
so this is one more thing that gets fixed by changing the GL_RENDERER string :p
<robclark>
yeah, I don't think sw rast is likely to work
<robclark>
there might be cmdline arg to disable blocklist? That would be probably easier than recompiling chrome
<tomeu>
ah, it was enough to recompile mesa
<robclark>
there is driconf to override GL_RENDERER/GL_VENDOR
<tomeu>
so functionally panfrost doesn't seem to be that bad from the limited testing I have done
<tomeu>
ah true, I heard about it in your talk
<tomeu>
I will move to performance next, now that we have perfetto in there
<robclark>
there is a file in /etc where you can set cmdline flags and env vars
<tomeu>
yep, /etc/chrome_dev.conf
<robclark>
that sounds right
<alyssa>
tomeu: so does that mean all mesa drivers are broken, or chromium itself is broken?
<tomeu>
don't ask me, but I think chromium should be doing something so GL drivers now that a CPU tile has been updated
<robclark>
alyssa: I know there are issues w/ llvmpipe because it doesn't really support fences (IIRC).. but non-gpu rast isn't really a thing that ever gets tested, so probably chromium "bug" (or maybe, call it an unsupported configuration)
<robclark>
tomeu: did you manage to replace the android gl drivers? I guess you'd need that to run android-cts.. but it also gives you a lot of interesting game workloads to test/debug ;-)
<robclark>
otherwise, I guess you are kinda limited to shadertoy and playcanv.as
<tomeu>
well, we are doing pretty bad with webgl aquarium, so I will start with that
<tomeu>
I'm sure we can learn quite some interesting things just by that
<robclark>
aquarium is pure uniform-upload benchmark ;-)
<robclark>
it basically does `for (i = 0 .. num_fish) { upload_uniforms(); draw(); }`
<tomeu>
well, I suspect our batching is hurting us there
<tomeu>
afterwards, I will look at minetest in Crostini, as I can outsource that to my children :p
<robclark>
it isn't doing anything fancy w/ fbo's or anything like that.. it is using MSAA and msrtt so you want to have msrtt working
<robclark>
heh, you can find how bad virgl is (crostini)
<tomeu>
yeah, actually, minetest on virgl is a bit slow on the lazor
<tomeu>
compared with the original pixel
<tomeu>
guess there haven't been much virgl on arm love yet
<robclark>
one of these days I should see if I can get crouton running so I can compare to native drivers..
<tomeu>
robclark: I just boot mainline on a sd card and nfsroot
<tomeu>
so I use the same rootfs as on my other panfrost boards
<robclark>
for native perf comparisons, I have fedora installed on one of my lazors.. it was what I recorded my presentation on.. but someone else was complaining about minecraft perf in crostini and I figured crouton setup would be easier setup for them to test (and I didn't actually want to bother to buy the game myself)
AreaScout_ has joined #panfrost
bluebugs has quit [Read error: No route to host]
cedric has joined #panfrost
cedric is now known as bluebugs
<krh>
tomeu: native_gpu_memory_buffers controls whether chrome uses textures for sharing buffers between processes (eg content and webgl) or minigbm allocated buffers
<krh>
with native disabled, we can't scan out webgl
<krh>
but the choice is there because before modifiers, a texture could have a more efficient internal representation (ie afbc) whereas minigbm allocated buffers were always linear