<alyssa>
1. Even with ASAHI_MESA_DEBUG=sync, there are serious sync glitches with Firefox, at least when WebRender is used (as is the case on agx/next, didn't test without). This is a WSI issue: it happens in GNOME but not in Sway. It might also have to do with Xwayland vs native wayland.
<alyssa>
At least with (explicit sync + Firefox + WebRender + GNOME), the rendered content is sometimes from a previous frame instead of a current frame, or at least partially so
<alyssa>
The most obvious symptom is typing into some input fields and not seeing the most recently typed character
<alyssa>
This is especially noticeable when typing into chat in discord.com
<alyssa>
There are other visual artefacts with Firefox in GNOME, but I assume they have the same root cause
<alyssa>
(and the glitching when typing is especially noticeable so probably easier to work with)
<alyssa>
I'm unsure if {Plasma, Firefox without WebRender} are affected but IMO this regression is blocking even if it's not ... and unless this is a Firefox bug it seems likely it can be hit with things other than Firefox, this is just what I noticed on day #1 of testing
<alyssa>
2. For some reason one of the apitraces I have runs fine (and much faster I think!) in debugoptimized, but faults in release. I have not yet root caused this one. This is not blocking.
<alyssa>
Once the Firefox regression is sorted, asusming no other regressions come up this week, I'll cut a new asahi/main release on top of the explicit sync UAPI and that should be good to go then
hightower2 has quit [Ping timeout: 480 seconds]
<alyssa>
hm wait the apitrace might be from my sysval rewrite being broken
<alyssa>
scratch #2
nela has quit [Ping timeout: 480 seconds]
<alyssa>
can't reproduce now
<alyssa>
maybe I screwed something up
<alyssa>
oh well
<alyssa>
Also, I am not convinced that it's necessary to sync in flush_resource
<alyssa>
freedreno + explicit sync doesn't seem to do anything for flush resource
<alyssa>
this is worth another 6% on manhattan
<alyssa>
i think
<alyssa>
oh but holy broken
<alyssa>
so what other fence is freedreno using that we're not
* alyssa
doesn't trust any of the fencing code
nela has joined #asahi-gpu
<alyssa>
I'm especially suspicious of agx_fence_create
<alyssa>
and all this ctx->syncobj stuff doesn't make much sense to me in an explicit sync world
<alyssa>
maybe if I fix that, firefox will fix itself
<alyssa>
this stuff is way too complicated
<alyssa>
lina: Potential race, what happens if we flush (but do not sync) a batch writing resource A?
<alyssa>
before that batch is completed, there is a second batch that reads resource A that gets flushed
<alyssa>
Is there a possibility that the second batch reads the wrong contents of A?
<alyssa>
It's submitted in the proper order but I don't see any fence/barrier between the two batches (i.e. in agx_flush_writer for batches that are submitted and not active)
<alyssa>
honestly I don't understand how any of agx_flush/agx_fence_create work
<alyssa>
and every driver I look at does something different and the docs are pretty vague
<alyssa>
freedreno is probably closest to what we want, but
akspecs_ has joined #asahi-gpu
tertu2 has joined #asahi-gpu
nepeat_ has joined #asahi-gpu
codingkoopa3 has joined #asahi-gpu
Lightsword_ has joined #asahi-gpu
Z750 has quit [resistance.oftc.net larich.oftc.net]
djorz has quit [resistance.oftc.net larich.oftc.net]
tertu has quit [resistance.oftc.net larich.oftc.net]
akspecs has quit [resistance.oftc.net larich.oftc.net]
nepeat has quit [resistance.oftc.net larich.oftc.net]
wicastC has quit [resistance.oftc.net larich.oftc.net]
alyssa has quit [resistance.oftc.net larich.oftc.net]
hxliew has quit [resistance.oftc.net larich.oftc.net]
Lightsword has quit [resistance.oftc.net larich.oftc.net]
codingkoopa has quit [resistance.oftc.net larich.oftc.net]
hxliew has joined #asahi-gpu
wicastC has joined #asahi-gpu
Z750 has joined #asahi-gpu
alyssa has joined #asahi-gpu
<lina>
alyssa: All batches are submitted with BARRIER_RENDER | BARRIER_COMPUTE, which tells the backend code to insert a fence on the last render and compute commands submitted (if any - the kernel will elide that if it already got a completion notification and it knows the GPU queue is idle, since at that point it doesn't even have event IDs attached so there is nothing to fence on)
<lina>
On the other hand, we could elide that when we know we do *not* have such a dependency, which I was thinking is the case whenever we submit batches back to back (e.g. whenever we flush_all and there is more than one batch to flush, I think we can avoid that barrier for all but the first, since they're guaranteed not to have any dependencies, otherwise they would have been flushed already), but right now
<lina>
we don't
<lina>
This is the "currently vertex and fragment are serialized" issue (if you remove those barriers, vertex for batch+1 can run concurrent with fragment for batch)
<lina>
Of course this is all cache issues aside. If we run into cache issues between back to back submissions themselves (like the CPU->GPU one I ran into) we probably need to set more bits in that VDM Barrier command...
<lina>
I still don't know exactly what caches the firmware flushes, when, and how to control any of that. We still have at least 3 mystery bits of freedom in the submission commands.
<lina>
As for the WSI, my understanding is that that ctx->syncobj is supposed to just track the last submitted batch, so that when WSI wants a barrier on "all past work" we just clone that.
<lina>
But there is the whole attaching fences to dma-bufs story that I haven't looked at at all... I don't know how the mesa hooks for that work, if it isn't all taken care of by gallium already based on the existing fence support...
<alyssa>
right.. ok
<alyssa>
I'm not worried about optimizing those barriers(yet)
<alyssa>
I am worried about the WSI story
<alyssa>
and the firefox wonkiness suggests that not all is right in oz
<lina>
Let me look for the DMA-BUF stuff...
<lina>
Fingers crossed it's a magic cap we set and mesa does it for us ^^
<alyssa>
hahahaha :(
<lina>
I get the feeling what we're doing now is *sufficient* for correct WSI but may not be what is actually expected of us...
<lina>
alyssa: Good news: it's in common WSI code! Bad news: it's in common *Vulkan* WSI code. ^^;;
<alyssa>
Yeah, that's what I expected
<lina>
Okay, now where do I plug this in for us...
kesslerd_ has joined #asahi-gpu
djorz has joined #asahi-gpu
<alyssa>
We might be the first 100% explicit sync gallium driver
<alyssa>
that's kinda neat :p
<lina>
So do, like, other pure explicit sync drivers not exist? Because that is the only usage of this API anywhere in mesa...
<alyssa>
No, I don't think they do
<lina>
Oof...
<alyssa>
I mean
<alyssa>
It wasn't been possible to have a pure explicit sync Linux (not Android) driver before the ioctls
<alyssa>
and those landed this year
<alyssa>
so any driver older than that necessarily has an implicit sync path for WSI
<alyssa>
I see DMA_BUF_IOCTL_EXPORT_SYNC_FILE and DMA_BUF_IOCTL_IMPORT_SYNC_FILE in there for their gallium driver
<lina>
Yay! ^^
cylm has joined #asahi-gpu
possiblemeatball has quit [Quit: Quit]
kesslerd has quit [Remote host closed the connection]
kesslerd_ has quit [Remote host closed the connection]
kesslerd has joined #asahi-gpu
kesslerd has quit [Remote host closed the connection]
<lina>
alyssa: Fixed the Firefox issue, it turns out we need to make resources shareable in flush_resource... so now it has to blit if not already shareable.
<lina>
Also the ail stuff for multisampling compression is wrong, I threw in a random guess but that needs tests. That fixed WebGL Aquarium.
<lina>
I also added some nice debugging stuff along the way and more asserts ^^
<lina>
Wooooah I don't know if this is me removing the sync in flush_resource or something you did, but glmark2 just jumped up a ton. Like, from 4500 to 6600 (!!)
<lina>
Wait and that's a debug build
<lina>
Now hitting over 9000 FPS on some tests...
<lina>
That's on GBM, a bit less on Wayland but the output looks legit... I don't think it's fast because it's broken...
<lina>
^^
<lina>
Release build GBM score: 7252 ^^
bisko has joined #asahi-gpu
wixde has joined #asahi-gpu
bisko has quit [Quit: My Mac has gone to sleep. ZZZzzz…]
nyilas has joined #asahi-gpu
chipxxx has joined #asahi-gpu
chip_x has quit [Ping timeout: 480 seconds]
chip_x has joined #asahi-gpu
chip__ has joined #asahi-gpu
chip_x has quit [Remote host closed the connection]
chipxxx has quit [Ping timeout: 480 seconds]
hightower2 has joined #asahi-gpu
wixde has quit [Ping timeout: 480 seconds]
ChaosPrincess has quit [Quit: WeeChat 3.8]
ChaosPrincess has joined #asahi-gpu
chip_x has joined #asahi-gpu
chip__ has quit [Ping timeout: 480 seconds]
possiblemeatball has joined #asahi-gpu
kesslerd has joined #asahi-gpu
kesslerd has quit []
chipxxx has joined #asahi-gpu
maria6 has joined #asahi-gpu
maria has quit [Ping timeout: 480 seconds]
maria6 is now known as maria
chip_x has quit [Ping timeout: 480 seconds]
chipxxx has quit [Remote host closed the connection]
chipxxx has joined #asahi-gpu
wixde has joined #asahi-gpu
Cromulent has joined #asahi-gpu
<lina>
I still get some weird hangs though... I think something is still broken, probably in WSI ^^;;;
<lina>
But at least xonotic now regularly reaches the default 250fps cap!
le0n has quit [Remote host closed the connection]
le0n has joined #asahi-gpu
bisko has joined #asahi-gpu
bisko has quit [Quit: My Mac has gone to sleep. ZZZzzz…]
cy8aer has quit [Remote host closed the connection]
bisko has quit [Ping timeout: 480 seconds]
bluetail has joined #asahi-gpu
cylm has quit [Ping timeout: 480 seconds]
<alyssa>
lina: Sounds like good progress :)
<alyssa>
afaict there are a lot of multisampling issues
<alyssa>
and I'm still not convinced there's not a register or two we're not yet exposing in the uapi so I was going to leave it to you to work through the deqps when you had some time
<alyssa>
and glmark hangs my system now
<alyssa>
so gonna say something regressed
<alyssa>
but yknow
<alyssa>
2 steps fwd 1 step back
<alyssa>
what the heck
<alyssa>
no faults so what's up. an OOM maybe? idk
<alyssa>
nothing in dmesg at the time of the hang, bizarre
<alyssa>
only happening in gnome maybe
<alyssa>
trex 225fps now
<alyssa>
mh 85fps
<alyssa>
so that's definitely better
<alyssa>
worth another few fps on stk
<alyssa>
btw the flush in agx_fence_create is redundant with the one in its only caller
<alyssa>
that only interaction still makes me confused tbh
<alyssa>
fixing MAX_BATCHES brings mh up to 125fps
<alyssa>
ooh and now t-rex is faulting that's neat
<alyssa>
or hanging the system
<alyssa>
yeah i'm going to say this isn't quite ready yet :)
<alyssa>
hopefully the same root cause as your weird hangs
<alyssa>
few more fps on stk
<alyssa>
getting a lot of random pink rectangles.
Cromulent has quit [Quit: Connection closed for inactivity]
possiblemeatball has quit [Quit: Quit]
bisko has joined #asahi-gpu
possiblemeatball has joined #asahi-gpu
bisko has quit [Ping timeout: 480 seconds]
Dementor has quit [Read error: Connection reset by peer]
Dementor has joined #asahi-gpu
bisko has joined #asahi-gpu
bisko has quit [Ping timeout: 480 seconds]
bisko has joined #asahi-gpu
bisko has quit [Ping timeout: 480 seconds]
nyilas has quit [Remote host closed the connection]
bisko has joined #asahi-gpu
wixde has quit [Read error: Connection reset by peer]