ChanServ changed the topic of #zink to: official development channel for the mesa3d zink driver || https://docs.mesa3d.org/drivers/zink.html
eukara_ has quit [Remote host closed the connection]
eukara_ has joined #zink
cheako has quit [Quit: Connection closed for inactivity]
<zmike> ajax: I'm looking at the glxgears thing again and I'm not sure exactly how we want to approach this; basically the issue is that glxgears never checks any error states, therefore it would be up to mesa to abort/exit/whatever in this scenario
<zmike> and we don't
<zmike> I'm not sure there's any mechanism in mesa to trigger a shutdown like that at all?
<ajax> zmike: the default xlib error handler is printf + exit, so if a glx call hits a condition described as "GLXBadWhatever is generated" then that's where glxgears would normally crash out
<ajax> something like we bubble enough error back up through SwapBuffers and teach libGL to turn that into a synthetic protocol error
<zmike> ajax: yeah, I'm assuming nv wsi has a handler that changes that behavior
<zmike> ugh
<ajax> i can take that on if you want, i started down that path late last week
<zmike> https://pastebin.com/j2k6c59w is the trace
<zmike> I'm working on figuring out texture from pixmap now
<zmike> can check this once I'm done
<zmike> really need your focus on the xcb ordering issue and swapinterval handling
<ajax> k
<zmike> ugh this is terrible
cheako has joined #zink
<zmike> hmm so I'm looking at two options for tfp, and they both seem terrible:
<zmike> 1) implement tfp in kopper
<zmike> 2) reuse dmabuf importing from dri3
<zmike> 1 is terrible because obviously it is
<zmike> 2 is terrible because none of the dri3 stuff is at all reusable if you aren't actually dri3, which means probably lots of awful hacky subclassing or related tricks to try and make it work
<zmike> if I'm not missing something, then I guess when I get back from running a couple errands I'm gonna flip a coin and implement whichever result I get
<ajax> ngh yeah
<zmike> as much as I really don't want to reinvent the wheel I'm thinking that might be the best choice here
<ajax> i guess i was hoping to not think about tfp until i could rely on the server also being zink
<ajax> because then it's just EXT_external_memory_fd hooked up to dri3, i thnk
<zmike> yeah that's more or less what it'll end up being except I gotta punt it through gallium interfaces
<zmike> so I think either way this work has to happen
<zmike> (dri3 still won't work with zink no matter what without huge refactoring since it takes all dri3 struct types)
LexSfX has quit []
LexSfX has joined #zink
LexSfX has quit [Ping timeout: 480 seconds]
LexSfX has joined #zink
<zmike> $ MESA_LOADER_DRIVER_OVERRIDE=zink bin/glx-tfp -auto
<zmike> PIGLIT: {"result": "pass" }
<zmike> ajax: re: glxgears closing, what if when it's detected that the swapchain is dead in kopper (e.g., your patch) we just trigger a more severe xerror from kopper
<ajax> sounds like the right plan to me
<zmike> seems like that would be easier than trying to pipe errors back up to a handler that doesn't exist yet
<zmike> any ideas for errors that would be catastrophic enough that even the most braindead wsi would exit?
<ajax> BadWindow, you'd think
<zmike> hm
<ajax> so the problem i'm having is
<ajax> i'm trying to get swapinterval working, and a side effect of that is the default changes to actually vsync'd
<zmike> sounds normal enough
<ajax> which means wsi takes a different path, the one where you're using a thread to manage the present events
<zmike> oh no
<ajax> and _only_ with that path, can i reproduce this class of hang-at-exit that i'm currently looking at
<zmike> threads an wsi name a more awful combo
<zmike> maybe the solution is to disable present thread with zink?
<zmike> (for now)
<ajax> when you say present thread, do you mean the one in wsi for the swapchain, or do you mean one in zink for async gl command dispatch?
<zmike> wsi
<ajax> i don't think the non-present-thread code path would behave properly for FIFO queues
<zmike> awkward
<ajax> if i could make it work without a thread believe me i'd love to
<zmike> I was imagining it might be a good enough bandaid to slap on for now since we're holding up 22.1 release
<ajax> scheisse
<ajax> okay so
<zmike> I guess
<ajax> it's stopped at exactly the same point. xcb_wait_for_special_event is a terrible api
<zmike> yeah
<zmike> I just found more fuckups with resizing if you do it fast enough
<zmike> so there goes my week
<zmike> but I think tfp is done at least
<ajax> nice!
<ajax> i need to figure out how to repro whatever was going wrong with my previous attempt, all i remember is someone bisected it guilty and it got reverted
<zmike> yeah I think it was fps drops in some game or something like that?
<ajax> i _think_ the way i wrote it this time is immune to the race michel pointed out
<zmike> "this time"
<ajax> look
<zmike> ptsd intensifies
<zmike> did you put up a MR yet or just branch?
<ajax> current state isn't pushed yet, give me a few
<zmike> no rush, I'm trying to figure out how to do dynamically resizable depth buffers
<zmike> oooh this is really fucking gross but I nailed it
<zmike> try this out
<ajax> ew
<zmike> yes
<zmike> it's basically the non-wsi version of the existing swapchain buffer code
<zmike> I'm wondering if I should just remove the special casing in kopper now and let zink fully manage the depth buffer with this handler
<zmike> since now it'll be kopper recreating depth buffers on window resize but then also zink internally recreating depth buffers at the Truly Correct Size and sneaking them in behind kopper's back anyway
<ajax> yeah i feel like you can just go ahead and resize zs when you resize front/back?
<zmike> that's basically what this is doing
<zmike> so I think I'm gonna test out dropping the special casing since potentially now kopper might alloc a depth buffer that never gets used
<ajax> ack
<zmike> looking at the remaining ci regressions from kopper and I don't see these failing locally?
<zmike> 🤔
<ajax> hm hm
<ajax> zmike: !15800 has current bits for swapinterval. zero regressions for me with piglit quick_gl and radv.
<zmike> I'm on it
<zmike> ajax: looks good overall, did you ever figure out what happens on resize?
<zmike> wonder if the interval should just be included in the loader info so zink_kopper can just apply it then
<ajax> resizing being the sequence number abort? not yet but apparently it's still easy to hit
<zmike> also does it fix the piglit regressions listed in the ci baseline for swap control?
<zmike> no, not that
<zmike> resize+swapinterval
<zmike> hm
<zmike> actually, I guess it wouldn't matter would it since zink handles the resizing and thus the interval should be preserved there
<zmike> DISREGARD
<ajax> oh that yeah. yes, look at the extra vtable nonsense i had to do in the egl code to keep it away from the dri2 protocol
<zmike> yeah gross
<zmike> but expected
<zmike> it's not dri without a vtable
<zmike> or 5
<ajax> that tfp code looks... plausible.
<ajax> that work on nvidia?
<ajax> i kind of imagine no
<zmike> it does not
<zmike> though that test fails the same way on nv native?
<zmike> 🤔
<ajax> heh
<ajax> yeah it's not really "good"
<zmike> X Error of failed request: BadMatch (invalid parameter attributes)
<zmike> Major opcode of failed request: 152 (GLX)
<zmike> Minor opcode of failed request: 22 (X_GLXCreatePixmap)
<zmike> yeah but it's the only "simple" test I have for it
<zmike> does it work for you on radv?
<ajax> firefox's o-o-p rendering uses it i think?
<zmike> I only have nv and intel up right now
<zmike> uhhhhhhh
<zmike> I don't think I've ever tested running ff on zink
<zmike> not sure this is the time to start haha
<ajax> buk buk buk buk braawwwwwk
<zmike> oh no you didn't
<zmike> fine
<zmike> FINE
<zmike> well ff works
<zmike> I'm slamming webgl aquarium on it
<ajax> yep, works with radv
<zmike> cool
<ajax> and with anv
<zmike> okay, so the only remaining release blockers (https://gitlab.freedesktop.org/mesa/mesa/-/issues/6267) are the disconnects and the auto-loading fallback
<zmike> I can tackle the fallback tomorrow
<zmike> and maybe try to throw a BadWindow into glxgears somehow
<ajax> i still don't understand how you're having that problem
<zmike> yeah it's baffling
<zmike> I guess nv wsi is just insanely permissive
<zmike> that special_event patch in your MR is pretty gnarly
<ajax> didn't say i liked it
<ajax> i could add like a usleep(100) or something to make it less busy-waity but i have trouble caring
<zmike> haha
<zmike> yeah
<ajax> you shouldn't get to that spot, and if you do this is not why you're slow
<ajax> you have fewer images than threads wanting to work on them. resize your shiz.
<zmike> hopefully this doesn't cause more mysterious regressions
<zmike> whew there's a light at the end of the tunnel finally
<ajax> i couldn't run quick_gl to completion without it so i'm pretty hopeful
<zmike> awesome
<zmike> I'm queuing up some runs
eukara has joined #zink
eukara_ has quit [Ping timeout: 480 seconds]
eukara has quit [Ping timeout: 480 seconds]
eukara has joined #zink
eukara_ has joined #zink
eukara has quit [Ping timeout: 480 seconds]
eukara__ has joined #zink
eukara_ has quit [Ping timeout: 480 seconds]
eukara__ has quit [Read error: Connection reset by peer]
<ajax> ngh
<ajax> so i can pretty reliably make kopper glxgears crash just by rapd resizing
<ajax> and if i forcibly call XInitThreads() from libGL's constructor, it doesn't crash
<ajax> so fundamentally the issue here is it's not possible to use xcb safely without knowing whether it's the backend for some xlib display
<ajax> which there's no API for
<ajax> i could scrape /proc/pid/maps for my own heap and try to find my own xcb_connection_t* somewhere aligned. it's 64 bits, it's not going to false-positive.
<ajax> but then the other problem is i don't think XInitThreads can help you if a display was already created before you called it
<ajax> which means if libGL was dlopen'd you might just be out of luck.
<zmike> starting to feel like we're getting into the territory of the truly insane
<daniels> ajax: just ram through the patch to make every display threadsafe?
<daniels> I posted it a while ago but decided I didn’t care when it became about nop chicken on various platforms
<ajax> daniels: i mean. yeah. i kind of hate saying everyone gets to upgrade their libX11...
<ajax> my other insane plan was to just import the _Display struct ABI into libxcb and use that for xcb_connection_t's actual storage
<ajax> no question of request numbers getting out of sync if they're only stored in one place!
<ajax> but there too you're forcing everyone to update libxcb just to make this one driver work
<ajax> afk for the evening. branch updated with some minor cleanups, but ci choked on the last version and i didn't fix anything about those fails yet
<daniels> tbf we are talking about an unreleased version of Mesa here … ?
<daniels> so upgrades are fairly implied
eukara has joined #zink
eukara_ has joined #zink
eukara has quit [Read error: No route to host]
eukara_ has quit [Ping timeout: 480 seconds]