pribas has quit [Read error: Connection reset by peer]
neobrain[m] has joined #zink
Soroush has joined #zink
nebadon has joined #zink
<nebadon>
ok well I updated to latest version on git master as of right now, and some improvement Unigine Heaven runs perfectly, but Tomb Raider crashes my system hard I get logged out of KDE
<nebadon>
never even see the menu of the game
<nebadon>
just black screen, then logged out about a minute later
xroumegue has quit [Ping timeout: 480 seconds]
xroumegue has joined #zink
fahien has joined #zink
fahien has quit [Ping timeout: 480 seconds]
<zmike>
hm
fahien has joined #zink
<zmike>
nebadon: I can't repro
<zmike>
it works fine for me on latest git
<zmike>
I am seeing a weird crash though
<zmike>
kusma: really need to keep all this code out of the draw path
cheako has quit [Quit: Connection closed for inactivity]
<anholt>
actual game, even. asphalt9. with your branch, still 594 bos and 110MB of descriptors.
<zmike>
wasn't it at like 800mb before?
<anholt>
that was total memory consumption.
<zmike>
ah
<zmike>
well I think this is probably the best I can do with the current tools available to me
<anholt>
yeah, this branch seems to be no change on this test.
<zmike>
alright, I guess maybe just tomb raider and related games
<anholt>
I have tried to read zink's descriptor code so many times and I'm utterly lost. definitely interested in debug here -- descriptors are something that zink is profligate with compared to angle.
<anholt>
hmm. so it doesn't look like the issue for asphalt is descriptor type variants, they're all VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER and VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER. we're just apparently using >3500 descriptors in a batch?
<anholt>
set descriptor counts vary a little bit, but each descriptor set is pretty small.
<zmike>
yeah if there's like 1000 draws per batch and each draw changes descriptors then that could happen pretty easily
<anholt>
150-250 at the high points it looks like.
<zmike>
still not too hard to believe if it's all different layouts
<zmike>
I've been meaning to do some sort of "aligning" of the set layouts such that e.g., a set using 1 descriptor would be aligned to the pool that allocates 4 descriptors
<zmike>
so some sets would be overallocated a bit, but there would be fewer pools
<zmike>
I tried it once a long time ago iirc and I didn't have time to remember how the rules with that worked so it was legal
<anholt>
a frame makes 10-15 descriptor pools. how do we have 600 of them live?
<zmike>
pools are per-batch to avoid locking and usage tracking
<anholt>
I guess that was just early in the trace while things were ramping up
<zmike>
so for each batch there are enough pools to render the frame typically
<zmike>
which means 10-15 pools per batch
<zmike>
but then also each pool has N sub-pools for overflow handling
<zmike>
i.e., when the pool is full it gets punted for later reuse and a new pool is allocated
<anholt>
5 QueueSubmits per frame, so it sounds like 5ish batches per frame. last_finished seems to be chasing current batch_id pretty closely.
<zmike>
there's multiple submits from swapchain handling too
<zmike>
as well as semaphore waits
<zmike>
to get multiple (zink) batches per frame the app would have to explicitly be flushing mid-frame
<anholt>
that was logging at zink's batch's queuesubmit
<zmike>
huh
<anholt>
still trying to work out how we get to 600 live.
<zmike>
so I guess the app is doing flushes? bizarre
<zmike>
600 seems reasonable to me if there's 10-15 per frame
<anholt>
it is apparently a huge fan of glFlush
<zmike>
that'll do it
<anholt>
ah, apparently not necessarily the app, the glflushes look like possibly part of angle's serialization of multithreaded rendering.
<zmike>
ergh
<zmike>
yeah mid-frame flushing is super bad for zink
<zmike>
in any case though, if there's mid-frame flushing like this then you're going to see the descriptor count effectively multiplied by num_flushes
<anholt>
glFlush looks like actual app work -- it's preceded by glFenceSync() in that context, and there's some cycling through doing glWaitSync() on them in later frames in the other context.
<zmike>
unfortunate
<anholt>
well, it's not mid-frame work at least. you're intentionally flushing the stuff.
<zmike>
I am?
<anholt>
they are.
<zmike>
ah
<zmike>
one mystery solved-ish
<anholt>
bah. the wsi images have VK_IMAGE_USAGE_SAMPLED_BIT. I was hoping I could make asphalt fast by just checking if implicit sync bos were included in framebuffers at beginrendering time. :/
<zmike>
:/
<anholt>
(makes sense, need it for u_blitter or angle's equivalent)
<zmike>
yea
<zmike>
is sort of a shame that it can't be dynamically enabled though
<anholt>
zmike: in zink_set_color_write_enables(), shouldn't the dummy cb path be for *not* having EXT_cwe?
<zmike>
ughhhhh
<zmike>
I recently removed the driver workaround for this
<zmike>
and forgot to invert the cases
<anholt>
use_dummy_attachments too
<zmike>
yeah it was all just sed
<zmike>
all the cases need to be inverted
<zmike>
rb if you wanna just throw a MR in
<anholt>
I don't think it's all cases, but I can mr