#zink on 2022-03-17 — irc logs at oftc.irclog.whitequark.org

2021-07-26 22:56 ChanServ changed the topic of #zink to: official development channel for the mesa3d zink driver || https://docs.mesa3d.org/drivers/zink.html

06:40 eukara_ has quit []

09:02 anholt has quit [Ping timeout: 480 seconds]

09:08 anholt has joined #zink

10:24 <kusma> zmike: I've bisected it down to 3df9d8ed807a6693d5fc8cbda4faec28af081ff3 (gallium/u_threaded: implement pipelined partial buffer uploads using CPU storage)

10:38 <kusma> valgrind still complains before that commit thought, so I suspect we're just getting unlucky at that point.

10:49 <kusma> zmike: It doesn't seem like zink_reset_batch() NULL out batch->state... What did you mean when you said "that happens during batch reset"?

13:19 <zmike> kusma: hm my cts build doesn't have dEQP-EGL.functional.color_clears.single_context.gles2.rgb

13:19 <zmike> do I need to configure with somethingi special?

13:20 <zmike> I can repro in glmark2 at least

13:20 <kusma> I don't think so... Are you using the glcts, and not whatever is merged into one of the other branches?

13:20 <zmike> yes

13:20 <zmike> also can you explain your idea regarding d3d12's cube compiler pass?

13:22 <kusma> zmike: No need to write new math, we already have the needed math?

13:22 <kusma> I'm using the opengl-cts-4.6.2.0 tag, BTW

13:24 <zmike> I get the idea of reusing code, but, at least from my initial read, it seems like reusing create_array_tex_from_cube_tex would mean that I would need to be always binding cube textures as 2darray

13:25 <kusma> Let's discuss on the MR instead

13:25 <zmike> ?

13:29 <kusma> The discussion started on the MR, let's keep it there so others can follow it.

13:30 <kusma> I added a comment there as well.

13:39 <zmike> hm I figured out the bo crash, now I just have to find where it's happening...

13:39 <zmike> this is an annoying one

13:41 <zmike> tl;dr there is a case where batch usage is being set without the bo being added to the batch

13:44 <zmike> not being replaced or invalidated....

14:14 <zmike> oof okay this one sucks

14:14 <zmike> but it can only happen on context destroy at least

15:05 <zmike> kusma: I don't know if you're circling back to this, but I still don't understand how it would be possible to use the 2DARRAY pass without always binding cubes as 2DARRAY

15:46 <zmike> crash fixed.

15:46 <zmike> just need to run it through local ci

15:46 cheako is now known as cheakoirccloud

15:48 cheakoirccloud has quit []

15:48 cheakoirccloud has joined #zink

15:49 cheakoirccloud has left #zink [#zink]

15:52 <kusma> @zmike: The pass also updates the samplers to be of the right type, and the driver binds based on the type in the shader, I think.

15:53 cheako has joined #zink

15:56 <zmike> kusma: I'm not sure how that avoids cubes needing to either be bound as 2darray 100% of the time or juggling multiple samplerviews and rebinding?

16:03 <zmike> it sounds to me like that means the texture gets bound as a CUBE samplerview and then the driver does some magic to instead bind it as a 2DARRAY

16:03 <zmike> but then that seems like it has the additional problem of needing separate shader variants based on which textures are seamless vs nonseamless

16:04 <kusma> You'd need to choose which textures you need to lower and not. And then the pass lowers all relevant ops for that texture.

16:05 <kusma> And yeah, the driver would have to consider how the texture is used by the shader rather than just the view-type. IIRC, in D3D12, this is trivial.

16:05 <zmike> it's less trivial in vulkan, and this still appears to require every cube be converted to 2darray

16:07 <kusma> Ah, no. We just do the conversion based on the view state.

16:07 <zmike> right, that's what I'm getting at

16:08 <kusma> That's where the seamless flag is, so it shouldn't be a problem.

16:08 <kusma> We do it for integer textures instead, but should be the same.

16:08 <zmike> but again, that would require different shader variants for each possible combination of seamless/non-seamless

16:08 <zmike> so if I have 32 cube textures

16:08 <zmike> any combination of them may or may not be seamless

16:08 <zmike> thus I would need a variant for every possible combination

16:09 <zmike> unless I always bind them as 2darray

16:09 <kusma> First of all, do ARB_seamless_cubemap, not ARB_seamless_cubemap_per_texture.

16:09 <zmike> I'm doing both

16:09 <kusma> ARB_seamless_cubemap_per_texture has no usecases.

16:10 <zmike> that seems like a dubious claim given that the extension exists

16:10 <kusma> In fact, apart from conformance, ARB_seamless_cubemap has practically speaking no usecases. I see some people talking about DX9, but DX9 on DX10 HW does seamless always.

16:11 <kusma> Nobody wants non-seamless, really.

16:11 <kusma> But, the CTS is what it is. So supporting ARB_seamless_cubemap has some merit.

16:13 eukara has joined #zink

16:14 <zmike> hm

16:15 <zmike> okay, well let's put aside the per-texture issue then for now

16:15 <zmike> the samplerview juggling still seems not ideal

16:35 <zmike> I also don't like the idea of not being able to support something for the sake of reusing some code that isn't exactly what I needed

16:36 <airlied> port d3d12 to your code :-p

16:37 <zmike> I think the d3d12 one has a slightly different goal in mind?

16:39 <kusma> D3D12 is going to need this eventually also.

16:39 <kusma> But yeah, the D3D12 code as it is now, does this because integer textures suck on DX

16:40 <zmike> oof

16:42 <kusma> Basically, you can sample them, but only while converting to float!

16:42 <zmike> sounds ideal!

16:43 <kusma> You need to do texelFetch to get integers!

16:43 <zmike> very ideal

16:43 <kusma> so we first lower integer cube-maps to layered textures, and then lower integer-samples to texel-fetches.

16:43 <kusma> Very, very ideal.

16:44 <kusma> The whole industry is much impressed.

16:44 <kusma> Poor Gert who had to write this code.

16:44 <zmike> I can hear them oohing and ahhing

16:44 <kusma> BTW, I think maybe r600 has some... similar things going on.

16:45 <kusma> But I think they have some neat HW instructions that lets them split up the sampling.

16:45 <zmike> truly living the dream

16:46 <kusma> grep for `array_is_lowered_cube`...

16:46 <zmike> will check when I get home

16:46 <zmike> it's still not fully clear to me that the d3d12 pass is a better option; from what I've gathered: I'd have to do work on that pass to make it do what I need, I'd have to juggle sampler views in gallium, and I'd never be able to support per-texture

16:46 <kusma> Wait what... why is this in core NIR?

16:47 <kusma> You could do per-texture, but you'd need a big shader-key

16:49 <zmike> yes, and I definitely don't want a big shader key for this

16:49 <kusma> It's in core NIR because... r600 does a half-assed lowering :(

16:49 <zmike> brutal

16:50 <kusma> Yeah.. I think it's time to clean this up a bit.

16:50 <kusma> Anyway, I'm off.

16:50 <zmike> it seems like the most ideal option here would be fixing up the pass I wrote and then stripping non-seamless handling out of d3d12 so that driver only does the int conversion

16:50 <zmike> that way d3d12 gets per-texture for free

16:51 <kusma> I think the layer stuff is better for D3D12 TBH.

16:51 <zmike> I suppose if it's trivial to swizzle the view type then yeah maybe

16:51 <kusma> I think it's better if this stays in the drivers, but using shared helpers for the lowering.

16:52 <kusma> Because drivers can do more targeted solutions. This isn't as clean and universally useful as it seems at first sight, sadly :(

16:53 <zmike> there's other things that aren't universally useful in gallium, and this would generate noticeable overhead if it was in zink instead

16:53 <zmike> so I'm strongly opposed to having it in zink

16:53 <kusma> Maybe we need to do variable-length shader-keys at some point.

16:54 <zmike> ?

16:54 <kusma> I mean, we could encode special cases for all-zeroes and all-ones for the shader key, and have only applications that use multiple types pay the cost.

16:55 <zmike> it's a single bool value in gallium padding as it is now

16:55 <zmike> there's no overhead

16:55 <kusma> Then I'm not sure what you meant before with "noticable overhead"

16:55 <kusma> I honestly don't think any other driver than Zink is going to use this lowering, so I'm not sure it's a good fit for lowering in the state-tracker.

16:56 <kusma> Anyway, I need to leave now. Later!

16:57 <zmike> no other driver uses pointsize lowering, but that uses identical mechanisms

16:57 <kusma> The plan was for other drivers to use it.

16:57 <zmike> and by putting it in gallium it can reuse ubo0 instead of needing an additional buffer

16:57 <kusma> In fact, some other drivers did. But they ended up reverting it.

16:58 <zmike> not to mention reducing overhead in driver thread by punting to main thread

16:59 <airlied> is non seamless in other apis?

16:59 <airlied> or does nobody care

17:00 <zmike> d3d9

17:00 <airlied> so you can pass d3d9 conform eith seamles

17:00 <airlied> i wonder should we lobby to relax GL CTS

17:00 <zmike> no, d3d9 doesn't have seamless

17:01 <zmike> dxvk fails in some games because of this

17:01 <airlied> but in a layer situation if you impl it on top of seamless does anyome care?

17:02 <zmike> yes, games do not work

17:02 <zmike> there are tickets open for this

17:04 <zmike> https://github.com/doitsujin/dxvk/issues/2239

17:04 <zmike> https://github.com/doitsujin/dxvk/issues/2361

17:04 <zmike> it's still necessary

17:20 airlied has quit [Remote host closed the connection]

17:21 airlied has joined #zink

17:40 <zmike> ajax: I fixed the glmark2 thing, but you'll have to pull in a small fixup for kopper once it lands

17:40 <zmike> unless you're going to just push to the copper MR branch and then I can add it myself

17:51 <ajax> zmike: updated the mr branch

17:52 <ajax> feel free to fix up as desired

17:54 <zmike> ajax: fixedup

17:54 <zmike> glmark2 should work fine now

17:57 <kusma> Shadow cube maps, yeah... IIRC there's a ticket for that. I think ANV is behaving strange there...

18:00 <kusma> IIRC, they are not follow the vk spec.

18:10 <ajax> [function] fragment-complexity=low:fragment-steps=5:glmark2: ../src/gallium/drivers/zink/zink_fence.c:128: zink_vkfence_wait: Assertion `fence->batch_id' failed.

18:10 <daniels> imagine what high complexity is like

18:11 <ajax> zmike: ^ this at the separator commit

18:11 <zmike> 🤔

18:11 * zmike runs it for longer

18:11 <ajax> let's see if it repros under gdb

18:12 <zmike> did you just let it run until it got there?

18:12 <ajax> yeah

18:12 <zmike> k, I'll leave it on in the background then

18:41 <ajax> hmph, gdb perturbs things enough that it doesn't happen

18:48 <zmike> ajax: not seeing it here

18:48 <zmike> let me try just on that branch...

19:04 <ajax> not crashing in the same place every time, unsurprisingly, but not ever crashing under gdb

19:04 <zmike> coredumpctl?

19:05 <ajax> https://paste.centos.org/view/raw/ff1768c6

19:05 <ajax> that looks suspicious

19:05 <zmike> hmmm

19:06 <zmike> ngl I'm not super inclined to look at this if it gets fixed by kopper?

19:06 <zmike> but yes, ideally dri3 shouldn't be trying to wait on a deleted fence

19:07 <zmike> I'd think that destroying the context would cause all the flushes dri has picked up to be discarded

19:07 <zmike> errr all the fences*

19:10 <ajax> did you drop the reset status query patch on purpose?

19:10 <zmike> yeah, it only works after kopper

19:10 <zmike> so probably not having it doesn't regress anything

19:10 <ajax> except

19:10 <ajax> glmark2: ../src/gallium/frontends/dri/kopper.c:148: kopper_init_screen: Assertion `pscreen->get_param(pscreen, PIPE_CAP_DEVICE_RESET_STATUS_QUERY)' failed.

19:10 <zmike> argh

19:11 <zmike> okay okay, maybe it needs to be squashed?

19:11 <zmike> gimace

19:11 <ajax> yeah squish it

19:12 <zmike> squished and pushed

19:15 <ajax> [build] use-vbo=false: FPS: 6040 FrameTime: 0.166 ms

19:15 <ajax> X connection to :0 broken (explicit kill or server shutdown).

19:16 <zmike> probably drisw xlib again

19:16 <zmike> same as glxgears in xwl

19:17 <ajax> glmark2 --reuse-context doesn't crash

19:18 <ajax> this should be entertaining

19:22 <ajax> excuse me, doesn't crash until exit

19:23 <zmike> haha