<kusma>
zmike: I've bisected it down to 3df9d8ed807a6693d5fc8cbda4faec28af081ff3 (gallium/u_threaded: implement pipelined partial buffer uploads using CPU storage)
<kusma>
valgrind still complains before that commit thought, so I suspect we're just getting unlucky at that point.
<kusma>
zmike: It doesn't seem like zink_reset_batch() NULL out batch->state... What did you mean when you said "that happens during batch reset"?
<zmike>
kusma: hm my cts build doesn't have dEQP-EGL.functional.color_clears.single_context.gles2.rgb
<zmike>
do I need to configure with somethingi special?
<zmike>
I can repro in glmark2 at least
<kusma>
I don't think so... Are you using the glcts, and not whatever is merged into one of the other branches?
<zmike>
yes
<zmike>
also can you explain your idea regarding d3d12's cube compiler pass?
<kusma>
zmike: No need to write new math, we already have the needed math?
<kusma>
I'm using the opengl-cts-4.6.2.0 tag, BTW
<zmike>
I get the idea of reusing code, but, at least from my initial read, it seems like reusing create_array_tex_from_cube_tex would mean that I would need to be always binding cube textures as 2darray
<kusma>
Let's discuss on the MR instead
<zmike>
?
<kusma>
The discussion started on the MR, let's keep it there so others can follow it.
<kusma>
I added a comment there as well.
<zmike>
hm I figured out the bo crash, now I just have to find where it's happening...
<zmike>
this is an annoying one
<zmike>
tl;dr there is a case where batch usage is being set without the bo being added to the batch
<zmike>
not being replaced or invalidated....
<zmike>
oof okay this one sucks
<zmike>
but it can only happen on context destroy at least
<zmike>
kusma: I don't know if you're circling back to this, but I still don't understand how it would be possible to use the 2DARRAY pass without always binding cubes as 2DARRAY
<zmike>
crash fixed.
<zmike>
just need to run it through local ci
cheako is now known as cheakoirccloud
cheakoirccloud has quit []
cheakoirccloud has joined #zink
cheakoirccloud has left #zink [#zink]
<kusma>
@zmike: The pass also updates the samplers to be of the right type, and the driver binds based on the type in the shader, I think.
cheako has joined #zink
<zmike>
kusma: I'm not sure how that avoids cubes needing to either be bound as 2darray 100% of the time or juggling multiple samplerviews and rebinding?
<zmike>
it sounds to me like that means the texture gets bound as a CUBE samplerview and then the driver does some magic to instead bind it as a 2DARRAY
<zmike>
but then that seems like it has the additional problem of needing separate shader variants based on which textures are seamless vs nonseamless
<kusma>
You'd need to choose which textures you need to lower and not. And then the pass lowers all relevant ops for that texture.
<kusma>
And yeah, the driver would have to consider how the texture is used by the shader rather than just the view-type. IIRC, in D3D12, this is trivial.
<zmike>
it's less trivial in vulkan, and this still appears to require every cube be converted to 2darray
<kusma>
Ah, no. We just do the conversion based on the view state.
<zmike>
right, that's what I'm getting at
<kusma>
That's where the seamless flag is, so it shouldn't be a problem.
<kusma>
We do it for integer textures instead, but should be the same.
<zmike>
but again, that would require different shader variants for each possible combination of seamless/non-seamless
<zmike>
so if I have 32 cube textures
<zmike>
any combination of them may or may not be seamless
<zmike>
thus I would need a variant for every possible combination
<zmike>
unless I always bind them as 2darray
<kusma>
First of all, do ARB_seamless_cubemap, not ARB_seamless_cubemap_per_texture.
<zmike>
I'm doing both
<kusma>
ARB_seamless_cubemap_per_texture has no usecases.
<zmike>
that seems like a dubious claim given that the extension exists
<kusma>
In fact, apart from conformance, ARB_seamless_cubemap has practically speaking no usecases. I see some people talking about DX9, but DX9 on DX10 HW does seamless always.
<kusma>
Nobody wants non-seamless, really.
<kusma>
But, the CTS is what it is. So supporting ARB_seamless_cubemap has some merit.
eukara has joined #zink
<zmike>
hm
<zmike>
okay, well let's put aside the per-texture issue then for now
<zmike>
the samplerview juggling still seems not ideal
<zmike>
I also don't like the idea of not being able to support something for the sake of reusing some code that isn't exactly what I needed
<airlied>
port d3d12 to your code :-p
<zmike>
I think the d3d12 one has a slightly different goal in mind?
<kusma>
D3D12 is going to need this eventually also.
<kusma>
But yeah, the D3D12 code as it is now, does this because integer textures suck on DX
<zmike>
oof
<kusma>
Basically, you can sample them, but only while converting to float!
<zmike>
sounds ideal!
<kusma>
You need to do texelFetch to get integers!
<zmike>
very ideal
<kusma>
so we first lower integer cube-maps to layered textures, and then lower integer-samples to texel-fetches.
<kusma>
Very, very ideal.
<kusma>
The whole industry is much impressed.
<kusma>
Poor Gert who had to write this code.
<zmike>
I can hear them oohing and ahhing
<kusma>
BTW, I think maybe r600 has some... similar things going on.
<kusma>
But I think they have some neat HW instructions that lets them split up the sampling.
<zmike>
truly living the dream
<kusma>
grep for `array_is_lowered_cube`...
<zmike>
will check when I get home
<zmike>
it's still not fully clear to me that the d3d12 pass is a better option; from what I've gathered: I'd have to do work on that pass to make it do what I need, I'd have to juggle sampler views in gallium, and I'd never be able to support per-texture
<kusma>
Wait what... why is this in core NIR?
<kusma>
You could do per-texture, but you'd need a big shader-key
<zmike>
yes, and I definitely don't want a big shader key for this
<kusma>
It's in core NIR because... r600 does a half-assed lowering :(
<zmike>
brutal
<kusma>
Yeah.. I think it's time to clean this up a bit.
<kusma>
Anyway, I'm off.
<zmike>
it seems like the most ideal option here would be fixing up the pass I wrote and then stripping non-seamless handling out of d3d12 so that driver only does the int conversion
<zmike>
that way d3d12 gets per-texture for free
<kusma>
I think the layer stuff is better for D3D12 TBH.
<zmike>
I suppose if it's trivial to swizzle the view type then yeah maybe
<kusma>
I think it's better if this stays in the drivers, but using shared helpers for the lowering.
<kusma>
Because drivers can do more targeted solutions. This isn't as clean and universally useful as it seems at first sight, sadly :(
<zmike>
there's other things that aren't universally useful in gallium, and this would generate noticeable overhead if it was in zink instead
<zmike>
so I'm strongly opposed to having it in zink
<kusma>
Maybe we need to do variable-length shader-keys at some point.
<zmike>
?
<kusma>
I mean, we could encode special cases for all-zeroes and all-ones for the shader key, and have only applications that use multiple types pay the cost.
<zmike>
it's a single bool value in gallium padding as it is now
<zmike>
there's no overhead
<kusma>
Then I'm not sure what you meant before with "noticable overhead"
<kusma>
I honestly don't think any other driver than Zink is going to use this lowering, so I'm not sure it's a good fit for lowering in the state-tracker.
<kusma>
Anyway, I need to leave now. Later!
<zmike>
no other driver uses pointsize lowering, but that uses identical mechanisms
<kusma>
The plan was for other drivers to use it.
<zmike>
and by putting it in gallium it can reuse ubo0 instead of needing an additional buffer
<kusma>
In fact, some other drivers did. But they ended up reverting it.
<zmike>
not to mention reducing overhead in driver thread by punting to main thread
<airlied>
is non seamless in other apis?
<airlied>
or does nobody care
<zmike>
d3d9
<airlied>
so you can pass d3d9 conform eith seamles
<airlied>
i wonder should we lobby to relax GL CTS
<zmike>
no, d3d9 doesn't have seamless
<zmike>
dxvk fails in some games because of this
<airlied>
but in a layer situation if you impl it on top of seamless does anyome care?