#zink on 2021-07-28 — irc logs at oftc.irclog.whitequark.org

2021-07-26 22:56 ChanServ changed the topic of #zink to: official development channel for the mesa3d zink driver || https://docs.mesa3d.org/drivers/zink.html

01:28 omegatron has quit [Quit: What happened? You quit!]

07:49 graphitemaster has joined #zink

07:51 <graphitemaster> I'm trying to build a stand-alone zink that will leverage nvidia provided vulkan and I'm running into some issues, mesa keeps trying to load swrast_dri

07:51 <graphitemaster> My build command is meson --prefix=/tmp/zink -Dgallium-drivers=zink -Dvulkan-drivers= -Ddri-drivers= build-zink; ninja -C build-zink/ install

07:53 <graphitemaster> Testing it with LD_LIBRARY_PATH=/tmp/zink/lib MESA_LOADER_DRIVER_OVERRIDE=zink glxinfo

07:53 <graphitemaster> libGL error: failed to load driver: swrast

08:23 kusma has joined #zink

08:23 <kusma> I think you need to use the GALLIUM_DRIVER=zink approach on NVIDIA until Penny/Copper has landed... ajax_?

08:34 hch12907 has joined #zink

08:35 <graphitemaster> That works. Looks like mesa + zink doesn't do mesa_glthread=true ?

08:35 <graphitemaster> "dri_create_context: requested glthread but driver is missing backgroundCallable V2 extension"

08:36 <kusma> graphitemaster: That might be because you're down a swrast codepath now :-/

08:36 <graphitemaster> [2021-07-28 04:34:58] [render/gl4/info] | GL Collabora Ltd 4.6 (Core Profile) Mesa 21.2.0-rc2 zink (NVIDIA GeForce RTX 2070)

08:36 <kusma> Let me test on intel

08:37 <graphitemaster> It does appear to be HW accelerated /w native NVIDIA

08:37 <kusma> NVIDIA and the lack of DRI support is... not ideal

08:37 <kusma> No, I meant swast-codepath as in the WSI stuff.

08:37 <graphitemaster> If I omit GALLIUM_DRIVER=zink, I get llvmpipe which is totally swrast

08:37 <graphitemaster> Ah, I see.

08:37 <kusma> We end up doing a nasty CPU-blit to a sw-winsys implementation when there's no DRI

08:38 <graphitemaster> I must say, the performance is actuallly worse than what I was expecting

08:38 <kusma> (which is what GALLIUM_DRIVER=zink is all about, kinda)

08:38 <kusma> Well, yeah. You're down the swrast winsys codepath ;)

08:39 <graphitemaster> native: 6700fps, zink: 70fps, llvmpipe: 40fps

08:39 <kusma> And we don't really have a better option for NVIDIA yet. Penny/Copper will fix that, but we're not there yet.

08:39 <graphitemaster> I can't imagine the WSI blit is that SLOW

08:40 <graphitemaster> Anyways, I'm just surprised it works XD

08:40 <kusma> Penny/Copper is ajax patches to hook up to vulkan WSI code instead

08:42 <kusma> graphitemaster: The good news is that most of your perf is bound to blitting the frontbuffer, so you can probably do much more heavy rendering at not much less perf! ;)

08:42 <graphitemaster> Was going to say, gsync is totally broken too

08:42 <kusma> BTW, just checked, mesa_glthread=true seems to work fine on Intel with DRI

08:43 <kusma> Hmm, is llvmpipe working with gsync?

08:43 <graphitemaster> I mean via zink

08:43 <graphitemaster> char ZINK[] = "GALLIUM_DRIVER=zink";

08:43 <graphitemaster> putenv(ZINK);

08:43 <graphitemaster> char MESA[] = "mesa_glthread=true";

08:43 <graphitemaster> putenv(MESA);

08:43 <kusma> Because, our winsys stuff should kinda be on par with llvmpipe (modulo some details that probably doesn't matter)

08:44 <graphitemaster> Could try llvmpipe there, I dunno how gsync would work with software rendering though

08:44 <graphitemaster> Unless you're using hw to present to swapchain

08:45 <kusma> Yeah, so maybe that's the reason. And again, I guess Penny/Copper is the fix ;)

08:45 <kusma> It's becoming a bit of a meme; Penny/Copper will fix EVERYTHING ;)

08:45 <kusma> I'm sure Ajax is working our the interactions it has with flying cars and jetpacks right now.

08:46 <graphitemaster> Apparently in llvmpipe all my glTextureSubImage3D calls are straight up INVALID_VALUE / INVALID_OPERATIION, but also glthread does not appear to work with GALLIUM_DRIVER=llvmpipe either

08:47 <kusma> graphitemaster: Yeah, that's kinda what I expected... Try MESA_DEBUG to figure out what more precisely is wrong with the glTextureSubImage3D calls...

08:48 <kusma> (env var, set it to something like 1)

08:49 <graphitemaster> Mesa: User error: GL_INVALID_VALUE in glTexStorage3D(invalid width, height or depth)

08:49 <graphitemaster> Time to print what values I'm passing there.

08:50 <graphitemaster> It's weird it's printing that specific function because I don't use it, I use glTextureStorage3D here.

08:50 <kusma> Yeah, that could be a reporting-bug

08:51 <graphitemaster> Mesa debug output: GL_INVALID_VALUE in glTexStorage3D(invalid width, height or depth)

08:51 <graphitemaster> w=32,h=32,d=4096

08:51 <graphitemaster> The values I'm passing to it.

08:51 <kusma> that d=4096 sounds like a LOT

08:51 <graphitemaster> That is correct, 128 packed 32x32x32 3D textures :P

08:52 <kusma> LLVMpipe has a max 3D texture size of 2k

08:52 <kusma> (per axis)

08:52 <kusma> So... that's your problem :-)

08:52 <graphitemaster> Well that is not min-spec conformant XD

08:52 <graphitemaster> 4096 is min-spec XD

08:53 <kusma> for which spec version?

08:54 <graphitemaster> > The value must be at least 1024

08:54 <graphitemaster> Oh my god

08:54 <graphitemaster> OpenGL is ridiculous sometimes.

08:54 <graphitemaster> The min-spec is even worse than 2k

08:56 <kusma> Sounds like your application either needs to reject drivers with too low limits, or change the texture-packing ;)

08:57 <graphitemaster> Yeah fixed, ez.

08:57 <kusma> ...or just not care about LLVMpipe, which is a totally acceptable solution for some applications ;)

08:57 <kusma> Cool :)

08:58 <graphitemaster> I'm impressed with how well llvmpipe runs until I look at top and I see that my ThreadRipper is dying.

09:01 <graphitemaster> zink fails my lookdev tests btw

09:02 <graphitemaster> that's where I compare different rendered frames of the same scene to see how close the frames are for different renderers

09:02 <graphitemaster> I guess rasterization / fill rules might be different

09:03 <graphitemaster> or the multisample pattern is different

09:03 <graphitemaster> hummm

09:03 <kusma> What kind of primitives? If it's lines or points, then... uh yeah ;)

09:04 <kusma> Triangles should be identical.

09:04 <graphitemaster> Looks more like texture filtering

09:04 <kusma> Hmm, that should be the same...

09:05 <graphitemaster> Oh

09:06 <kusma> Maybe we're not exposing the same levels of anisotropic filtering or something?

09:06 <graphitemaster> What do dFdx, dFdy map to in zink, Coarse or Fine

09:06 <kusma> Coarse by default, I suspect.

09:06 <kusma> Ah, I don't think mesa supports the glHint for this...

09:06 <graphitemaster> Yep, looks like you ignore GL_FRAGMENT_SHADER_DERIVATIVE_HINT

09:06 <graphitemaster> That's the issue, changing shaders fixed it :P

09:07 <kusma> Yeah, I think that would be a very welcome fix :)

09:07 <kusma> Shouldn't be too hard to fix, I think.

09:07 <graphitemaster> Beautiful.

09:08 <kusma> OK, seems i965 supports the hint, but not any Gallium drivers.

09:09 <graphitemaster> It seems simple to support in theory but you can set it before a draw call and that has to patch the shader :|

09:10 <kusma> graphitemaster: Yeah, but we have stuff to handle these kinds of things

09:10 <graphitemaster> *nod*

09:10 <graphitemaster> Might end up learning mesa...

09:10 <kusma> I think st_update_fp needs to check the state and lower the instructions to either the fine or coarse versions.

09:11 <kusma> So that would probably end up as a bit in st_fp_variant_key or something like that...

09:12 <kusma> Or... maybe that's a bit heavy handed... That won't let us deduplicate the variants when uses_fddx_fddy is false...

09:13 <kusma> Ah, maybe it does let us...

09:21 <graphitemaster> I'm ignorant so just going to nod along. Really curious about the penny/copper thing too.

10:03 <graphitemaster> Found a legit bug, passing 16 to GL_UNPACK_ALIGNMENT

10:03 <graphitemaster> Dunno who to thank, Mesa for being strict or NV for allowing that >_>

10:07 <kusma> graphitemaster: Maybe you could test this branch? https://gitlab.freedesktop.org/kusma/mesa/-/commits/st-lower-fddx-fddy-precision

10:08 <kusma> This should hopefully fix the GL_FRAGMENT_SHADER_DERIVATIVE_HINT-issue...

10:09 <graphitemaster> I'll give it a shot after a coffee and I fix my texture alignment bugs, thanks.

10:11 <kusma> Awesome :)

10:37 <graphitemaster> GL_UNPACK_IMAGE_HEIGHT is completely broken on AMD Windows drivers *shrug*

10:58 omegatron has joined #zink

11:01 <kusma> graphitemaster: I posted the patches here: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12097

11:08 <graphitemaster> kusma, building now, should I also test switching hint before a draw call just to make extra sure

11:14 <graphitemaster> Well they must be working because lookdev tests passed for GL Collabora Ltd 4.6 (Core Profile) Mesa 21.3.0-devel (git-e0b45bf2ff) zink (NVIDIA GeForce RTX 2070)

11:15 <graphitemaster> But that is just set once globally, I dunno if it works for switching the hint before a draw call yet.

11:24 <kusma> Yeah, good point. I guess I should verify that we're invalidating the right state here.

11:24 <kusma> I suspect we are, because i965 doesn't do anything magical here as far as I can tell, but let's find out!

11:26 <kusma> Hmm, no. Doesn't look right to me.

11:41 <graphitemaster> Something else is wrong lookdev wise too in another scene, something wonky with texture lod

11:42 <graphitemaster> This: f32x3 prefiltered_color = rx_textureCMLod(u_prefilter, r, roughness * 5.0).rgb;

11:43 <graphitemaster> Should map textureLod(u_prefilter, r, roughness * 5.0).rgb, but it doesn't appear to have the same value

11:43 <kusma> Fixed the state-update bug in the MR, BTW

11:43 <graphitemaster> u_prefilter is a cubemap sampler.

11:44 <kusma> does that cubemap care about seamless vs non-seamless?

11:44 <graphitemaster> seamless cubemap filtering is required, enabled globally at context init

11:44 <graphitemaster> this just looks like it's using a lower lod level though

11:44 <graphitemaster> lower than the one it hsould be using

11:45 <kusma> We only support seamless cubemaps for Zink, because Vulkan. In theory we could implement non-seamless per texture by lowering to some ALU code and a layered 2D texture lookup instead, but meh.

11:45 <kusma> OK, so that shouldn't be a problem.

11:45 <kusma> @gr

11:46 <kusma> graphitemaster: Do you specify any lod-bias anywhere?

11:46 <kusma> Could be that we're missing one... I remember having a problem like that in the very similar D3D12 driver...

11:47 <kusma> IIRC, there were one LOD bias in the... texture objects(?) that we didn't have an obvious place to account for, so we needed some shader-lowering...

11:47 <graphitemaster> No lod bias here.

11:47 <kusma> OK, then that's probably not it either...

11:47 <graphitemaster> min lod is -1000 and max lod is 1000 (gl defaults)

11:47 <graphitemaster> lod bias = 0.0

11:48 <kusma> So one obvious thing is that cubemaps have an input coordinate per face of -1 to 1 instead of 0 to 1...

11:48 <graphitemaster> Was going to say, cubemap uvw's are implicitly supposed to be normalized by the textureLod call

11:48 <kusma> So if the lod is calculated without taking the cubemapness into account, you'd get an off-by-one in the LOD...

11:48 <graphitemaster> The normalize is not being thrown out is it?

11:49 <kusma> The normalize happens below us, we're just emitting SPIR-V opcodes to sample.

11:49 <kusma> So... this sounds like it could be a bug in the NVIDIA Vulkan driver, perhaps?

11:50 <kusma> There are some cases where Mesa drivers lower texturing stuff like that, but I don't think we do for Zink...

11:51 <kusma> You can use $ZINK_DEBUG=spirv to have Zink dump the SPIR-V modules

11:53 <graphitemaster> native: https://media.discordapp.net/attachments/793891948166381601/869910379340386304/2021-07-28-075212_1920x1080_scrot.png

11:53 <graphitemaster> zink: https://cdn.discordapp.com/attachments/793891948166381601/869910375657771048/2021-07-28-075300_1920x1080_scrot.png

11:54 <graphitemaster> picking the absolute lowest lod level in the prefiiltered envmap

11:56 <graphitemaster> It's like the lod parameter is reversed

12:02 <graphitemaster> I think zink is mapping min and max lod incorrectly when mipmaps are involved.

12:02 <kusma> I don't think so, TBH...

12:04 <kusma> https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/gallium/drivers/zink/zink_context.c#L306

12:04 <kusma> This looks correct to me...

12:06 <kusma> I think checking the generated SPIR-V code should tell us if something funky is happening to the sampling...

12:07 <graphitemaster> yeah need to find which shader it is though

12:07 <kusma> Yeah, that's not the easiest :)

12:08 <kusma> We're printing the name of each shader as we dump it, but we don't compile the SPIR-V until the shader is used...

12:08 <kusma> So it's not like glCompileShader or glLinkShader is going to correspond here...

12:09 <graphitemaster> [root@graphite rex]# spirv-dis dump11.spv | grep "u_prefilter"

12:09 <graphitemaster> OpName %u_prefilter "u_prefilter"

12:09 <graphitemaster> OpDecorate %u_prefilter DescriptorSet 2

12:09 <graphitemaster> OpEntryPoint Fragment %main "main" %vs_coordinate %fs_color %ubo_slot_0 %u_scale_bias %u_prefilter %u_irradiance %u_depth %u_emission %u_normal %u_albedo

12:09 <graphitemaster> OpDecorate %u_prefilter Binding 133

12:09 <graphitemaster> %u_prefilter = OpVariable %_ptr_UniformConstant_23 UniformConstant

12:09 <graphitemaster> %400 = OpLoad %23 %u_prefilter

12:09 <graphitemaster> Found it

12:09 <kusma> Hmm, that's just reading a uniform...

12:09 <graphitemaster> https://pastebin.com/raw/PTwgENiD

12:10 <kusma> OK, that's using OpImageSampleExplicitLod

12:10 <kusma> %401 = OpImageSampleExplicitLod %v4float %400 %398 Lod %399

12:11 <graphitemaster> Walking back up, can see the OpFMul %float %394 %395

12:11 <graphitemaster> Which is the "roughness * 5.0" presumably

12:12 <graphitemaster> I assume uintBitsToFloat(1084227584) is 5.0

12:12 <kusma> Yeah, and 1084227584 = 0x40a00000 = 5.0

12:14 <graphitemaster> roughness is pulled out of the b channel of the normal texture, which is what I assume %393 = OpCompositeExtract %uint %59 2 is doing, the 2 = b

12:14 <graphitemaster> So the spirv code is gine

12:14 <graphitemaster> s/gine/fine

12:14 <kusma> Hmm, but I think I see a smoking gun: "%58 = OpImageSampleImplicitLod %v4float %57 %56 None"

12:14 <kusma> Do you really want to sample your gbuffer (I assume that's what this is?) with LODs?

12:15 <kusma> I guess if you have none, then it might be fine... But otherwise...?

12:15 <kusma> That's where your "roughness" seems to be coming from...

12:15 <graphitemaster> No mips or lods, all nearest filter, I don't specify Lod in that case

12:15 <kusma> OK, then that should be fine.

12:16 <kusma> So, yeah. Then I wonder if this is a problem in the Vulkan driver...

12:17 <graphitemaster> I highly doubt that

12:17 <kusma> Having worked with NVIDIAs Vulkan driver, I'm not so sure I doubt it ;)

12:18 <kusma> I had a lot of issues along these lines in the past with it.

12:19 <kusma> In this case, it would be a OpImageSampleImplicitLod + cubemap special case. I guess it's worth checking if that's covered in the CTS.

12:19 <kusma> Eh OpImageSampleExplicitLod

12:20 <graphitemaster> I know there is some whacky crap in OpenGL with min/max lod, it doesn't just map directly

12:21 <graphitemaster> directly being TIC here on the Maxwell NV GPU

12:21 <graphitemaster> There is a min/max field for lod but the GL values from the NV driver do not actually produce the equivalent values in the TIC.

12:21 <kusma> Seems the VK CTS does test the combination of textureLod and cubemaps.

12:22 <graphitemaster> Right

12:23 <graphitemaster> ceil(min_lod + 0.5) - 1

12:23 <graphitemaster> ceil(max_lod + 0.5) - 1

12:23 <graphitemaster> Check https://www.khronos.org/registry/OpenGL/extensions/ARB/ARB_texture_query_lod.txt

12:23 <graphitemaster> ComputeAccessedLod

12:24 <kusma> Oh joy, that stuff...

12:25 <kusma> So.. are you using nearest filtering of the mipmaps?

12:26 <kusma> I mean, I think this is doing the same in Vulkan and GL, so I don't think that's it either. Just trying to figure out if there's more to dig into here...

12:26 <graphitemaster> looks like bilinear + trilinear + mipmaps

12:27 <kusma> right, but that ceil stuff is inside an "TEXTURE_MIN_FILTER is NEAREST_MIPMAP_NEAREST or LINEAR_MIPMAP_NEAREST" conditional, neigher is trilinear...

12:27 <kusma> neither

12:32 <graphitemaster> min = GL_LINEAR_MIPMAP_LINEAR, mag = GL_LINEAR, min_lod = -1000, max_lod = 1000, lod_bias = 0.0

12:32 <kusma> Yeah, so that's not really what's going on here...

12:32 <graphitemaster> Just needed to verify things.

12:32 <kusma> https://www.khronos.org/registry/vulkan/specs/1.2-extensions/html/vkspec.html#textures-image-level-selection

12:33 <kusma> So, Vulkan specifies the same here, `ceil(d' + 0.5) - 1`, but also *allows* for dropping the -1...

12:35 <graphitemaster> I also force aniso 16x on it...

12:36 <graphitemaster> I dunno why I do that

12:39 <graphitemaster> Huh

12:39 <graphitemaster> I'm looking at UE4 source code because I'm confused

12:39 <kusma> Hmm...

12:40 <graphitemaster> They set minLod to 0.0 and maxLod to 1000 in Vulkan, but -1000.0 and 1000.0 in OpenGL

12:40 <kusma> Aniso is another candidate for lod bias issues, indeed...

12:40 <kusma> Mesa ends up clamping the min/max lod range to 0...num_miplevels anyway

12:42 <graphitemaster> UE4 universally sets mipLodBias to clamp(user_lod_bias, -maxSamplerLodBias, maxSamplerLodBias)

12:42 <graphitemaster> So just clamps it to the device limits I guess

12:43 <graphitemaster> I want to test something, I'm going to set minLod to 0 in my GL code, just to weed out that this clamping behavior is not the culprit

12:43 <kusma> Good idea

12:45 <graphitemaster> Not that :|

12:46 <graphitemaster> All this time we've been thinking it's Lod related, what if it's the generation of the texture itself too.

12:46 <graphitemaster> Haven't ruled that out yet

12:47 <kusma> Yeah, that could be!

12:48 <kusma> I'm far from sure Zink always access the right 2D image when combinding things like miplevels, texture-arrays and cubemaps... As in blits to the right subresource etc.

12:49 <kusma> IIRC there were recently some fixes there, but with a comment from zmike that there probably was more similar stuff...

12:49 <kusma> I think that was about cubemap-arrays, but I'm not sure.

12:51 <graphitemaster> https://raw.githubusercontent.com/BuckeyeSoftware/rex/main/base/renderer/techniques/prefilter_environment_map.json5?token=AACTZG3ERCYEDANIV7U2DZTBBKGS6

12:51 <graphitemaster> This comparison here

12:51 <graphitemaster> In void main()

12:51 <graphitemaster> If the compiler did integrate for roughness of 0 it could cause that issue

12:52 <graphitemaster> Oh my god I pass 0.0 there to sample_envmap

12:53 <kusma> Is that bad? :)

12:53 <graphitemaster> I'm getting tired

12:53 <graphitemaster> 0.0 is base level, yeah

12:54 <kusma> You mean in the "u_roughness == 0.0"-case, right? I would assume that a lod of 0.0 would be what you want?

12:54 <graphitemaster> Correct

12:54 <kusma> Oh... I think I get what you're saying...

12:55 <kusma> you don't have base level of 0, do you?

12:55 <graphitemaster> I do.

12:55 <kusma> Oh, OK. Then this looks correct to me.

12:56 <graphitemaster> I wish I could renderdoc the Vulkan calls of Zink, but renderdoc goes and replaces the dynamic linker that loads the GL driver

12:56 <graphitemaster> So it's intercepting it as GL

12:56 <kusma> Hmm, @zmike got it working with some patches

12:57 <kusma> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10999

13:02 <graphitemaster> Okay just to rule that out, it does not happen with 2D textures

13:02 <graphitemaster> This is specifically a problem with cubemap samplers.

13:02 <graphitemaster> So now I'm believing it could be a NV Vulkan driver bug more

13:03 <graphitemaster> At first I was like no way this is a NV driver bug because that would've been found immediately ... for 2D textures :P

13:04 <kusma> Yeah... Both GL and Vulkan defines the LOD as being calculated in pixel-space, and this seems to be the kind of problem you get if you calculate it from coordinate-space...

13:04 <kusma> Yeah. But the VK CTS does test textureLod + cubemap, so I'm not 100% sure...

13:05 <kusma> Seems like something NV should be aware of then...

13:07 <zmike> there's a ticket open about cube sampling

13:08 <kusma> Interesting

13:08 <graphitemaster> Have a link?

13:09 <kusma> https://gitlab.freedesktop.org/mesa/mesa/-/issues/4332

13:11 <graphitemaster> Yeah I assume seamless filtering globally.

13:11 <graphitemaster> Wouldn't cause the wrong mip selection.

13:11 <kusma> graphitemaster: Yeah, so you shouldn't be affected.

13:12 <kusma> This is also about shadow samplers, not color samplers

13:17 <graphitemaster> So it looks like textureLod with a CM behaves such that 0.0 is the base level and 1.0 is the last mip level

13:17 <graphitemaster> That's what I'm observing with some tests

13:17 <kusma> Uh... that sounds very wrong :-P

13:18 <kusma> If that's what the driver is doing, it sounds completely busted

13:18 <kusma> Maybe the CTS tests only uses two levels?

13:21 <graphitemaster> Lines also fail the test, but this is as close as I can get zink to match GL

13:22 <kusma> Lines in GL (and Vulkan) are... meh.

13:22 <graphitemaster> Channel distortion: MAE red: 336.877 (0.00514041) green: 322.364 (0.00491896) blue: 331.043 (0.00505139) all: 330.095 (0.00503692)

13:22 <graphitemaster> https://media.discordapp.net/attachments/673059994324697088/869932521306083358/difference.png?width=1211&height=681

13:23 <graphitemaster> cubemap filtering is just way off

13:23 <kusma> Heh, seems there's some ordered dithering going here. Does glDisable(GL_DITHER) on the NVIDIA reference help? :)

13:23 <graphitemaster> see bottom: https://raw.githubusercontent.com/BuckeyeSoftware/rex/main/base/renderer/techniques/deferred_indirect.json5?token=AACTZGZ3T6U4PGHS3J2KNUDBBKKNI

13:23 <graphitemaster> indirect_specular += bayer_16x16(vs_coordinate * u_resolution) / 255.0; XD

13:24 <kusma> We don't do dithering at all, as it's completely undefined and not really supported in Vulkan. In theory we could create some bayer-matrix and apply a bias at the end of the fragment shader and kinda-sorta get the right behavior... but meh.

13:24 <kusma> Ah, right. Yeah, so that's you, not GL :)

13:24 <graphitemaster> Nice catch though

13:28 <kusma> Then I guess the noise here is mostly due to the lod issue?

13:28 <graphitemaster> Having visual tests (RMSE, MAE, etc) for comparing different rendering backends was one of those "sounds really good on paper", because I can ensure consistent visual results across all our platforms and artists love it, but it's also been an absolute nightmare because it's found several driver bugs, api quirks, and just bullshit that I really don't feel like fixing.

13:29 <kusma> Oh I absolutely hate image based testing. It's something that has haunted me my entire career, and in every case we've ended up stopping to do it because of all of the issues it comes with.

13:32 <kusma> Theory: you notice when rendering changes. Practice: You notice months later than something regressed, because a trivial change pushed the diff over the error-threshold, and nobody inspected the results in between, or blindly accepted new results because validating the new ones is hard.

13:32 <graphitemaster> Oh I broke zink hard

13:33 <kusma> And no amount of making fancy UIs to inspect the changes or clever comparison algorithm etc seems to help with that.

13:33 <kusma> graphitemaster: Welcome to the club!

13:34 <graphitemaster> rex: ../src/gallium/drivers/zink/zink_descriptors.c:809: zink_descriptor_set_get: Assertion `pool->num_sets_allocated < ZINK_DEFAULT_MAX_DESCS' failed.

13:34 <zmike> oh cool you hit my new assert

13:34 <kusma> Good assert.

13:34 <zmike> try with ZINK_DESCRIPTORS=lazy

13:35 <graphitemaster> Now I just get Mesa: User error: GL_INVALID_VALUE in glNamedBufferSubData(offset 0 + size 8224 > buffer size 0)

13:35 <graphitemaster> Except I'm totally not passing those values to that function

13:36 <graphitemaster> Ah

13:36 <graphitemaster> Mesa: User error: GL_OUT_OF_MEMORY in glNamedBufferData

13:36 <graphitemaster> Mesa: 1 similar GL_OUT_OF_MEMORY errors

13:36 <graphitemaster> That makes more sense

13:37 <graphitemaster> I've only allocated 128 MiB from mesa total though, so something is wonked

13:38 <graphitemaster> It looks like zink has problems with being restarted.

13:38 <graphitemaster> i.e cleanup all GL resources, destroy the GL context, unload the library, load the library, make new GL context, repeat.

13:38 <graphitemaster> Which is how the testsuite runs

13:39 <zmike> what do you mean "unload the library"

13:39 <graphitemaster> dlclose

13:39 <kusma> zmike: dlclose, I suppose?

13:39 <zmike> what library tho

13:39 <zmike> zink?

13:40 <graphitemaster> libGL.so in this case

13:41 <zmike> might be that you're creating a new screen object for each test and never destroying it

13:45 <graphitemaster> I'd assume SDL_GL_DeleteContext and SDL_DestroyWindow would handle that

13:48 <zmike> should be easy enough to check; if you monitor your available vram as your tests progress and it keeps going down then that's the problem

13:52 <graphitemaster> This was just two restarts https://media.discordapp.net/attachments/673059994324697088/869940331708301342/2021-07-28-095152_2838x1561_scrot.png

13:53 <graphitemaster> 21% vram usage and getting an oom error

13:53 <zmike> huh

13:53 <kusma> Could it be that we're hitting the max allocation limit on NV?

13:53 <graphitemaster> sorry, 3 restarts

13:57 <zmike> would be helpful if you could figure out exactly which vk call is getting the oom error

13:59 <graphitemaster> Is there a super verbose logging mode that emits every Vulkan call

14:00 <graphitemaster> I imagine the validation layers would report oom

14:00 <graphitemaster> Going to try ZINK_DEBUG=validation

14:01 <graphitemaster> Nothing

14:03 <zmike> I would guess you should be able to gdb in buffer_transfer_map and find the error from there

14:21 <graphitemaster> The issue is in resource_object_create, vkAllocateMemory

14:22 <zmike> what's the callstack for it?

14:25 <graphitemaster> https://pastebin.com/raw/VZg9ChfB

14:26 <zmike> can you tell me what templ->usage is?

14:27 <graphitemaster> 2 lol

14:28 <zmike> hmm

14:28 <graphitemaster> (gdb) print *templ

14:28 <graphitemaster> screen = 0x0}

14:28 <graphitemaster> $2 = {reference = {count = 0}, width0 = 16777216, height0 = 1, depth0 = 1, array_size = 1, format = PIPE_FORMAT_R8_UNORM, target = PIPE_BUFFER, last_level = 0, nr_samples = 0, nr_storage_samples = 0, usage = 2, bind = 0, flags = 0, next = 0x0,

14:28 <zmike> yeah so it looks like your BAR is getting blown out

14:28 <zmike> 128mb I'd guess

14:29 <graphitemaster> Why are the resources not being released when I literally shutdown the context..

14:29 <zmike> resources are allocated by the screen, not the context

14:29 <graphitemaster> The screen in this case is what?

14:30 <graphitemaster> I can launch the engine 3 times fine without running out of memory

14:30 <graphitemaster> But if I relaunch the same insane 3 times I run out of memory

14:30 <zmike> overall gl creation

14:30 <graphitemaster> s/insane/instance

14:30 <kusma> I think there's some confusion here. @zmike is talking about pipe_context... I have a feeling graphitemaster is referring to the GL context, which isn't quite the same.

14:31 <kusma> Deleting the GL context should indeed lead to all resources allocated in that GL context to be deleted. Question is do we actually do that?

14:31 <graphitemaster> If I just launch the engine multiple times from the command line I can run several instance simultaneously without running out of memory, but simply relaunching the same instance twice (only one instance running) is enough to OOM.

14:32 <graphitemaster> I have 39 instances of it running right now, no OOM.

14:32 <graphitemaster> Relaunch just one of them, OOM.

14:32 <kusma> My guess would be yeah, but who knows? :)

14:32 <zmike> like I said, I'd guess it's not destroying the screen object between

14:32 <zmike> zink_destroy_screen

14:32 <zmike> should be easy to verify

14:33 <kusma> zmike: Yeah, but that's not something the application controls... I think the state-tracker should delete the resources...

14:33 <graphitemaster> https://pastebin.com/raw/3Pe1DtQX

14:33 <graphitemaster> hits it on each restart

14:33 <zmike> hm so no issue there either

14:33 <zmike> very odd

14:33 <kusma> OK, sounds good then.

14:33 <kusma> Maybe we have a leak?

14:34 <zmike> seems improbable or else cts would've been exploding

14:34 <kusma> Fair point.

14:34 <zmike> 🤔

14:35 <kusma> Maybe it's a... different kind of leak?

14:35 <kusma> Like, something that doesn't just happen to all applications, but some sort of corner-case that the CTS doesn't trigger?

14:36 <kusma> This is a buffer object... A fairly large one... 16 MB...

14:37 <kusma> Nah, that doesn't make much sense.

14:37 <graphitemaster> Simple gdb script shows there are more calls to vkAllocateMemory than vkFreeMemory

14:37 <kusma> That's bad

14:37 <zmike> na that's expected

14:37 <kusma> Maybe we can tag these calls with some valgrind magic to track leaks?

14:37 <zmike> the memory is cached

14:38 <kusma> @zmike: not across screen deletes, is it?

14:38 <zmike> well no, is the script counting that?

14:40 <graphitemaster> I mean you can test this yourself: b vkAllocateMemory; commands; silent; continue; end; (do this for vkFreeMemory too), then just `info break n` and n+1 to see the number of times the bp was hit

14:41 <graphitemaster> They do not match when the screen is destroyed

14:42 <kusma> That sounds like a smoking gun to me...

14:42 <graphitemaster> And yes, it's 16 MiB, the engine only has one unified vertex buffer

14:43 <graphitemaster> It will resize it if it gets too small

14:43 <graphitemaster> So there should only be one whole buffer in this whole thing

14:43 <graphitemaster> I guess it doesn't like that XD

14:44 <kusma> I mean, there's no logical reason why that shouldn't work... This sounds like a bug to me ;)

14:44 <graphitemaster> I really feel like an ass for running into bugs and just wasting your time with them XD

14:45 <graphitemaster> yack shaving now, I was suposed to figure out what was wrong with lod

14:45 <kusma> Yeah, how could you! ;)

14:50 <graphitemaster> Anyways https://media.discordapp.net/attachments/673059994324697088/869954927689101312/2021-07-28-105007_1583x381_scrot.png

14:50 <graphitemaster> Teh proof

14:50 <graphitemaster> 135 vkAllocateMemory, 96 vkFreeMemory

14:53 <zmike> hm I just did a brief check and the only one that's missing is from some resource that is created in...

14:59 <zmike> hm looks like a fb texture

15:02 <zmike> https://gitlab.freedesktop.org/zmike/mesa/-/snippets/2551

15:02 <zmike> seems like some part of the frontend isn't deleting fb attachments

15:02 <zmike> dunno if that's the entire issue though

15:14 <graphitemaster> Can someone explain the sizeof(void*) == 4 thing in cache_or_free_mem

15:15 <graphitemaster> It only unmaps memory on 32-bit systems?

15:15 <graphitemaster> Or is that because it creates a temporary mapping on 32-bit only

15:20 <zmike> the former

15:21 <zmike> I'm not sure how deep I want to get into this considering all of this code is getting deleted in the next week or so

15:22 <graphitemaster> I'd ignore it, it's not a big problem.

15:22 <graphitemaster> Lod on the other hand I'm going to keep poking around to see what is going on here

15:36 <graphitemaster> O_o

15:37 <graphitemaster> Well that's fantastic, the thing looks correct in RenderDoc through Zink

15:37 <zmike> have you tried running with mangohud to rule out wsi issues?

15:37 <graphitemaster> mangohud?

15:38 <zmike> https://github.com/flightlessmango/MangoHud

15:38 <zmike> MANGOHUD_DLSYM=1 mangohud $command

15:40 <graphitemaster> The issue goes away with MANGOHUD overlay

15:40 <graphitemaster> That is really confusing

15:41 <zmike> so then it's probably wsi-related

15:41 <zmike> and not worth spending more time on now

15:41 <graphitemaster> It's gamma related somehow

15:41 <graphitemaster> My sense is glBlitFramebuffer is not appplying the correct sRGB conversion

15:42 <zmike> hm

15:44 <graphitemaster> I have an RGBA_U8 and I'm blitting this to sRGB window backbuffer

15:45 <graphitemaster> My sense is this is not working correctly.

15:45 <zmike> ajax: do you know if there are any piglit tests for this?^

15:48 <graphitemaster> zink (without mangohud): https://media.discordapp.net/attachments/673059994324697088/869969391809347655/2021-07-28-114659_1600x900_scrot.png

15:48 <graphitemaster> zink (with mangohud): https://cdn.discordapp.com/attachments/673059994324697088/869969389057888306/2021-07-28-114711_1600x900_scrot.png

15:49 <zmike> huh

15:49 <graphitemaster> I know, fucking confusing eh

15:49 <zmike> dunno, to me it just looks like a normal type of wsi issue

15:49 <zmike> where something's not quite showing up on screen in sync with the pixels in the image

15:50 <graphitemaster> it's a dynamic range issue

15:52 <graphitemaster> The contents are being clipped under blit framebuffer

15:52 <graphitemaster> So it's losing all it's detail

15:53 <graphitemaster> Oh wait

15:53 <graphitemaster> mangohud somehow is preventing zink from being used, nvm

15:54 <graphitemaster> false alarm

15:55 <graphitemaster> Yeah no, the issue is still the same, nvm.

15:56 <zmike> okay, so it's not a wsi issue at least

15:56 <graphitemaster> Have to run as LD_PRELOAD=/tmp/zink/lib/libGL.so GALLIUM_DRIVER=zink MANGOHUD_DLSYM=1 mangohud ./rex

15:56 <graphitemaster> To force it to actually use zink

15:57 <graphitemaster> Time to sleep. Will look at this tomorrow.

16:05 graphitemaster has quit [Ping timeout: 480 seconds]

16:06 graphitemaster has joined #zink

16:07 Simonx22 has quit [Ping timeout: 480 seconds]

16:56 hch12907 has quit [Remote host closed the connection]

16:59 hch12907 has joined #zink

19:00 xlei has joined #zink