ChanServ changed the topic of #zink to: official development channel for the mesa3d zink driver || https://docs.mesa3d.org/drivers/zink.html
omegatron has quit [Quit: What happened? You quit!]
graphitemaster has joined #zink
<graphitemaster> I'm trying to build a stand-alone zink that will leverage nvidia provided vulkan and I'm running into some issues, mesa keeps trying to load swrast_dri
<graphitemaster> My build command is meson --prefix=/tmp/zink -Dgallium-drivers=zink -Dvulkan-drivers= -Ddri-drivers= build-zink; ninja -C build-zink/ install
<graphitemaster> Testing it with LD_LIBRARY_PATH=/tmp/zink/lib MESA_LOADER_DRIVER_OVERRIDE=zink glxinfo
<graphitemaster> libGL error: failed to load driver: swrast
kusma has joined #zink
<kusma> I think you need to use the GALLIUM_DRIVER=zink approach on NVIDIA until Penny/Copper has landed... ajax_?
hch12907 has joined #zink
<graphitemaster> That works. Looks like mesa + zink doesn't do mesa_glthread=true ?
<graphitemaster> "dri_create_context: requested glthread but driver is missing backgroundCallable V2 extension"
<kusma> graphitemaster: That might be because you're down a swrast codepath now :-/
<graphitemaster> [2021-07-28 04:34:58] [render/gl4/info] | GL Collabora Ltd 4.6 (Core Profile) Mesa 21.2.0-rc2 zink (NVIDIA GeForce RTX 2070)
<kusma> Let me test on intel
<graphitemaster> It does appear to be HW accelerated /w native NVIDIA
<kusma> NVIDIA and the lack of DRI support is... not ideal
<kusma> No, I meant swast-codepath as in the WSI stuff.
<graphitemaster> If I omit GALLIUM_DRIVER=zink, I get llvmpipe which is totally swrast
<graphitemaster> Ah, I see.
<kusma> We end up doing a nasty CPU-blit to a sw-winsys implementation when there's no DRI
<graphitemaster> I must say, the performance is actuallly worse than what I was expecting
<kusma> (which is what GALLIUM_DRIVER=zink is all about, kinda)
<kusma> Well, yeah. You're down the swrast winsys codepath ;)
<graphitemaster> native: 6700fps, zink: 70fps, llvmpipe: 40fps
<kusma> And we don't really have a better option for NVIDIA yet. Penny/Copper will fix that, but we're not there yet.
<graphitemaster> I can't imagine the WSI blit is that SLOW
<graphitemaster> Anyways, I'm just surprised it works XD
<kusma> Penny/Copper is ajax patches to hook up to vulkan WSI code instead
<kusma> graphitemaster: The good news is that most of your perf is bound to blitting the frontbuffer, so you can probably do much more heavy rendering at not much less perf! ;)
<graphitemaster> Was going to say, gsync is totally broken too
<kusma> BTW, just checked, mesa_glthread=true seems to work fine on Intel with DRI
<kusma> Hmm, is llvmpipe working with gsync?
<graphitemaster> I mean via zink
<graphitemaster> char ZINK[] = "GALLIUM_DRIVER=zink";
<graphitemaster> putenv(ZINK);
<graphitemaster> char MESA[] = "mesa_glthread=true";
<graphitemaster> putenv(MESA);
<kusma> Because, our winsys stuff should kinda be on par with llvmpipe (modulo some details that probably doesn't matter)
<graphitemaster> Could try llvmpipe there, I dunno how gsync would work with software rendering though
<graphitemaster> Unless you're using hw to present to swapchain
<kusma> Yeah, so maybe that's the reason. And again, I guess Penny/Copper is the fix ;)
<kusma> It's becoming a bit of a meme; Penny/Copper will fix EVERYTHING ;)
<kusma> I'm sure Ajax is working our the interactions it has with flying cars and jetpacks right now.
<graphitemaster> Apparently in llvmpipe all my glTextureSubImage3D calls are straight up INVALID_VALUE / INVALID_OPERATIION, but also glthread does not appear to work with GALLIUM_DRIVER=llvmpipe either
<kusma> graphitemaster: Yeah, that's kinda what I expected... Try MESA_DEBUG to figure out what more precisely is wrong with the glTextureSubImage3D calls...
<kusma> (env var, set it to something like 1)
<graphitemaster> Mesa: User error: GL_INVALID_VALUE in glTexStorage3D(invalid width, height or depth)
<graphitemaster> Time to print what values I'm passing there.
<graphitemaster> It's weird it's printing that specific function because I don't use it, I use glTextureStorage3D here.
<kusma> Yeah, that could be a reporting-bug
<graphitemaster> Mesa debug output: GL_INVALID_VALUE in glTexStorage3D(invalid width, height or depth)
<graphitemaster> w=32,h=32,d=4096
<graphitemaster> The values I'm passing to it.
<kusma> that d=4096 sounds like a LOT
<graphitemaster> That is correct, 128 packed 32x32x32 3D textures :P
<kusma> LLVMpipe has a max 3D texture size of 2k
<kusma> (per axis)
<kusma> So... that's your problem :-)
<graphitemaster> Well that is not min-spec conformant XD
<graphitemaster> 4096 is min-spec XD
<kusma> for which spec version?
<graphitemaster> > The value must be at least 1024
<graphitemaster> Oh my god
<graphitemaster> OpenGL is ridiculous sometimes.
<graphitemaster> The min-spec is even worse than 2k
<kusma> Sounds like your application either needs to reject drivers with too low limits, or change the texture-packing ;)
<graphitemaster> Yeah fixed, ez.
<kusma> ...or just not care about LLVMpipe, which is a totally acceptable solution for some applications ;)
<kusma> Cool :)
<graphitemaster> I'm impressed with how well llvmpipe runs until I look at top and I see that my ThreadRipper is dying.
<graphitemaster> zink fails my lookdev tests btw
<graphitemaster> that's where I compare different rendered frames of the same scene to see how close the frames are for different renderers
<graphitemaster> I guess rasterization / fill rules might be different
<graphitemaster> or the multisample pattern is different
<graphitemaster> hummm
<kusma> What kind of primitives? If it's lines or points, then... uh yeah ;)
<kusma> Triangles should be identical.
<graphitemaster> Looks more like texture filtering
<kusma> Hmm, that should be the same...
<graphitemaster> Oh
<kusma> Maybe we're not exposing the same levels of anisotropic filtering or something?
<graphitemaster> What do dFdx, dFdy map to in zink, Coarse or Fine
<kusma> Coarse by default, I suspect.
<kusma> Ah, I don't think mesa supports the glHint for this...
<graphitemaster> Yep, looks like you ignore GL_FRAGMENT_SHADER_DERIVATIVE_HINT
<graphitemaster> That's the issue, changing shaders fixed it :P
<kusma> Yeah, I think that would be a very welcome fix :)
<kusma> Shouldn't be too hard to fix, I think.
<graphitemaster> Beautiful.
<kusma> OK, seems i965 supports the hint, but not any Gallium drivers.
<graphitemaster> It seems simple to support in theory but you can set it before a draw call and that has to patch the shader :|
<kusma> graphitemaster: Yeah, but we have stuff to handle these kinds of things
<graphitemaster> *nod*
<graphitemaster> Might end up learning mesa...
<kusma> I think st_update_fp needs to check the state and lower the instructions to either the fine or coarse versions.
<kusma> So that would probably end up as a bit in st_fp_variant_key or something like that...
<kusma> Or... maybe that's a bit heavy handed... That won't let us deduplicate the variants when uses_fddx_fddy is false...
<kusma> Ah, maybe it does let us...
<graphitemaster> I'm ignorant so just going to nod along. Really curious about the penny/copper thing too.
<graphitemaster> Found a legit bug, passing 16 to GL_UNPACK_ALIGNMENT
<graphitemaster> Dunno who to thank, Mesa for being strict or NV for allowing that >_>
<kusma> graphitemaster: Maybe you could test this branch? https://gitlab.freedesktop.org/kusma/mesa/-/commits/st-lower-fddx-fddy-precision
<kusma> This should hopefully fix the GL_FRAGMENT_SHADER_DERIVATIVE_HINT-issue...
<graphitemaster> I'll give it a shot after a coffee and I fix my texture alignment bugs, thanks.
<kusma> Awesome :)
<graphitemaster> GL_UNPACK_IMAGE_HEIGHT is completely broken on AMD Windows drivers *shrug*
omegatron has joined #zink
<kusma> graphitemaster: I posted the patches here: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12097
<graphitemaster> kusma, building now, should I also test switching hint before a draw call just to make extra sure
<graphitemaster> Well they must be working because lookdev tests passed for GL Collabora Ltd 4.6 (Core Profile) Mesa 21.3.0-devel (git-e0b45bf2ff) zink (NVIDIA GeForce RTX 2070)
<graphitemaster> But that is just set once globally, I dunno if it works for switching the hint before a draw call yet.
<kusma> Yeah, good point. I guess I should verify that we're invalidating the right state here.
<kusma> I suspect we are, because i965 doesn't do anything magical here as far as I can tell, but let's find out!
<kusma> Hmm, no. Doesn't look right to me.
<graphitemaster> Something else is wrong lookdev wise too in another scene, something wonky with texture lod
<graphitemaster> This: f32x3 prefiltered_color = rx_textureCMLod(u_prefilter, r, roughness * 5.0).rgb;
<graphitemaster> Should map textureLod(u_prefilter, r, roughness * 5.0).rgb, but it doesn't appear to have the same value
<kusma> Fixed the state-update bug in the MR, BTW
<graphitemaster> u_prefilter is a cubemap sampler.
<kusma> does that cubemap care about seamless vs non-seamless?
<graphitemaster> seamless cubemap filtering is required, enabled globally at context init
<graphitemaster> this just looks like it's using a lower lod level though
<graphitemaster> lower than the one it hsould be using
<kusma> We only support seamless cubemaps for Zink, because Vulkan. In theory we could implement non-seamless per texture by lowering to some ALU code and a layered 2D texture lookup instead, but meh.
<kusma> OK, so that shouldn't be a problem.
<kusma> @gr
<kusma> graphitemaster: Do you specify any lod-bias anywhere?
<kusma> Could be that we're missing one... I remember having a problem like that in the very similar D3D12 driver...
<kusma> IIRC, there were one LOD bias in the... texture objects(?) that we didn't have an obvious place to account for, so we needed some shader-lowering...
<graphitemaster> No lod bias here.
<kusma> OK, then that's probably not it either...
<graphitemaster> min lod is -1000 and max lod is 1000 (gl defaults)
<graphitemaster> lod bias = 0.0
<kusma> So one obvious thing is that cubemaps have an input coordinate per face of -1 to 1 instead of 0 to 1...
<graphitemaster> Was going to say, cubemap uvw's are implicitly supposed to be normalized by the textureLod call
<kusma> So if the lod is calculated without taking the cubemapness into account, you'd get an off-by-one in the LOD...
<graphitemaster> The normalize is not being thrown out is it?
<kusma> The normalize happens below us, we're just emitting SPIR-V opcodes to sample.
<kusma> So... this sounds like it could be a bug in the NVIDIA Vulkan driver, perhaps?
<kusma> There are some cases where Mesa drivers lower texturing stuff like that, but I don't think we do for Zink...
<kusma> You can use $ZINK_DEBUG=spirv to have Zink dump the SPIR-V modules
<graphitemaster> picking the absolute lowest lod level in the prefiiltered envmap
<graphitemaster> It's like the lod parameter is reversed
<graphitemaster> I think zink is mapping min and max lod incorrectly when mipmaps are involved.
<kusma> I don't think so, TBH...
<kusma> This looks correct to me...
<kusma> I think checking the generated SPIR-V code should tell us if something funky is happening to the sampling...
<graphitemaster> yeah need to find which shader it is though
<kusma> Yeah, that's not the easiest :)
<kusma> We're printing the name of each shader as we dump it, but we don't compile the SPIR-V until the shader is used...
<kusma> So it's not like glCompileShader or glLinkShader is going to correspond here...
<graphitemaster> [root@graphite rex]# spirv-dis dump11.spv | grep "u_prefilter"
<graphitemaster> OpName %u_prefilter "u_prefilter"
<graphitemaster> OpDecorate %u_prefilter DescriptorSet 2
<graphitemaster> OpEntryPoint Fragment %main "main" %vs_coordinate %fs_color %ubo_slot_0 %u_scale_bias %u_prefilter %u_irradiance %u_depth %u_emission %u_normal %u_albedo
<graphitemaster> OpDecorate %u_prefilter Binding 133
<graphitemaster> %u_prefilter = OpVariable %_ptr_UniformConstant_23 UniformConstant
<graphitemaster> %400 = OpLoad %23 %u_prefilter
<graphitemaster> Found it
<kusma> Hmm, that's just reading a uniform...
<kusma> OK, that's using OpImageSampleExplicitLod
<kusma> %401 = OpImageSampleExplicitLod %v4float %400 %398 Lod %399
<graphitemaster> Walking back up, can see the OpFMul %float %394 %395
<graphitemaster> Which is the "roughness * 5.0" presumably
<graphitemaster> I assume uintBitsToFloat(1084227584) is 5.0
<kusma> Yeah, and 1084227584 = 0x40a00000 = 5.0
<graphitemaster> roughness is pulled out of the b channel of the normal texture, which is what I assume %393 = OpCompositeExtract %uint %59 2 is doing, the 2 = b
<graphitemaster> So the spirv code is gine
<graphitemaster> s/gine/fine
<kusma> Hmm, but I think I see a smoking gun: "%58 = OpImageSampleImplicitLod %v4float %57 %56 None"
<kusma> Do you really want to sample your gbuffer (I assume that's what this is?) with LODs?
<kusma> I guess if you have none, then it might be fine... But otherwise...?
<kusma> That's where your "roughness" seems to be coming from...
<graphitemaster> No mips or lods, all nearest filter, I don't specify Lod in that case
<kusma> OK, then that should be fine.
<kusma> So, yeah. Then I wonder if this is a problem in the Vulkan driver...
<graphitemaster> I highly doubt that
<kusma> Having worked with NVIDIAs Vulkan driver, I'm not so sure I doubt it ;)
<kusma> I had a lot of issues along these lines in the past with it.
<kusma> In this case, it would be a OpImageSampleImplicitLod + cubemap special case. I guess it's worth checking if that's covered in the CTS.
<kusma> Eh OpImageSampleExplicitLod
<graphitemaster> I know there is some whacky crap in OpenGL with min/max lod, it doesn't just map directly
<graphitemaster> directly being TIC here on the Maxwell NV GPU
<graphitemaster> There is a min/max field for lod but the GL values from the NV driver do not actually produce the equivalent values in the TIC.
<kusma> Seems the VK CTS does test the combination of textureLod and cubemaps.
<graphitemaster> Right
<graphitemaster> ceil(min_lod + 0.5) - 1
<graphitemaster> ceil(max_lod + 0.5) - 1
<graphitemaster> ComputeAccessedLod
<kusma> Oh joy, that stuff...
<kusma> So.. are you using nearest filtering of the mipmaps?
<kusma> I mean, I think this is doing the same in Vulkan and GL, so I don't think that's it either. Just trying to figure out if there's more to dig into here...
<graphitemaster> looks like bilinear + trilinear + mipmaps
<kusma> right, but that ceil stuff is inside an "TEXTURE_MIN_FILTER is NEAREST_MIPMAP_NEAREST or LINEAR_MIPMAP_NEAREST" conditional, neigher is trilinear...
<kusma> neither
<graphitemaster> min = GL_LINEAR_MIPMAP_LINEAR, mag = GL_LINEAR, min_lod = -1000, max_lod = 1000, lod_bias = 0.0
<kusma> Yeah, so that's not really what's going on here...
<graphitemaster> Just needed to verify things.
<kusma> So, Vulkan specifies the same here, `ceil(d' + 0.5) - 1`, but also *allows* for dropping the -1...
<graphitemaster> I also force aniso 16x on it...
<graphitemaster> I dunno why I do that
<graphitemaster> Huh
<graphitemaster> I'm looking at UE4 source code because I'm confused
<kusma> Hmm...
<graphitemaster> They set minLod to 0.0 and maxLod to 1000 in Vulkan, but -1000.0 and 1000.0 in OpenGL
<kusma> Aniso is another candidate for lod bias issues, indeed...
<kusma> Mesa ends up clamping the min/max lod range to 0...num_miplevels anyway
<graphitemaster> UE4 universally sets mipLodBias to clamp(user_lod_bias, -maxSamplerLodBias, maxSamplerLodBias)
<graphitemaster> So just clamps it to the device limits I guess
<graphitemaster> I want to test something, I'm going to set minLod to 0 in my GL code, just to weed out that this clamping behavior is not the culprit
<kusma> Good idea
<graphitemaster> Not that :|
<graphitemaster> All this time we've been thinking it's Lod related, what if it's the generation of the texture itself too.
<graphitemaster> Haven't ruled that out yet
<kusma> Yeah, that could be!
<kusma> I'm far from sure Zink always access the right 2D image when combinding things like miplevels, texture-arrays and cubemaps... As in blits to the right subresource etc.
<kusma> IIRC there were recently some fixes there, but with a comment from zmike that there probably was more similar stuff...
<kusma> I think that was about cubemap-arrays, but I'm not sure.
<graphitemaster> This comparison here
<graphitemaster> In void main()
<graphitemaster> If the compiler did integrate for roughness of 0 it could cause that issue
<graphitemaster> Oh my god I pass 0.0 there to sample_envmap
<kusma> Is that bad? :)
<graphitemaster> I'm getting tired
<graphitemaster> 0.0 is base level, yeah
<kusma> You mean in the "u_roughness == 0.0"-case, right? I would assume that a lod of 0.0 would be what you want?
<graphitemaster> Correct
<kusma> Oh... I think I get what you're saying...
<kusma> you don't have base level of 0, do you?
<graphitemaster> I do.
<kusma> Oh, OK. Then this looks correct to me.
<graphitemaster> I wish I could renderdoc the Vulkan calls of Zink, but renderdoc goes and replaces the dynamic linker that loads the GL driver
<graphitemaster> So it's intercepting it as GL
<kusma> Hmm, @zmike got it working with some patches
<graphitemaster> Okay just to rule that out, it does not happen with 2D textures
<graphitemaster> This is specifically a problem with cubemap samplers.
<graphitemaster> So now I'm believing it could be a NV Vulkan driver bug more
<graphitemaster> At first I was like no way this is a NV driver bug because that would've been found immediately ... for 2D textures :P
<kusma> Yeah... Both GL and Vulkan defines the LOD as being calculated in pixel-space, and this seems to be the kind of problem you get if you calculate it from coordinate-space...
<kusma> Yeah. But the VK CTS does test textureLod + cubemap, so I'm not 100% sure...
<kusma> Seems like something NV should be aware of then...
<zmike> there's a ticket open about cube sampling
<kusma> Interesting
<graphitemaster> Have a link?
<graphitemaster> Yeah I assume seamless filtering globally.
<graphitemaster> Wouldn't cause the wrong mip selection.
<kusma> graphitemaster: Yeah, so you shouldn't be affected.
<kusma> This is also about shadow samplers, not color samplers
<graphitemaster> So it looks like textureLod with a CM behaves such that 0.0 is the base level and 1.0 is the last mip level
<graphitemaster> That's what I'm observing with some tests
<kusma> Uh... that sounds very wrong :-P
<kusma> If that's what the driver is doing, it sounds completely busted
<kusma> Maybe the CTS tests only uses two levels?
<graphitemaster> Lines also fail the test, but this is as close as I can get zink to match GL
<kusma> Lines in GL (and Vulkan) are... meh.
<graphitemaster> Channel distortion: MAE red: 336.877 (0.00514041) green: 322.364 (0.00491896) blue: 331.043 (0.00505139) all: 330.095 (0.00503692)
<graphitemaster> cubemap filtering is just way off
<kusma> Heh, seems there's some ordered dithering going here. Does glDisable(GL_DITHER) on the NVIDIA reference help? :)
<graphitemaster> indirect_specular += bayer_16x16(vs_coordinate * u_resolution) / 255.0; XD
<kusma> We don't do dithering at all, as it's completely undefined and not really supported in Vulkan. In theory we could create some bayer-matrix and apply a bias at the end of the fragment shader and kinda-sorta get the right behavior... but meh.
<kusma> Ah, right. Yeah, so that's you, not GL :)
<graphitemaster> Nice catch though
<kusma> Then I guess the noise here is mostly due to the lod issue?
<graphitemaster> Having visual tests (RMSE, MAE, etc) for comparing different rendering backends was one of those "sounds really good on paper", because I can ensure consistent visual results across all our platforms and artists love it, but it's also been an absolute nightmare because it's found several driver bugs, api quirks, and just bullshit that I really don't feel like fixing.
<kusma> Oh I absolutely hate image based testing. It's something that has haunted me my entire career, and in every case we've ended up stopping to do it because of all of the issues it comes with.
<kusma> Theory: you notice when rendering changes. Practice: You notice months later than something regressed, because a trivial change pushed the diff over the error-threshold, and nobody inspected the results in between, or blindly accepted new results because validating the new ones is hard.
<graphitemaster> Oh I broke zink hard
<kusma> And no amount of making fancy UIs to inspect the changes or clever comparison algorithm etc seems to help with that.
<kusma> graphitemaster: Welcome to the club!
<graphitemaster> rex: ../src/gallium/drivers/zink/zink_descriptors.c:809: zink_descriptor_set_get: Assertion `pool->num_sets_allocated < ZINK_DEFAULT_MAX_DESCS' failed.
<zmike> oh cool you hit my new assert
<kusma> Good assert.
<zmike> try with ZINK_DESCRIPTORS=lazy
<graphitemaster> Now I just get Mesa: User error: GL_INVALID_VALUE in glNamedBufferSubData(offset 0 + size 8224 > buffer size 0)
<graphitemaster> Except I'm totally not passing those values to that function
<graphitemaster> Ah
<graphitemaster> Mesa: User error: GL_OUT_OF_MEMORY in glNamedBufferData
<graphitemaster> Mesa: 1 similar GL_OUT_OF_MEMORY errors
<graphitemaster> That makes more sense
<graphitemaster> I've only allocated 128 MiB from mesa total though, so something is wonked
<graphitemaster> It looks like zink has problems with being restarted.
<graphitemaster> i.e cleanup all GL resources, destroy the GL context, unload the library, load the library, make new GL context, repeat.
<graphitemaster> Which is how the testsuite runs
<zmike> what do you mean "unload the library"
<graphitemaster> dlclose
<kusma> zmike: dlclose, I suppose?
<zmike> what library tho
<zmike> zink?
<graphitemaster> libGL.so in this case
<zmike> might be that you're creating a new screen object for each test and never destroying it
<graphitemaster> I'd assume SDL_GL_DeleteContext and SDL_DestroyWindow would handle that
<zmike> should be easy enough to check; if you monitor your available vram as your tests progress and it keeps going down then that's the problem
<graphitemaster> 21% vram usage and getting an oom error
<zmike> huh
<kusma> Could it be that we're hitting the max allocation limit on NV?
<graphitemaster> sorry, 3 restarts
<zmike> would be helpful if you could figure out exactly which vk call is getting the oom error
<graphitemaster> Is there a super verbose logging mode that emits every Vulkan call
<graphitemaster> I imagine the validation layers would report oom
<graphitemaster> Going to try ZINK_DEBUG=validation
<graphitemaster> Nothing
<zmike> I would guess you should be able to gdb in buffer_transfer_map and find the error from there
<graphitemaster> The issue is in resource_object_create, vkAllocateMemory
<zmike> what's the callstack for it?
<zmike> can you tell me what templ->usage is?
<graphitemaster> 2 lol
<zmike> hmm
<graphitemaster> (gdb) print *templ
<graphitemaster> screen = 0x0}
<graphitemaster> $2 = {reference = {count = 0}, width0 = 16777216, height0 = 1, depth0 = 1, array_size = 1, format = PIPE_FORMAT_R8_UNORM, target = PIPE_BUFFER, last_level = 0, nr_samples = 0, nr_storage_samples = 0, usage = 2, bind = 0, flags = 0, next = 0x0,
<zmike> yeah so it looks like your BAR is getting blown out
<zmike> 128mb I'd guess
<graphitemaster> Why are the resources not being released when I literally shutdown the context..
<zmike> resources are allocated by the screen, not the context
<graphitemaster> The screen in this case is what?
<graphitemaster> I can launch the engine 3 times fine without running out of memory
<graphitemaster> But if I relaunch the same insane 3 times I run out of memory
<zmike> overall gl creation
<graphitemaster> s/insane/instance
<kusma> I think there's some confusion here. @zmike is talking about pipe_context... I have a feeling graphitemaster is referring to the GL context, which isn't quite the same.
<kusma> Deleting the GL context should indeed lead to all resources allocated in that GL context to be deleted. Question is do we actually do that?
<graphitemaster> If I just launch the engine multiple times from the command line I can run several instance simultaneously without running out of memory, but simply relaunching the same instance twice (only one instance running) is enough to OOM.
<graphitemaster> I have 39 instances of it running right now, no OOM.
<graphitemaster> Relaunch just one of them, OOM.
<kusma> My guess would be yeah, but who knows? :)
<zmike> like I said, I'd guess it's not destroying the screen object between
<zmike> zink_destroy_screen
<zmike> should be easy to verify
<kusma> zmike: Yeah, but that's not something the application controls... I think the state-tracker should delete the resources...
<graphitemaster> hits it on each restart
<zmike> hm so no issue there either
<zmike> very odd
<kusma> OK, sounds good then.
<kusma> Maybe we have a leak?
<zmike> seems improbable or else cts would've been exploding
<kusma> Fair point.
<zmike> 🤔
<kusma> Maybe it's a... different kind of leak?
<kusma> Like, something that doesn't just happen to all applications, but some sort of corner-case that the CTS doesn't trigger?
<kusma> This is a buffer object... A fairly large one... 16 MB...
<kusma> Nah, that doesn't make much sense.
<graphitemaster> Simple gdb script shows there are more calls to vkAllocateMemory than vkFreeMemory
<kusma> That's bad
<zmike> na that's expected
<kusma> Maybe we can tag these calls with some valgrind magic to track leaks?
<zmike> the memory is cached
<kusma> @zmike: not across screen deletes, is it?
<zmike> well no, is the script counting that?
<graphitemaster> I mean you can test this yourself: b vkAllocateMemory; commands; silent; continue; end; (do this for vkFreeMemory too), then just `info break n` and n+1 to see the number of times the bp was hit
<graphitemaster> They do not match when the screen is destroyed
<kusma> That sounds like a smoking gun to me...
<graphitemaster> And yes, it's 16 MiB, the engine only has one unified vertex buffer
<graphitemaster> It will resize it if it gets too small
<graphitemaster> So there should only be one whole buffer in this whole thing
<graphitemaster> I guess it doesn't like that XD
<kusma> I mean, there's no logical reason why that shouldn't work... This sounds like a bug to me ;)
<graphitemaster> I really feel like an ass for running into bugs and just wasting your time with them XD
<graphitemaster> yack shaving now, I was suposed to figure out what was wrong with lod
<kusma> Yeah, how could you! ;)
<graphitemaster> Teh proof
<graphitemaster> 135 vkAllocateMemory, 96 vkFreeMemory
<zmike> hm I just did a brief check and the only one that's missing is from some resource that is created in...
<zmike> hm looks like a fb texture
<zmike> seems like some part of the frontend isn't deleting fb attachments
<zmike> dunno if that's the entire issue though
<graphitemaster> Can someone explain the sizeof(void*) == 4 thing in cache_or_free_mem
<graphitemaster> It only unmaps memory on 32-bit systems?
<graphitemaster> Or is that because it creates a temporary mapping on 32-bit only
<zmike> the former
<zmike> I'm not sure how deep I want to get into this considering all of this code is getting deleted in the next week or so
<graphitemaster> I'd ignore it, it's not a big problem.
<graphitemaster> Lod on the other hand I'm going to keep poking around to see what is going on here
<graphitemaster> O_o
<graphitemaster> Well that's fantastic, the thing looks correct in RenderDoc through Zink
<zmike> have you tried running with mangohud to rule out wsi issues?
<graphitemaster> mangohud?
<zmike> MANGOHUD_DLSYM=1 mangohud $command
<graphitemaster> The issue goes away with MANGOHUD overlay
<graphitemaster> That is really confusing
<zmike> so then it's probably wsi-related
<zmike> and not worth spending more time on now
<graphitemaster> It's gamma related somehow
<graphitemaster> My sense is glBlitFramebuffer is not appplying the correct sRGB conversion
<zmike> hm
<graphitemaster> I have an RGBA_U8 and I'm blitting this to sRGB window backbuffer
<graphitemaster> My sense is this is not working correctly.
<zmike> ajax: do you know if there are any piglit tests for this?^
<zmike> huh
<graphitemaster> I know, fucking confusing eh
<zmike> dunno, to me it just looks like a normal type of wsi issue
<zmike> where something's not quite showing up on screen in sync with the pixels in the image
<graphitemaster> it's a dynamic range issue
<graphitemaster> The contents are being clipped under blit framebuffer
<graphitemaster> So it's losing all it's detail
<graphitemaster> Oh wait
<graphitemaster> mangohud somehow is preventing zink from being used, nvm
<graphitemaster> false alarm
<graphitemaster> Yeah no, the issue is still the same, nvm.
<zmike> okay, so it's not a wsi issue at least
<graphitemaster> Have to run as LD_PRELOAD=/tmp/zink/lib/libGL.so GALLIUM_DRIVER=zink MANGOHUD_DLSYM=1 mangohud ./rex
<graphitemaster> To force it to actually use zink
<graphitemaster> Time to sleep. Will look at this tomorrow.
graphitemaster has quit [Ping timeout: 480 seconds]
graphitemaster has joined #zink
Simonx22 has quit [Ping timeout: 480 seconds]
hch12907 has quit [Remote host closed the connection]
hch12907 has joined #zink
xlei has joined #zink