ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar
JohnnyonFlame has joined #dri-devel
<jenatali> Pretty sure you can do anything with a specialization constant
nchery_ has joined #dri-devel
<zmike> there are some values that must be actual constants
<jenatali> Just means you need to run the specialization pass from SPIRV-Tools before sending it to the Vulkan driver
mbrost__ has quit [Ping timeout: 480 seconds]
<zmike> 😬
<zmike> at that point may as well just do normal pipeline variants
mbrost__ has joined #dri-devel
<karolherbst> yeah... shared mem is terrible.. isn't there a vulkan extension or something?
<zmike> could be
<zmike> check the registry
nchery has quit [Ping timeout: 480 seconds]
<karolherbst> doesn't seem to exist
<karolherbst> heh.. but it's not shared mem why luxmark kills the GPU anyway
<zmike> for now maybe just detect whether you're on zink and create different compute shaders for each variable size by setting static size as the total size
mbrost_ has joined #dri-devel
<karolherbst> uhhhhhh
<zmike> unless you want to plug in spec constant / pipeline variants
<karolherbst> you know that this value is completely variable?
<zmike> yep
<karolherbst> not creating 65000 shaders
<zmike> well if you want to do other stuff this weekend...
<zmike> I'm saying I can do the variants next week
<zmike> but this would work as a temporary solution
<karolherbst> I think I try to fix why zink trashes the GPU context instead :)
<zmike> 🤷
<karolherbst> uhm.. figure out
mbrost__ has quit [Ping timeout: 480 seconds]
<karolherbst> still works on anv though
<zmike> are you using VVL at all?
<karolherbst> VVL?
<karolherbst> ahh... right.. I used to use them, but it was fine.. maybe it's not with radv
<zmike> if you have it installed you can use ZINK_DEBUG=validation
<karolherbst> but maybe I should just in case
mbrost_ has quit [Ping timeout: 480 seconds]
<karolherbst> mhh.. it's not printing anything, must be perfect code then
<karolherbst> ehh.. I'll just run the cts on anv and see what's the difference
co1umbarius has joined #dri-devel
columbarius has quit [Ping timeout: 480 seconds]
<jenatali> karolherbst: fwiw we'd need variants with shared memory size too, if you wanted to do it in rusticl so any drivers that need it don't have to do it
<karolherbst> jenatali: I kind of plan on splitting it out so backends can deal with it however they wanted to
<karolherbst> create_compute_state takes the static amount and launch_grid gets a variable_shared_mem parameter
<jenatali> Fair. I'd use that for a variant then
<karolherbst> yeah, that's more or less the idea then
<karolherbst> but I wanted the CSO to be like that so I can create it early and all kernel launches reuse the same one
<karolherbst> which I need for workgroupinfo as well
<jenatali> Makes sense
<karolherbst> ehh.. running with anv ooms my system, but it passes a lot more tests: "Pass 1613 Fails 41 Crashes 355 Timeouts 0: 82%|"
alanc has quit [Remote host closed the connection]
alanc has joined #dri-devel
ybogdano has quit [Ping timeout: 480 seconds]
jolan has quit [Remote host closed the connection]
jolan has joined #dri-devel
<karolherbst> uhhh.. it's the kernel running OOM.. strange
jolan has quit []
<karolherbst> or rather... no, I think it's file memory
<karolherbst> probably some huge spirv being generated
jolan has joined #dri-devel
<karolherbst> found it
<karolherbst> "./build/test_conformance/multiple_device_context/test_multiples" :(
Emantor has quit [Quit: ZNC - http://znc.in]
Emantor has joined #dri-devel
<jenatali> Ooh yeah that one is fun
<karolherbst> I suspect zink is just allocating tons of stuff
<jenatali> karolherbst: can you ever have more than one device enumerated at a time?
lemonzest has quit [Quit: WeeChat 3.6]
<karolherbst> at least my 32GB machine isn't enough
<karolherbst> jenatali: what do you mean?
<jenatali> I assume rusticl is a platform, which can then enumerate devices?
<karolherbst> sure
<karolherbst> and multiple devices just work
<karolherbst> but that's not hte point
<karolherbst> there is just one device...
<jenatali> And then when you create a context, you can pick one or more devices to use
<karolherbst> sure
<karolherbst> that all works
<jenatali> Ok I just wasn't sure if you handled the multi-device with a single context case
<jenatali> Cause that was a huge pain for me
<karolherbst> it works if it's not zink at least
alyssa has joined #dri-devel
<alyssa> vec4 32 ssa_4 = intrinsic load_interpolated_input (ssa_3, ssa_2) (base=1, component=0, dest_type=invalid /*0*/, io location=33 slots=1 /*161*/)/* v_colorBase */
<karolherbst> but maybe the test is doing something super silly? like using the same device multiple times?
<alyssa> type=invalid? thinking
<zmike> karolherbst: massif?
<karolherbst> zmike: mhh?
<zmike> to see where the allocation is
<karolherbst> not in userspace
<zmike> ah
<karolherbst> the process is using 1.0% of mem :(
<karolherbst> probably some GPU buffers or something? I dunno
<zmike> if you're on amd you can use radeontop or nvtop to see vram utilization in realtime
<karolherbst> it's anv sadly
<karolherbst> heh on radv that test just works
<zmike> find a 🍺 then
<karolherbst> uses like 100MB of memory
<alyssa> one line fix nvm
<karolherbst> "Successfully created all 200 contexts." uhhhhh
<zmike> man-blinking.gif
<HdkR> Does it actually run out of memory or does it run in to the 65k VMA region limit? :D
<karolherbst> I guess if a context creates more then 200MB of stuff I can see it running out of memory :D
<karolherbst> HdkR: actual physical memory
<karolherbst> but on the GPU side I think
<HdkR> Very fancy
<karolherbst> ehhh... but what are those 200 contexts anyway?....
<karolherbst> I am sure I only create one screen...
<karolherbst> and a context is per queue...
<karolherbst> *pipe_context
<karolherbst> ohh.. it actually creates queues and kernels and programs per cl_context
<karolherbst> zmike: what's the vk type a pipe_context maps to? vkDevice? vkQueue?
<zmike> there isn't one
<karolherbst> anyway.. sounds like an anv bug to me :)
<zmike> does indeed
lyudess has joined #dri-devel
<jenatali> karolherbst: When you run on zink, which / how many devices do you show?
<karolherbst> just one
<karolherbst> whatever is the active/first vulkan device
Lyude has quit [Read error: Connection reset by peer]
<karolherbst> zink can't be loaded multiple times yet afaik
<jenatali> I see
<karolherbst> I am sure not because of zink, just because there is no "iterate all render nodes" thing
<karolherbst> anyway.. on radv it works
<karolherbst> on anv it OOMs
<zmike> you could probably do it by changing the icd env between loads
<karolherbst> heh...
<karolherbst> what a dirty hack, but I like it
kem has quit [Ping timeout: 480 seconds]
khfeng has joined #dri-devel
khfeng has quit []
khfeng has joined #dri-devel
kem has joined #dri-devel
<illwieckz> dirty hacks need love too
<karolherbst> yo, but that is seriously dirty as I'd have to parse env vars and locations and all of that
<zmike> I guess maybe I should try adding fallback in zink to try progressively iterating through devices if the first one fails? 🤔
<karolherbst> zmike: I need something like pipe_loader_probe just for vulkan
<zmike> what is the end goal here
<karolherbst> being able to load multiple devices through zink
<zmike> yeah but do you care what you're loading or you're okay with just loading anything
<karolherbst> I have to load _all_ devices
<zmike> yeah so probably just adding fallback handling in zink would be fine
<karolherbst> though at some point I want to either use the gallium native or the zink one though
<karolherbst> so maybe we need a pipe_loader_probe which falls back to zink for the specific render node instead
<karolherbst> or something
heat has quit [Ping timeout: 480 seconds]
<zmike> hm
<jenatali> That's how we handle our D3D layered impls
<jenatali> Would be nice if we could do that for the CL/GL/VK layered impls too, but for now we just put all devices and just put them after the native impls
alyssa has left #dri-devel [#dri-devel]
YuGiOhJCJ has joined #dri-devel
YuGiOhJCJ has quit [Quit: YuGiOhJCJ]
jagan_ has quit [Remote host closed the connection]
sumits has quit [Quit: ZNC - http://znc.in]
andrey-konovalov has quit []
sumits has joined #dri-devel
andrey-konovalov has joined #dri-devel
ella-0_ has joined #dri-devel
ella-0 has quit [Read error: Connection reset by peer]
Leopold_ has quit [Remote host closed the connection]
srslypascal is now known as Guest3138
srslypascal has joined #dri-devel
Guest3138 has quit [Ping timeout: 480 seconds]
Lyude has joined #dri-devel
lyudess has quit [Ping timeout: 480 seconds]
fab has joined #dri-devel
cef has quit [Quit: Zoom!]
fxkamd has quit []
samueldr_ has left #dri-devel [#dri-devel]
sdutt has quit [Read error: Connection reset by peer]
lygstate has quit [Remote host closed the connection]
kts has joined #dri-devel
rasterman has joined #dri-devel
Duke`` has joined #dri-devel
danvet has joined #dri-devel
djbw has quit [Read error: Connection reset by peer]
srslypascal is now known as Guest3143
srslypascal has joined #dri-devel
Guest3143 has quit [Ping timeout: 480 seconds]
kts has quit [Quit: Leaving]
kts has joined #dri-devel
cef has joined #dri-devel
jernej_ is now known as jernej
kem has quit [Ping timeout: 480 seconds]
kem has joined #dri-devel
JohnnyonFlame has quit [Ping timeout: 480 seconds]
ella-0_ is now known as ella-0
ella-0 has left #dri-devel [#dri-devel]
ella-0 has joined #dri-devel
chipxxx has quit [Remote host closed the connection]
chipxxx has joined #dri-devel
co1umbarius has quit [Read error: No route to host]
TMM has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
TMM has joined #dri-devel
srslypascal is now known as Guest3150
srslypascal has joined #dri-devel
lemonzest has joined #dri-devel
Guest3150 has quit [Ping timeout: 480 seconds]
srslypascal has quit [Quit: Leaving]
srslypascal has joined #dri-devel
Thymo_ has quit [Ping timeout: 480 seconds]
JohnnyonFlame has joined #dri-devel
i-garrison has quit []
morphis has quit [Remote host closed the connection]
morphis has joined #dri-devel
gouchi has joined #dri-devel
fab has quit [Quit: fab]
fab_ has joined #dri-devel
fab_ is now known as Guest3153
morphis has quit [Remote host closed the connection]
morphis has joined #dri-devel
morphis has quit [Remote host closed the connection]
carbonfiber has joined #dri-devel
i-garrison has joined #dri-devel
morphis has joined #dri-devel
Duke`` has quit []
Duke`` has joined #dri-devel
kts has quit [Quit: Leaving]
morphis has quit [Ping timeout: 480 seconds]
JohnnyonFlame has quit [Ping timeout: 480 seconds]
kts has joined #dri-devel
Jeremy_Rand_Talos_ has quit [Remote host closed the connection]
Jeremy_Rand_Talos_ has joined #dri-devel
morphis has joined #dri-devel
Thymo has joined #dri-devel
gouchi has quit [Remote host closed the connection]
illwieckz has quit [Remote host closed the connection]
pcercuei has joined #dri-devel
morphis has quit [Remote host closed the connection]
illwieckz has joined #dri-devel
TMM has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
TMM has joined #dri-devel
Leopold has joined #dri-devel
srslypascal is now known as Guest3157
srslypascal has joined #dri-devel
Guest3157 has quit [Ping timeout: 480 seconds]
srslypascal is now known as Guest3158
srslypascal has joined #dri-devel
Haaninjo has joined #dri-devel
srslypascal has quit [Remote host closed the connection]
srslypascal has joined #dri-devel
srslypascal is now known as Guest3159
srslypascal has joined #dri-devel
Guest3158 has quit [Ping timeout: 480 seconds]
lunarequest has joined #dri-devel
Guest3159 has quit [Ping timeout: 480 seconds]
srslypascal is now known as Guest3160
srslypascal has joined #dri-devel
heat has joined #dri-devel
srslypascal is now known as Guest3161
srslypascal has joined #dri-devel
lunarequest has left #dri-devel [#dri-devel]
srslypascal is now known as Guest3162
srslypascal has joined #dri-devel
Guest3160 has quit [Ping timeout: 480 seconds]
lunarequest has joined #dri-devel
Guest3161 has quit [Ping timeout: 480 seconds]
srslypascal has quit [Remote host closed the connection]
srslypascal has joined #dri-devel
Guest3162 has quit [Ping timeout: 480 seconds]
srslypascal has quit [Remote host closed the connection]
kts has quit [Quit: Leaving]
nullrequest has joined #dri-devel
srslypascal has joined #dri-devel
lunarequest has quit [Ping timeout: 480 seconds]
srslypascal has quit [Remote host closed the connection]
JoshuaAshton has quit [Ping timeout: 480 seconds]
gawin has joined #dri-devel
kts has joined #dri-devel
nullrequest has quit [Ping timeout: 480 seconds]
srslypascal has joined #dri-devel
srslypascal is now known as Guest3165
srslypascal has joined #dri-devel
Guest3165 has quit [Ping timeout: 480 seconds]
nullrequest has joined #dri-devel
nullrequest has left #dri-devel [#dri-devel]
srslypascal is now known as Guest3167
srslypascal has joined #dri-devel
Guest3167 has quit [Ping timeout: 480 seconds]
srslypascal is now known as Guest3168
srslypascal has joined #dri-devel
gouchi has joined #dri-devel
srslypascal has quit [Remote host closed the connection]
srslypascal has joined #dri-devel
Guest3168 has quit [Ping timeout: 480 seconds]
Company has quit [Quit: Leaving]
gawin has quit [Ping timeout: 480 seconds]
kem has quit [Ping timeout: 480 seconds]
kem has joined #dri-devel
sarnex has quit [Read error: Connection reset by peer]
sarnex has joined #dri-devel
chipxxx has quit [Remote host closed the connection]
MajorBiscuit has joined #dri-devel
srslypascal is now known as Guest3173
srslypascal has joined #dri-devel
Guest3173 has quit [Ping timeout: 480 seconds]
chipxxx has joined #dri-devel
Compy has joined #dri-devel
djbw has joined #dri-devel
srslypascal is now known as Guest3175
srslypascal has joined #dri-devel
Compy has left #dri-devel [#dri-devel]
Compy_ has joined #dri-devel
<Compy_> Good morning all. I'm working on a baseline image targeting an Allwinner H2, MALI GPU (buildroot, linux 5.10.47). I've installed mesa3d (21.1.8) and the gallium lima driver, opengl EGL/ES and libdrm (2.4.107). Linux modules for KMSDRM are compiled in, and I see kmsg saying that lima has loaded/been detected successfully, however I can't get any KMSDRM devices in things like SDL2. From what I can tell, the
<Compy_> KMSDRM_drmModeGetResources call is failing (the underlying IOCTL call). Any idea where I can start to look at this?
Guest3175 has quit [Ping timeout: 480 seconds]
Lucretia has quit []
Lucretia has joined #dri-devel
srslypascal is now known as Guest3177
srslypascal has joined #dri-devel
Guest3177 has quit [Ping timeout: 480 seconds]
kts has quit [Quit: Leaving]
MajorBiscuit has quit [Quit: WeeChat 3.5]
<pinchartl> Compy_: my usual approach to this kind of problem is to trace it in the kernel. there are debugging options you can enable in the DRM/KMS core to output debug messages to the kernel log, tracing calls from userspace. if that's not enough, I usually add printk() statements to the code paths to locate the source of the error. this debugging strategy may be too influenced by a lifetime of kernel
<pinchartl> development though :-)
<Compy_> Hahah, I'm doing the same thing with printk() statements. I'll see if I can up that debug verbosity in the DRM/KMS core. Thanks a ton for the response to my semi-vague situation. Always appreciated :)
srslypascal is now known as Guest3178
srslypascal has joined #dri-devel
<pinchartl> I'll have to do something similar to get GL-accelerated compositio on an i.MX8MP (with etnaviv, not lima), so I sympathize :-)
JohnnyonFlame has joined #dri-devel
Guest3178 has quit [Ping timeout: 480 seconds]
srslypascal is now known as Guest3183
srslypascal has joined #dri-devel
Guest3183 has quit [Ping timeout: 480 seconds]
srslypascal is now known as Guest3185
srslypascal has joined #dri-devel
Guest3185 has quit [Ping timeout: 480 seconds]
heat has quit [Ping timeout: 480 seconds]
JoshuaAshton has joined #dri-devel
columbarius has joined #dri-devel
alyssa has joined #dri-devel
<alyssa> I'm seeing incorrect rendering of the in-game stars in Neverball on upstream Mesa.
<alyssa> I've bisected it to a change in the Mesa frontend. I would like to know if other drivers are affected.
<alyssa> To reproduce, open any level in neverball and collect a coin.
<alyssa> I would appreciate if someone who uses Mesa on something other than Mali could test and report results :-)
<alyssa> (And if affected, that reverting the guilty commit fixes it for you too.)
<alyssa> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19090 -- this one line insertion fixes the issue for me on Mali-T860
<alyssa> I expect all drivers are affected and all drivers need that fix
<alyssa> But apparently I play more Neverball than you guys ;-p
Namarrgon has quit [Quit: WeeChat 3.6]
Namarrgon has joined #dri-devel
Namarrgon has quit []
Namarrgon has joined #dri-devel
<karolherbst> alyssa: mind checking if "./build/test_conformance/buffers/test_buffers buffer_map_read_uint" fails randomly without the sync hacks you were talking about? I see it on zink and was thinking if it's the same thing you were seeing, because I can also fix it by hard syncing stuff
<alyssa> I am not in an OpenCL branch right now so I can
<alyssa> 't test easily
<alyssa> but ... everything was failing randomly, even early on in test_basic
<karolherbst> ohh, I see
<karolherbst> yeah,that's something else then
<karolherbst> though that might be fixed with my MRs merged now
<karolherbst> mapping of memory was bonkers and literally only worked on iris perfectly
<alyssa> Delightful
<karolherbst> yeah... but the new code should be much better.. at least it works on radeonsi and nouveau without issues
<karolherbst> but still something is wrong with mapping memory on zink... :/
sdutt has joined #dri-devel
<alyssa> nod
<alyssa> I have some advice for anyone considering writing a compiler for a VLIW vec4 GPU
<alyssa> Don't
<karolherbst> +1
<karolherbst> the apple stuff is like that, no?
<alyssa> Not at all!
<alyssa> AGX is pure scalar, dynamic scheduling, basically like a CPU except for control flow
<karolherbst> ohh.. then I confused it with something else
<karolherbst> or was it just that encoding is variable it size?
<alyssa> Yeah
<alyssa> that's just encoding though
<karolherbst> right... that I mistook that for something else
<karolherbst> *Then
<alyssa> (and means that AGX programs are much smaller than any Mali on avg)
<karolherbst> I am actually curious if that is a benefit or drawback overall
pcercuei has quit [Read error: Connection reset by peer]
<alyssa> which?
pcercuei has joined #dri-devel
<karolherbst> variable length
<alyssa> Oh.. \shrug/
<alyssa> Definitely good for icache use
<alyssa> Makes no difference to sw
<alyssa> No idea how much complexity it adds to the decoder though
<karolherbst> yeah.. you probably need smaller caches, but nvidia was like: let's just go with 128b fixed length and yolo it
<karolherbst> and it's very very wasteful overall
<karolherbst> but maybe they just put bigger caches
<RSpliet> reckon the icache has partially-decoded ops or like micro-ops inside?
<RSpliet> Not that you'd be able to tell from the outside :-)
dv_ has quit [Ping timeout: 480 seconds]
gouchi has quit [Remote host closed the connection]
<karolherbst> zmike: what's the thing I have to do for zink to make sure an operation on one pipe_context is actually visibile in another one... e.g. a buffer_subdata
dv_ has joined #dri-devel
<zmike> karolherbst: depends on the type of buffer? if you actually get a direct mapping for it then it'll be visible immediately
<zmike> but if it's vram then you'll get a staging slice, so you have to unmap/flush
<zmike> after that the buffer should automatically synchronize other usage
<karolherbst> mhhh, yeah so it's seems like the only case where it's always doing the correct thing is if I use a userptr/hostptr
<karolherbst> mhh
<karolherbst> something is very broken with all the memory consistency stuff
<zmike> are you flushing the transfer map?
<karolherbst> nope
<karolherbst> just unmapping them
<karolherbst> which I guess should be good enough, no?
<zmike> that should be sufficient, though if you're using cross-context like that you also have to flush the context
<karolherbst> yeah.. I always flush the helper context I use for data uploads
<zmike> I'd expect that to be sufficient
<zmike> can you describe to me your usage in terms of gallium calls
<karolherbst> yeah.. it's weird
<zmike> are you actually doing subdata or are you manually map/unmap
<zmike> and what is the other usage for the buffer
<karolherbst> subdata for initial data uploads, then a resource_copy + unsynchronized maps/unmaps or shadow reasoures... I think there are multiple bugs/issues
<karolherbst> the shadow stuff seems to work reliably though
<karolherbst> so I suspect something being odd with unsynchronized maps
<karolherbst> and the tests cases with initial uploads are also flaky
<zmike> it sounds like you probably want to be using the threaded context replace buffer hook?
<karolherbst> mhh.. how would that look like?
<zmike> create buffer A, subdata, bind as descriptor or whatever, dispatch compute, create buffer B with same size, subdata, replace storage (src=B, dst=A), dispatch compute, ...
<zmike> ah, the issue might be with rebinds actually
<karolherbst> not doing any compute launches though
<karolherbst> that's all just plain copies
<zmike> step through zink_buffer_map and see if it's discarding
<zmike> that might be altering the behavior you're expecting
<zmike> though it should be the same behavior as radeonsi
<karolherbst> yeah... I'll check
<zmike> I realize now that I don't think I hooked up anything for rebinding a buffer that's used as a global bind
<zmike> so if that happens everything's fucked
<karolherbst> it's not getting bound
<zmike> just saying
<karolherbst> there is really no kernel involved here
<karolherbst> k
<zmike> you still haven't told me what you're actually doing so I'm just saying things as I think of them
<karolherbst> " subdata for initial data uploads, then a resource_copy + unsynchronized maps/unmaps or shadow reasoures" that's really all
<zmike> I mean in terms of the exact command stream
<zmike> GALLIUM_TRACE would've been great here if it worked
<karolherbst> what would I need to do to hook it up?
<zmike> there's some debug_wrap thing
<zmike> inline_debug_helper.h
<zmike> debug_screen_wrap
<zmike> with that working you should be able to do GALLIUM_TRACE=t.xml <exe> and have it dump a huge xml thing that you can then use like `src/gallium/tools/trace/dump.py -N -p t.xml > dump`
<karolherbst> it seems to be doing that automatically, but it's not setting less callbacks
<karolherbst> ohh wait.. I guess I have to unwrap it
<zmike> ?
<zmike> the point is that you use the wrapped screen+context from rusticl
<zmike> and it'll trace everything you do
<karolherbst> sure, but that trace context is already created by something else
<karolherbst> mhh, let me check the code, maybe I miss something here
<karolherbst> ahhh...
<karolherbst> it's set_global_binding
<karolherbst> doesn't have a debug variant
<zmike> yea so probably add that to tr_context.c
Duke`` has quit [Ping timeout: 481 seconds]
JoshuaAshton has quit [Ping timeout: 480 seconds]
<zmike> ergh I gotta paste you my filter command sometime
<karolherbst> heh.. seems like the test indeed launches kernels.. strange and annoying
<zmike> sed -i '/set_constant_buffer/d;/bind_sampler/d;/get_param/d;/get_shader_param/d;/is_format_supported/d;/get_compute_param/d;/get_compiler_options/d;/get_disk_shader_cache/d;/get_name/d;/get_vendor/d;/resource_bind_backing/d;/allocate_memory/d;/free_memory/d;/set_prediction_mode/d;/fence_reference/d;/delete_sampler_state/d' dump
<zmike> yea I mean I have it locally
<zmike> just sending so you have it and can use it
<karolherbst> ... figures
<zmike> so where's the issue happening
<zmike> lot of resource_copy_region calls
<karolherbst> yeah...
<zmike> hm in the pruned version I see L215 for example you're mapping resource_12 on pipe_1
<karolherbst> hard to say, the application basically just maps memory and checks if it has the correct values, but it does map asynchronous which is a huge pain
<zmike> then doing resource_copy_region on pipe_2 with resource_12 as dst
<zmike> or no misread
<karolherbst> yeah.. pipe_1 is the helper context dealing with random stuff I don't have a cl_queue for
<zmike> the copy is a little later, not immediate
<karolherbst> the thing is...
<karolherbst> the first buffer_map you see is already returning the pointer the application checks
<karolherbst> and it checks it's containing the stuff happening after the map
<karolherbst> until the flush+fence_finish stuff
<karolherbst> so everything between resource_creates is more or less one test case
<karolherbst> I do some shadow resources, so sometimes it doens't add up
<zmike> can you just prune it down to one failing case?
<zmike> it would be easier to know exactly what's happening that way
<karolherbst> ohh.. I could let it crash on the first failing test
<karolherbst> that should work
<zmike> (i.e., edit the test to run only one case that fails)
<karolherbst> one passing and one failing test
<karolherbst> should be obvious where the second one starts
<karolherbst> (after the flushing)
Jeremy_Rand_Talos__ has joined #dri-devel
Jeremy_Rand_Talos_ has quit [Remote host closed the connection]
<zmike> so...it's creating a resource, subdata, flush, map, set_global_binding, and then it fails?
<zmike> I assume the kernel is writing to it?
<karolherbst> yeah.. I think so
<karolherbst> yep
<zmike> ok and can you verify whether this is a BAR allocation that zink is doing?
<zmike> ie. when it maps is it directly mapping
<zmike> because you're creating it with usage=0, which means it should be attempting BAR
<zmike> which should be host-visible
Jeremy_Rand_Talos__ has quit [Remote host closed the connection]
Jeremy_Rand_Talos__ has joined #dri-devel
<zmike> ahhhh
<zmike> you're not creating it as persistent!
<karolherbst> because it isn't
<zmike> the mapping?
<zmike> it sure seems to be
<karolherbst> it's DIRECTLY | UNSYNCHRONIZED
<zmike> that's not enough if you're expecting to read data out of it across a dispatch
<karolherbst> I don't, I want the current data, that's all
<zmike> but you're running a kernel that writes to it
<karolherbst> the application is responsible for syncing
<karolherbst> the application checks after the flushes happen
<karolherbst> I just have to map very very early
<zmike> I don't see anything that would cause this to sync?
<zmike> there's vulkan calls that should be happening here
<karolherbst> the flush and fence_finish?
<zmike> yeah but I think you still need to flush/invalidate teh mapped memory range
<karolherbst> mhhh.. that might be
<zmike> which is why you need persistent
<zmike> or at least transfer_flush_region
<karolherbst> but I don't want persistent
<karolherbst> yeah.. I guess I have to use transfer_flush_region
<karolherbst> but that's all only for WRITE access and not READ :/
<karolherbst> it's very annoying
<karolherbst> but it might not matter
<karolherbst> so how I understand transfer_flush_region is that when a resource is created with FLUSH_EXPLICIT that cached data is only _written_ back on transfer_flush_region
<zmike> well you can see if this fixes it by just setting persistent on the map
<karolherbst> anyway.. not using "DIRECTLY | UNSYNCHRONIZED" on non UMA systems, so RADV should get its shadow buffer and that should work all fine regardless
<zmike> and nothing else
<karolherbst> sure, but I absolutely don't want to set persistent
<zmike> nothing else extra*
<zmike> yeah yeah but for testing
<karolherbst> just the map or also the resource?
<zmike> map
<karolherbst> still fails
<zmike> hm
<zmike> what's the memory_barrier call there for?
<karolherbst> just something clover was doing after launch_grid and I copied it over
<karolherbst> might not be needed or might be.. dunno
<zmike> you only need that if you're synchronizing between gpu operations
<zmike> which this doesn't appear to be doing
<zmike> not that it's harmful
<karolherbst> yeah.. perf opts are for later and stuff
<zmike> hm
<zmike> not really a perf opt, zink will no-op it anyway
<karolherbst> ahh
<zmike> definitely looks like something weird happening
<karolherbst> so the only real requierement I want to have is that after flush + fence_finish I want the current data to be visible on the mapped ptrs
<karolherbst> and normally I only try to map directly when I know it's safe
<karolherbst> like on UMA systems
<zmike> yeah the only thing I can think of is that you're not getting a real map of the buffer somehow
<karolherbst> don't do it on discrete GPUs e.g.
<karolherbst> might be
<karolherbst> but zink doesn't seem to shadow it either
<zmike> but if you've stepped through zink_buffer_map you can check that pretty easily
<zmike> have you tried mapping with just PIPE_MAP_READ and not also PIPE_MAP_WRITE
<karolherbst> yeah.. it calls map_resource on the res directly
<karolherbst> huh.. I could try that, but it does look like it gives me the real deal directly
<zmike> alright
<karolherbst> so
<karolherbst> I know a way of fixing
<karolherbst> it
<karolherbst> if I just stall the pipeline the tests are passing
<zmike> not sure then, I'd probably have to look at it
<zmike> stall the pipeline?
<karolherbst> like sleeping in my worker thread on the second context
mhenning has joined #dri-devel
<karolherbst> helper context stuff happens on the application thread, actual context stuff (cl_queue) happens inside a worker thread
<karolherbst> so probably just a sync issue between the two contexts or something?
<zmike> that's incredibly bizarre
<karolherbst> I could sleep and make another trace
<karolherbst> maybe it looks differente nough to spot something
<zmike> oh hold up
<zmike> 539 pipe_context::flush(pipe = pipe_2, flags = 0) = ret_5 // time 225
<zmike> 540 pipe_context::flush(pipe = pipe_2, flags = 0) = fence_5 // time 2
<zmike> 543 pipe_screen::fence_finish(screen = screen_3, ctx = NULL, fence = fence_5, timeout = 18446744073709551615) = 1 // time 3
<zmike> 541 pipe_screen::fence_finish(screen = screen_3, ctx = NULL, fence = fence_4, timeout = 18446744073709551615) = 1 // time 2
<zmike> ret_5 never gets waited on
<zmike> why are you flushing twice anyway
<karolherbst> silly reasons
<zmike> try waiting on that fence too
<karolherbst> do I have to wait on all fences or just the last one?
<karolherbst> okay...
<zmike> dunno, just brainstorming
<zmike> could try waiting on every fence after flushing and see what happens
<zmike> maximum sync
<karolherbst> heh.. why am I not waiting on the ret_5 fence.. that's like odd...
<karolherbst> zmike: okay.. so that fixes it indeed
<karolherbst> I am wondering if something overwrites a fence or something stupid
<karolherbst> the trace looks odd
<zmike> the variable names get reused sometimes
<karolherbst> yeah.. that would be weird here.. I am sure it's something terribly stupid
Surkow|laptop has quit [Ping timeout: 480 seconds]
rasterman has quit [Quit: Gettin' stinky!]
danvet has quit [Ping timeout: 480 seconds]
Surkow|laptop has joined #dri-devel
sarnex has quit [Quit: Quit]
<karolherbst> ohh seems like the process crashed to fast, it actually waited on the fence
<karolherbst> just in a different order, which shouldn't matter I guess
carbonfiber has quit [Quit: Connection closed for inactivity]
lemonzest has quit [Quit: WeeChat 3.6]
<karolherbst> mhhh.. one trace with broken and working test: https://gist.github.com/karolherbst/808d0d000df3bc6880f89427bf9ce2ae
<karolherbst> the difference isn't all that huge
<karolherbst> which really stands out is the wait on fence_4
<karolherbst> still don't see why that would be my bug...
<zmike> so what was it that made the test pass? or you just tried with sleep
<karolherbst> I call fence_finish immediately after flush
<zmike> did you try incrementally removing them to see which fence was the problem?
<karolherbst> it can only be one
<zmike> ?
<karolherbst> the trace is almost identical, just that one wait moves quite a bit to the bottom
sarnex has joined #dri-devel
<zmike> so...the 4th flush?
<karolherbst> yeah
<zmike> and if you fence_finish immediately after then it passes?
<zmike> but like this it fails?
<karolherbst> yes
<zmike> hm
<karolherbst> yeah...
<karolherbst> zink does some map/unmap dance in buffer_subdata though
<zmike> yeah
<zmike> sounds to me like the subdata call isn't synchronizing right
<karolherbst> probably
<zmike> did you check to make sure it isn't discarding?
<alyssa> zmike: shouldn't you be ~~enjoying the weekend~~ working out or something
<alyssa> no?
<karolherbst> I didn't check what buffer_Subdata is doing
<zmike> I was at the gym like 12 hours ago
<alyssa> great I have an MR for you then thanks
<karolherbst> :D
<karolherbst> will look more deeper into buffer_subdata
<karolherbst> ehh.. deeper
<alyssa> glcts-nvi10-valve isn't running so I assume the code works perfectly with zero bugs
<karolherbst> with that you could drop NIR_SERIALIZED. will probably create the MR later or something
<alyssa> karolherbst: nice
<karolherbst> just requires a RUSTICL_ENABLE=panfrost to use CL though then, but I guess that's fair :)
<karolherbst> and all drivers use the same thing then, which is nice
<karolherbst> (TODO: need to add it to the doc)
<alyssa> +1
<alyssa> what's the : and ; syntax for
<karolherbst> RUSTICL_ENABLE=radeonsi:0;1,nouveau:0,iris
<karolherbst> to select devices
<karolherbst> without : it enables all
<alyssa> uhhh okay
<karolherbst> with just : it disables all
<karolherbst> yeah.. somebody asked for that as a separate feature, but I thought I can just combine it to one env var
<karolherbst> should also help with zink and native not claiming the same device
<karolherbst> sooo.. zink is using PIPE_MAP_DISCARD_RANGE.. and also sets UNSYNCHRONIZED, DISCARD_WHOLE_RESOURCE.. but it's still ending up with a direct map_resource thing
djbw has quit [Read error: Connection reset by peer]
<zmike> hm
<zmike> I don't have a great explanation then
<karolherbst> maybe MAP_ONCE changes things...
<zmike> subdata should be fine on its own
<karolherbst> yeah... dunno
<karolherbst> doesn't seem to be though
<zmike> I can only speculate that the flush+fence is forcing mapped memory invalidation/flush
<karolherbst> mhhh... yeah, let me debug a bit further
djbw has joined #dri-devel
pcercuei has quit [Quit: dodo]
<alyssa> lowering away seamless cubemap gathers sounds intense ...
* zmike closes irc for the weekend
<alyssa> hf
MajorBiscuit has joined #dri-devel
<karolherbst> ahhhhhhhh
<karolherbst> :(
<karolherbst> that's the second subdata I was seeing and I started to think I am crazy because gdb never actually hit a second one....
<karolherbst> yo, how evil is that
Haaninjo has quit [Quit: Ex-Chat]
sarnex has quit [Quit: Quit]
DemiMarieObenour[m] is now known as DemiMarie
MajorBiscuit has quit [Ping timeout: 480 seconds]
sarnex has joined #dri-devel
oneforall2 has quit [Read error: Connection reset by peer]