#dri-devel on 2022-10-15 — irc logs at oftc.irclog.whitequark.org

2022-08-14 19:45 ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar

00:01 JohnnyonFlame has joined #dri-devel

00:01 <jenatali> Pretty sure you can do anything with a specialization constant

00:02 nchery_ has joined #dri-devel

00:02 <zmike> there are some values that must be actual constants

00:03 <jenatali> Just means you need to run the specialization pass from SPIRV-Tools before sending it to the Vulkan driver

00:05 mbrost__ has quit [Ping timeout: 480 seconds]

00:05 <zmike> 😬

00:05 <zmike> at that point may as well just do normal pipeline variants

00:06 mbrost__ has joined #dri-devel

00:06 <karolherbst> yeah... shared mem is terrible.. isn't there a vulkan extension or something?

00:06 <zmike> could be

00:06 <zmike> check the registry

00:08 nchery has quit [Ping timeout: 480 seconds]

00:08 <karolherbst> doesn't seem to exist

00:11 <karolherbst> heh.. but it's not shared mem why luxmark kills the GPU anyway

00:11 <zmike> for now maybe just detect whether you're on zink and create different compute shaders for each variable size by setting static size as the total size

00:12 mbrost_ has joined #dri-devel

00:12 <karolherbst> uhhhhhh

00:12 <zmike> unless you want to plug in spec constant / pipeline variants

00:12 <karolherbst> you know that this value is completely variable?

00:12 <zmike> yep

00:12 <karolherbst> not creating 65000 shaders

00:12 <zmike> well if you want to do other stuff this weekend...

00:12 <zmike> I'm saying I can do the variants next week

00:13 <zmike> but this would work as a temporary solution

00:13 <karolherbst> I think I try to fix why zink trashes the GPU context instead :)

00:13 <zmike> 🤷

00:13 <karolherbst> uhm.. figure out

00:14 mbrost__ has quit [Ping timeout: 480 seconds]

00:15 <karolherbst> still works on anv though

00:16 <zmike> are you using VVL at all?

00:16 <karolherbst> VVL?

00:16 <zmike> https://github.com/KhronosGroup/Vulkan-ValidationLayers

00:16 <karolherbst> ahh... right.. I used to use them, but it was fine.. maybe it's not with radv

00:16 <zmike> if you have it installed you can use ZINK_DEBUG=validation

00:16 <karolherbst> but maybe I should just in case

00:20 mbrost_ has quit [Ping timeout: 480 seconds]

00:20 <karolherbst> mhh.. it's not printing anything, must be perfect code then

00:24 <karolherbst> ehh.. I'll just run the cts on anv and see what's the difference

00:26 co1umbarius has joined #dri-devel

00:28 columbarius has quit [Ping timeout: 480 seconds]

00:45 <jenatali> karolherbst: fwiw we'd need variants with shared memory size too, if you wanted to do it in rusticl so any drivers that need it don't have to do it

00:46 <karolherbst> jenatali: I kind of plan on splitting it out so backends can deal with it however they wanted to

00:46 <karolherbst> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18581

00:46 <karolherbst> create_compute_state takes the static amount and launch_grid gets a variable_shared_mem parameter

00:47 <jenatali> Fair. I'd use that for a variant then

00:47 <karolherbst> yeah, that's more or less the idea then

00:47 <karolherbst> but I wanted the CSO to be like that so I can create it early and all kernel launches reuse the same one

00:47 <karolherbst> which I need for workgroupinfo as well

00:48 <jenatali> Makes sense

00:50 <karolherbst> ehh.. running with anv ooms my system, but it passes a lot more tests: "Pass 1613 Fails 41 Crashes 355 Timeouts 0: 82%|"

00:57 alanc has quit [Remote host closed the connection]

00:57 alanc has joined #dri-devel

00:57 ybogdano has quit [Ping timeout: 480 seconds]

00:58 jolan has quit [Remote host closed the connection]

00:59 jolan has joined #dri-devel

00:59 <karolherbst> uhhh.. it's the kernel running OOM.. strange

01:00 jolan has quit []

01:00 <karolherbst> or rather... no, I think it's file memory

01:00 <karolherbst> probably some huge spirv being generated

01:01 jolan has joined #dri-devel

01:17 <karolherbst> found it

01:17 <karolherbst> "./build/test_conformance/multiple_device_context/test_multiples" :(

01:20 Emantor has quit [Quit: ZNC - http://znc.in]

01:20 Emantor has joined #dri-devel

01:21 <jenatali> Ooh yeah that one is fun

01:21 <karolherbst> I suspect zink is just allocating tons of stuff

01:21 <jenatali> karolherbst: can you ever have more than one device enumerated at a time?

01:21 lemonzest has quit [Quit: WeeChat 3.6]

01:21 <karolherbst> at least my 32GB machine isn't enough

01:21 <karolherbst> jenatali: what do you mean?

01:22 <jenatali> I assume rusticl is a platform, which can then enumerate devices?

01:22 <karolherbst> sure

01:22 <karolherbst> and multiple devices just work

01:22 <karolherbst> but that's not hte point

01:22 <karolherbst> there is just one device...

01:22 <jenatali> And then when you create a context, you can pick one or more devices to use

01:22 <karolherbst> sure

01:22 <karolherbst> that all works

01:22 <jenatali> Ok I just wasn't sure if you handled the multi-device with a single context case

01:22 <jenatali> Cause that was a huge pain for me

01:23 <karolherbst> it works if it's not zink at least

01:23 alyssa has joined #dri-devel

01:23 <alyssa> vec4 32 ssa_4 = intrinsic load_interpolated_input (ssa_3, ssa_2) (base=1, component=0, dest_type=invalid /*0*/, io location=33 slots=1 /*161*/)/* v_colorBase */

01:23 <karolherbst> but maybe the test is doing something super silly? like using the same device multiple times?

01:23 <alyssa> type=invalid? thinking

01:25 <zmike> karolherbst: massif?

01:26 <karolherbst> zmike: mhh?

01:26 <zmike> to see where the allocation is

01:26 <karolherbst> not in userspace

01:27 <zmike> ah

01:27 <karolherbst> the process is using 1.0% of mem :(

01:27 <karolherbst> probably some GPU buffers or something? I dunno

01:28 <zmike> if you're on amd you can use radeontop or nvtop to see vram utilization in realtime

01:28 <karolherbst> it's anv sadly

01:28 <karolherbst> heh on radv that test just works

01:28 <zmike> find a 🍺 then

01:28 <karolherbst> uses like 100MB of memory

01:28 <alyssa> one line fix nvm

01:28 <karolherbst> "Successfully created all 200 contexts." uhhhhh

01:29 <zmike> man-blinking.gif

01:30 <HdkR> Does it actually run out of memory or does it run in to the 65k VMA region limit? :D

01:30 <karolherbst> I guess if a context creates more then 200MB of stuff I can see it running out of memory :D

01:30 <karolherbst> HdkR: actual physical memory

01:30 <karolherbst> but on the GPU side I think

01:30 <HdkR> Very fancy

01:32 <karolherbst> ehhh... but what are those 200 contexts anyway?....

01:32 <karolherbst> I am sure I only create one screen...

01:32 <karolherbst> and a context is per queue...

01:33 <karolherbst> *pipe_context

01:34 <karolherbst> ohh.. it actually creates queues and kernels and programs per cl_context

01:34 <karolherbst> zmike: what's the vk type a pipe_context maps to? vkDevice? vkQueue?

01:34 <zmike> there isn't one

01:34 <karolherbst> anyway.. sounds like an anv bug to me :)

01:34 <zmike> does indeed

01:36 lyudess has joined #dri-devel

01:37 <jenatali> karolherbst: When you run on zink, which / how many devices do you show?

01:37 <karolherbst> just one

01:37 <karolherbst> whatever is the active/first vulkan device

01:37 Lyude has quit [Read error: Connection reset by peer]

01:37 <karolherbst> zink can't be loaded multiple times yet afaik

01:38 <jenatali> I see

01:38 <karolherbst> I am sure not because of zink, just because there is no "iterate all render nodes" thing

01:38 <karolherbst> anyway.. on radv it works

01:38 <karolherbst> on anv it OOMs

01:39 <zmike> you could probably do it by changing the icd env between loads

01:39 <karolherbst> heh...

01:39 <karolherbst> what a dirty hack, but I like it

01:43 kem has quit [Ping timeout: 480 seconds]

01:44 khfeng has joined #dri-devel

01:48 khfeng has quit []

01:48 khfeng has joined #dri-devel

01:53 kem has joined #dri-devel

02:02 <illwieckz> dirty hacks need love too

02:03 <karolherbst> yo, but that is seriously dirty as I'd have to parse env vars and locations and all of that

02:06 <zmike> I guess maybe I should try adding fallback in zink to try progressively iterating through devices if the first one fails? 🤔

02:07 <karolherbst> zmike: I need something like pipe_loader_probe just for vulkan

02:08 <zmike> what is the end goal here

02:08 <karolherbst> being able to load multiple devices through zink

02:08 <zmike> yeah but do you care what you're loading or you're okay with just loading anything

02:09 <karolherbst> I have to load _all_ devices

02:09 <zmike> yeah so probably just adding fallback handling in zink would be fine

02:09 <karolherbst> though at some point I want to either use the gallium native or the zink one though

02:09 <karolherbst> so maybe we need a pipe_loader_probe which falls back to zink for the specific render node instead

02:09 <karolherbst> or something

02:10 heat has quit [Ping timeout: 480 seconds]

02:11 <zmike> hm

02:14 <jenatali> That's how we handle our D3D layered impls

02:16 <jenatali> Would be nice if we could do that for the CL/GL/VK layered impls too, but for now we just put all devices and just put them after the native impls

02:19 alyssa has left #dri-devel [#dri-devel]

02:33 YuGiOhJCJ has joined #dri-devel

02:52 YuGiOhJCJ has quit [Quit: YuGiOhJCJ]

02:58 jagan_ has quit [Remote host closed the connection]

03:03 sumits has quit [Quit: ZNC - http://znc.in]

03:03 andrey-konovalov has quit []

03:11 sumits has joined #dri-devel

03:14 andrey-konovalov has joined #dri-devel

03:30 ella-0_ has joined #dri-devel

03:33 ella-0 has quit [Read error: Connection reset by peer]

03:46 Leopold_ has quit [Remote host closed the connection]

04:08 srslypascal is now known as Guest3138

04:08 srslypascal has joined #dri-devel

04:15 Guest3138 has quit [Ping timeout: 480 seconds]

04:29 Lyude has joined #dri-devel

04:30 lyudess has quit [Ping timeout: 480 seconds]

05:20 fab has joined #dri-devel

05:27 cef has quit [Quit: Zoom!]

05:43 fxkamd has quit []

05:46 samueldr_ has left #dri-devel [#dri-devel]

06:00 sdutt has quit [Read error: Connection reset by peer]

06:08 lygstate has quit [Remote host closed the connection]

06:23 kts has joined #dri-devel

06:42 rasterman has joined #dri-devel

07:02 Duke`` has joined #dri-devel

07:09 danvet has joined #dri-devel

07:23 djbw has quit [Read error: Connection reset by peer]

07:34 srslypascal is now known as Guest3143

07:34 srslypascal has joined #dri-devel

07:36 Guest3143 has quit [Ping timeout: 480 seconds]

07:40 kts has quit [Quit: Leaving]

07:53 kts has joined #dri-devel

08:13 cef has joined #dri-devel

08:22 jernej_ is now known as jernej

08:27 kem has quit [Ping timeout: 480 seconds]

08:37 kem has joined #dri-devel

08:51 JohnnyonFlame has quit [Ping timeout: 480 seconds]

09:02 ella-0_ is now known as ella-0

09:04 ella-0 has left #dri-devel [#dri-devel]

09:06 ella-0 has joined #dri-devel

09:09 chipxxx has quit [Remote host closed the connection]

09:10 chipxxx has joined #dri-devel

09:45 co1umbarius has quit [Read error: No route to host]

09:50 TMM has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]

09:50 TMM has joined #dri-devel

09:53 srslypascal is now known as Guest3150

09:53 srslypascal has joined #dri-devel

09:53 lemonzest has joined #dri-devel

09:58 Guest3150 has quit [Ping timeout: 480 seconds]

09:59 srslypascal has quit [Quit: Leaving]

10:08 srslypascal has joined #dri-devel

10:12 Thymo_ has quit [Ping timeout: 480 seconds]

10:21 JohnnyonFlame has joined #dri-devel

10:38 i-garrison has quit []

10:40 morphis has quit [Remote host closed the connection]

10:40 morphis has joined #dri-devel

10:44 gouchi has joined #dri-devel

10:46 fab has quit [Quit: fab]

10:46 fab_ has joined #dri-devel

10:47 fab_ is now known as Guest3153

10:52 morphis has quit [Remote host closed the connection]

10:53 morphis has joined #dri-devel

10:53 morphis has quit [Remote host closed the connection]

10:55 carbonfiber has joined #dri-devel

10:57 i-garrison has joined #dri-devel

10:58 morphis has joined #dri-devel

10:59 Duke`` has quit []

11:01 Duke`` has joined #dri-devel

11:06 kts has quit [Quit: Leaving]

11:06 morphis has quit [Ping timeout: 480 seconds]

11:09 JohnnyonFlame has quit [Ping timeout: 480 seconds]

11:12 kts has joined #dri-devel

11:13 Jeremy_Rand_Talos_ has quit [Remote host closed the connection]

11:14 Jeremy_Rand_Talos_ has joined #dri-devel

11:18 morphis has joined #dri-devel

11:21 Thymo has joined #dri-devel

11:38 gouchi has quit [Remote host closed the connection]

11:39 illwieckz has quit [Remote host closed the connection]

11:39 pcercuei has joined #dri-devel

11:45 morphis has quit [Remote host closed the connection]

11:48 illwieckz has joined #dri-devel

12:08 TMM has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]

12:08 TMM has joined #dri-devel

12:20 Leopold has joined #dri-devel

12:48 srslypascal is now known as Guest3157

12:48 srslypascal has joined #dri-devel

12:53 Guest3157 has quit [Ping timeout: 480 seconds]

13:06 srslypascal is now known as Guest3158

13:06 srslypascal has joined #dri-devel

13:06 Haaninjo has joined #dri-devel

13:08 srslypascal has quit [Remote host closed the connection]

13:09 srslypascal has joined #dri-devel

13:11 srslypascal is now known as Guest3159

13:11 srslypascal has joined #dri-devel

13:12 Guest3158 has quit [Ping timeout: 480 seconds]

13:14 lunarequest has joined #dri-devel

13:17 Guest3159 has quit [Ping timeout: 480 seconds]

13:19 srslypascal is now known as Guest3160

13:19 srslypascal has joined #dri-devel

13:20 heat has joined #dri-devel

13:23 srslypascal is now known as Guest3161

13:23 srslypascal has joined #dri-devel

13:24 lunarequest has left #dri-devel [#dri-devel]

13:25 srslypascal is now known as Guest3162

13:25 srslypascal has joined #dri-devel

13:25 Guest3160 has quit [Ping timeout: 480 seconds]

13:27 lunarequest has joined #dri-devel

13:29 Guest3161 has quit [Ping timeout: 480 seconds]

13:30 srslypascal has quit [Remote host closed the connection]

13:30 srslypascal has joined #dri-devel

13:31 Guest3162 has quit [Ping timeout: 480 seconds]

13:31 srslypascal has quit [Remote host closed the connection]

13:34 kts has quit [Quit: Leaving]

13:35 nullrequest has joined #dri-devel

13:36 srslypascal has joined #dri-devel

13:39 lunarequest has quit [Ping timeout: 480 seconds]

13:42 srslypascal has quit [Remote host closed the connection]

13:45 JoshuaAshton has quit [Ping timeout: 480 seconds]

13:47 gawin has joined #dri-devel

13:54 kts has joined #dri-devel

13:55 nullrequest has quit [Ping timeout: 480 seconds]

13:56 srslypascal has joined #dri-devel

14:05 srslypascal is now known as Guest3165

14:05 srslypascal has joined #dri-devel

14:11 Guest3165 has quit [Ping timeout: 480 seconds]

14:22 nullrequest has joined #dri-devel

14:24 nullrequest has left #dri-devel [#dri-devel]

14:26 srslypascal is now known as Guest3167

14:26 srslypascal has joined #dri-devel

14:32 Guest3167 has quit [Ping timeout: 480 seconds]

14:37 srslypascal is now known as Guest3168

14:37 srslypascal has joined #dri-devel

14:41 gouchi has joined #dri-devel

14:41 srslypascal has quit [Remote host closed the connection]

14:42 srslypascal has joined #dri-devel

14:44 Guest3168 has quit [Ping timeout: 480 seconds]

14:48 Company has quit [Quit: Leaving]

14:49 gawin has quit [Ping timeout: 480 seconds]

14:49 kem has quit [Ping timeout: 480 seconds]

14:58 kem has joined #dri-devel

15:01 sarnex has quit [Read error: Connection reset by peer]

15:09 sarnex has joined #dri-devel

15:32 chipxxx has quit [Remote host closed the connection]

15:43 MajorBiscuit has joined #dri-devel

15:50 srslypascal is now known as Guest3173

15:50 srslypascal has joined #dri-devel

15:54 Guest3173 has quit [Ping timeout: 480 seconds]

16:00 chipxxx has joined #dri-devel

16:10 Compy has joined #dri-devel

16:14 djbw has joined #dri-devel

16:20 srslypascal is now known as Guest3175

16:20 srslypascal has joined #dri-devel

16:21 Compy has left #dri-devel [#dri-devel]

16:22 Compy_ has joined #dri-devel

16:22 <Compy_> Good morning all. I'm working on a baseline image targeting an Allwinner H2, MALI GPU (buildroot, linux 5.10.47). I've installed mesa3d (21.1.8) and the gallium lima driver, opengl EGL/ES and libdrm (2.4.107). Linux modules for KMSDRM are compiled in, and I see kmsg saying that lima has loaded/been detected successfully, however I can't get any KMSDRM devices in things like SDL2. From what I can tell, the

16:22 <Compy_> KMSDRM_drmModeGetResources call is failing (the underlying IOCTL call). Any idea where I can start to look at this?

16:24 Guest3175 has quit [Ping timeout: 480 seconds]

16:31 Lucretia has quit []

16:36 Lucretia has joined #dri-devel

16:37 srslypascal is now known as Guest3177

16:37 srslypascal has joined #dri-devel

16:42 Guest3177 has quit [Ping timeout: 480 seconds]

16:43 kts has quit [Quit: Leaving]

16:47 MajorBiscuit has quit [Quit: WeeChat 3.5]

16:52 <pinchartl> Compy_: my usual approach to this kind of problem is to trace it in the kernel. there are debugging options you can enable in the DRM/KMS core to output debug messages to the kernel log, tracing calls from userspace. if that's not enough, I usually add printk() statements to the code paths to locate the source of the error. this debugging strategy may be too influenced by a lifetime of kernel

16:52 <pinchartl> development though :-)

16:53 <Compy_> Hahah, I'm doing the same thing with printk() statements. I'll see if I can up that debug verbosity in the DRM/KMS core. Thanks a ton for the response to my semi-vague situation. Always appreciated :)

16:57 srslypascal is now known as Guest3178

16:57 srslypascal has joined #dri-devel

16:57 <pinchartl> I'll have to do something similar to get GL-accelerated compositio on an i.MX8MP (with etnaviv, not lima), so I sympathize :-)

16:58 JohnnyonFlame has joined #dri-devel

17:02 Guest3178 has quit [Ping timeout: 480 seconds]

17:18 srslypascal is now known as Guest3183

17:18 srslypascal has joined #dri-devel

17:22 Guest3183 has quit [Ping timeout: 480 seconds]

17:23 srslypascal is now known as Guest3185

17:23 srslypascal has joined #dri-devel

17:27 Guest3185 has quit [Ping timeout: 480 seconds]

17:44 heat has quit [Ping timeout: 480 seconds]

18:09 JoshuaAshton has joined #dri-devel

18:12 columbarius has joined #dri-devel

18:36 alyssa has joined #dri-devel

18:36 <alyssa> I'm seeing incorrect rendering of the in-game stars in Neverball on upstream Mesa.

18:36 <alyssa> (upstream issue https://gitlab.freedesktop.org/mesa/mesa/-/issues/7502 )

18:36 <alyssa> I've bisected it to a change in the Mesa frontend. I would like to know if other drivers are affected.

18:37 <alyssa> To reproduce, open any level in neverball and collect a coin.

18:38 <alyssa> I would appreciate if someone who uses Mesa on something other than Mali could test and report results :-)

18:38 <alyssa> (And if affected, that reverting the guilty commit fixes it for you too.)

18:45 <alyssa> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19090 -- this one line insertion fixes the issue for me on Mali-T860

18:45 <alyssa> I expect all drivers are affected and all drivers need that fix

18:45 <alyssa> But apparently I play more Neverball than you guys ;-p

19:06 Namarrgon has quit [Quit: WeeChat 3.6]

19:06 Namarrgon has joined #dri-devel

19:07 Namarrgon has quit []

19:09 Namarrgon has joined #dri-devel

19:25 <karolherbst> alyssa: mind checking if "./build/test_conformance/buffers/test_buffers buffer_map_read_uint" fails randomly without the sync hacks you were talking about? I see it on zink and was thinking if it's the same thing you were seeing, because I can also fix it by hard syncing stuff

19:28 <alyssa> I am not in an OpenCL branch right now so I can

19:28 <alyssa> 't test easily

19:28 <alyssa> but ... everything was failing randomly, even early on in test_basic

19:29 <karolherbst> ohh, I see

19:29 <karolherbst> yeah,that's something else then

19:29 <karolherbst> though that might be fixed with my MRs merged now

19:30 <karolherbst> mapping of memory was bonkers and literally only worked on iris perfectly

19:36 <alyssa> Delightful

19:49 <karolherbst> yeah... but the new code should be much better.. at least it works on radeonsi and nouveau without issues

19:49 <karolherbst> but still something is wrong with mapping memory on zink... :/

19:50 sdutt has joined #dri-devel

19:59 <alyssa> nod

19:59 <alyssa> I have some advice for anyone considering writing a compiler for a VLIW vec4 GPU

19:59 <alyssa> Don't

20:00 <karolherbst> +1

20:00 <karolherbst> the apple stuff is like that, no?

20:00 <alyssa> Not at all!

20:01 <alyssa> AGX is pure scalar, dynamic scheduling, basically like a CPU except for control flow

20:01 <karolherbst> ohh.. then I confused it with something else

20:01 <karolherbst> or was it just that encoding is variable it size?

20:02 <alyssa> Yeah

20:02 <alyssa> that's just encoding though

20:02 <karolherbst> right... that I mistook that for something else

20:02 <karolherbst> *Then

20:02 <alyssa> (and means that AGX programs are much smaller than any Mali on avg)

20:03 <karolherbst> I am actually curious if that is a benefit or drawback overall

20:03 pcercuei has quit [Read error: Connection reset by peer]

20:03 <alyssa> which?

20:03 pcercuei has joined #dri-devel

20:04 <karolherbst> variable length

20:04 <alyssa> Oh.. \shrug/

20:04 <alyssa> Definitely good for icache use

20:04 <alyssa> Makes no difference to sw

20:04 <alyssa> No idea how much complexity it adds to the decoder though

20:04 <karolherbst> yeah.. you probably need smaller caches, but nvidia was like: let's just go with 128b fixed length and yolo it

20:05 <karolherbst> and it's very very wasteful overall

20:05 <karolherbst> but maybe they just put bigger caches

20:12 <RSpliet> reckon the icache has partially-decoded ops or like micro-ops inside?

20:13 <RSpliet> Not that you'd be able to tell from the outside :-)

20:31 dv_ has quit [Ping timeout: 480 seconds]

20:31 gouchi has quit [Remote host closed the connection]

20:34 <karolherbst> zmike: what's the thing I have to do for zink to make sure an operation on one pipe_context is actually visibile in another one... e.g. a buffer_subdata

20:34 dv_ has joined #dri-devel

20:36 <zmike> karolherbst: depends on the type of buffer? if you actually get a direct mapping for it then it'll be visible immediately

20:37 <zmike> but if it's vram then you'll get a staging slice, so you have to unmap/flush

20:37 <zmike> after that the buffer should automatically synchronize other usage

20:37 <karolherbst> mhhh, yeah so it's seems like the only case where it's always doing the correct thing is if I use a userptr/hostptr

20:37 <karolherbst> mhh

20:37 <karolherbst> something is very broken with all the memory consistency stuff

20:37 <zmike> are you flushing the transfer map?

20:38 <karolherbst> nope

20:38 <karolherbst> just unmapping them

20:38 <karolherbst> which I guess should be good enough, no?

20:38 <zmike> that should be sufficient, though if you're using cross-context like that you also have to flush the context

20:38 <karolherbst> yeah.. I always flush the helper context I use for data uploads

20:39 <zmike> I'd expect that to be sufficient

20:40 <zmike> can you describe to me your usage in terms of gallium calls

20:40 <karolherbst> yeah.. it's weird

20:40 <zmike> are you actually doing subdata or are you manually map/unmap

20:40 <zmike> and what is the other usage for the buffer

20:41 <karolherbst> subdata for initial data uploads, then a resource_copy + unsynchronized maps/unmaps or shadow reasoures... I think there are multiple bugs/issues

20:42 <karolherbst> the shadow stuff seems to work reliably though

20:42 <karolherbst> so I suspect something being odd with unsynchronized maps

20:42 <karolherbst> and the tests cases with initial uploads are also flaky

20:42 <zmike> it sounds like you probably want to be using the threaded context replace buffer hook?

20:43 <karolherbst> mhh.. how would that look like?

20:44 <zmike> create buffer A, subdata, bind as descriptor or whatever, dispatch compute, create buffer B with same size, subdata, replace storage (src=B, dst=A), dispatch compute, ...

20:44 <zmike> ah, the issue might be with rebinds actually

20:44 <karolherbst> not doing any compute launches though

20:45 <karolherbst> that's all just plain copies

20:45 <zmike> step through zink_buffer_map and see if it's discarding

20:45 <zmike> that might be altering the behavior you're expecting

20:45 <zmike> though it should be the same behavior as radeonsi

20:45 <karolherbst> yeah... I'll check

20:46 <zmike> I realize now that I don't think I hooked up anything for rebinding a buffer that's used as a global bind

20:46 <zmike> so if that happens everything's fucked

20:46 <karolherbst> it's not getting bound

20:46 <zmike> just saying

20:46 <karolherbst> there is really no kernel involved here

20:46 <karolherbst> k

20:47 <zmike> you still haven't told me what you're actually doing so I'm just saying things as I think of them

20:47 <karolherbst> " subdata for initial data uploads, then a resource_copy + unsynchronized maps/unmaps or shadow reasoures" that's really all

20:48 <zmike> I mean in terms of the exact command stream

20:48 <zmike> GALLIUM_TRACE would've been great here if it worked

20:48 <karolherbst> what would I need to do to hook it up?

20:48 <zmike> there's some debug_wrap thing

20:50 <zmike> inline_debug_helper.h

20:50 <zmike> debug_screen_wrap

20:52 <zmike> with that working you should be able to do GALLIUM_TRACE=t.xml <exe> and have it dump a huge xml thing that you can then use like `src/gallium/tools/trace/dump.py -N -p t.xml > dump`

20:52 <karolherbst> it seems to be doing that automatically, but it's not setting less callbacks

20:52 <karolherbst> ohh wait.. I guess I have to unwrap it

20:52 <zmike> ?

20:52 <zmike> the point is that you use the wrapped screen+context from rusticl

20:53 <zmike> and it'll trace everything you do

20:53 <karolherbst> sure, but that trace context is already created by something else

20:53 <karolherbst> mhh, let me check the code, maybe I miss something here

20:55 <karolherbst> ahhh...

20:55 <karolherbst> it's set_global_binding

20:55 <karolherbst> doesn't have a debug variant

20:55 <zmike> yea so probably add that to tr_context.c

21:03 Duke`` has quit [Ping timeout: 481 seconds]

21:05 JoshuaAshton has quit [Ping timeout: 480 seconds]

21:06 <karolherbst> zmike: https://gist.githubusercontent.com/karolherbst/b70edc189f4401aaff8485ddebcd830d/raw/98f465a0d6ab3a1851cfa7c3e10c8c5d67d624a5/gistfile1.txt

21:06 <zmike> ergh I gotta paste you my filter command sometime

21:07 <karolherbst> heh.. seems like the test indeed launches kernels.. strange and annoying

21:08 <zmike> sed -i '/set_constant_buffer/d;/bind_sampler/d;/get_param/d;/get_shader_param/d;/is_format_supported/d;/get_compute_param/d;/get_compiler_options/d;/get_disk_shader_cache/d;/get_name/d;/get_vendor/d;/resource_bind_backing/d;/allocate_memory/d;/free_memory/d;/set_prediction_mode/d;/fence_reference/d;/delete_sampler_state/d' dump

21:08 <karolherbst> https://gist.githubusercontent.com/karolherbst/b70edc189f4401aaff8485ddebcd830d/raw/52df4aeffc628bc8733793e9e42bbf73e74ccd62/gistfile1.txt

21:09 <zmike> yea I mean I have it locally

21:09 <zmike> just sending so you have it and can use it

21:09 <karolherbst> ... figures

21:09 <zmike> so where's the issue happening

21:09 <zmike> lot of resource_copy_region calls

21:09 <karolherbst> yeah...

21:10 <zmike> hm in the pruned version I see L215 for example you're mapping resource_12 on pipe_1

21:10 <karolherbst> hard to say, the application basically just maps memory and checks if it has the correct values, but it does map asynchronous which is a huge pain

21:10 <zmike> then doing resource_copy_region on pipe_2 with resource_12 as dst

21:11 <zmike> or no misread

21:11 <karolherbst> yeah.. pipe_1 is the helper context dealing with random stuff I don't have a cl_queue for

21:11 <zmike> the copy is a little later, not immediate

21:11 <karolherbst> the thing is...

21:11 <karolherbst> the first buffer_map you see is already returning the pointer the application checks

21:12 <karolherbst> and it checks it's containing the stuff happening after the map

21:12 <karolherbst> until the flush+fence_finish stuff

21:12 <karolherbst> so everything between resource_creates is more or less one test case

21:13 <karolherbst> I do some shadow resources, so sometimes it doens't add up

21:13 <zmike> can you just prune it down to one failing case?

21:13 <zmike> it would be easier to know exactly what's happening that way

21:13 <karolherbst> ohh.. I could let it crash on the first failing test

21:13 <karolherbst> that should work

21:13 <zmike> (i.e., edit the test to run only one case that fails)

21:14 <karolherbst> https://gist.githubusercontent.com/karolherbst/b70edc189f4401aaff8485ddebcd830d/raw/ad1c07635a064baf11c122ac136c58e41ec859dc/gistfile1.txt

21:14 <karolherbst> one passing and one failing test

21:14 <karolherbst> should be obvious where the second one starts

21:14 <karolherbst> (after the flushing)

21:15 Jeremy_Rand_Talos__ has joined #dri-devel

21:16 Jeremy_Rand_Talos_ has quit [Remote host closed the connection]

21:16 <zmike> so...it's creating a resource, subdata, flush, map, set_global_binding, and then it fails?

21:16 <zmike> I assume the kernel is writing to it?

21:16 <karolherbst> yeah.. I think so

21:16 <karolherbst> yep

21:17 <zmike> ok and can you verify whether this is a BAR allocation that zink is doing?

21:17 <zmike> ie. when it maps is it directly mapping

21:18 <zmike> because you're creating it with usage=0, which means it should be attempting BAR

21:18 <zmike> which should be host-visible

21:19 Jeremy_Rand_Talos__ has quit [Remote host closed the connection]

21:19 Jeremy_Rand_Talos__ has joined #dri-devel

21:20 <zmike> ahhhh

21:20 <zmike> you're not creating it as persistent!

21:20 <karolherbst> because it isn't

21:20 <zmike> the mapping?

21:20 <zmike> it sure seems to be

21:21 <karolherbst> it's DIRECTLY | UNSYNCHRONIZED

21:21 <zmike> that's not enough if you're expecting to read data out of it across a dispatch

21:21 <karolherbst> I don't, I want the current data, that's all

21:21 <zmike> but you're running a kernel that writes to it

21:21 <karolherbst> the application is responsible for syncing

21:21 <karolherbst> the application checks after the flushes happen

21:21 <karolherbst> I just have to map very very early

21:22 <zmike> I don't see anything that would cause this to sync?

21:22 <zmike> there's vulkan calls that should be happening here

21:22 <karolherbst> the flush and fence_finish?

21:22 <zmike> yeah but I think you still need to flush/invalidate teh mapped memory range

21:22 <karolherbst> mhhh.. that might be

21:22 <zmike> which is why you need persistent

21:22 <zmike> or at least transfer_flush_region

21:22 <karolherbst> but I don't want persistent

21:23 <karolherbst> yeah.. I guess I have to use transfer_flush_region

21:23 <karolherbst> but that's all only for WRITE access and not READ :/

21:23 <karolherbst> it's very annoying

21:23 <karolherbst> but it might not matter

21:24 <karolherbst> so how I understand transfer_flush_region is that when a resource is created with FLUSH_EXPLICIT that cached data is only _written_ back on transfer_flush_region

21:25 <zmike> well you can see if this fixes it by just setting persistent on the map

21:25 <karolherbst> anyway.. not using "DIRECTLY | UNSYNCHRONIZED" on non UMA systems, so RADV should get its shadow buffer and that should work all fine regardless

21:25 <zmike> and nothing else

21:25 <karolherbst> sure, but I absolutely don't want to set persistent

21:25 <zmike> nothing else extra*

21:25 <zmike> yeah yeah but for testing

21:25 <karolherbst> just the map or also the resource?

21:25 <zmike> map

21:26 <karolherbst> still fails

21:26 <zmike> hm

21:27 <zmike> what's the memory_barrier call there for?

21:27 <karolherbst> just something clover was doing after launch_grid and I copied it over

21:27 <karolherbst> might not be needed or might be.. dunno

21:28 <zmike> you only need that if you're synchronizing between gpu operations

21:28 <zmike> which this doesn't appear to be doing

21:28 <zmike> not that it's harmful

21:28 <karolherbst> yeah.. perf opts are for later and stuff

21:28 <zmike> hm

21:28 <zmike> not really a perf opt, zink will no-op it anyway

21:28 <karolherbst> ahh

21:29 <zmike> definitely looks like something weird happening

21:29 <karolherbst> so the only real requierement I want to have is that after flush + fence_finish I want the current data to be visible on the mapped ptrs

21:29 <karolherbst> and normally I only try to map directly when I know it's safe

21:30 <karolherbst> like on UMA systems

21:30 <zmike> yeah the only thing I can think of is that you're not getting a real map of the buffer somehow

21:30 <karolherbst> don't do it on discrete GPUs e.g.

21:30 <karolherbst> might be

21:30 <karolherbst> but zink doesn't seem to shadow it either

21:30 <zmike> but if you've stepped through zink_buffer_map you can check that pretty easily

21:31 <zmike> have you tried mapping with just PIPE_MAP_READ and not also PIPE_MAP_WRITE

21:31 <karolherbst> yeah.. it calls map_resource on the res directly

21:31 <karolherbst> huh.. I could try that, but it does look like it gives me the real deal directly

21:31 <zmike> alright

21:31 <karolherbst> so

21:32 <karolherbst> I know a way of fixing

21:32 <karolherbst> it

21:32 <karolherbst> if I just stall the pipeline the tests are passing

21:32 <zmike> not sure then, I'd probably have to look at it

21:32 <zmike> stall the pipeline?

21:32 <karolherbst> like sleeping in my worker thread on the second context

21:32 mhenning has joined #dri-devel

21:32 <karolherbst> helper context stuff happens on the application thread, actual context stuff (cl_queue) happens inside a worker thread

21:33 <karolherbst> so probably just a sync issue between the two contexts or something?

21:33 <zmike> that's incredibly bizarre

21:33 <karolherbst> I could sleep and make another trace

21:33 <karolherbst> maybe it looks differente nough to spot something

21:34 <zmike> oh hold up

21:34 <zmike> 539 pipe_context::flush(pipe = pipe_2, flags = 0) = ret_5 // time 225

21:34 <zmike> 540 pipe_context::flush(pipe = pipe_2, flags = 0) = fence_5 // time 2

21:34 <zmike> 543 pipe_screen::fence_finish(screen = screen_3, ctx = NULL, fence = fence_5, timeout = 18446744073709551615) = 1 // time 3

21:34 <zmike> 541 pipe_screen::fence_finish(screen = screen_3, ctx = NULL, fence = fence_4, timeout = 18446744073709551615) = 1 // time 2

21:34 <zmike> ret_5 never gets waited on

21:34 <zmike> why are you flushing twice anyway

21:34 <karolherbst> silly reasons

21:35 <zmike> try waiting on that fence too

21:35 <karolherbst> do I have to wait on all fences or just the last one?

21:35 <karolherbst> okay...

21:35 <zmike> dunno, just brainstorming

21:35 <zmike> could try waiting on every fence after flushing and see what happens

21:35 <zmike> maximum sync

21:37 <karolherbst> heh.. why am I not waiting on the ret_5 fence.. that's like odd...

21:49 <karolherbst> zmike: okay.. so that fixes it indeed

21:49 <karolherbst> I am wondering if something overwrites a fence or something stupid

21:49 <karolherbst> the trace looks odd

21:50 <zmike> the variable names get reused sometimes

21:50 <karolherbst> yeah.. that would be weird here.. I am sure it's something terribly stupid

21:51 Surkow|laptop has quit [Ping timeout: 480 seconds]

21:55 rasterman has quit [Quit: Gettin' stinky!]

21:57 danvet has quit [Ping timeout: 480 seconds]

21:59 Surkow|laptop has joined #dri-devel

22:02 sarnex has quit [Quit: Quit]

22:03 <karolherbst> ohh seems like the process crashed to fast, it actually waited on the fence

22:03 <karolherbst> just in a different order, which shouldn't matter I guess

22:04 carbonfiber has quit [Quit: Connection closed for inactivity]

22:07 lemonzest has quit [Quit: WeeChat 3.6]

22:08 <karolherbst> mhhh.. one trace with broken and working test: https://gist.github.com/karolherbst/808d0d000df3bc6880f89427bf9ce2ae

22:08 <karolherbst> the difference isn't all that huge

22:09 <karolherbst> which really stands out is the wait on fence_4

22:10 <karolherbst> still don't see why that would be my bug...

22:12 <zmike> so what was it that made the test pass? or you just tried with sleep

22:13 <karolherbst> I call fence_finish immediately after flush

22:13 <zmike> did you try incrementally removing them to see which fence was the problem?

22:13 <karolherbst> it can only be one

22:13 <zmike> ?

22:14 <karolherbst> the trace is almost identical, just that one wait moves quite a bit to the bottom

22:14 sarnex has joined #dri-devel

22:14 <zmike> so...the 4th flush?

22:14 <karolherbst> yeah

22:14 <zmike> and if you fence_finish immediately after then it passes?

22:14 <zmike> but like this it fails?

22:14 <karolherbst> yes

22:14 <zmike> hm

22:14 <karolherbst> yeah...

22:15 <karolherbst> zink does some map/unmap dance in buffer_subdata though

22:15 <zmike> yeah

22:15 <zmike> sounds to me like the subdata call isn't synchronizing right

22:15 <karolherbst> probably

22:15 <zmike> did you check to make sure it isn't discarding?

22:15 <alyssa> zmike: shouldn't you be ~~enjoying the weekend~~ working out or something

22:15 <alyssa> no?

22:15 <karolherbst> I didn't check what buffer_Subdata is doing

22:15 <zmike> I was at the gym like 12 hours ago

22:15 <alyssa> great I have an MR for you then thanks

22:16 <karolherbst> :D

22:16 <karolherbst> will look more deeper into buffer_subdata

22:16 <alyssa> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18651

22:16 <karolherbst> ehh.. deeper

22:17 <karolherbst> alyssa: btw, I have something for you: https://gitlab.freedesktop.org/karolherbst/mesa/-/commit/18ef7406840cc467f12407e014f4953d99e4bb26

22:17 <alyssa> glcts-nvi10-valve isn't running so I assume the code works perfectly with zero bugs

22:17 <karolherbst> with that you could drop NIR_SERIALIZED. will probably create the MR later or something

22:18 <alyssa> karolherbst: nice

22:18 <karolherbst> just requires a RUSTICL_ENABLE=panfrost to use CL though then, but I guess that's fair :)

22:18 <karolherbst> and all drivers use the same thing then, which is nice

22:18 <karolherbst> (TODO: need to add it to the doc)

22:19 <alyssa> +1

22:19 <alyssa> what's the : and ; syntax for

22:19 <karolherbst> RUSTICL_ENABLE=radeonsi:0;1,nouveau:0,iris

22:19 <karolherbst> to select devices

22:20 <karolherbst> without : it enables all

22:20 <alyssa> uhhh okay

22:20 <karolherbst> with just : it disables all

22:20 <karolherbst> yeah.. somebody asked for that as a separate feature, but I thought I can just combine it to one env var

22:21 <karolherbst> should also help with zink and native not claiming the same device

22:25 <karolherbst> sooo.. zink is using PIPE_MAP_DISCARD_RANGE.. and also sets UNSYNCHRONIZED, DISCARD_WHOLE_RESOURCE.. but it's still ending up with a direct map_resource thing

22:25 djbw has quit [Read error: Connection reset by peer]

22:26 <zmike> hm

22:26 <zmike> I don't have a great explanation then

22:26 <karolherbst> maybe MAP_ONCE changes things...

22:27 <zmike> subdata should be fine on its own

22:27 <karolherbst> yeah... dunno

22:27 <karolherbst> doesn't seem to be though

22:27 <zmike> I can only speculate that the flush+fence is forcing mapped memory invalidation/flush

22:29 <karolherbst> mhhh... yeah, let me debug a bit further

22:30 djbw has joined #dri-devel

22:37 pcercuei has quit [Quit: dodo]

22:44 <alyssa> lowering away seamless cubemap gathers sounds intense ...

22:46 * zmike closes irc for the weekend

22:47 <alyssa> hf

22:48 MajorBiscuit has joined #dri-devel

22:59 <karolherbst> ahhhhhhhh

23:00 <karolherbst> what is that?!? https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/gallium/auxiliary/driver_trace/tr_context.c#L1806

23:00 <karolherbst> :(

23:00 <karolherbst> that's the second subdata I was seeing and I started to think I am crazy because gdb never actually hit a second one....

23:01 <karolherbst> yo, how evil is that

23:09 Haaninjo has quit [Quit: Ex-Chat]

23:23 sarnex has quit [Quit: Quit]

23:31 DemiMarieObenour[m] is now known as DemiMarie

23:36 MajorBiscuit has quit [Ping timeout: 480 seconds]

23:41 sarnex has joined #dri-devel

23:50 oneforall2 has quit [Read error: Connection reset by peer]