#dri-devel on 2024-02-14 — irc logs at oftc.irclog.whitequark.org

2022-12-21 00:45 ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar

00:33 pcercuei has quit [Quit: dodo]

00:53 iive has quit [Quit: They came for me...]

01:13 Haaninjo has quit [Quit: Ex-Chat]

01:29 alanc has quit [Remote host closed the connection]

01:30 alanc has joined #dri-devel

01:34 co1umbarius has joined #dri-devel

01:35 columbarius has quit [Read error: Connection reset by peer]

01:44 Waterr has joined #dri-devel

01:47 Waterr has quit [Killed (MoranServ (Possible spambot -- mail support@oftc.net with questions.))]

01:49 YuGiOhJCJ has joined #dri-devel

02:04 apinheiro has quit [Quit: Leaving]

02:15 pixelcluster_ has joined #dri-devel

02:19 pixelcluster has quit [Ping timeout: 480 seconds]

02:27 tertl8 has joined #dri-devel

02:54 Company has quit [Quit: Leaving]

02:55 Jeremy_Rand_Talos has quit [Remote host closed the connection]

02:56 Jeremy_Rand_Talos has joined #dri-devel

03:18 edt_ has joined #dri-devel

03:45 kts has joined #dri-devel

03:53 kts has quit [Ping timeout: 480 seconds]

03:53 aravind has joined #dri-devel

03:58 davispuh has quit [Ping timeout: 480 seconds]

04:01 Mangix has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]

04:01 kts has joined #dri-devel

04:01 Mangix has joined #dri-devel

04:18 heat is now known as Guest2714

04:18 Guest2714 has quit [Read error: Connection reset by peer]

04:18 heat has joined #dri-devel

04:29 bmodem has joined #dri-devel

04:30 heat has quit [Ping timeout: 480 seconds]

04:31 kts has quit [Ping timeout: 480 seconds]

04:45 aravind has quit [Ping timeout: 480 seconds]

04:46 TMM has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]

04:46 TMM has joined #dri-devel

05:00 anujp has quit [Ping timeout: 480 seconds]

05:31 aravind has joined #dri-devel

05:35 <DemiMarie> Is it possible to identify who is to blame for a GPU hang? In native contexts it would be useful to be able to determine which VM is at fault and ban it from using the GPU until the user says otherwise.

05:38 junaid has joined #dri-devel

05:46 ity has quit [Remote host closed the connection]

05:46 ity has joined #dri-devel

06:03 fab has joined #dri-devel

06:11 itoral has joined #dri-devel

06:25 junaid has quit [Remote host closed the connection]

06:29 macromorgan has quit [Read error: Connection reset by peer]

06:31 macromorgan has joined #dri-devel

06:33 lemonzest has quit [Quit: WeeChat 4.2.1]

06:42 lemonzest has joined #dri-devel

06:44 YuGiOhJCJ has quit [Remote host closed the connection]

06:44 YuGiOhJCJ has joined #dri-devel

06:50 dorcaslitunyaVM has joined #dri-devel

07:01 kts has joined #dri-devel

07:02 fab has quit [Quit: fab]

07:05 itoral_ has joined #dri-devel

07:12 itoral has quit [Ping timeout: 480 seconds]

07:13 illwieckz has quit [Quit: I'll be back!]

07:14 kts has quit [Ping timeout: 480 seconds]

07:19 loki_val has joined #dri-devel

07:21 kzd has quit [Ping timeout: 480 seconds]

07:22 crabbedhaloablut has quit [Ping timeout: 480 seconds]

07:25 kts has joined #dri-devel

07:28 kts has quit [Read error: Connection reset by peer]

07:34 sima has joined #dri-devel

07:35 illwieckz has joined #dri-devel

07:37 kts has joined #dri-devel

07:42 rgallaispou has joined #dri-devel

07:44 tomba_ has joined #dri-devel

07:50 fab has joined #dri-devel

07:58 tzimmermann has joined #dri-devel

08:00 sghuge has quit [Remote host closed the connection]

08:00 sghuge has joined #dri-devel

08:04 glennk has joined #dri-devel

08:05 mvlad has joined #dri-devel

08:07 tursulin has joined #dri-devel

08:09 crabbedhaloablut has joined #dri-devel

08:09 frieder has joined #dri-devel

08:11 loki_val has quit [Ping timeout: 480 seconds]

08:14 kts has quit [Ping timeout: 480 seconds]

08:14 pjakobsson has joined #dri-devel

08:17 sukrutb has joined #dri-devel

08:23 sukrutb has quit [Remote host closed the connection]

08:23 sukrutb has joined #dri-devel

08:31 sukrutb has quit [Ping timeout: 480 seconds]

08:32 hansg has joined #dri-devel

08:55 lynxeye has joined #dri-devel

08:57 Haaninjo has joined #dri-devel

08:57 simondnnsn has quit [Read error: Connection reset by peer]

08:57 simondnnsn has joined #dri-devel

08:58 vliaskov has joined #dri-devel

09:01 bolson_ has quit [Ping timeout: 480 seconds]

09:01 <tzimmermann> jani, thanks for reviewing my fbdev heeader cleanup. may i ask you to give an ack to the additional patch in v2: https://patchwork.freedesktop.org/patch/578025/?series=129765&rev=2

09:02 <jani> tzimmermann: ack; already looked at it but got distracted and forgot to reply

09:02 <tzimmermann> thanks

09:02 itoral_ has quit [Ping timeout: 480 seconds]

09:05 itoral_ has joined #dri-devel

09:18 simon-perretta-img has quit [Ping timeout: 480 seconds]

09:20 simon-perretta-img has joined #dri-devel

09:29 rasterman has joined #dri-devel

10:00 pcercuei has joined #dri-devel

10:01 CME has quit [Ping timeout: 480 seconds]

10:05 Haaninjo has quit [Quit: Ex-Chat]

10:12 cmichael has joined #dri-devel

10:25 glennk has quit [Ping timeout: 480 seconds]

10:41 glennk has joined #dri-devel

10:46 aravind has quit [Ping timeout: 480 seconds]

10:51 simondnnsn has quit [Read error: Connection reset by peer]

10:52 simondnnsn has joined #dri-devel

11:01 xq has joined #dri-devel

11:17 <samuelig> cmarcelo, go ahead

11:17 shoragan has quit [Read error: Network is unreachable]

11:17 CME has joined #dri-devel

11:19 shoragan has joined #dri-devel

11:19 <samuelig> cmarcelo, would you like me or somebody else to review them?

11:24 shoragan has quit [Read error: Network is unreachable]

11:24 shoragan has joined #dri-devel

11:29 CME_ has joined #dri-devel

11:34 CME has quit [Ping timeout: 480 seconds]

11:57 kts has joined #dri-devel

12:05 DodoGTA has quit [Remote host closed the connection]

12:07 Duke`` has joined #dri-devel

12:10 bmodem has quit [Ping timeout: 480 seconds]

12:10 pH5 has quit [Read error: Network is unreachable]

12:10 pH5 has joined #dri-devel

12:16 pH5 has quit [Read error: Connection reset by peer]

12:16 pH5 has joined #dri-devel

12:17 DodoGTA has joined #dri-devel

12:31 <karolherbst> DemiMarie: it might be potentially be possible for some drivers and some faults

12:33 <karolherbst> mhh maybe "some drivers" is too optimistic, let's say "some GPUs"

12:37 kts_ has joined #dri-devel

12:41 dviola has joined #dri-devel

12:42 kts has quit [Ping timeout: 480 seconds]

13:06 YuGiOhJCJ has quit [Quit: YuGiOhJCJ]

13:08 kts_ has quit [Ping timeout: 480 seconds]

13:11 fireburn has joined #dri-devel

13:12 psykose has joined #dri-devel

13:12 bmodem has joined #dri-devel

13:13 itoral_ has quit [Remote host closed the connection]

13:15 pixelcluster_ has quit []

13:15 pixelcluster has joined #dri-devel

13:17 Calandracas has quit [Remote host closed the connection]

13:21 simon-perretta-img has quit [Ping timeout: 480 seconds]

13:21 simon-perretta-img has joined #dri-devel

13:23 Leopold___ has joined #dri-devel

13:26 dorcaslitunyaVM has quit [Remote host closed the connection]

13:29 simon-perretta-img has quit [Ping timeout: 480 seconds]

13:29 simon-perretta-img has joined #dri-devel

13:30 Leopold____ has joined #dri-devel

13:30 Leopold_ has quit [Ping timeout: 480 seconds]

13:32 Leopold___ has quit [Ping timeout: 480 seconds]

13:41 thaytan has quit [Ping timeout: 480 seconds]

14:02 sknebel has quit [Read error: Connection reset by peer]

14:02 simondnnsn has quit [Read error: Connection reset by peer]

14:02 Company has joined #dri-devel

14:03 sknebel has joined #dri-devel

14:04 simondnnsn has joined #dri-devel

14:08 <zmike> alyssa kusma: I assume at least one of you will be on call today for https://gitlab.freedesktop.org/mesa/mesa/-/issues/10550 ?

14:09 <zmike> / https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27521

14:14 <alyssa> zmike: wasn't planning on it but I can be if you want

14:15 <zmike> I figured since it was so contentious that at least someone would show up, but maybe I didn't skim hard enough to see through the argumentation

14:17 krumelmonster has quit [Ping timeout: 480 seconds]

14:25 <alyssa> I just don't think we should be breaking piles of apps

14:25 krumelmonster has joined #dri-devel

14:31 thaytan has joined #dri-devel

14:41 yrlf has quit [Quit: Ping timeout (120 seconds)]

14:41 yrlf has joined #dri-devel

14:43 heat has joined #dri-devel

14:46 tertl8 has quit [Quit: Connection closed for inactivity]

14:49 edt_ has quit []

14:49 Calandracas has joined #dri-devel

14:53 fab has quit [Quit: fab]

15:09 Calandracas has quit [Remote host closed the connection]

15:13 <mareko> is the mesa-commit list discontinued?

15:15 <mareko> zmike: I think I can file cts tickets

15:15 <mareko> just haven't doen it

15:15 <daniels> mareko: yep

15:16 OftenTimeConsuming has quit [Remote host closed the connection]

15:17 OftenTimeConsuming has joined #dri-devel

15:23 bolson has joined #dri-devel

15:24 <mareko> daniels: somebody asked me about it, I didn't even know that it existed :)

15:25 kzd has joined #dri-devel

15:27 <DemiMarie> karolherbst: which GPUs?

15:28 <karolherbst> unknown

15:29 <DemiMarie> Any guesses?

15:29 <karolherbst> nope

15:29 * DemiMarie wishes GPUs had full preemption

15:29 <karolherbst> that's not the problem :)

15:29 <karolherbst> even if they had, how would the kernel know what crashed the GPU if the GPU gets suddenly into a weird state and returns nonsense?

15:30 <karolherbst> or the firmware returning nonsense

15:30 <karolherbst> or not responding

15:30 tobiasjakobi has joined #dri-devel

15:30 <DemiMarie> The GPU should never be able to be put in such a state unless the kernel driver is buggy.

15:30 <karolherbst> well..

15:31 <karolherbst> it's all software in the end

15:31 <karolherbst> and software has bugs

15:31 <karolherbst> GPU not getting into a weird state is like saying "computers shouldn't be able to get into a weird state"

15:31 tobiasjakobi has quit []

15:32 <DemiMarie> karolherbst: suppose we ignore KMD and FW bugs for now

15:32 <karolherbst> still

15:33 <karolherbst> you ask for bugfree computers

15:33 <DemiMarie> No

15:33 <karolherbst> a GPU in itself can be considered a full computer.. maybe not a personal one, but definetly on the embedded level

15:33 <DemiMarie> Or my question is bad

15:35 <DemiMarie> So on the CPU, if there is a fault (such as accessing junk memory), the hardware gets control and tells the kernel enough information for the kernel to know what the fault was and what process did it.

15:35 <karolherbst> we all have to accept that an embedded system can and will crash in a way that you can only power cycle it to recover

15:35 <karolherbst> you can't compare GPUs to CPUs

15:35 Calandracas has joined #dri-devel

15:35 <DemiMarie> Why?

15:35 <karolherbst> because GPUs are way more complex

15:35 <DemiMarie> Why?

15:36 <karolherbst> because GPUs are more like embedded devices

15:36 <DemiMarie> Wh

15:36 <DemiMarie> y?

15:36 <karolherbst> I got tired :) have fun

15:36 <sima> gpu is essentially a distributed network, there's a pile of things that send messages around, and eventually they reach a network node that talks to the memory subsystem

15:36 <sima> that's the point you get the hw fault

15:37 <DemiMarie> Am I asking questions that only the HW vendor can answer?

15:37 <sima> and the kernel pretty much has to be able to preempt, or things will go sideways due to priority/locking inversion issues

15:37 <sima> so unlike a cpu, where you can preempt a single node, for a gpu you have to preempt that entire distributed network

15:37 <sima> including all the in-flight messages

15:37 <sima> DemiMarie, ^^ and safe/restore of what essentially is a cluster is just too hard

15:37 <mattst88> I think since GPUs are expected to be programmed by drivers, their designers are able to not ensure that they're impossible to wedge (unlike a CPU, where that would be basically unforgivable)

15:37 <DemiMarie> sima: I thought CPUs are also distributed internally. Do they just put more work into hiding it?

15:38 <sima> DemiMarie, they're a lot less distributed, and they have enormous amounts of magic to hide their distributed nature from the application

15:38 <sima> because the ISA isn't distributed

15:38 <sima> so you can reuse that for preempting the entire thing

15:38 <sima> with gpu you actually send around these messages explicitly in shaders

15:38 <sima> (well most of them at least)

15:43 <DemiMarie> mattst88: hopefully it will be less forgivable over time, given the rising use of GPUs in secure situations.

15:43 rz_ has quit [Ping timeout: 480 seconds]

15:43 rz has joined #dri-devel

15:45 <DemiMarie> sima: Ah, so that is why one needs memory barriers in so many situations where a CPU would never need them.

15:45 <sima> yeah that's one aspect

15:46 <DemiMarie> How does preemption work for compute then?

15:46 <sima> but the overall one that makes preempt/page fault so hard really is that it's a network of nodes having a big chat, and memory i/o is just one of them - kinda like you have storage servers in a cluster

15:46 <cmarcelo> samuelig: thanks. just this ACK here is good for me

15:46 <sima> DemiMarie, badly :-)

15:47 <sima> no idea how it works on others, but on intel, where it does work essentially preempt sends an interrupt to all the compute cores

15:47 <sima> and they run a special shader which essentially stores the entire thread state into memory somewhere

15:47 <sima> but that means this special preempt shader and your compute kernel need to cooperate

15:48 <sima> and preemption is kinda voluntary

15:48 <sima> and it only works for nodes which support it, and because things get messy for 3d fixed function they just dont

15:48 <DemiMarie> sima: why do they need to cooperate, beyond the need for a memory buffer to save the state?

15:49 <sima> the hw cannot actually save/restore itself, it's kinda like the kernel asking userspace to please store all it's state, so that it can nuke the process

15:49 <sima> and then on restart it asks userspace again to recreate everything

15:50 <sima> it's a giantic mess afaiui

15:50 <DemiMarie> Could the KMD provide the preempt shader?

15:50 <sima> afaiui it's tied to how you compile the compute shader

15:50 <DemiMarie> Or just say, “you have X amount of time before I blow you away?”

15:50 <sima> like if you yolo too much in-flight stuff you can't restore that on the other side

15:50 <sima> yeah it's a timeout

15:51 <sima> but the timeout is huge because register state is like a few mb

15:51 <DemiMarie> What is the timeout?

15:51 <sima> more than a frame iirc

15:51 <DemiMarie> Ugh

15:51 <sima> so forget anything remotely realtime

15:51 <DemiMarie> Hopefully future GPUs will support full preemption of everything.

15:51 <sima> it's getting less

15:52 <DemiMarie> That’s good

15:52 <sima> like both intel and amd are switching to the model where any pending page fault prevents preemption

15:52 <DemiMarie> What do you mean?

15:52 <sima> because the thing that hits the page fault is a few network hops away from the compute cores that can save/restore

15:52 <sima> so you cannot preempt while any fault is pending

15:52 <DemiMarie> So that just means no page faults allowed.

15:52 <DemiMarie> Pin all memory.

15:52 <sima> nah it just means kmd engineers are crying a lot

15:53 <DemiMarie> Why?

15:53 <DemiMarie> Because pre-pinned memory management is terrible?

15:53 <sima> yeah

15:53 <sima> so you get to pick essentially between "you get page faults, no pinning yay" and "you get preempt, no gpu hogging, yay"

15:54 <sima> but not at the same time

15:54 kts has joined #dri-devel

15:54 <sima> which is a full on inversion of the linux kernel virtual memory handling

15:54 <DemiMarie> For Qubes OS we will pick the latter every time

15:54 <DemiMarie> So make guests pin all their buffers.

15:55 <sima> so you don't necessarily need to pin everything, you /just/ need to guarantee there's enough memory around to fix up any faults while you try to preempt

15:55 <DemiMarie> Lots of room for bugs 🙂.

15:55 <DemiMarie> In the Qubes case, the memory will be granted by a Xen guest, so it is already pinned.

15:55 <sima> yeah it's one of these "great for tech demo, so much pain to ship" things

15:56 <DemiMarie> sima: a sysctl or module opt to just force pinning would be nice

15:56 <sima> plan for linux is that you get cgroups to make sure there's not memory thrashing that hurts

15:56 <DemiMarie> That way those who do not need page faults can avoid the attack surface.

15:56 <sima> but we had that for like decade plus as a plan by now :-/

15:57 bmodem has quit [Ping timeout: 480 seconds]

15:57 <DemiMarie> sima: why is pre-pinned memory management so bad?

15:57 <sima> people love memory overcommit

15:58 <sima> with pinning everything you need enough memory for everything for its worst case

15:58 <DemiMarie> Make that userspace’s job

15:58 <DemiMarie> It can pin and unpin buffers at will

15:58 <sima> which is a lot more than just enough for the average case and let the kernel balance usage

15:59 <sima> but in the end, if you want real-time then "pin it all" really is the only option

15:59 <sima> irrespective of gpu or not

16:00 <DemiMarie> Hence why PipeWire has an option for mlockall()

16:01 fab has joined #dri-devel

16:01 <DemiMarie> sima: is the kernel command-line option to disable overcommit a reasonable idea?

16:01 <DemiMarie> Or could it be emulated in the native context code?

16:06 <DemiMarie> This also explains Apple’s decision to make AR fully declarative: it means they can guarantee real-time behavior, because there are no app-provided shaders in the code that updates what the user sees in response to the real-world changing.

16:09 hansg has quit [Quit: Leaving]

16:10 <DemiMarie> sima: thanks for taking some time to explain all of this!!! It means a lot to me.

16:19 Leopold____ has quit [Remote host closed the connection]

16:23 Leopold_ has joined #dri-devel

16:24 davispuh has joined #dri-devel

16:33 macromorgan has quit [Read error: Connection reset by peer]

16:34 macromorgan has joined #dri-devel

16:34 macromorgan has quit []

16:37 macromorgan has joined #dri-devel

16:40 anujp has joined #dri-devel

16:55 Leopold_ has quit [Remote host closed the connection]

16:56 davispuh has quit [Quit: http://quassel-irc.org - Chat comfortably. Anywhere.]

17:00 Leopold has joined #dri-devel

17:00 larunbe has joined #dri-devel

17:01 davispuh has joined #dri-devel

17:04 alarumbe has quit [Ping timeout: 480 seconds]

17:05 sukrutb has joined #dri-devel

17:07 Aura has joined #dri-devel

17:11 Leopold has quit [Remote host closed the connection]

17:12 Dark-Show has quit [Quit: Leaving]

17:12 Leopold has joined #dri-devel

17:14 cmichael has quit [Quit: Leaving]

17:16 kts has quit [Ping timeout: 480 seconds]

17:22 <abhinav__> vsyrjala jani gentle reminder on https://patchwork.kernel.org/project/dri-devel/patch/20240212173355.1857757-1-quic_abhinavk@quicinc.com/ pls

17:23 <DemiMarie> So for my use-case, I care less about ensuring that the GPU doesn’t crash (so long as the crash is non-exploitable) as I do about ensuring that there is someone to blame.

17:24 <DemiMarie> I want the userspace VMM to be able to distinguish between `GUILTY_CONTEXT` and `INNOCENT_CONTEXT`>

17:35 simondnnsn has quit [Read error: Connection reset by peer]

17:38 <alyssa> karolherbst: dj-death: forgot about the "designated initializers cause spilling because nir_opt_memcpy chokes" issue

17:38 <alyssa> gfxstrand: karolherbst and I talked about it back in October or so, and then I think we forgot, or at least I did

17:38 <karolherbst> same

17:38 sukrutb has quit [Ping timeout: 480 seconds]

17:39 <karolherbst> but did faiths MR helped with that? I think there was more to do, like... something with opt_memcpy or something

17:40 <alyssa> I don't think it did but I might not have testeed

17:40 <karolherbst> it also kinda depends when you run it

17:41 <karolherbst> ohh right.. there was `copy_deref` doing something similar

17:41 <karolherbst> and memcpy lowering could translate memcpy_deref to copy_deref or something

17:42 <karolherbst> in any case...

17:42 <jenatali> Right, you have to lower vars to explicit types, and then opt_memcpy should be able it to turn it into a copy

17:42 <karolherbst> if the copy between derefs is huge, you can end up with tons of live values

17:42 <karolherbst> mhh yeah.. I might have to check again as I did reorder things again

17:43 <jenatali> (But then you want to erase scratch size and lower vars to explicit types again to recompute how much scratch is actually needed after that optimization...

17:43 <jenatali> )

17:43 <karolherbst> but yeah.. we need to lower those copies to loops, not unroll them directly

17:43 <alyssa> iirc there were a bunch of related memcpy issues I hit

17:47 <DemiMarie> karolherbst: sorry for tiring you with the repeated “why”.

17:48 <karolherbst> don't worry, I was also literally tired anyway and had to finish other things

17:50 jkrzyszt has joined #dri-devel

18:13 dv_ has quit [Read error: Connection reset by peer]

18:13 tzimmermann has quit [Quit: Leaving]

18:14 dv_ has joined #dri-devel

18:15 rasterman has quit [Quit: Gettin' stinky!]

18:17 Dark-Show has joined #dri-devel

18:20 <DemiMarie> It now makes more sense why graphics is not preemptable: It must be so fast (to avoid user complaints) that it is faster to simply wipe out the state and force applications to recompute it.

18:21 <DemiMarie> Reset can be done by a broadcast signal in hardware that forces everything to a known state, irrespective of what state it had been in. This is much simpler and cheaper (both in time and in transistors) than trying to save that state for later restoration.

18:23 <zmike> mareko pepp: it looks like radeonsi doesn't do any kind of checking with sample counts for e.g., rgb9e5? so si_is_format_supported will return true for sample_count==8

18:24 <zmike> is this somehow intended? I'm skeptical that you really support this

18:25 <mareko> zmike: it's always supported by shader images, it's supported by render targets only if the format is supported

18:26 <zmike> you support multisampled rgb9e5 in shader images?

18:26 <mareko> yes

18:26 <zmike> huh

18:26 <zmike> and textures?

18:26 <mareko> texelFetch only

18:27 <zmike> hm

18:27 <mareko> it's the same for all MSAA

18:28 <mareko> it's just 32bpp memory layout with a 32bpp format, then the texture and render hw just needs to handle the format conversion to/from FP32 and FP16

18:29 <sima> DemiMarie, afaik that's also what the big compute cluster people do, with checkpoints to permanent storage thrown in often enough that you don't have to throw away too much computation time when something, anyhting really, goes wrong

18:29 <sima> so also reset and recover from known-good application state, including anything gpus do

18:32 <DemiMarie> sima: that indeed makes sense. IIUC it is expected to have to change at some point, because failures will be too frequent, but I’m not sure if that point has been reached yet.

18:32 <DemiMarie> sima: interestingly AGX can do resets so quickly that one can reset every frame and still have a usable desktop.

18:33 <DemiMarie> That’s what Asahi did before Lina figured out TLB flushing.

18:33 <sima> yeah reset tends to be fairly quick, the slow part is waiting long enough to not piss of users too much about the fallout

18:34 jeeeun841351908 has quit []

18:34 <DemiMarie> Waiting for what?

18:34 <DemiMarie> Also, is there any information available as to what caused the reset?

18:35 <DemiMarie> I want to throw up a dialog to the user saying, “VM X crashed the GPU.”

18:35 <sima> arb_robustness tells you why you died (i.e. guilty or innocent collateral damage)

18:35 <sima> but it's very driver specific, and you need to be the userspace that created the gpu ctx

18:36 <sima> plus I think aside from amdgpu and intel no one even implements that, you just get a "you died"

18:36 <DemiMarie> Does Vulkan have something similar, and is this information something that the native context implementation could collect?

18:36 <sima> if that

18:37 <DemiMarie> sima: thankfully AMDGPU and Intel are the ones I care about the most by far

18:38 jeeeun841351908 has joined #dri-devel

18:38 <DemiMarie> sima: is this a hardware or driver limitation?

18:41 <sima> https://registry.khronos.org/vulkan/specs/1.3-extensions/man/html/VK_EXT_device_fault.html only thing I've found with a bit of googling, but that doesn't give you info (at least at a glance) about issues caused by someone elese

18:41 <sima> DemiMarie, driver limitation generally, it's a lot of tricky corner cases

18:42 <sima> also if you're asking about why there's innocent ctx getting nuked, that's usually a mix of hw and driver limitations

18:42 <dj-death> jenatali: have you been doing much with LLVM 17?

18:42 <jenatali> dj-death: Nothing at all

18:43 <dj-death> jenatali: just hitting some interesting casts from the translation

18:43 <dj-death> jenatali: whereas the LLVM 16 was doing more deref_struct

18:43 <jenatali> At this point my goal is to stay on LLVM 15 as long as possible and let you all shake out all the problems with newer LLVM versions :)

18:43 <dj-death> jenatali: that's preventing a bunch of optimizations

18:43 <dj-death> ahaha :)

18:43 <dj-death> nice

18:44 <dj-death> I guess we could put an upper bound on the LLVM version

18:44 <jenatali> One of the nice benefits of being on Windows and being forced to statically link LLVM is I get to control when we update

18:44 <jenatali> And it takes hours to build so I'm not inclined to do it often, especially if it risks issues like this

18:44 <dj-death> yeah

18:44 <dj-death> I love linux distros

18:45 <Calandracas> packaging llvm is pain

18:46 <alyssa> jenatali: :clown:

18:46 tobiasjakobi has joined #dri-devel

18:46 tobiasjakobi has quit []

18:46 <Calandracas> espacially when some things like zig==15, while chromium>=17 and rust>=17

18:48 <Calandracas> I wish there was a cannonical way to support parrallel installations

18:57 sukrutb has joined #dri-devel

19:06 ptrc has quit [Remote host closed the connection]

19:07 ptrc has joined #dri-devel

19:14 <karolherbst> I mean.. it kinda works, you are just (sometimes) screwed if multiple versions end up in the same program

19:16 konstantin_ has joined #dri-devel

19:16 konstantin is now known as Guest2795

19:16 konstantin_ is now known as konstantin

19:16 <zmike> jenatali: good news, I think I'm accidentally fixing all those xfails I added a couple weeks ago

19:17 <jenatali> \o/

19:17 <jenatali> I'll be happy to review once you've got something for me to look at

19:19 <zmike> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27621

19:19 <zmike> I tried adding better samplecount checking to zink and ended up failing the same tests as you

19:20 Guest2795 has quit [Ping timeout: 480 seconds]

19:20 <jenatali> Hah, awesome

19:20 <zmike> the classic https://www.youtube.com/watch?v=AbSehcT19u0

19:21 <karolherbst> opaque pointers were a mistake, don't @ me

19:22 fab has quit [Quit: fab]

19:22 <HdkR> void* is as opaque as we need to be

19:22 frieder has quit [Remote host closed the connection]

19:27 tomba_ has quit [Ping timeout: 480 seconds]

19:43 <airlied> dj-death, karolherbst : I assume the changes in derefs is just opaque ptrs getting us different patterns?

19:43 <karolherbst> yes

19:43 <karolherbst> more or less

19:43 <karolherbst> and LLVM doing unholy optimizations due to that

19:44 <karolherbst> I really hope the SPIR-V backend doesn't give us the same issues...

19:44 <karolherbst> well.. with llvm-19 that is

19:45 <airlied> well it's just different patterns, apps could give us them we just have to keep up

19:45 <karolherbst> nah, in this case it's something LLVM does

19:45 <karolherbst> for apps that would be super unholy to do as well

19:46 <karolherbst> like.. if your first member of a struct is an int, nobody does this: "(int *)some_struct" instead of just "&some_struct->first_field"

19:46 <karolherbst> but that's what LLVM-17 is now giving us

19:46 <karolherbst> so yeah. in _theory_ apps could, but none actually would

19:47 <airlied> I admire your confidence

19:47 <karolherbst> I know I'm wrong, but I still want to believe

19:48 <karolherbst> anyway.. I think I fixed it with rusticl and maybe the same fix helps intel

19:48 <alyssa> i feel like i've done that in user code

19:48 <karolherbst> I _think_ I wrote this to fix it in rusticl: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27068/diffs?commit_id=f8966598940ad46fb1ff2cbd9c23013289ef0736

19:48 <karolherbst> but not 100%

19:48 <karolherbst> it was to fix some llvm-17 issue though

19:52 tertl8 has joined #dri-devel

19:55 heat is now known as Guest2800

19:55 Guest2800 has quit [Read error: Connection reset by peer]

19:55 heat has joined #dri-devel

19:56 <DemiMarie> Calandracas: statically link LLVM in all of its callers and ensure its symbols are never exported?

20:02 <Calandracas> that isn't really ideal. What we ended up doing is splitings all of the shared libs into their own packages for each version (libclang15 package provides libclang.so.15, libclang17 package provides libclang.so.17) etc.

20:02 <karolherbst> Calandracas: have you tried to see if it also works if all of them get loaded into the same process?

20:03 <karolherbst> because apparently that doesn't work in all distributions

20:03 <karolherbst> and it's actually a cursed use case

20:03 <karolherbst> but a real one

20:04 <Calandracas> The ugly part is needing to have a full toolchain for each llvm version. What alpine does is install each version in /usr/lib/llvm$VERSION/

20:04 <karolherbst> oh yeah, that as well

20:04 <Calandracas> Fedora installs the latest release to /usr but "compat" packages go to /usr/lib/llvm$VERSION

20:04 <karolherbst> and making sure that things like "CLANG_RESOURCE_DIR" stay consistent with the installation directory relative to the so file

20:05 <Calandracas> it then turns into a mess of symlinking things from /usr/lib/llvm$VERSION to /usr

20:05 <karolherbst> though I only got reports of fedora messing that up and only for some packages? kinda works if you compile mesa from git

20:05 <karolherbst> yeah....

20:05 <tnt> karolherbst: and then there are some applications doing stuff with RTLD_DEEPBIND that don't help either.

20:06 vliaskov has quit []

20:06 <karolherbst> Calandracas: in mesa we need to know where the resource path of clang is.. I think we finally have something that works with that MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25568

20:07 <karolherbst> the "realpath" part was needed for debian having weird symlinks :D

20:07 <Calandracas> for context, I'm talking about void linux, which just merged llvm17 last week

20:07 <airlied> in theory if you build all the llvm version with proper sym versioning it should work, in practice it's all screwed

20:07 <karolherbst> yeah...

20:08 <karolherbst> though at least on fedora I had three CL impls using a different llvm version (15/16/17) and it was fine....

20:08 <karolherbst> which surprised me tbh

20:09 <Calandracas> well its fine when llvm is only used a library, and applications can link whatever soname they want

20:09 <Calandracas> but having conflicting versions of llvm-config, cmake files, etc. is what causes issues

20:10 <karolherbst> yeah.... that's also painful

20:11 <Calandracas> + musl, armv6l, and armv7l patches

20:11 <Calandracas> but thats a different issue altogether

20:14 lynxeye has quit [Quit: Leaving.]

20:15 <dj-death> airlied: yes

20:17 <dj-death> karolherbst: I guess I could try to reproduce the order in which you run NIR passes

20:17 <dj-death> karolherbst: hopefully that works

20:17 <dj-death> I have doubts

20:18 <karolherbst> the idea behind my changes was that I need to run nir_lower_memcpy after explicit_types and some deref opt loop

20:19 tursulin has quit [Ping timeout: 480 seconds]

20:19 <karolherbst> and the deref opt loop does more opts if explicit type information exists

20:19 <karolherbst> s/loop//

20:22 Haaninjo has joined #dri-devel

20:36 krushia has quit [Quit: Konversation terminated!]

20:39 heat has quit [Remote host closed the connection]

20:40 heat has joined #dri-devel

20:52 <dj-death> karolherbst: I kind of end up with the same code if I pass the arguments by value instead of by pointer :)

20:52 <dj-death> stuff spills everywhere

20:56 Duke`` has quit [Ping timeout: 480 seconds]

21:08 simon-perretta-img has quit [Read error: Connection reset by peer]

21:09 simon-perretta-img has joined #dri-devel

21:13 <DemiMarie> airlied: considering that proprietary games might not use symbol versioning I am not surprised that there are problems there.

21:14 <jenatali> It's not really about symbol versioning, and if it was, it'd be about LLVM using symbol versioning. The problem is the lack of symbol namespacing

21:14 <karolherbst> dj-death: yeah.. that's LLVM being LLVM

21:15 <jenatali> You'd want component A in a process to use LLVM A, and component B to use LLVM B, but since the symbols are global, you can get the components using the wrong version

21:17 <karolherbst> it's kinda sad that LLVM IR stopped being useful for a middle-end IR :'(

21:17 <DemiMarie> jenatali: I thought symbol versioning solved that by including the versions (which are different) in the symbol that the dynamic linker looked up.

21:17 <karolherbst> now it's just a backend one

21:17 <jenatali> Demi: "Solved"

21:17 <jenatali> Still requires LLVM to put versions on their symbols

21:17 <DemiMarie> and they don’t?

21:17 <jenatali> Not AFAIK

21:17 <DemiMarie> could this be forced during compilation?

21:18 <DemiMarie> but yeah, until that is dealt with static linking seems like the safest option

21:19 <jenatali> 🤷‍♂️ My knowledge in this space is really just based on how things are different from Windows

21:19 <dj-death> karolherbst: so I suppose if I try to compile the same code with rusticl, I'll end up with spills every where

21:19 <DemiMarie> and any distro that doesn’t like it can deal with the bug reports

21:20 <vsyrjala> llvm even leaks its compile time options into the abi. i once tried to turn off the nvptx (or whatever) thing in my system llvm. everything linked against llvm stoped working because of some missing global symbol

21:21 <karolherbst> dj-death: maybe? Wouldn't surprise me, but it also kinda depends how the codegen goes

21:21 <karolherbst> anyway.. do you have a dump of the code being compiled?

21:21 <karolherbst> like.. the thing passed into llvm

21:22 <karolherbst> In hinsight I should have added `dump_clc` to `CLC_DEBUG`, but uhh.. that's kinda not useful in the CL context

21:22 <dj-death> karolherbst: yeah : https://pastebin.com/5tdiJ7i5

21:22 <karolherbst> yeah..

21:22 <karolherbst> "__attribute__((unused)) const struct GFX125_3DSTATE_VERTEX_BUFFERS * restrict values" that's your generic pointer

21:22 <karolherbst> if you pass in private memory, specify it as "private const struct ...*"

21:22 <dj-death> I know

21:23 <karolherbst> though that just gets rid of the generic modes in the cast

21:23 <karolherbst> not the cast itself

21:23 <karolherbst> but should make it easier to fix the compiler pass order

21:26 <dj-death> yeah

21:26 <dj-death> will see tomorrow, thanks

21:26 <karolherbst> mhh.. we have this CL runner in piglit.. let's see if I can make it do something

21:27 <dj-death> yeah private didn't help indeed :)

21:28 <karolherbst> yo.. how do I make this code run.. this will be header file hell :D

21:32 <karolherbst> mhh it didn't crash at least

21:32 <karolherbst> ehh wait.. that's still on my box with llvm-16 I think? mhh

21:33 <karolherbst> rusticl on llvm-16 compiles that to this, which looks very sane: https://gist.githubusercontent.com/karolherbst/fc5bcbfdcc6d1a633d1e63046a3a3fb0/raw/6188dab8d21ed059a563c01c64dc0c5d5bdd9cf2/gistfile1.txt

21:33 <karolherbst> finishing compiling llvm-17 and testing it there shortly

21:34 paulk has quit [Ping timeout: 480 seconds]

21:34 <karolherbst> but turning it into a kernel can also lead to a lot of things happening or not happening...

21:36 paulk has joined #dri-devel

21:48 <airlied> jenatali: pretty sure there is an llvm option to turn on symbol versions across the api

21:48 <jenatali> Oh cool

22:00 jkrzyszt has quit [Ping timeout: 480 seconds]

22:04 jsa has joined #dri-devel

22:05 sima has quit [Ping timeout: 480 seconds]

22:07 simon-perretta-img has quit [Ping timeout: 480 seconds]

22:07 simon-perretta-img has joined #dri-devel

22:13 <dj-death> karolherbst: any luck with llvm-17?

22:13 <karolherbst> recompiling mesa atm

22:14 <dj-death> I can install both versions on debian

22:14 <karolherbst> dj-death: mhh.. with llvm-17 I indeed run into scratch being used...

22:14 <dj-death> it's not too bad

22:14 <karolherbst> but at least I have a fairly trivial reproducer now

22:14 <dj-death> karolherbst: ah so it's more complicated than just ordering :(

22:15 <karolherbst> I'll try to take a look and see if I can fix it for rusticl, because that's a real issue regardless...

22:15 <karolherbst> I wonder....

22:15 <dj-death> yeah, well let me know :)

22:16 <karolherbst> let me diff it...

22:19 <karolherbst> yeah....

22:19 <karolherbst> it's `nir_opt_deref` not kicking in

22:19 <karolherbst> well..

22:19 <karolherbst> somewhat

22:20 <karolherbst> mhhh.. interesting

22:36 <dj-death> there is an additional

22:36 <dj-death> 64 %160 = deref_cast (struct.GFX125_3DSTATE_VERTEX_BUFFERS *)%6 (constant struct.GFX125_3DSTATE_VERTEX_BUFFERS) (ptr_stride=20, align_mul=0, align_offset=0)

22:36 <karolherbst> yeah...

22:36 <karolherbst> so the thing is, that vars_to_ssa is smart and matches the struct field access back to the original source

22:36 <karolherbst> but that doesn't happen with the deref_struct thing missing

22:37 <karolherbst> https://gist.github.com/karolherbst/67bc4a4e2bd62714b92899c2ae4e8cd7

22:38 <karolherbst> this "32 %26 = mov %21" makes it all optimized away in the llvm-16 case

22:38 <karolherbst> but I don't know yet what I think would be a good solution to this issue

22:39 <karolherbst> this casting to the struct base instead of creating a deref of the first field is really a nonsense thing llvm is doing 🙃I don't understand why

22:40 glennk has quit [Ping timeout: 480 seconds]

22:41 <karolherbst> I think we can detect this pattern with explicit type information

22:41 <jenatali> Seems like we could detect that in nir_opt_deref?

22:41 <karolherbst> and just... workaround it in a dirty way

22:41 <karolherbst> yeah

22:41 <karolherbst> if you cast a struct to the type of it's first member... just do a deref_struct on the first field or something

22:41 <jenatali> Right

22:41 <jenatali> As long as the cast doesn't add alignment/stride info

22:41 <karolherbst> yeah...

22:42 <karolherbst> but that sounds like the most pragmatic solution here

22:42 <jenatali> It should probably be recursive too...

22:42 <karolherbst> I wonder what would be the _in a perfect world with perfect code_ solution tho

22:42 <karolherbst> mhhhh

22:42 <karolherbst> let's see...

22:42 <jenatali> struct outer { struct middle { struct inner { int a; }; }; };

22:42 <karolherbst> should be easy to try

22:42 <jenatali> Should be able to cast outer* to int*

22:42 <karolherbst> but yeah...

22:43 <karolherbst> we can follow inner structs until we hit a non struct/array thing

22:43 <karolherbst> I suspect LLVM might do the same on arrays 🙃 why stop at structs?

22:43 <jenatali> Right, array element 0

22:44 <jenatali> Seems like a worthy optimization even if it was an app that was doing it explicitly TBH

22:44 <karolherbst> that's the beauty of LLVM: everything is possible

22:44 <dj-death> I'm checking why this doesn't get removed by opt_replace_struct_wrapper_cast :

22:44 <dj-death> 64 %20 = deref_cast (struct.GFX125_3DSTATE_VERTEX_BUFFERS *)%12 (function_temp struct.GFX125_3DSTATE_VERTEX_BUFFERS) (ptr_stride=0, align_mul=0, align_offset=0)

22:44 <karolherbst> the stride probably?

22:45 <dj-death> nope :

22:45 <dj-death> glsl_get_struct_field_offset(parent->type, 0) != 0

22:45 <dj-death> WTF

22:45 <karolherbst> ahh yeah

22:45 <karolherbst> do you have explicit types at that point?

22:45 <karolherbst> ohh also...

22:45 <karolherbst> OHHHH

22:45 <karolherbst> I remember

22:45 <karolherbst> I was debugging that function

22:46 <karolherbst> but I forgot on why it didn't work

22:46 <karolherbst> lemme debug this :D

22:47 heat has quit [Remote host closed the connection]

22:51 <karolherbst> dj-death: okay.. so

22:51 <karolherbst> in rusticl I pass this check

22:51 <karolherbst> but it fails later

22:51 <karolherbst> `if (cast->cast.ptr_stride != glsl_get_explicit_stride(field_type))`

22:51 <karolherbst> ptr_stride of the cast is 4

22:52 <karolherbst> field_type has a stride of 0 :')

22:52 jsa has quit []

22:54 <karolherbst> dj-death: https://gist.github.com/karolherbst/a271bf08f8d3da54dc587e1a9e14e676

22:55 <jenatali> karolherbst: No, that's not the right check

22:55 <karolherbst> my change? I know.. I just forced to make it work

22:56 <jenatali> If explicit stride is 0 you'd want to compute an implicit stride

22:56 <karolherbst> I see...

22:56 <karolherbst> in any case, the struct fields have no explicit_stride and therefore we don't do this opt

22:56 <dj-death> karolherbst: yeah that works here too

22:57 <karolherbst> requires explicit_types before opt_deref tho :)

22:57 <jenatali> Right, a uint in a struct has an implicit stride of 4, but it gets cast to a pointer with an explicit stride of... 4

22:57 <karolherbst> but yeah

22:57 <dj-death> yeah I can work with that

22:57 <dj-death> especially for temp variables

22:57 <karolherbst> so yeah.. I think my changes were motivated by this.. but it didn't fix the issue because I forgot to deal with the that opt not kicking in

22:59 <karolherbst> yeah... I just took the opportunity and reworked my pipeline so that I only have to call explicit types exactly once for each var type, so I won't have to reset the scratch_size at all :)

22:59 <jenatali> karolherbst: I don't see how you can do that

22:59 <karolherbst> well.. I did

22:59 <karolherbst> why shouldn't it be possible?

22:59 <jenatali> If you need explicit types before opt_deref/opt_memcpy, but then after those optimizations temp variables disappear

23:00 <jenatali> If you don't reset scratch size, you'll overallocate, since you still reserve space for variables that got deleted

23:00 <karolherbst> so here is the thing

23:00 <karolherbst> I call run those opts once with and once without explicit types

23:00 <karolherbst> or multiple times even

23:00 <jenatali> Sure

23:00 <karolherbst> so in the end it all works out

23:00 TMM has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]

23:00 <jenatali> But when you lower to explicit types it sets the scratch size

23:00 TMM has joined #dri-devel

23:00 <karolherbst> yeah

23:00 <karolherbst> but I do that kinda late

23:00 <jenatali> And those opts won't actually result in variables getting deleted until after lowering to explicit types

23:01 <jenatali> So you still overallocate. If you reset the scratch size and rerun the explicit types pass, you'll get a (much) smaller value

23:01 Haaninjo has quit [Quit: Ex-Chat]

23:01 <karolherbst> mhh? I don't think I actually overallocate scratch size, because it is able to figure out to dce some/most of the vars? but maybe I actually did miss something here

23:01 <jenatali> Which is a design flaw in using that pass to set scratch size, FWIW

23:01 <karolherbst> but I didn't encounter anything strange

23:02 <jenatali> What do you do based on the scratch size set in the shader info?

23:02 <karolherbst> nothing?

23:03 <jenatali> I ended up failing to compile shaders because the validator that we run downstream on the DXIL sees that we request an alloca, but then never use it, because all of the temp variables go away between setting the scratch size and emitting code

23:03 <karolherbst> mhhh

23:03 <karolherbst> maybe we should add a nir validation for this?

23:04 <karolherbst> scratch size set, but not scratch ops found?

23:04 <karolherbst> *no

23:04 <dj-death> yeah

23:04 <jenatali> It'd need to be explicitly run. Scratch size is set after lower_to_explicit_types, but you need lower_explicit_io afterwards to actually create the scratch ops

23:04 <karolherbst> I mean.. or a deref on temp memory

23:05 <dj-death> would need to be moved to io?

23:05 <jenatali> Then vars_to_ssa or copy_prop would start failing validation

23:05 <karolherbst> we could check for scratch ops or derefs on temporaries

23:05 <karolherbst> shouldn't be too hard

23:05 <jenatali> Nothing decrements the scratch size once it's set

23:05 <jenatali> Because they just leave holes instead. The scratch offsets are baked into the variables after lowering to explicit types

23:05 <karolherbst> then those passes nuking all those ops, should set scratch size to 0

23:06 <karolherbst> that's kinda the point here in validating it, no?

23:06 <jenatali> But they don't necessarily nuke everything

23:06 <jenatali> And scratch that looks like { empty space, var } is still bad

23:06 <karolherbst> fair

23:06 <karolherbst> or we just add a "nir_shrink_memory" pass

23:07 <karolherbst> and use it on shared as well

23:07 <jenatali> Aka "set scratch size to 0 and re-run lower_to_explicit_types" :)

23:07 <karolherbst> mhhh

23:07 <karolherbst> sounds like pain :D

23:07 <jenatali> And yeah, same problem with shared, but I don't know if as much stuff can remove shared variables

23:07 <karolherbst> _but_...

23:07 <karolherbst> maybe I should validate on it in debug builds

23:07 <karolherbst> do explicit types again and see if it changes the sizes

23:07 <jenatali> Hint: It will ;)

23:08 <karolherbst> we'll see about that 🙃 though I think what I am doing now gets me pretty close at least

23:08 <jenatali> Look at the link you pasted above

23:08 <jenatali> scratch: 20

23:08 <jenatali> I don't see any load_scratch or store_scratch

23:08 <karolherbst> yo.. pain

23:09 <jenatali> Yeah

23:09 <karolherbst> *sigh*... guess I'll just add it back then

23:10 qyliss has quit [Quit: bye]

23:10 <karolherbst> so anyway...

23:10 <dj-death> karolherbst: still a bit confused by your current fix

23:11 <karolherbst> fixing opt_replace_struct_wrapper_cast comes first

23:11 <karolherbst> don't think too much about it

23:11 <dj-death> karolherbst: why uint has no explicit_stride ?

23:11 <karolherbst> I just hacked it

23:11 <jenatali> dj-death: Explicit stride is only a thing on arrays and matrices in nir

23:11 <jenatali> And deref_cast I guess

23:11 <jenatali> Since those can be treated as arrays using deref_ptr_as_array

23:11 <dj-death> oh

23:12 <dj-death> so you have to trust the deref here

23:12 <karolherbst> maybe we should just set explicit_stride on struct members? dunno

23:12 <dj-death> yeah

23:12 qyliss has joined #dri-devel

23:12 <karolherbst> though that gets funky with packed structs

23:12 <dj-death> I mean they have a location

23:12 <jenatali> I'd want to hear from gfxstrand for this particular issue

23:18 <karolherbst> can't we just sneak a fix past her this time? 🙃

23:24 mvlad has quit [Remote host closed the connection]

23:26 <dj-death> karolherbst: not fixing all the issues though :)

23:27 <dj-death> struct delta64 { uint64_t v0; uint64_t v1;

23:27 <dj-death> } data = *((global struct delta64 *)&query[qw_offset]);

23:27 <dj-death> this local variable ends up to scratch

23:28 <jenatali> dj-death: How are you expecting that to not end up in scratch?

23:28 <jenatali> Range analysis on qw_offset to realize it can only be 0 or 1 and turn it into a bcsel or something?

23:31 <dj-death> jenatali: https://pastebin.com/rL8rfNUw

23:31 <dj-death> jenatali: a simplified example also scratching

23:32 <jenatali> dj-death: Ohh I see

23:32 <jenatali> I misread it at first :)

23:33 <karolherbst> mhhh

23:34 <karolherbst> my hope is that with the spirv backend we can just turn on -O2 and llvm gives us less silly code :D

23:35 <airlied> I admire your confidence (v2)

23:35 <karolherbst> I'm sure it will work out in the end

23:36 <jenatali> As long as the end isn't like 10 years from now

23:37 <dj-death> the problem here is that it's casting the first field which is uint64_t

23:37 <dj-death> into a uvec4

23:37 <dj-death> interesting

23:39 <dj-death> I guess we could add more special casing

23:42 <dj-death> if you're casting a struct's first field to a type that covers the entire struct

23:42 <dj-death> you actually want the initial deref to the struct

23:43 <jenatali> Yeah

23:43 <dj-death> 64 %9 = deref_var &data (function_temp struct.delta64)

23:43 <dj-death> 64 %38 = deref_struct &%9->field0 (function_temp uint64_t) // &data.field0

23:43 <dj-death> 64 %39 = deref_cast (uvec4 *)%38 (function_temp uvec4) (ptr_stride=16, align_mul=0, align_offset=0)

23:43 <dj-death> can just replace %39 with %9

23:43 <jenatali> Weird...

23:45 <jenatali> Yeah maybe not even having to qualify "that covers the entire struct," the only thing you need to make sure is that you're not trying to go the other way and cast to the first member of an inner struct

23:54 heat has joined #dri-devel