ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar
<gfxstrand>
I really hate that this stupid query crash cost me like 4 hours of CTS runtime today. *grumble*
psykose has quit [Ping timeout: 480 seconds]
<karolherbst>
*pat *pat*
<gfxstrand>
*prrrr*
<epony>
popcorn teatiem
psykose has joined #dri-devel
oneforall2 has joined #dri-devel
guru__ has joined #dri-devel
oneforall2 has quit [Read error: Connection reset by peer]
jewins has quit [Ping timeout: 480 seconds]
oneforall2 has joined #dri-devel
columbarius has joined #dri-devel
co1umbarius has quit [Ping timeout: 480 seconds]
guru__ has quit [Ping timeout: 480 seconds]
youmukon1 has joined #dri-devel
youmukon1 has quit []
youmukon1 has joined #dri-devel
youmukon1 has quit []
youmukon1 has joined #dri-devel
youmukonpaku1337 is now known as Guest1397
youmukon1 is now known as youmukonpaku1337
Guest1397 has quit [Ping timeout: 480 seconds]
Kayden has joined #dri-devel
yyds has joined #dri-devel
oneforall2 has quit [Remote host closed the connection]
oneforall2 has joined #dri-devel
psykose has quit [Ping timeout: 480 seconds]
sukrutb_ has quit [Ping timeout: 480 seconds]
epony has quit [Ping timeout: 480 seconds]
psykose has joined #dri-devel
psykose has quit [Remote host closed the connection]
psykose has joined #dri-devel
antoniospg has quit []
idr has quit [Quit: Leaving]
lemonzest has quit [Quit: WeeChat 4.0.4]
lemonzest has joined #dri-devel
guru__ has joined #dri-devel
<i509vcb>
In the Vulkan spec there is a mention of VK_ERROR_UNKNOWN possibly being returned by any command. But outside of that does the spec saying vkWhatever returns some specified error codes mean other error codes are technically valid to return?
<i509vcb>
Question is related to me noticing that vkCreateWaylandSurfaceKHR states VK_ERROR_OUT_OF_HOST_MEMORY and VK_ERROR_OUT_OF_DEVICE_MEMORY are failure return codes, but Mesa can return VK_ERROR_SURFACE_LOST_KHR if you happen to hit some code paths
tristan has joined #dri-devel
tristan is now known as Guest1402
<i509vcb>
s/valid/invalid
psykose has quit [Ping timeout: 480 seconds]
oneforall2 has quit [Ping timeout: 480 seconds]
psykose has joined #dri-devel
guru__ has quit [Read error: Connection reset by peer]
<zmike>
sounds like bug
<i509vcb>
It kind of makes sense that you could instantly lose a surface because the wl_display you gave to vulkan had a protocol error, but the spec seems ignore the existance of SURFACE_LOST in that case
ayaka_ has joined #dri-devel
oneforall2 has joined #dri-devel
guru__ has joined #dri-devel
crabbedhaloablut has joined #dri-devel
oneforall2 has quit [Ping timeout: 480 seconds]
kzd_ has joined #dri-devel
kzd_ has quit []
orbea has quit [Remote host closed the connection]
orbea has joined #dri-devel
kzd has quit [Ping timeout: 480 seconds]
<DemiMarie>
Someone ought to fuzz Mesa and make sure it returns VK_ERROR_OUT_OF_HOST_MEMORY where needed. /hj
jewins has joined #dri-devel
aravind has joined #dri-devel
Guest1402 has quit [Ping timeout: 480 seconds]
sukrutb_ has joined #dri-devel
JohnnyonFlame has joined #dri-devel
<Lynne>
it's not like linux will ever return null for malloc, sadly, but on windows it's possible
<ishitatsuyuki>
there are some CTS test that uses a custom allocator to simulate this. it's the C programmer's greatest foe ;P
tristan has joined #dri-devel
tristan is now known as Guest1408
ayaka_ has quit [Remote host closed the connection]
ayaka_ has joined #dri-devel
Leopold_ has quit []
bmodem has joined #dri-devel
<HdkR>
Linux not returning null for malloc? That's easy, just run out of virtual address space
<airlied>
just be a 32-bit game :-P
Leopold_ has joined #dri-devel
<HdkR>
Yea, a 32-bit game is easy mode :D
<kode54>
that reminds me
<kode54>
I can't get Yuzu to run on Vulkan on ANV right now
<kode54>
it's dying and throwing a terminating exception because some Vulkan call returns VK_ERROR_UNKNOWN
<kode54>
naturally, the console output doesn't say where this is being thrown from
<airlied>
alyssa: do you do function calls? :-) 24687 has some spirv/nir bits
jewins has quit [Ping timeout: 480 seconds]
ayaka_ has quit [Ping timeout: 480 seconds]
bmodem has quit [Ping timeout: 480 seconds]
bmodem has joined #dri-devel
bmodem has quit [Excess Flood]
bmodem has joined #dri-devel
Guest1408 has quit [Ping timeout: 480 seconds]
rauji___ has joined #dri-devel
<Lynne>
ishitatsuyuki: it's my greatest regret, writing all this nice and neat resilient code to cascade all errors, but never actually using it or seeing it run
<Lynne>
linux should give oom'd programs at least half a chance of closing carefully by letting malloc return null, but far too much code has been written under the assumption it won't
tristan has joined #dri-devel
tristan is now known as Guest1411
tzimmermann has joined #dri-devel
<HdkR>
Lynne: I would love some inotify system to register low memory situations. Kind of like cgroup notifications but app level
aravind has quit [Ping timeout: 480 seconds]
<kode54>
dj-death: do I need to poke that issue where I posted that trace?
aravind has joined #dri-devel
<dj-death>
kode54: that would be helpful
<kode54>
will do
<dj-death>
kode54: I thought you said it was a crash
<kode54>
I meant the one where i915 was running slowly for one game
<kode54>
I added the traces and generated a new log
<kode54>
but then I didn't realize you went on holiday
<dj-death>
normally all VK_ERROR_* should go through vk_errorf
<kode54>
this is a different thing
<dj-death>
there should be a trace somewhere
<kode54>
VK_ERROR_UNKNOWN was Yuzu
<kode54>
the log I added traces for was Borderlands: GOTY Enhanced
Guest1411 has quit [Ping timeout: 480 seconds]
<kode54>
I'm apparently tracking multiple issues
<kode54>
in different hings
<kode54>
*things
mszyprow has joined #dri-devel
<kode54>
not sure what to do about yuzu
<kode54>
I'll have to look at the source code to see why it's just throwing an exception
<kode54>
yuzu doesn't have a single reference to vk_errorf
camus has quit [Read error: Connection reset by peer]
camus has joined #dri-devel
<dj-death>
kode54: I meant the vulkan driver
<kode54>
oh
<kode54>
how do I get those messages if the app isn't able to show them?
<kode54>
which environment variable do I need to set to make them all just dump to the console?
JohnnyonFlame has quit [Read error: Connection reset by peer]
<kj>
gfxstrand: to double check, it's not acceptable to compile down to SPIR-V or NIR (and serialize) at build time for internal shaders?
<kj>
So the options would be to check in glsl and glsl_to_nir(), or build the shader in nir at runtime
rasterman has joined #dri-devel
vliaskov has joined #dri-devel
<kj>
Asking because for pvr we still need to unhardcode some internal shaders so would be nice to write them up with something more higher level than rogue ir (which we've done atm)
<kj>
I recall a conversation here about setting up a common way of doing things for internal shaders but not sure what happened with that
epony has quit [autokilled: This host violated network policy. Mail support@oftc.net if you think this is in error. (2023-09-01 08:47:19)]
<dj-death>
kode54: thanks yeah
<dj-death>
kode54: really looks like a window system issue
<kode54>
which window system should I use?
<dj-death>
kode54: looks like the game is like waiting to get a new buffer for an image for like 160ms
<kode54>
is there one especially suited to this task?
<dj-death>
kode54: I think most people use gnome-shell
<kode54>
Xorg or Wayland?
<kode54>
I have nasty window scaling glitches on both gnome and plasma
<kode54>
resizing the scaling of one output to 200% causes the window shadows to glitch out
<kode54>
would love to have usable dumping so we can get to the bottom of these GuC crashes
<dj-death>
if only we could have the type of dma-fence the kernel is waiting on
cmichael has quit [Quit: Leaving]
<kode54>
oh, the other one? that would be great
YuGiOhJCJ has quit [Quit: YuGiOhJCJ]
ayaka_ has quit [Ping timeout: 480 seconds]
yyds has quit [Remote host closed the connection]
kts has quit [Quit: Konversation terminated!]
swizzlefowl has joined #dri-devel
<kode54>
maybe DSB?
<swizzlefowl>
Hello
youmukon1 has joined #dri-devel
<zamundaaa[m]>
MrCooper: I'm doing a bunch of performance work for KWin and found something pretty unexpected
youmukonpaku1337 has quit [Read error: Connection reset by peer]
<zamundaaa[m]>
When I'm rendering to a gbm buffer (imported as an EGLImage, which is used as the color attachment for an fbo) and call glFinish(), the fds of the buffer aren't always immediately readable afterwards
<zamundaaa[m]>
This seems to happen quite seldomly on AMD, and more often on Intel. Are my expectations for how this works just wrong, or could there be some driver bugs involved?
donaldrobson has quit [Ping timeout: 480 seconds]
camus has quit [Ping timeout: 480 seconds]
yuq825 has left #dri-devel [#dri-devel]
youmukonpaku1337 has joined #dri-devel
youmukon1 has quit []
<DemiMarie>
kode54: Fuzz the GuC and report the bugs to Intel?
<DemiMarie>
Sorry
bmodem has quit [Ping timeout: 480 seconds]
<DemiMarie>
Asahi Lina has some experience debugging firmware crashes.
bmodem has joined #dri-devel
jewins has joined #dri-devel
<alyssa>
kj: compiling glsl/cl to spir-v at build-time is fine.
<alyssa>
compiling to nir & serializing is somewhat more sketchy. but intel goes all the way to hw binaries at build time so YMMV i guess
bmodem has quit [Ping timeout: 480 seconds]
<alyssa>
i'll probably send out common code for doing CL kernels at build time in a reasonably generic way. I have a vague plan to let CL C be usable for certain vertex/fragment shaders too, if that's something that's needed.
<alyssa>
Not sure what kinds of shaders you're talking about. For small stuff usually nir_builder is the right call, it's just the chunky monkey kernels that really benefit.
<DavidHeidelberg[m]>
eric_engestrom: the build job limit sounds like good idea, MR Ack :)
fab has quit [Quit: fab]
<kj>
alyssa: thanks, we have compute shaders for queries which are fairly simple, but there's also a whole bunch of shaders used for transfer stuff which might be a bit more involved (haven't looked in depth there)
kzd has joined #dri-devel
<penguin42>
If I've got a 'Compute Shader LLVM IR' dumped from rusticl debug, is there anyway I can push it through llvm to see what it does?
<alyssa>
kj: yeah.. I've done some pretty intense nir_builder but yeah working with C is a lot nicer :p
<alyssa>
see agx_nir_lower_texture.c if you want to be scared :p
<alyssa>
(not really a candidate for CL at this point)
OftenTimeConsuming has quit [Remote host closed the connection]
OftenTimeConsuming has joined #dri-devel
<karolherbst>
penguin42: llvm has tooling for their stuf,f like llvm-dis and whatnot
<penguin42>
karolherbst: Yeh so I've seen those, but I don't know how to go from the debug output of mesa to feeding that into llvm so I can play around with llvm to see why it's doing what it's doing
<karolherbst>
yeah... sadly I don't really know much about how to dig into those things deeper from an AMD perspective, might wantt o ask around in #radeon as there are some LLVM developers who might be able to help out with it
<penguin42>
karolherbst: Ack, do you know how to throw that debug into llvm ?
<karolherbst>
no idea how to trigger the normal pipeline, however there is AMD_DEBUG=asm or something to print the actual hardware level IR
<karolherbst>
or something
<penguin42>
karolherbst: Yeh so I have the IR and I have the asm, I wanted to play around with what was inbetween; mostly this is trying to understand where that weird load ordering thing came from
<karolherbst>
ahh
<karolherbst>
I guess for that you'll have to compile LLVM and see what passes it runs
<karolherbst>
maybe there is an LLVM option to print what it runs
<karolherbst>
dunno
<penguin42>
yeh I guess there is once I can figure out how to run it :-)
<karolherbst>
just use your local LLVM build instead of your system one
<penguin42>
karolherbst: But what options do I pass to llvm to take that IR and spit out that asm?
<penguin42>
karolherbst: I don't see any of that at the moment because I only have the rusticl debug
fab has joined #dri-devel
<karolherbst>
there is no simple solution here
<penguin42>
ok
<karolherbst>
your best bet is to just use the current mesa code
<karolherbst>
and whatever radeonsi is doing
<karolherbst>
I don't know if you can do that stuff on the cli even
<penguin42>
ok - that's what I was really after, because I was assuming doing it from the CLI would be the best way to add llvm debug/tracing/etc
<karolherbst>
there might be a way, I just don't know it
<karolherbst>
but you can also just write a small tool doing the same thing
<karolherbst>
e.g. copying the pipeline radeonsi is using
<penguin42>
karolherbst: I can see optimising shader code can drive people nuts; I was finding one change to my shader made rusticl faster and ROCm slower or the other way around
<karolherbst>
yeah...
<karolherbst>
optimizing compilers be like that
<karolherbst>
at some point it's hard to find those changes which are always a benefit
<karolherbst>
so, some stuff gets slower, some stuff gets faster
<penguin42>
karolherbst: I'm suspecting some of this might be 'bank clashes' - but wth knows; AMDs pretty profiling tools look like they need their kernel drivers
<eric_engestrom>
DavidHeidelberg[m]: thanks! mind saying that on the MR? :P
<eric_engestrom>
pendingchaos: (sorry for the delay, been doing too many things lately and I forgot to read my mentions here) I'll continue making new 23.1.x releases until 23.2.0 is out, no matter how long it takes
tzimmermann has quit [Quit: Leaving]
Duke`` has joined #dri-devel
mripard has quit [Quit: mripard]
Jeremy_Rand_Talos_ has quit [Read error: Connection reset by peer]
Jeremy_Rand_Talos_ has joined #dri-devel
Mis012[m]1 is now known as Mis012[m]
Jeremy_Rand_Talos_ has quit [Remote host closed the connection]
Jeremy_Rand_Talos_ has joined #dri-devel
Jeremy_Rand_Talos_ has quit [Remote host closed the connection]
Jeremy_Rand_Talos_ has joined #dri-devel
sgruszka has quit [Ping timeout: 480 seconds]
yyds has quit [Remote host closed the connection]
dviola has quit [Quit: WeeChat 4.0.4]
ahajda has quit [Quit: Going offline, see ya! (www.adiirc.com)]
Guest1428 has quit [Remote host closed the connection]
lemonzest has quit [Quit: WeeChat 4.0.4]
dviola has joined #dri-devel
<Lynne>
cargo is nice and all until a fresh sync takes 500 megabytes and you're on a bad connection, and a crate decides it absolutely must use nightly, as all crates do
<Venemo>
is gitlab down again?
<Lynne>
just a throwaway comment
ahajda has joined #dri-devel
lemonzest has joined #dri-devel
rasterman has quit [Quit: Gettin' stinky!]
jimc has quit [Read error: Connection reset by peer]
mszyprow has quit [Ping timeout: 480 seconds]
<alyssa>
something something cargo cult
<gfxstrand>
:P
<karolherbst>
🦀
lynxeye has quit [Quit: Leaving.]
<anarsoul>
Lynne: just don't do sync when you're on a bad connection
<Lynne>
having a bad connection is hardly a choice
<tnt>
I've got an application causing : "[drm] GPU HANG: ecode 12:1:85dcfdfb, in ngscopeclient" (intel 12th gen, vulkan app).
<tnt>
How would one go about tracing what's going on ?
frankbinns has quit [Remote host closed the connection]
<cmarcelo>
does anyone foresee glsl_function_type() being useful again (only user was spirv, but stopped in favor of own implementation, right now is dead code)? deciding here if I can just remove or make changes to improve with others as part of an ongoing MR.
tobiasjakobi has joined #dri-devel
tobiasjakobi has quit []
mszyprow has joined #dri-devel
<dcbaker>
mattst88: did you meant to request a review from marge on the intel_clc series, or did you mean to assign it to marge?
<mattst88>
dcbaker: lol, derp
<mattst88>
thanks for noticing that
kts has joined #dri-devel
<dcbaker>
that looks good to me, btw. I'd still like to get it to where we don't need to do that, but there are so many assumptions in meson's implementation that things are for the host and only the host it's turning into a slug fest with some seriously annoying problems
<alyssa>
dcbaker: I'm experimenting with a generic mesa_clc
<dcbaker>
so, same problem then?
<alyssa>
current tentative plan is that it goes CLC->SPIR-V but does not touch any NIR
<alyssa>
which should be a lot fewer deps but yes, same problem until clang can do that itself
<dcbaker>
the good news is Mesa isn't the only project with this problem, it turns out that other big complex projects run into the same issue when cross compiling
<dcbaker>
at least, good in that everyone agrees it needs to happen, lol
<DemiMarie>
Sorry for all the unanswerable questions I asked earlier!
djbw has joined #dri-devel
<alyssa>
nod
<alyssa>
i expect by end-of-year asahi will have a hard build-dep on clc
<alyssa>
We have a significant need from it and we have buy in from Fedora and Intel's already doing it for raytracing so sure yeah why not
<alyssa>
and asahi only needs to build on arm and x86, so no LLVM problem
<dcbaker>
sigh. LLVM.
junaid has joined #dri-devel
<dcbaker>
karolherbst: I'm going to start reviewing the next round of the crates in meson work next week, I know gfxstrand has been trying it out. Are there any crates you need/want?
Nyamiou has joined #dri-devel
gouchi has joined #dri-devel
Nyamiou has quit []
<alyssa>
dcbaker: Debian has some build rules to disable LLVM on exotic architectures (':
<alyssa>
So no CLC deps in common code without angering somebody.
<alyssa>
Although... if they're cross building it shouldn't matter?
<karolherbst>
dcbaker: uhm.. mostly just syn and serde
<alyssa>
Like you should be able to run intel_clc/mesa_clc on the host with the host LLVM and then do an LLVM-free target mesa build using the precompiled kernels
<alyssa>
you don't get Rusticl support but the BVH kernels etc should work fine in that set up
<alyssa>
we dont support that on the mesa side but we.. probably could?
<dcbaker>
karolherbst: cool, I'm pretty sure syn already works
<dcbaker>
I'll make sure we test out serde
<karolherbst>
cool
<dcbaker>
alyssa: yeah.... I'm just not sure how you'd go about supporting OpenCL without llvm/clang at this point
<karolherbst>
there are probably random others I might want to use in the future, but those would be a good start to drop some code
<alyssa>
dcbaker: ~~gcc-spirv when~~ delet
<dcbaker>
alyssa: I mean, if someone else wants to write the code and it works...
<alyssa>
dcbaker: :p
<alyssa>
I don't love runtime LLVM deps, honestly I build -Dllvm=disabled myself up until now
<alyssa>
but I feel entitled to a buildtime LLVM dep o:)
<alyssa>
(I already build mesa with clang, this is just more of that :D)
<dcbaker>
I don't care that much about buildtime deps that much, but I usually build with gcc and getting the right version of LLVM can be a real pain sometimes
<alyssa>
I do know an llvm spirv target was talked about, I wonder if mesa_clc will be obsoleted in due time..
<alyssa>
except for libclc, src/compiler/clc/ isn't doing much that clang couldn't do itself..
<alyssa>
:q
<karolherbst>
yeah.. maybe.. if it's not causing any regressions in the CTS that is
<alyssa>
yeah
alanc has quit [Remote host closed the connection]
alanc has joined #dri-devel
glennk has quit [Remote host closed the connection]
glennk has joined #dri-devel
glennk has quit [Remote host closed the connection]
a-865 has quit [Ping timeout: 480 seconds]
glennk has joined #dri-devel
<airlied>
zmike: you should have made that blog a monetized twitter post, controversy gets clicks!
a-865 has joined #dri-devel
<zmike>
airlied: there's no controversy there, just people who agree with me and people who are wrong
mszyprow has quit [Ping timeout: 480 seconds]
<airlied>
that's the attitude to get more elon bucks :-P
<alyssa>
so it turns out I've been doing something absolutely stupid in mesa for years
<alyssa>
kewl.
<karolherbst>
heh
<alyssa>
well, loop's gone now, with -O0
<karolherbst>
:D
<karolherbst>
do we want to know?
<alyssa>
No
<alyssa>
Even so, with -O3 there's a big pile of load/store_scratch that goes away with -O0
<karolherbst>
yeah....
<alyssa>
i'm guessing I'm missing some NIR copyprop pass somewhere
<alyssa>
seems like we should be able to see thru that
<karolherbst>
so you have scratch stuff with O3 but not with O0?
<karolherbst>
anyway.. you can always copy whatever rusticl is doing to get rid of those things
<karolherbst>
the entire pipeline is cursed
<karolherbst>
kinda
<karolherbst>
but it kinda also works
<airlied>
the translator docs I think state O0 only is supported, anything else is a crap shoot
<karolherbst>
yeah.. I'm also a bit hesistant of even landing some of the llvm opt work because of this
<airlied>
but yeah it might be worth running some llvm passes, but also I think we can do a lot on the NIR side to step up
<karolherbst>
it's literally not tested anywhere
<airlied>
and close the gaps
<airlied>
if we want NIR to be a real life compute compiler
<karolherbst>
right.. I didn't want to get more optimize binaries, just smaller ones
<karolherbst>
and I was able to reduce binaries sizes by lik 60%
<karolherbst>
it's just not very stable
<airlied>
I suppose in theory spirv-opt could be used as well, if it ever did anything useful :-P
<karolherbst>
it speeds up caching/ reduces peak memory usage and other benefits, sadly we probably can't rely on it
<karolherbst>
yeah... that gives me another 25% reduction
<karolherbst>
it's all in the MR
<karolherbst>
sadly.. I can't use the impressive `MergeFunctions` LLVM pass
<karolherbst>
so the benefits are all kinda smallish
<karolherbst>
`MergeFunctions` generates function pointers in a few places
<karolherbst>
alyssa: anyway, I think the asahi CL stuff is ready to land, I've listed all the remaining problems, but nothing stands out really and I mitigated the linear image issue as much as possible. Now you simply can't map 3D images, but whatever. Maybe I should just assign to marge and... figure out timestamps after that
mvlad has quit [Remote host closed the connection]
Mangix has quit [Ping timeout: 480 seconds]
<alyssa>
karolherbst: I had scratch with O0 but disappeared with O3. I think I screwed up my pass order or whatever, will look at it harder, NIR should be able to breeze thru this
<karolherbst>
yeah.. it should
<karolherbst>
just run all the passes 5 times or something
<alyssa>
Lol
<alyssa>
20:01 airlied | if we want NIR to be a real life compute compiler
<alyssa>
Yeah... A big chunk of stuff that CL wants, VK also wants and we don't have any LLVM to cheat off there. So trying to get NIR into shape seems like the better long-term approach, idk
<alyssa>
nir_opt_deref doesn't remove the cast and nir_opt_constant_folding doesn't see through the cast, so that turns into a load_global (!) instead of a load_const
Duke`` has quit [Ping timeout: 480 seconds]
<karolherbst>
constant isn't in shader constant memory though
<karolherbst>
it's just a ubo (more or less)
<karolherbst>
just with global addressing
<alyssa>
still supposed to be constant folded.
<karolherbst>
does the constant variable have a constant initializer?
<alyssa>
yes
<alyssa>
it's just the cast in the way
junaid has quit [Remote host closed the connection]
<karolherbst>
so without the cast it would constant fold? Mhh.. normally we kinda drop pointless casts but there are a few restrictions in palce
<karolherbst>
but yeah, if the source is constant known at compile time we should constant fold it
<alyssa>
admittedly dont fully understand whats happening here but it should also constant fold..
<karolherbst>
it might be that most of the folding only reliably works once IO is lowered
<alyssa>
But it's too late by then, since lowering I/O turns this into load_global(load_constant_base_ptr)
<karolherbst>
mhhh... yeah, it shouldn't...
<alyssa>
this needs to be optimized away in derefs
mszyprow has joined #dri-devel
<karolherbst>
I think the problem here is, that casting a constant memory pointer to a different address space is kinda UB
<alyssa>
there's no constant memory pointer in the C code?
<alyssa>
*CL kernel
<alyssa>
just literals that clang decided to turn into constant memory
<karolherbst>
right... it's kinda weird tbh
<karolherbst>
what's the CLC source?
<alyssa>
agx_pack(..) { ..}
<karolherbst>
not really sure what that gets generated into, but in theory it should just be a stack variable getting fields assigned, so I'm kinda confused why it's doing this kinda nonsense in nir
<karolherbst>
or rather, I don't see where this cast would come from
<karolherbst>
does it look better in the spir-v? though I suspect not
<karolherbst>
or maybe?
<karolherbst>
what does the nir straight out of spirv_to_nir look like?
<karolherbst>
but anyway... casting from constant to generic is just not legal
<karolherbst>
the thing is... because it's all coming from C you also can't just drop random cast, because $reasons
<karolherbst>
like e.g. if you'd do (global* int)some_local_memory_ptr, you also can't just load from the local address, because it's technically a bug in the source code and UB
<karolherbst>
but only in the sense of your pointer is probably pointing to invalid memory
<karolherbst>
but what if you do (local* int)(global* int)...
<alyssa>
there's no cast to global?
<alyssa>
what's happening is that there's a constant struct
<alyssa>
that would be fine, if we split the struct with split_struct_vars
<alyssa>
but that bails on complex uses, because of the deref_ptr_as_array
<alyssa>
which in turn nir_deref.c claims should be eliminated by nir_opt_deref but that's not happening
<karolherbst>
ehhh wait.. I missinterpreted the " 64 %9 = deref_cast (uvec2 *)%5 (constant uvec2) (ptr_stride=8, align_mul=0, align_offset=0)" thing...
<alyssa>
presumably my pass order is busted.
<karolherbst>
if it's a pointless cast, opt_deref should kinda be able to get rid of it
<karolherbst>
alyssa: btw, did you call explicit_type?
<karolherbst>
some of the passes rely on explicit type information
<alyssa>
which is being inserted by nir_lower_memcpy, I think
<karolherbst>
it's kinda funky, that with the constant struct initiialize nir at some point does load_const, but for whatever reason it thinks it should pipe that through scratch memory...
<karolherbst>
the deref chains for store and load are different
<karolherbst>
so we fail to see they are equal
<karolherbst>
or rather point to equal things
<karolherbst>
there is a dump LLVM reasons for it and the translator also not being super nice to us
<karolherbst>
so when storing it, you have explicit struct member accesses
<karolherbst>
but on load you don't have the struct information and it just does raw vec/scalar loads
<karolherbst>
it's really annoying
<karolherbst>
however, we should still be able to optimize it away :D
<karolherbst>
it's just that our opt_deref isn't smart enough for that yet
<alyssa>
alright..
<karolherbst>
I think there is an MR for that...
<karolherbst>
maybe not
<karolherbst>
gfxstrand might remember
<alyssa>
in this case at least, the obvious problem is that we're lowering memcpy to a raw copy of bytes, which fundamentally impedes other opts
<karolherbst>
but yeah.. I think ultimately this is something we can only clean up after io lowering
<karolherbst>
mhhh.. yeah, just..
<karolherbst>
In my example there is no memcpy
<alyssa>
two solutions are to either lower memcpys of structs to memcpys of each element separately, if we know it's tightly packed & so on
<alyssa>
or to teach struct lowering to delete memcpies
<alyssa>
uh
<alyssa>
or to teach struct splitting to split memcpys
<karolherbst>
ehh wait.. I faied to use my search function.. there is a memcpy
<karolherbst>
let's see...
<karolherbst>
yeah soo.. we can't do much useful with that memcpy
<karolherbst>
it's just taking the raw pointer and copies the function_temp stuff into it
<karolherbst>
so out ouf LLVM/SPIR-V it's already a plain byte copy
<karolherbst>
and not much we can really do about it
<karolherbst>
and I don't think that before IO lowering is the place where we could actually resolve that, because we'd have to know the actual offsets the load/stores go to
<karolherbst>
in order to propagate it
<karolherbst>
the downside of doing this after io lowering is, that we already allocated scratch space
<karolherbst>
I honestly don't know what would be the best path forward here
<karolherbst>
maybe we can convince LLVM to not do this nonsense? but then we can also get spir-v doing it anyway
alyssa has quit [Quit: alyssa]
alyssa has joined #dri-devel
<alyssa>
ok.. I think we can split the memcpy, at least in the simple case I'm looking at
<alyssa>
but the original case didn't have a memcpy there, just stores with deref_ptr_as_array..
<alyssa>
oh, but there's legitimately a cast happening in that one
Mangix has joined #dri-devel
<alyssa>
even though it's a cast between.. morally equivalent things
<karolherbst>
I wonder if the better strategy is to simply convert everything to byte arrays as a intermediate step before io lowering.. :D
<karolherbst>
but that's going to be messy
gouchi has quit [Quit: Quitte]
<alyssa>
i mean.. trying to unlower scratch back to SSA sounds like you're in for a bad time
<karolherbst>
yup
<karolherbst>
I think all solutions here are messy in one or the other way
<alyssa>
i think the memcpy is the root brokenness
<karolherbst>
sure
<alyssa>
This is the nonsense that we get with everything up until lowering memcpys, triggered in my kernel by passing a struct around: