ChanServ changed the topic of #asahi-gpu to: Asahi Linux GPU development (no user support, NO binary reversing) | Keep things on topic | GitHub: https://alx.sh/g | Wiki: https://alx.sh/w | Logs: https://alx.sh/l/asahi-gpu
MajorBiscuit has quit [Ping timeout: 480 seconds]
alyssa has joined #asahi-gpu
<alyssa> 8: 0509000e00c0f200 device_load 0, i32, xyzw, r1_r2_r3_r4, u0_u1, r0, unsigned
<alyssa> 10: 0529040e00c0f200 device_load 0, i32, xyzw, r5_r6_r7_r8, u2_u3, r0, unsigned
<alyssa> 18: 0541080e00c0f200 device_load 0, i32, xyzw, r8_r9_r10_r11, u4_u5, r0, unsigned
<alyssa> 20: 3800 wait 0
<alyssa> I wonder why this code sequence doesn't want to work :V
<alyssa> I just want multiple loads in flight :d
<alyssa> 2 loads seems to work, but 3 no?
<alyssa> which is strange because Apple's compiler is happy to generate sequences like that
<alyssa> why does my wait not work but there's does? :P
<alyssa> theirs
<alyssa> oh, hold on
<alyssa> now when I make it load vec4 it instead changes to
<alyssa> 1e: 0501000a00c8f200 device_load.TODO 0, i32, xyzw, r0_r1_r2_r3, u0_u1, r0, unsigned, lsl 2, 0, 0, 4, 0
<alyssa> 26: 0529a00e00c8f200 device_load 0, i32, xyzw, r5_r6_r7_r8, u0_u1, r5, unsigned, lsl 2
<alyssa> 2e: 0551800e00c8f200 device_load 0, i32, xyzw, r10_r11_r12_r13, u0_u1, r4, unsigned, lsl 2
<alyssa> 36: 3800 wait 0
<alyssa> ain't this what we saw this morning?
<alyssa> much ado about a single bit.
<alyssa> does it for vec3
<alyssa> and vec2
<alyssa> but not scalars
<alyssa> so, so many questions
<alyssa> what in the world could this single celebrity bit do
bluetail0 has joined #asahi-gpu
bluetail8 has joined #asahi-gpu
bluetail8 has quit []
bluetail has quit [Ping timeout: 480 seconds]
bluetail0 has quit [Ping timeout: 480 seconds]
<TellowKrinkle> Weird, I'm getting this on my M1 (no .TODO):
<TellowKrinkle> 24: 0529a40e00c8f200 device_load 0, i32, quad, r5_r6_r7_r8, u2_u3, r5, unsigned, lsl 2
<TellowKrinkle> 34: 3800 wait 0
<TellowKrinkle> 1c: 0501040e00c8f200 device_load 0, i32, quad, r0_r1_r2_r3, u2_u3, r0, unsigned, lsl 2
<TellowKrinkle> 2c: 0551240e01c8f200 device_load 0, i32, quad, r10_r11_r12_r13, u2_u3, r9, unsigned, lsl 2
<TellowKrinkle> (Compute shader that loads 3 float4s, does an fma, and stores the result)
<alyssa> TellowKrinkle: and the plot thickens
<alyssa> this is in a fragment shader. I wonder if that makes a difference.
<alyssa> indeed I don't get it in a vertex shader
<alyssa> wtfff
<alyssa> so if I can get the non-TODO one in a VS
<alyssa> why isn't this work for my compiler
<alyssa> so... what gives?!
<alyssa> i get it, it's only okay when apple's compiler does it..
<alyssa> same index, different base, maybe that's part
<alyssa> 14: 3e9108e02500 convert f_to_u32, r4, r15, rtz
<alyssa> 1a: 0501800e00c8f200 device_load 0, i32, xyzw, r0_r1_r2_r3, u0_u1, r4, unsigned, lsl 2
<alyssa> 22: 0531840e00c8f200 device_load 0, i32, xyzw, r6_r7_r8_r9, u2_u3, r4, unsigned, lsl 2
<alyssa> 2a: 0559880e00c8f200 device_load 0, i32, xyzw, r11_r12_r13_r14, u4_u5, r4, unsigned, lsl 2
<alyssa> 32: 3800 wait 0
<alyssa> nope no difference
<alyssa> so apply just... generates asm that looks like mine... but it doesn't work for me...
<TellowKrinkle> BTW on that fragment shader one, were you doing anything with side effects (e.g. writing to device memory)? I can only get the .TODO if I do that and I don't also add "[[early_fragment_tests]]"
<alyssa> no side effects..
<alyssa> now i'm not getting the .TODO in the frag shader either
<alyssa> uh
<alyssa> right, wasn't for scalar loads, only vector
<alyssa> uh why don't i get it for this vector.
<alyssa> signed index, 3 different buffers, same idx
<alyssa> 3 different buffers, same idx, but an UNSIGNED index and I get the .TODO on the first
<alyssa> there now i got it with a signed index
<alyssa> i honestly think the logic is just
<alyssa> bool APPLE_PROPRIETARY_COMPILER_should_set_random_bit(void) {
<alyssa> if (setting_the_bit_would_fuck_with_alyssa_more_than_not_setting_it()) {
<alyssa> return true;
<alyssa> } else {
<alyssa> return false;
<alyssa> }
<alyssa> }
<alyssa> maybe it's some sort of cache hint, idk
<alyssa> in terms of data structure changes
swaggie has quit [Read error: Connection reset by peer]
swaggie has joined #asahi-gpu
<alyssa> for the VS, unk2 is 1 on the no buffer load case and 2 on the 3 load case
<alyssa> uses way more regs obviously
<alyssa> different uniforms obviosuly
<alyssa> for the FS, unk 2 is set in the FS when loads and unset when no loads
<alyssa> that seems potentially more interesting
<alyssa> but other than nothing
<alyssa> so we have 2 random magic bits with no discernible pattern
<alyssa> and inexplicable deqp fails on linux
<alyssa> and ... yeah, this seems like a normal state of affairs, let's play supertuxkart instead
<alyssa> unless i have some unrelated bug where the VS is now faster causing me to lose some other race condition
<alyssa> that's.. unlikely but i'm running out of ideas
cylm_ has quit [Ping timeout: 480 seconds]
balrog has quit [Quit: Bye]
<alyssa> __builtin_ffs();
<alyssa> I found my bug
<alyssa> Unrelated to any of the magic, just a pure academic error
<alyssa> this is going on Spot The Bug
<TellowKrinkle> Okay the ".TODO" seems to only be added if the input comes from a flat interpolated input?
<alyssa> TellowKrinkle: I was getting it with uint(smooth input)
<TellowKrinkle> alyssa: Good to know
<TellowKrinkle> Okay I was able to get the TODO with smooth input
<TellowKrinkle> buffer[uint(smooth.x)] * buffer[uint(smooth.y)] => "device_load.TODO"
<TellowKrinkle> buffer[uint(smooth.x) * 8 + 0] * buffer[uint(smooth.x) * 8 + 2] => "device_load"
balrog has joined #asahi-gpu
<TellowKrinkle> buffer[uint(smooth.x) * 8 + 0] * buffer[uint(smooth.y) * 8 + 2] => device_load
balrog has quit [Quit: Bye]
<alyssa> truly an RNG
<TellowKrinkle> So far, it seems that if the input is smooth, doing anything other than just converting it to a uint removes the .TODO. But if the input is flat, I can do all sorts of math on it and it stays .TODO. Can't possibly imagine what such a flag would be useful for...
<alyssa> brain
evtyz has joined #asahi-gpu
evtyz has quit [Quit: My MacBook Air has gone to sleep. ZZZzzz…]
Gues__________________________ has joined #asahi-gpu
Gues__________________________ has quit [Quit: My MacBook Air has gone to sleep. ZZZzzz…]
balrog has joined #asahi-gpu
WindowPain_ has joined #asahi-gpu
WindowPain has quit [Ping timeout: 480 seconds]
bluetail8 has joined #asahi-gpu
MajorBiscuit has joined #asahi-gpu
Hibyehello_ has joined #asahi-gpu
Major_Biscuit has joined #asahi-gpu
Hibyehello has quit [Ping timeout: 480 seconds]
MajorBiscuit has quit [Ping timeout: 480 seconds]
SSJ_GZ has joined #asahi-gpu
cylm_ has joined #asahi-gpu
swaggie has quit [Remote host closed the connection]
swaggie has joined #asahi-gpu
swaggie has quit [Remote host closed the connection]
swaggie has joined #asahi-gpu
SSJ_GZ has quit [Ping timeout: 480 seconds]
swaggie has quit [Ping timeout: 480 seconds]
bcrumb has joined #asahi-gpu
swaggie has joined #asahi-gpu
swaggie has quit [Ping timeout: 480 seconds]
SSJ_GZ has joined #asahi-gpu
bcrumb has quit [Quit: WeeChat 3.7.1]
bluetail84 has joined #asahi-gpu
bluetail8 has quit [Ping timeout: 480 seconds]
bluetail84 is now known as bluetail8
swaggie has joined #asahi-gpu
swaggie has quit [Ping timeout: 480 seconds]
pjakobsson_ has quit [Remote host closed the connection]
bcrumb has joined #asahi-gpu
bcrumb has quit []
VinDuv has quit [Quit: ZNC 1.8.2+deb2+b1 - https://znc.in]
VinDuv has joined #asahi-gpu
LinuxM1 has joined #asahi-gpu
bcrumb has joined #asahi-gpu
bcrumb has quit []
LinuxM1 has quit [Ping timeout: 480 seconds]
Major_Biscuit has quit [Ping timeout: 480 seconds]
bcrumb has joined #asahi-gpu
snouhaud has joined #asahi-gpu
systwi_ has quit [Ping timeout: 480 seconds]
snouhaud has quit []
snouhaud has joined #asahi-gpu
systwi has joined #asahi-gpu
bcrumb has quit [Quit: WeeChat 3.7.1]
bcrumb has joined #asahi-gpu
bcrumb has quit []
swaggie has joined #asahi-gpu
swaggie has quit [Ping timeout: 480 seconds]
cylm_ has quit [Ping timeout: 480 seconds]
Dcow has quit [Remote host closed the connection]
Dcow has joined #asahi-gpu
Gues__________________________ has joined #asahi-gpu
Gues__________________________ has quit [Ping timeout: 480 seconds]
leftas has joined #asahi-gpu
mkurz has joined #asahi-gpu
<mkurz> marcan: Lina updated the GPU tracker issue (https://github.com/AsahiLinux/linux/issues/72), but linked to the same kwin issue twice ("... plus KWin bug, plus core Mesa bug...") She didn't link to the Mesa bug actually.
Dcow has quit [Remote host closed the connection]
Dcow has joined #asahi-gpu
Dcow has quit [Read error: Connection reset by peer]
Dcow has joined #asahi-gpu
Rayyan_ is now known as Rayyan
snouhaud has quit []
<alyssa> Linear texture: must be of type MTLTextureType2D or linear MTLTextureType2DArray, textureType (MTLTextureType1DArray) disallowed
<alyssa> pray tell, metal, how do I get a linear MTLTextureType2DArray
<alyssa> bloody hell
Dcow has quit [Remote host closed the connection]
Dcow has joined #asahi-gpu
Dcow has quit [Ping timeout: 480 seconds]
SSJ_GZ has quit [Ping timeout: 480 seconds]
Major_Biscuit has joined #asahi-gpu