marcan changed the topic of #asahi-gpu to: Asahi Linux GPU development (no user support, NO binary reversing) | Keep things on topic | GitHub: https://alx.sh/g | Wiki: https://alx.sh/w | Logs: https://alx.sh/l/asahi-gpu
chengsun_ has joined #asahi-gpu
capta1nt0ad has joined #asahi-gpu
chengsun has quit [Ping timeout: 480 seconds]
chengsun_ has quit [Ping timeout: 480 seconds]
capta1nt0ad has quit [Remote host closed the connection]
<chadmed> alyssa: works on kernel 332ca551eb93
Emantor_ has quit []
Emantor has joined #asahi-gpu
zalyx has quit [Quit: later alligator]
zalyx has joined #asahi-gpu
user982492 has joined #asahi-gpu
ma has joined #asahi-gpu
ma4 has quit [Ping timeout: 480 seconds]
user982492 has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
mini0n has joined #asahi-gpu
alyssa has joined #asahi-gpu
<alyssa> oh lol, pool allocations in magic.c mean the BO handles are wrong, I assume that's why macOS+mesa was unstable
<alyssa> (and why Linux was fine)
mini0n has quit []
SSJ_GZ has joined #asahi-gpu
corion has joined #asahi-gpu
corion has quit [Quit: Page closed]
mattgirv has quit [Server closed connection]
mattgirv_ has joined #asahi-gpu
cr1901 has quit [Read error: No route to host]
cr1901 has joined #asahi-gpu
<lina> jannau: Is this with yesterday's clustering fixes, or only the incomplete ones from wednesday?
<lina> steven: I replied on #asahi-stream but we should probably talk about texture issues here since alyssa is here ^^
<jannau> lina: current gpu/rust-wip is broken. not sure that clustering itself is the problem as setting WaitForPowerOff avoids the issue as well
corion has joined #asahi-gpu
n1c has quit [Server closed connection]
n1c has joined #asahi-gpu
robinp has joined #asahi-gpu
<jannau> lina: bisected to bedf1a6cc218669 "drm/asahi: render,buffer: Implement TVB auto-expansion"
corion has quit [Quit: Page closed]
nuup has quit [Server closed connection]
nuup has joined #asahi-gpu
corion has joined #asahi-gpu
corion has quit [Remote host closed the connection]
corion has joined #asahi-gpu
corion_ has joined #asahi-gpu
corion has quit [Remote host closed the connection]
corion_ has quit []
<lina> Interesting!
<lina> I did that late last night so it didn't get much testing...
<lina> I think I'll have to go back to tracing macOS VM accesses around that part of the code, there's probably something I'm doing wrong...
<lina> jannau: setting initial_tvb_size large enough should also avoid the problem, right?
<lina> You can also add debug_flags=0x2000 to get messages when TVB growth happens, which might be useful
<lina> This might just be that the TVB is too small to work properly and the logic for the initial size is wrong, or missing some factors
<lina> I know it's missing tile size but I think that mesa branch doesn't have varying tile size yet I think?
chengsun has joined #asahi-gpu
anuejn has joined #asahi-gpu
<jannau> doesn't look like mesa:asahi/main has varying tile size
<jannau> if the initial tvb size was shouldn't I see already issues with "drm/asahi: buffer,render: Add TVB minimum size formula and enforce it"?
corion has joined #asahi-gpu
vup has joined #asahi-gpu
<jannau> lina: https://www.jannau.net/asahi/gpu/2022-12-03_gpu_tvb_buffer_grow_supertuxkart_start_race_bedf1a6cc21.log, supertuxkart start at "Dez 03 11:50:30", start race at "Dez 03 11:50:57" I think
<lina> jannau: If the growth is what breaks then setting the initial TVB size higher would avoid that
<lina> jannau: Okay, that's weird. I don't think supertuxkart is more than one process? So that should be one TVB, so it not growing linearly is wrong...
<lina> Unless it creates multiple GL contexts by opening the device more than once somehow?
Dcow has quit [Ping timeout: 480 seconds]
<lina> Oh wait, those are other apps aren't they
<lina> jannau: The grep cut off the fault info, can you get me that? ^^
Dcow has joined #asahi-gpu
LinuxM1 has joined #asahi-gpu
<jannau> lina: still broken with an insane initial_tvb_size=0x200, no tvb resizes with that, fault is the same with a different vm_slot
<jannau> is 'address: 0xc' expected? I guess not
LinuxM1 has quit [Quit: Leaving]
corion has quit [Quit: Page closed]
c10l has quit [Quit: Bye o/]
goldsoultheory has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
c10l has joined #asahi-gpu
xiaojun has joined #asahi-gpu
xiaojun has quit []
chadmed_ has joined #asahi-gpu
yamii has joined #asahi-gpu
<lina> jannau: That sounds like a null pointer, but then it can't bisect to that commit because that commit's code is never running...
goldsoultheory has joined #asahi-gpu
<jannau> lina: bisect runtested 918e75228b2fe6c
<jannau> lina: bisect tested 918e75228b2fe6c explictly without reproducing the issue
<lina> This stuff sometimes isn't reproducible... yesterday's clustering stuff broke glmark2-refract at 1080p initially but not later
<lina> But larger sizes always broke
<lina> The only part of bedf1a6cc218669 that changed that doesn't only run with TVB resizes is the 4 lines of code around @@ -360,10 +386,6 @@ impl Buffer::ver {
<lina> Can you try reverting (re-adding) those and seeing if it does anything?
<lina> If that's not it then I don't understand
<lina> You can also try removing the matching store in auto_grow() which also runs unconditionally
<lina> (I basically just moved that store)
goldsoultheory has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
<jannau> sigh, bisect seems be bad, error reproducible with bedf1a6cc2186 reverted
chadmed_ has quit [Remote host closed the connection]
<jannau> testing supertuxkart now with kms instead of wayland and now everything is bad, landing now on "drm/asahi: Add overflow detection/guard areas to allocator" but probably only since I marke the previous commit as good without testing
<jannau> disabling Clustering still fixes it
<lina> I don't think the previous commit could've caused it?
<lina> It sounds like this is probably broken ever since I enabled clustering (so 44113d0e62)
<jannau> I'd guess so too but unfortunately not consistently broken
goldsoultheory has joined #asahi-gpu
goldsoultheory has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
<lina> alyssa: If you get a chance, try echo 0x01000000000000 > /sys/module/asahi/parameters/debug_flags
<lina> It seems to make things 5-10% faster (at least on M1 Ultra), but I have no idea what it does or whether always enabling that has a downside
<lina> jannau: I can reproduce it, by the way ^^
<lina> But I still have no idea what it is...
SSJ_GZ has quit [Ping timeout: 480 seconds]
<lina> Okay, now I know what it is
<lina> asahi.debug_flags=0xf00002000 at boot time (important) makes it work, and also spam dmesg with buffer overflow errors
<lina> Yay, my allocator debug stuff is useful! ^^
<lina> It's the cluster tilemaps!
<lina> So why are those undersized...
<lina> Or maybe I still got the stride somewhere wrong...
SSJ_GZ has joined #asahi-gpu
<lina> Actually I think in this particular case you don't even need to set it at boot time, that only matters for the kernel allocators anyway
rappet has quit [Server closed connection]
rappet has joined #asahi-gpu
Rayyan has quit [Server closed connection]
Rayyan has joined #asahi-gpu
corion has joined #asahi-gpu
<lina> OK, looks like we have double the tilemaps as usual?
user982492 has joined #asahi-gpu
ayke has quit [Server closed connection]
ayke has joined #asahi-gpu
corion has quit [Quit: Page closed]
Dcow has quit [Remote host closed the connection]
Dcow has joined #asahi-gpu
deflax has quit [Server closed connection]
deflax has joined #asahi-gpu
jn has quit [Server closed connection]
jn has joined #asahi-gpu
os has quit [Server closed connection]
os has joined #asahi-gpu
paddatrapper_ has quit [Server closed connection]
paddatrapper_ has joined #asahi-gpu
<lina> jannau: Should be fixed, but I'm really not happy with the magic *2 factor...
<lina> If you ever notice weird breakage again (especially clustering-related), try that flag. It should quickly find if a buffer is undersized somewhere.
corion has joined #asahi-gpu
c10l has quit [Read error: Connection reset by peer]
c10l has joined #asahi-gpu
goldsoultheory has joined #asahi-gpu
ChaosPrincess has quit [Quit: WeeChat 3.7.1]
ChaosPrincess has joined #asahi-gpu
SSJ_GZ has quit [Ping timeout: 480 seconds]
Dcow has quit [Remote host closed the connection]