#asahi-gpu on 2023-07-05 — irc logs at oftc.irclog.whitequark.org

2022-12-21 00:46 ChanServ changed the topic of #asahi-gpu to: Asahi Linux GPU development (no user support, NO binary reversing) | Keep things on topic | GitHub: https://alx.sh/g | Wiki: https://alx.sh/w | Logs: https://alx.sh/l/asahi-gpu

00:20 crabbedhaloablut has quit [Read error: Connection reset by peer]

00:21 crabbedhaloablut has joined #asahi-gpu

00:27 nsklaus has quit [Remote host closed the connection]

01:49 chadmed has quit [Quit: Konversation terminated!]

01:50 chadmed has joined #asahi-gpu

01:59 balrog has quit [Ping timeout: 480 seconds]

02:01 yuyichao has joined #asahi-gpu

02:12 <alyssa> lina: 42 files changed, 2624 insertions(+), 469 deletions(-)

02:13 <alyssa> wow, sure whittled down my patch stack a lot T_T

02:17 mikelee has joined #asahi-gpu

02:19 mikelee has quit [Remote host closed the connection]

02:42 <alyssa> eh, we'll get through it :)

02:51 mikelee has joined #asahi-gpu

02:53 <alyssa> Opened some more "hopefully not at all scary" MRs to whittle things down

02:55 <alyssa> lina: https://gitlab.freedesktop.org/asahi/mesa/-/merge_requests/95 <-- this one you definitely want to look @

03:47 lina has quit [Remote host closed the connection]

03:49 lina has joined #asahi-gpu

03:50 <lina> Looks like my previous kernel fix didn't fix the Firefox hangs, it just made them not spin a CPU core... ;;

03:59 PyroPeter_ has joined #asahi-gpu

04:01 cyrozap has quit [Quit: ZNC 1.8.2+deb2+b1 - https://znc.in]

04:01 PyroPeter has quit [Ping timeout: 480 seconds]

04:01 cyrozap has joined #asahi-gpu

04:06 cyrozap has quit []

04:06 cyrozap has joined #asahi-gpu

04:09 mikelee has quit [Read error: Connection reset by peer]

04:09 cyrozap has quit []

04:10 cyrozap has joined #asahi-gpu

04:27 mikelee has joined #asahi-gpu

05:13 mikelee has quit [Remote host closed the connection]

05:14 mikelee has joined #asahi-gpu

05:15 mikelee has quit [Remote host closed the connection]

05:15 mikelee has joined #asahi-gpu

05:15 mikelee has quit [Remote host closed the connection]

05:16 mikelee has joined #asahi-gpu

05:24 crabbedhaloablut has quit []

05:26 crabbedhaloablut has joined #asahi-gpu

05:37 cylm has quit [Ping timeout: 480 seconds]

05:40 mikelee has quit [Remote host closed the connection]

05:40 mikelee has joined #asahi-gpu

05:41 mikelee has quit [Remote host closed the connection]

05:41 mikelee has joined #asahi-gpu

06:05 hightower3 has quit []

06:13 mikelee_ has joined #asahi-gpu

06:20 mikelee has quit [Ping timeout: 480 seconds]

07:38 bisko has joined #asahi-gpu

08:16 chadmed has quit [Remote host closed the connection]

08:29 nsklaus has joined #asahi-gpu

08:42 chadmed has joined #asahi-gpu

10:03 chadmed has quit [Remote host closed the connection]

10:03 chadmed has joined #asahi-gpu

11:36 <lina> jannau: Do you remember what test was failing with the TVB thing?

11:37 <jannau> lina: no

11:37 <jannau> at least not out of my head

11:38 <lina> Do you remember if it was dEQP-GLES3 or dEQP-GLES31 or something else?

11:38 <jannau> lina: deqp-gles3/functional/shaders/builtin_functions/precision/matrixcompmult/highp_fragment/mat3

11:43 <jannau> lina: with asahi.initial_tvb_size=64 and alyssa's es31 branch only atomic tests failed

11:45 <jannau> yes, es31. the top 2 commits were unfinished tests for the heisenbug

11:49 <jannau> yes, that looks like atomics tests

11:50 <jannau> I suspect some atomics tests might succeed spuriously

11:50 <lina> These are the tests that failed without "atomic" in the name but I think they're related?

11:51 <lina> dEQP-GLES31.functional.compute.basic.image_barrier_multiple,Fail

11:51 <lina> dEQP-GLES31.functional.compute.basic.ssbo_cmd_barrier_multiple,Fail

11:51 <lina> dEQP-GLES31.functional.compute.indirect_dispatch.gen_in_compute.multiple_groups,Fail

11:51 <lina> dEQP-GLES31.functional.compute.indirect_dispatch.upload_buffer.multiple_groups,Fail

11:51 <lina> dEQP-GLES31.functional.image_load_store.early_fragment_tests.early_fragment_tests_stencil,Fail

11:51 <lina> dEQP-GLES31.functional.image_load_store.early_fragment_tests.no_early_fragment_tests_depth,Fail

11:52 <jannau> looks familiar, Alyssa looked at some/all of them and said they use atomics

12:09 bisko has quit [Ping timeout: 480 seconds]

12:17 <alyssa> correct

12:19 <lina> TL;DR today I fixed a bunch of kernel stuff, both more sizing/param issues (I have testable test data for it now!) and also implemented the sync TVB growth event though I don't think we actually use/need it now, it was getting triggered by some bad settings I think. But you can test it with ASAHI_MESA_DEBUG=synctvb if you want (see open MR).

12:19 <lina> I got through GLES3 in that mode but GLES31 deadlocked or something, I'll look into it later with lockdep (might be a regression with synctvb, if so we don't care). Without any overrides things work ^^

12:38 <alyssa> woot

12:38 <alyssa> any progress on our bugs? or is this strategy of "well, it definitely fixed SOMETHING!"

12:38 <alyssa> :-D

12:40 possiblemeatball has quit [Quit: Quit]

13:01 chadmed has quit [Ping timeout: 480 seconds]

13:03 chadmed has joined #asahi-gpu

13:39 <lina> alyssa: Well the one jannau ran into with the TVB size at least... and I'm pretty sure I fixed other stuff too, yes ^^

13:52 <alyssa> I heh

13:53 <alyssa> it's my understanding the atomics with clustering isssue and the heisenbug are the hot items

13:53 <alyssa> (blocking CTS)

13:54 maria has quit [Ping timeout: 480 seconds]

13:55 maria has joined #asahi-gpu

14:13 WindowPa- has joined #asahi-gpu

14:13 WindowPain has quit [Read error: Connection reset by peer]

15:11 <_jannau__> I think only atomics are blocking CTS on t600x, the heisenbug didn't reproduce there

15:16 cylm has joined #asahi-gpu

15:21 Dementor has quit [Remote host closed the connection]

15:23 Dementor has joined #asahi-gpu

15:36 nela0 has joined #asahi-gpu

15:38 nela has quit [Ping timeout: 480 seconds]

15:38 nela0 is now known as nela

15:51 <alyssa> _jannau__: let's do process of elimination

15:52 <alyssa> _jannau__: can you git clone github.com/dougallj/applegpu on macOS on t600x, run https://paste.debian.net/1285009 and send the output?

15:54 <alyssa> (Or anyone else with an M1 Pro/Max/Ultra)

16:04 <dottedmag> alyssa: https://paste.debian.net/1285010/

16:04 <dottedmag> m1max, if it matters

16:08 balrog has joined #asahi-gpu

16:11 <alyssa> dottedmag: thanks

16:12 <alyssa> that indeed sets 1 bit that isn't set on M1

16:12 <alyssa> bit 45

16:16 <alyssa> I wrote a patch and it wedged my M1

16:16 <alyssa> that sounds like we're on the right track

16:16 <alyssa> reason: AfFault,

16:16 <alyssa> spicy

16:17 <alyssa> ooh and green screen when shutting down

16:17 <alyssa> did i wedge dcp too? powerful mesa patch

16:21 <alyssa> _jannau__: https://rosenzweig.io/0001-HACK-Clustered.patch

16:21 <alyssa> Give that a try on t600x

16:21 <alyssa> with the atomics tests

16:21 <alyssa> if that changes anything, let me know drm_asahi_params_global::num_clusters_total so I can do the proper runtime detection and not wedge t8103

16:27 <jannau> alyssa: all atomic tests in es31 pass now on t6002

16:27 <alyssa> Oof.

16:27 <alyssa> Alright.

16:27 <alyssa> I mean, that's good but

16:27 <alyssa> spicy.

16:28 <alyssa> I had a hunch already that bit 45 controls cache of some kind

16:28 <alyssa> this.. adds to that suspicion >:)

16:33 * alyssa digs up her notes on cache bits

16:34 <jannau> num_clusters_total should be 2 / 4 / 8 for m1 pro / max / ultra

16:35 <alyssa> k, yeah

16:36 <alyssa> then https://rosenzweig.io/0001-asahi-agx-Set-coherency-bit-for-clustered-targets.patch should work everywhere

16:36 <jannau> the lower core count versions have always cores in all clusters disabled

16:37 <jannau> shall I start a full CTS run?

16:38 <alyssa> jannau: Let me get you a non-stupid branch to use

16:39 <jannau> ok

16:42 Guest4807 has quit [Quit: Bridge terminating on SIGTERM]

16:42 rhysmdnz has quit [Quit: Bridge terminating on SIGTERM]

16:43 <alyssa> kicking off CTS on M1 to break sure I didn't break anything in the meantime

16:43 <alyssa> jannau: alyssa/mesa:agx/es31-v2

16:45 Jamie has joined #asahi-gpu

16:45 rhysmdnz has joined #asahi-gpu

16:45 Jamie is now known as Guest5071

16:45 <alyssa> That doesn't have your wide_color fix on the mesa side, and the 10-bit thing still hasn't landed upstream anyway, so unfortunately still an x11_egl run

16:45 <alyssa> but... should pass, probably, maybe? O:)

16:46 <jannau> aye

16:46 <alyssa> (Strictly you could do a Wayland run with your patch and point to it and say "look it's your bug" but it would complicate things.)

16:47 <alyssa> anyway, M1 CTS is running

16:47 <jannau> we could do wayland and disable 10-bit formats

16:47 <alyssa> meh

16:47 <alyssa> easier just to do the x11_egl run for now

16:47 <alyssa> anyway, M1 CTS is running

16:47 <jannau> m1 ultra x11_egl CTS is running as well

16:47 <alyssa> who will win? =D

16:47 <alyssa> nominally the ultra but the CTS is completely single threaded CPU bound

16:48 <alyssa> so..

16:48 <jannau> the failing cts run was slower on the ultra than a succesful one on m1

16:48 <jannau> iirc 41 minutes vs. 28 min

16:49 <alyssa> sure

16:49 <alyssa> deqp-runner does a bunch of debug stuff for failing tests, idk what the real CTS runner is doing

16:53 balrog has quit [Quit: Bye]

16:56 balrog has joined #asahi-gpu

17:37 yuka has quit [Remote host closed the connection]

17:40 <alyssa> M1 seemed to finish

17:40 yuka has joined #asahi-gpu

17:40 <jannau> finished here as well after 52 minutes, still failed

17:41 <alyssa> :'(

17:41 <jannau> KHR-GLES31.core.draw_indirect.advanced-twoPass-transformFeedback-arrays

17:41 <alyssa> what this time?

17:41 <alyssa> oof

17:41 <jannau> KHR-GLES31.core.draw_indirect.advanced-twoPass-transformFeedback-elements

17:41 <jannau> for various configs

17:42 <jannau> for all configs

17:43 <alyssa> Interesting

17:43 <alyssa> definitely passes on M1

17:44 <jannau> error is in all cases: "(x,y)= (0,0). Color RGBA(0,0,0,1) is different than expected RGBA(0.1,0.2,0.3,1)"

17:44 <alyssa> jannau: The potentially "spicy" part of those tests is that they feed transform feedback results into the inidrect draw

17:45 <alyssa> since we dispatch xfb with the VDM (i.e. as vertex shaders with no fragment output), that's a forward VDM->VDM dependency, that does not require a full flush of the batch, but it does require a memory barrier

17:45 <alyssa> I'm wondering if this is morally the same issue as the atomics

17:46 <alyssa> the barrier we're using is strong enough to flush a cluster but not the whole system

17:46 <alyssa> See line 2892 of agx_state.c

17:47 <alyssa> try setting more bits (this will require adding extra fields in asahi/lib/cmdbuf.xml)

17:47 <alyssa> If someone has the ability to run wrap.dylib + agxdecode against t600x this corresponds to a metal memory barrier, that's where I got those magic bits in the first place

17:48 <alyssa> but in this case since we know what we're looking for bruteforcing might be faster anyway lol

18:10 <alyssa> jannau: any luck?

18:15 <jannau> alyssa: no. I think we're looking at different agx_state.c files. is line 2892 'cfg.unknown_30 = frag_tex_count >= 4;'?

18:17 <alyssa> uh, no

18:17 <alyssa> am i wrong the wrong branch

18:18 <alyssa> ok i was definitely way off

18:18 <alyssa> https://gitlab.freedesktop.org/alyssa/mesa/-/blob/agx/es31-v2/src/gallium/drivers/asahi/agx_state.c#L3228-3241

18:19 <alyssa> hopefully that makes more sense X_X

18:19 <jannau> yes

18:19 <alyssa> so sorry about that X_X

18:20 <jannau> adding more bits at the end or in the gaps? I suppose the answer is yes

18:21 <alyssa> yes

18:36 <ella-0> alyssa: tests i'm using are VK.api.copy_and_blit.core.image_to_image.simple_tests.*, branch is https://gitlab.freedesktop.org/Ella-0/mesa/-/tree/wip/agxv/meta

18:37 <alyssa> ella-0: thx

18:41 <jannau> alyssa: becomes a flake with bit (4 and) 24 set

18:42 <alyssa> uhoh

18:42 <alyssa> that's still setting all the other bits too?

18:43 <jannau> yes, those are in addition

18:43 <alyssa> interesting

18:43 lena6 has quit [Remote host closed the connection]

18:43 <alyssa> may be necessary to see what metal does after all :|

18:45 <alyssa> [encoder memoryBarrier: MTLBarrierScopeBuffers afterStages: MTLRenderStageVertex beforeStages: MTLRenderStageVertex]

18:46 <alyssa> or maybe need even more bits set. IDK

18:47 <jannau> all good with bits 4, 21-26 in addition

18:48 <alyssa> \o/

18:50 <alyssa> you going to write the patch then?

18:50 <jannau> if bit 27 would not break it I'd say bit 20 to 27 are for each cluster

18:50 <jannau> let me first test if I need all bits

18:52 <alyssa> I could easily believe that

18:54 <jannau> minimal set seems to be 4, 24, 26

18:56 <alyssa> are we sure it doesn't flake like that, though?

19:03 <jannau> reasonably sure. no fail in 1000 repeats and if omit any of the bits it fails in 50% +/- 10

19:05 <alyssa> woo!

19:05 * alyssa eagerly awaits the patch.. and the qpas ;)

19:11 <jannau> cts is running and https://gitlab.freedesktop.org/asahi/mesa/-/merge_requests/99

19:11 <jannau> let me test if that breaks something on m1/m2

19:26 <alyssa> tangentially related, I don't think G13J exists

19:26 <alyssa> If I'm not mistaken m1 ultra is G13D

19:27 <alyssa> I may be mistaken of course

19:30 <jannau> you're correct. I think I mixed it up with the SoC family codename

19:33 <alyssa> nod

19:33 <alyssa> apple refers to the collection as G13X which isn't really right either :-D

19:42 systwi has quit [Ping timeout: 480 seconds]

20:03 <jannau> "56/56 sessions passed, conformance test PASSED"

20:14 <alyssa> WHOO!

20:28 systwi has joined #asahi-gpu

20:36 systwi_ has joined #asahi-gpu

20:37 systwi has quit [Ping timeout: 480 seconds]

20:41 c10l has quit [Quit: Bye o/]

20:42 c10l has joined #asahi-gpu

20:47 systwi has joined #asahi-gpu

20:51 systwi_ has quit [Ping timeout: 480 seconds]

20:57 <jannau> and the next target where fractional scaling breaks the x11_egl CTS. let's hope my fix gets merged soon so we can run under native wayland

21:05 mkurz has joined #asahi-gpu

21:16 <alyssa> :-D

21:25 systwi_ has joined #asahi-gpu

21:25 systwi has quit [Remote host closed the connection]

21:54 cylm_ has joined #asahi-gpu

21:56 cylm has quit [Ping timeout: 480 seconds]

22:23 systwi has joined #asahi-gpu

22:27 systwi_ has quit [Ping timeout: 480 seconds]

22:46 compassion178 has joined #asahi-gpu

22:49 compassion17 has quit [Ping timeout: 480 seconds]

22:49 compassion178 is now known as compassion17

23:06 pyropeter1 has joined #asahi-gpu

23:08 PyroPeter_ has quit [Ping timeout: 480 seconds]

23:35 nsklaus has quit [Remote host closed the connection]