#asahi-gpu on 2022-10-26 — irc logs at oftc.irclog.whitequark.org

2022-08-14 19:46 ChanServ changed the topic of #asahi-gpu to: Asahi Linux: porting Linux to Apple Silicon macs | GPU / 3D graphics stack black-box RE and development (NO binary reversing) | Keep things on topic | GitHub: https://alx.sh/g | Wiki: https://alx.sh/w | Logs: https://alx.sh/l/asahi-gpu

00:01 chengsun has quit [Quit: Quit]

00:01 chengsun has joined #asahi-gpu

00:01 Etrien___ has quit [Ping timeout: 480 seconds]

00:09 ella-0 has quit [Read error: Connection reset by peer]

00:59 sadams0978 has joined #asahi-gpu

01:00 sadams0978 has quit [Quit: Konversation terminated!]

01:45 Etrien has quit [Read error: Connection reset by peer]

01:46 Etrien has joined #asahi-gpu

01:57 pthariensflame has joined #asahi-gpu

02:06 pthariensflame has quit [Quit: Textual IRC Client: www.textualapp.com]

02:15 <alyssa> It's really interesting comparing statistics for a given compiler shader *across* instruction sets

02:15 <alyssa> here's a chunky shader from glmark2 -bterrain

02:15 <alyssa> shaders/glmark/1-22.shader_test - MESA_SHADER_FRAGMENT shader: 1294 inst, 8600 bytes, 50 halfregs, 1 threads, 0 loops, 0:0 spills:fills

02:16 <alyssa> shaders/glmark/1-22.shader_test - MESA_SHADER_FRAGMENT shader: 1249 inst, 16.015625 cycles, 16.015625 fma, 3.453125 cvt, 0.000000 sfu, 0.125000 v, 0.000000 t, 0.000000 ls, 632 quadwords, 1 threads, 0 loops, 0:0 spills:fills

02:16 <alyssa> top is Apple M1, bottom is Mali-G57

02:16 <alyssa> off the bat you'll notice that the statistics for Mali are a lot more advanced, because we actually understand the uarch there

02:16 <alyssa> (AGX will get similar stuff in time~)

02:16 <alyssa> but here's an obvious one:

02:17 <alyssa> AGX is 8600 bytes

02:17 <alyssa> Mali is ~10,000 bytes

02:17 <alyssa> (quadwords = 16 bytes)

02:17 <alyssa> This is interesting, a Mali-G57 instruction can actually do *more* than an AGX instruction

02:18 <alyssa> there are fewer of them (1249 vs 1294), but they're bigger

02:18 <alyssa> Why? mali-g57 uses fixed-length instructions, all are 8 bytes, and many are sparse

02:18 <alyssa> AGX uses variable length instructions, in this shader instructions are an average of 6.65 bytes

02:20 <alyssa> Similarly, register pressure

02:20 <alyssa> it's not explicitly printed for Mali-G57, but "1 thread" means that it's using at least 33 dword registers

02:20 <alyssa> whereas AGX is using 50/2 = 25 dword registers

02:20 <alyssa> why would the same program use so many fewer registers on AGX?

02:21 <alyssa> one answer is "I did a better job at register allocation for AGX than Mali"

02:32 chengsun_ has joined #asahi-gpu

02:35 chengsun has quit [Ping timeout: 480 seconds]

02:37 chengsun_ has quit [Read error: Connection reset by peer]

02:37 chengsun has joined #asahi-gpu

02:50 chengsun_ has joined #asahi-gpu

02:52 chengsun has quit [Ping timeout: 480 seconds]

02:55 chengsun_ has quit [Quit: Quit]

02:57 chengsun has joined #asahi-gpu

03:13 chengsun_ has joined #asahi-gpu

03:16 chengsun has quit [Ping timeout: 480 seconds]

03:22 chengsun_ has quit [Ping timeout: 480 seconds]

03:58 ella-0 has joined #asahi-gpu

04:00 Graypup_ has quit [Quit: meow]

04:01 Graypup_ has joined #asahi-gpu

04:31 bluetail has joined #asahi-gpu

04:55 Etrien has quit [Read error: Connection reset by peer]

04:55 Etrien has joined #asahi-gpu

05:37 SSJ_GZ has joined #asahi-gpu

05:44 Etrien has quit [Read error: Connection reset by peer]

05:45 Etrien has joined #asahi-gpu

05:55 <lina> Nice!! <3

06:12 Etrien__ has joined #asahi-gpu

06:17 Etrien has quit [Ping timeout: 480 seconds]

06:26 Etrien__ has quit [Read error: Connection reset by peer]

06:26 Etrien has joined #asahi-gpu

08:05 sharonmary6[m] has quit []

08:05 MatrixTravelerbot[m]1 has quit []

08:05 psydroid[m] has quit []

08:05 arisu has quit []

08:05 Ella[m] has quit []

08:05 Lucy[m] has quit [Quit: Bridge terminating on SIGTERM]

08:05 Soroush has quit []

08:06 Dcow has joined #asahi-gpu

08:11 capta1nt0ad has joined #asahi-gpu

08:44 subatomic has joined #asahi-gpu

08:46 Dcow_ has joined #asahi-gpu

08:53 Dcow has quit [Ping timeout: 480 seconds]

08:57 subatomic has quit [Quit: Textual IRC Client: www.textualapp.com]

09:00 capta1nt0ad has quit [Quit: Konversation terminated!]

09:01 chengsun has joined #asahi-gpu

09:03 geochip has joined #asahi-gpu

09:48 geochip has quit [Quit: leaving]

09:59 chadmed has joined #asahi-gpu

10:31 chadmed has quit [Quit: Konversation terminated!]

10:32 chadmed has joined #asahi-gpu

10:34 chadmed has quit []

10:40 kov has quit [Quit: Coyote finally caught me]

10:40 kov has joined #asahi-gpu

11:00 chadmed has joined #asahi-gpu

11:48 chadmed has quit [Read error: No route to host]

12:07 chadmed has joined #asahi-gpu

13:23 Gaspare has joined #asahi-gpu

13:37 chadmed has quit [Read error: No route to host]

13:40 n1c has quit [Quit: ZNC 1.8.2+deb1+focal2 - https://znc.in]

13:41 n1c has joined #asahi-gpu

13:46 Gaspare has quit [Read error: Connection reset by peer]

15:05 <lina> So the M1 Ultra just needed a bunch of new buffers and some larger bigger ones, initdata changes, and a couple constants changed, and then it worked.

15:05 <lina> There's a Z acceleration/hierarchical Z thing that showed up, and then 5 new buffers adjacent to the tiler?

15:06 <lina> I get the feeling they're actually trying to balance work within single jobs between dies, which would explain why they need some new buffers to transfer things around.

15:07 <lina> Parallelizing fragment processing is trivial, but fragment is not, since it interacts with the tiler/sorting stuff

15:09 <alyssa> lina: as I texted you, the "Z acceleration" buffer is probably just from depth compression, which Apple's driver is aggressive about enabling and took me a lot of time to figure out how to disable

15:09 <alyssa> it does visually look like hier-z but I'm not convinced it actually is

15:14 <lina> alyssa: I've never seen it enabled on the M1 Mini, ever. I have zero hits for those pointers in all my historical hypervisor logs. But on this one, it showed up, and mesa was faulting without it... so I'm not sure you disabled it ^^

15:20 <alyssa> Very curious

15:20 <alyssa> I'd love to see the Mesa patch if you've pushed

15:20 <alyssa> (the diff from t8103 mesa I mean)

15:21 <lina> Let me do that!

15:22 <lina> The size is just random though, haven't worked out how to calculate any of it yet.

15:24 <alyssa> o

15:24 <alyssa> OK

15:25 <lina> Last 3 commits here: https://gitlab.freedesktop.org/asahilina/mesa/-/commits/asahi/wip

15:29 <lina> The stencil one is just a guess though, haven't actually seen it yet

15:48 <lina> Looks like it's 1/32 compression and it uses POT addressing, so align zbuffer size to POT and divide by 32 for the accel buffer size

15:48 <lina> It seems every 8x4 block of Z pixels maps to one accel buffer byte, 0x03 means clear.

15:49 <lina> Still need to look at the deflake buffer sizes... 2 of them are obvious due to adjacency, but do you remember how you figured out the third bound?

16:45 <alyssa> lina: guess

16:45 <alyssa> or maybe not

16:45 <alyssa> no, adjacency as well

16:45 <alyssa> finding the next allocation in the same BO and using that as an upper bound

17:24 Gaspare has joined #asahi-gpu

18:01 Gaspare has quit [Ping timeout: 480 seconds]

18:31 Dcow_ has quit [Remote host closed the connection]

18:31 Dcow has joined #asahi-gpu

18:40 Dcow has quit [Ping timeout: 480 seconds]

19:11 Dcow has joined #asahi-gpu

19:55 yuyichao_ has quit [Remote host closed the connection]

19:55 yuyichao_ has joined #asahi-gpu

20:27 rwhitby has joined #asahi-gpu

20:41 Dcow has quit [Remote host closed the connection]

20:42 Dcow has joined #asahi-gpu

20:43 Dcow has quit [Remote host closed the connection]

20:43 Dcow has joined #asahi-gpu

21:05 rwhitby has quit [Quit: rwhitby]

21:15 Etrien__ has joined #asahi-gpu

21:18 <phire> I suspect the accleration buffer stuff is just because whatever alyssa did to force a linear buffer, has regressed for some reason

21:20 Etrien has quit [Ping timeout: 480 seconds]

21:30 SSJ_GZ has quit [Read error: No route to host]

21:31 Etrien__ has quit [Read error: Connection reset by peer]

21:31 Etrien has joined #asahi-gpu

21:57 <alyssa> s/linear/uncompressed/, IIRC it's stil twiddled

21:57 <alyssa> Metal really doesn't like rendering to linear

21:59 <alyssa> It's probably not a *big* deal to support properly in mesa but that patch is not the way to do it

22:06 <phire> so it's just compressed vs uncompressed?

22:16 <alyssa> Yes, I think so

23:30 <jannau> hah, dcp supports XRGB afterall, just not as separate pixelformat but via a flag in dcp_surface (either unk1 or unk2)

23:32 <jannau> but it took me far too long to realize that fbcon displayed just a transparent terminal

23:47 <phire> classic bug

23:56 <alyssa> jannau: Woof

23:56 <alyssa> How'd you discover the flag?