#panfrost on 2022-12-24 — irc logs at oftc.irclog.whitequark.org

2022-12-21 00:45 ChanServ changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard + Bifrost + Valhall - Logs https://oftc.irclog.whitequark.org/panfrost - I don't know anything about WSI. That's my story and I'm sticking to it.

01:44 jolan has joined #panfrost

01:52 alyssa has joined #panfrost

02:24 <alyssa> yeet

02:25 <alyssa> merry christmas I hope you enjoy your mesa coding style compliant panfrost

02:40 <alyssa> and yeeted

02:59 <CounterPillow> you too!

03:21 <alyssa> italove: rebased your open MRs

03:21 <alyssa> and rebased mine

03:21 <alyssa> and also rebased my old opencl branch but apparently all of the good stuff I already landed and all that's left are some hacks we never figured out

03:24 q4a1 has quit []

03:25 q4a has joined #panfrost

03:25 <q4a> alyssa: What panfrost task will you do next?

03:26 <alyssa> after "chill the heck out", you mean? :-)

03:26 * alyssa has a todo list somewhere here

03:27 <q4a> yes

03:28 <alyssa> v10 gles3.1 support

03:28 <alyssa> immediate next task is finishing up the pandecode side and getting those patches out

03:29 <alyssa> that's pretty close I just keep getting distracted with shiny things like running Dolphin on my other driver, hehe ^.^

03:29 <alyssa> The challenge with CSF is 100% on the kernel side

03:30 <alyssa> Given that we have conformatn gles3.1 on v9, the Mesa side for v10 is... well, it's not trivial but there's nothing really novel happening

03:30 <alyssa> Way more cache coherency snafu than any Mali I've seen so far, though, so that's really fun (-:

03:31 <alyssa> I am suspicious that the rk3588 board I have might have been from a bad early batch of the SoC ... fresh one is on the way from a different manufacturer, hopefully that works out better

03:32 <q4a> Thanks for your work. I have v10 and I'm ready to help with simple things or test when needed

03:33 <alyssa> thanks :)

03:33 <alyssa> I have a few month months of uni to finish up

03:33 <alyssa> after that, the sky's the limit :-)

03:36 <alyssa> q4a: If you're interested in learning about compilers, there's a lot of "low hanging" tasks in the Valhall compiler you could work on

03:37 <alyssa> instruction selection optimizations and such

03:37 <q4a> Yes. I'm interested

03:37 <alyssa> stuff that probably doesn't actually help fps in real workloads, so I can't justify spending time on anymore, but are lots of funs and good learning experiences

03:37 <alyssa> Okie

03:38 <alyssa> I can write up some issues on gitlab about ideas to work on

03:38 <q4a> it will be great!

03:38 <alyssa> :D

03:39 <q4a> I need specific kernel for that tasks?

03:40 <alyssa> Mmh, that's tricky

03:40 <alyssa> What Mali hardware do you have other than v10?

03:40 <q4a> rk3288

03:40 <alyssa> right. different compiler then.

03:40 <alyssa> Mmh, most of what I have in mind you would be writing unit tests for

03:41 <alyssa> so it actually shouldn't matter what hw you have

03:41 <alyssa> once your unit tests pass, obviously i would run it through deqp on v9

03:41 <alyssa> which reminds me we really need to get v9 in CI

03:41 * alyssa mumbles

03:44 <alyssa> q4a: So, for environment, I recommend setting up drm-shim

03:44 <alyssa> https://docs.mesa3d.org/drivers/panfrost.html#drm-shim

03:45 <alyssa> and shader-db https://gitlab.freedesktop.org/mesa/shader-db

03:45 <alyssa> with the commands there you can then run panfrost's compilers for any target GPU you like on a shader you craft, or on a big pile of shaders as you choose

03:45 <alyssa> readme for shader-db helps

03:45 <alyssa> `python3 report.py before.txt after.txt` will generate some nice stats

03:46 <alyssa> or, would. I think you need a patch I forgot to upstream

03:46 <q4a> All this should work on rk3288?

03:47 <alyssa> sure

03:47 <alyssa> or an x86 machine or whatever

03:47 <alyssa> only requirement is that you're running Linux (or maybe BSD)

03:47 <alyssa> https://rosenzweig.io/shader-db-patch.txt <-- this patch to shader-db will make report.py work

03:48 <alyssa> with valhall

03:48 <alyssa> as the docs explain, PAN_GPU_ID=9093 will target a Valhall processor as you want

03:48 <alyssa> I have an alias "run-g57" that expands to "LD_PRELOAD=~/mesa/build/src/panfrost/drm-shim/libpanfrost_noop_drm_shim.so PAN_GPU_ID=9093 ./run"

03:49 <alyssa> I guess the proper way to do your own mesa builds now is to run `meson devenv` inside of your `mesa/build/` folder

03:49 <alyssa> within the proper mesa devenv, with drm-shim as above I can run e.g

03:50 <alyssa> $ run-g57 shaders/glmark/1-18.shader_test

03:50 <alyssa> and it'll print out some stats about the shader it compiled:

03:50 <alyssa> shaders/glmark/1-18.shader_test - MESA_SHADER_POSITION shader: 30 inst, 3.000000 cycles, 0.343750 fma, 0.062500 cvt, 0.062500 sfu, 0.000000 v, 0.000000 t, 3.000000 ls, 16 quadwords, 2 threads, 0 loops, 0:0 spills:fills

03:50 <alyssa> shaders/glmark/1-18.shader_test - MESA_SHADER_VARYING shader: 123 inst, 6.000000 cycles, 1.515625 fma, 0.109375 cvt, 0.812500 sfu, 0.000000 v, 0.000000 t, 6.000000 ls, 64 quadwords, 2 threads, 0 loops, 0:0 spills:fills

03:50 <alyssa> shaders/glmark/1-18.shader_test - MESA_SHADER_FRAGMENT shader: 22 inst, 0.250000 cycles, 0.093750 fma, 0.187500 cvt, 0.062500 sfu, 0.250000 v, 0.000000 t, 0.000000 ls, 16 quadwords, 2 threads, 0 loops, 0:0 spills:fills

03:50 <alyssa> and if I want to see the assembly

03:50 <alyssa> `BIFROST_MESA_DEBUG=shaders run-g57 shaders/glmark/1-18.shader_test` will show me the asm

03:50 <alyssa> and intermediate IR and so on

03:51 <alyssa> https://rosenzweig.io/output.txt

03:51 <alyssa> at the top is the final optimized NIR shader

03:52 <alyssa> next up is the final optimized Valhall instructions, but before register allocation

03:52 <alyssa> next up is the shader after register allocation

03:53 <alyssa> then ^ plus the late Valhall specific passes that aren't used on Bifrost

03:53 <alyssa> finally, a disassembly of the compiled shader itself, i.e. what the hardware actually executes

03:53 <alyssa> Looking at that shader, already lots of little inefficiencies jump out

03:55 <alyssa> so here's a low hanging fruit of the above variety: folding in the F16_TO_F32 instruction into the FMA_RSCALE.f32 instruction

03:55 <alyssa> F16_TO_F32 r0, ^r0.h0

03:55 <alyssa> FMA_RSCALE.f32 r0, ^r0, 0x3F800000, 0x0.neg, ^r1

03:55 <alyssa> this can be written more efficiently as

03:55 <alyssa> FMA_RSCALE.f32 r0, ^r0.h0, 0x3F800000, 0x0.neg, ^r1

03:55 <alyssa> i think

03:56 <alyssa> uh, no, apparently it can't. oof.

03:57 <alyssa> ok, here's a different issue

03:57 <alyssa> IADD_IMM.i32 r1, 0x0, #0x18

03:57 <alyssa> FMA_RSCALE.f32 r0, ^r0, 0x3F800000, 0x0.neg, ^r1

03:57 <alyssa> that "IADD_IMM.i32" instruction just loads a constant

03:57 <alyssa> It would be more efficient to reserve a "fast access uniform" (FAU entry) for the constant and write in one instruction instead

03:58 <alyssa> FMA_RSCALE.f32 r0, ^r0, 0x3F800000, 0x0.neg, u0

03:58 <alyssa> `src/panfrost/bifrost/valhall/va_lower_constants.c` is responsible for lowering constants

03:59 <alyssa> read that pass, you'll see there's a todo for using uniforms

04:00 <alyssa> `src/panfrost/bifrost/valhall/test/test-lower-constants.cpp` tests that pass. you'll want to write unit tests for the optimization you're trying to write first, and then you can run them from your mesa/build with `meson test --suite=panfrost`

04:00 <alyssa> also read `bi_opt_push_ubo.c` and the push data structure

04:01 <alyssa> and the sysvals infrastructure

04:01 <alyssa> you'll need to extend them somehow to push constants

04:01 <alyssa> and then upload those constants in the driver

04:05 <q4a> ok. I need some time to read, build and test this stuff

04:09 floof58 is now known as Guest283

04:09 floof58 has joined #panfrost

04:10 Guest283 has quit [Ping timeout: 480 seconds]

04:15 <alyssa> good luck!

04:41 <q4a> I builded mesa, shader-db and got my asm output: https://pastebin.ubuntu.com/p/Z8XNRzhDZc/

04:42 <q4a> for `BIFROST_MESA_DEBUG=shaders run-g57 shaders/glmark/1-18.shader_test`

04:43 <q4a> Need to read more about asm instructions..

04:56 <alyssa> q4a: people.collabora.com/~alyssa/Valhall-Documentation.pdf

04:56 <alyssa> that's not updated anymore but it's mostly accurate

04:57 <alyssa> src/panfrost/bifrost/valhall/ISA.xml is updated, however.

05:01 <q4a> thanks!

05:14 Daanct12 has joined #panfrost

05:59 cphealy has quit [Remote host closed the connection]

05:59 cphealy has joined #panfrost

06:03 warpme_____ has quit []

08:04 Daanct12 has quit [Quit: Leaving]

09:45 rasterman has joined #panfrost

12:10 Net147 has quit [Quit: Quit]

12:11 Net147 has joined #panfrost

12:50 MajorBiscuit has joined #panfrost

18:03 cphealy has quit []

18:15 cphealy has joined #panfrost

18:46 warpme_____ has joined #panfrost

20:15 kenzie7 has quit []

20:16 kenzie7 has joined #panfrost

20:28 rasterman has quit [Quit: Gettin' stinky!]

20:48 Daanct12 has joined #panfrost

20:54 Danct12 has quit [Ping timeout: 480 seconds]

22:17 floof58 has quit [Ping timeout: 480 seconds]

22:24 floof58 has joined #panfrost

23:47 MajorBiscuit has quit [Quit: WeeChat 3.6]

23:53 avane_ has quit [Ping timeout: 480 seconds]

23:54 avane has joined #panfrost