#panfrost on 2023-02-06 — irc logs at oftc.irclog.whitequark.org

2022-12-21 00:45 ChanServ changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard + Bifrost + Valhall - Logs https://oftc.irclog.whitequark.org/panfrost - I don't know anything about WSI. That's my story and I'm sticking to it.

00:01 <jdavidberger> htop shows 3 idle cores and one that maxs out at like 5-10%. Its only doing ~10 dispatches over a few seconds for the test so I think the CPU shouldn't be slowing it down I think. It gets very very close to 25 if nothing else is running and the shaders are cached and everything which is suspiciously close to 32 * 800mhz

00:04 <cphealy> That is definitely suspiciously close to 32*800MHz!

00:05 <cphealy> What method are you using to come up with the value of 38gflops?

00:09 <jdavidberger> From the datasheet gen1 g52 is 32 operations per cycle but gen2 should be 48 operations per cycle so 48*800mhz = 38.4gflops

00:10 <cphealy> "FP32 Operations/Cycle"?

00:10 <HdkR> And then you learn that 32 operations/cycle != 32 FP32 operations/cycle

00:11 <cphealy> How are you interpreting the 32/48 value for Mali G52 as gen1 and gen2?

00:12 <cphealy> I wasn't aware that there are different gens of G52.

00:13 <jdavidberger> HdkR: True; but that linked datasheet lists it as FP32 operations/clock

00:15 <jdavidberger> cphealy: That is pretty unclear in the datasheet but the hardware revision is r1p0 and the max thread count as reported by the GPU matches the 768 number on the RK3566

00:17 <cphealy> I think the 32/48 is independent of r1p0 vs some other revision. My understanding of the revision is it is more along the lines of newer revision with bugs fixed. Basically the same as Cortex-A55 r1p0 vs Cortex-A55 r1p1 as an example.

00:26 <jdavidberger> Maybe -- the official documentation from ARM leaves something to be desired here for sure. But the GPU reporting 768 threads available with the drmIoctl call makes me think it's the second gen and I gotta think the 32->48 bump comes from the max threads going from 512->768

00:36 <cphealy> I have a different theory on the "FP32 Operations/Cycle" reporting 32/48: If you look at the top of that table, you will see an entry for "Arithmetric Units". With every GPU that has two numbers for the "FP32 Operations/Cycle", you will see two numbers for the "Arithmetric Units", so the correct FP32 Operations/Cycle number is likely tied directly to how many arithmetric units the GPU has.

00:36 <cphealy> It could be that you have a G52 with 2 arithmetric units as opposed to 3.

00:46 <cphealy> This article provides additional detail: https://www.anandtech.com/show/12501/arm-launches-new-mali-g52-g31-gpus-new-display-and-video-ip

00:59 <cphealy> According to this datasheet: https://www.boardcon.com/download/Rockchip_RK3566_Datasheet_V1.1.pdf The GPU is ARM Mali G52 1-Core-2EE. This would mean that you have a single shader core with 2 execution units. This would mean 32 is the correct number for FP32 Operations/Cycle.

01:03 <jdavidberger> that makes sense. The 768 thread thing really threw me off in that chart. Which is a bummer but glad I only spent one day trying to talk it into going faster

01:06 hanetzer has joined #panfrost

01:10 <cphealy> ;-)

01:13 <jdavidberger> I have the application code I need to run speced out at ~20GFLOPs. Hopefully I'm not that far off. Thanks for the help; its good not to spin my wheels for nothing

01:48 camus has joined #panfrost

02:20 alpernebbi has quit [Ping timeout: 480 seconds]

02:21 alpernebbi has joined #panfrost

04:17 wicastC has quit [Remote host closed the connection]

04:18 wicastC has joined #panfrost

05:37 davidlt_ has joined #panfrost

05:43 davidlt__ has joined #panfrost

05:45 rcf has quit [Quit: WeeChat 3.8-dev]

05:45 hanetzer1 has joined #panfrost

05:46 rcf has joined #panfrost

05:47 hanetzer has quit [Ping timeout: 480 seconds]

05:49 davidlt_ has quit [Ping timeout: 480 seconds]

06:10 davidlt__ has quit [Ping timeout: 480 seconds]

06:19 hanetzer2 has joined #panfrost

06:20 hanetzer1 has quit [Ping timeout: 480 seconds]

06:46 guillaume_g has joined #panfrost

06:50 guillaume_g has quit []

07:22 furry has joined #panfrost

07:26 furry has left #panfrost [#panfrost]

08:01 jdavidberger has quit [Quit: Leaving.]

08:58 rasterman has joined #panfrost

09:19 chewitt has quit [Quit: Zzz..]

10:19 <robmur01> cphealy: spot on - G52r0 implicitly has 3EE shader cores, G52r1 is configurable for 2EE or 3EE

11:03 <q4a> Looks like I was wrong. Some Mali GPU (like G610) support geometry shaders. Check "GL_EXT_geometry_shader" and "Geometry Shader" in Andeoid aida64 report: https://gist.github.com/q4a/b89a371e67e0504995ed23a96af8b2a3

11:03 Leopold has quit [Ping timeout: 480 seconds]

11:11 <HdkR> q4a: They support the extension yes. The hardware converts them to compute shaders.

11:11 <HdkR> Well, driver converts them to compute shaders :P

11:26 <robmur01> right, just because the *driver* reports a capability doesn't mean it has to be implemented in hardware. Take llvmpipe, for instance ;)

11:28 <q4a> then how to check if it's software or hw implementation?

11:28 <HdkR> It's all compute shader baby

11:28 <HdkR> No mali supports GS in hardware :)

12:03 MajorBiscuit has joined #panfrost

12:29 guillaume_g has joined #panfrost

13:08 guillaume_g has quit []

13:25 TheKit[m] has joined #panfrost

14:08 jelly has quit [Read error: Connection reset by peer]

14:09 jelly-hme has joined #panfrost

14:42 rcf1 has joined #panfrost

14:53 rcf has quit [Quit: WeeChat 3.4.1]

14:59 jdavidberger has joined #panfrost

15:17 alyssa has joined #panfrost

15:17 <alyssa> Has MRT + blend shaders been broken on Midgard this entire time?

15:17 <alyssa> Answer is likely than you'd think

15:22 <cphealy> robmur01: Are there readable registers in the GPU that expose how many AUs each shader core has? Also, are there readable registers in the GPU that expose how many shader cores the GPU has?

15:56 <stepri01> shader cores is easy - SHADER_PRESENT is a bitmap of which cores are implemented, so that number of bits set is the number of shader cores

15:59 <stepri01> number of AUs is stored in CORE_FEATURES (on GPUs where it means something)

16:00 <stepri01> (or Execution Engines to use the correct term)

16:02 <cphealy> stepri01: Execution Engines is equivalent to Arithmetic Units in public ARM Mali datasheet vernacular, correct?

16:03 <alyssa> stepri01: FWIW, the userspace does want to know the clock speed for clinfo...

16:03 <alyssa> right now I hardcode 800MHz...

16:03 <alyssa> IDK what any app can actually do with the information lol

16:04 <stepri01> cphealy: I'm not sure where you are seeing AUs in public docs. https://developer.arm.com/documentation/102546/0100/The-Bifrost-Shader-Core refers to them as EEs. But I think they are the same thing

16:05 <stepri01> alyssa: Yes I know - I tried to argue against providing the data (because it's almost certainly useless) but I just got pointed to the spec and didn't have much of an argument :(

16:05 <stepri01> hardcoding a random number seems like a good idea

16:05 <alyssa> mood

16:05 <alyssa> I don't care one way or the other tbh

16:18 <jdavidberger> Here is the dump of DRM registers for the g52 2ee; so AU as core features matches 2 which matches the benchmark -- https://gist.github.com/jdavidberger/eb1458ea59974be4784acb30c3e240dc

16:52 MajorBiscuit has quit [Ping timeout: 480 seconds]

16:56 ungeskriptet[m] is now known as ungeskriptet

17:00 <robclark> stepri01, alyssa: drm/msm exposes max clk.. I use it for things like calculating % utilization from perfcntrs.. IMO it is a perfectly reasonable thing to expose to userspace

17:01 <stepri01> robclark: The real problem is that "no idea" isn't an allowed response, and in some situations the kernel really doesn't know

17:01 <stepri01> It's also badly specified as things like DVFS mean that it's not the actual frequency

17:03 <robclark> surely the kernel knows the max freq.. it doesn't have to report the current freq, only the max

17:05 <stepri01> only if it is actually managing the clocks. On a FPGA platform it might not be known, and using a software model there isn't necessarily such a thing as a clock

17:06 <stepri01> so you end up with a "lie with a hardcoded number" path in the driver and wonder what exactly you gain by trying to give a real number on real hardware

17:06 <stepri01> beyond the specific case of profiling using hardware specific counters the number is useless and no application should use it

17:07 <stepri01> so why it exists in a supposedly hardware-agnostic spec is beyond me

17:09 stepri01 has quit [Quit: leaving]

17:14 <robclark> yeah, not sure why it is in spec.. but I don't think weird developer-only edge cases would convince me that it isn't something that the kernel should expose

17:40 p0g0 has quit [Ping timeout: 480 seconds]

17:53 <alyssa> especially given that neither FPGAs nor software models are available to us mere mortals

18:01 davidlt has joined #panfrost

18:09 <jdavidberger> Is there some concept of max invocations on bifrost that is possibly less than just GL_MAX_COMPUTE_WORK_GROUP_COUNT * GL_MAX_COMPUTE_WORK_GROUP_SIZE?

18:12 <jdavidberger> Specifically when I run with glDispatchCompute(65535,1,1) with a local size of {256,1,1} I expect 0xFFFF00 invocations and i'm seeing ... i think 0x2AAA80 invocations based on timings; hard to tell but def less

18:38 davidlt has quit [Ping timeout: 480 seconds]

19:24 jelly-hme is now known as jelly

19:26 jdavidberger has quit [Ping timeout: 480 seconds]

19:27 jdavidberger has joined #panfrost

19:34 davidlt has joined #panfrost

19:39 davidlt_ has joined #panfrost

19:43 davidlt has quit [Ping timeout: 480 seconds]

19:52 davidlt_ has quit [Ping timeout: 480 seconds]

19:54 Guest3758 has joined #panfrost

20:12 <alyssa> jdavidberger: Check dmesg, any chance the job is timing out/

20:32 <jdavidberger> [222057.754333] panfrost fde60000.gpu: gpu sched timeout, js=1, config=0x7b00, status=0x8, head=0xa279140, tail=0xa279140, sched_job=0000000065196d57 🤦

20:35 <jdavidberger> thanks that explains the behavior completely

20:40 Guest3758 has quit [Ping timeout: 480 seconds]

20:45 <alyssa> Woof...

20:45 <alyssa> jdavidberger: I don't recommend it (since long-running compute kernels will lock up your graphical session, there's no preemption

20:45 <alyssa> ) but you can patch the kernel to bump the timeout

20:45 <alyssa> (particularly if you're doing headless compute stuff)

20:46 <alyssa> it would be nice to get a proper fix but it's nontrivial and if it was going to happen it would have been 3 years ago :|

20:47 <alyssa> https://cgit.freedesktop.org/drm-misc/tree/drivers/gpu/drm/panfrost/panfrost_job.c#n25

20:47 <alyssa> timeout in milliseconds

21:04 greenjustin_ has joined #panfrost

21:06 greenjustin is now known as Guest3794

21:06 greenjustin_ is now known as greenjustin

21:11 Guest3794 has quit [Ping timeout: 480 seconds]

21:45 Guest3791 has joined #panfrost

21:46 q4a1 has joined #panfrost

21:47 robmur01_ has joined #panfrost

21:48 larunbe has joined #panfrost

21:48 Consolatis_ has joined #panfrost

21:48 avane_ has joined #panfrost

21:48 FLHerne_ has joined #panfrost

21:48 pjakobsson_ has joined #panfrost

21:48 italove5 has joined #panfrost

21:49 samuelig_ has joined #panfrost

21:50 pbsds12 has joined #panfrost

21:50 mriesch_ has joined #panfrost

21:50 pendingchaos_ has joined #panfrost

21:50 mmind00_ has joined #panfrost

21:51 CounterPillow_ has joined #panfrost

21:51 alpernebbi has quit [charon.oftc.net helix.oftc.net]

21:51 bbrezillon has quit [charon.oftc.net helix.oftc.net]

21:51 mmind00 has quit [charon.oftc.net helix.oftc.net]

21:51 q4a has quit [charon.oftc.net helix.oftc.net]

21:51 indy has quit [charon.oftc.net helix.oftc.net]

21:51 alarumbe has quit [charon.oftc.net helix.oftc.net]

21:51 jelly has quit [charon.oftc.net helix.oftc.net]

21:51 pbsds1 has quit [charon.oftc.net helix.oftc.net]

21:51 pendingchaos has quit [charon.oftc.net helix.oftc.net]

21:51 sergi has quit [charon.oftc.net helix.oftc.net]

21:51 pjakobsson has quit [charon.oftc.net helix.oftc.net]

21:51 robmur01 has quit [charon.oftc.net helix.oftc.net]

21:51 avane has quit [charon.oftc.net helix.oftc.net]

21:51 italove has quit [charon.oftc.net helix.oftc.net]

21:51 xdarklight has quit [charon.oftc.net helix.oftc.net]

21:51 Consolatis has quit [charon.oftc.net helix.oftc.net]

21:51 nergzd723 has quit [charon.oftc.net helix.oftc.net]

21:51 thecycoone[m] has quit [charon.oftc.net helix.oftc.net]

21:51 simon-perretta-img has quit [charon.oftc.net helix.oftc.net]

21:51 rellla has quit [charon.oftc.net helix.oftc.net]

21:51 CounterPillow has quit [charon.oftc.net helix.oftc.net]

21:51 mav has quit [charon.oftc.net helix.oftc.net]

21:51 jenneron[m] has quit [charon.oftc.net helix.oftc.net]

21:51 mriesch has quit [charon.oftc.net helix.oftc.net]

21:51 mntirc has quit [charon.oftc.net helix.oftc.net]

21:51 samuelig has quit [charon.oftc.net helix.oftc.net]

21:51 FLHerne has quit [charon.oftc.net helix.oftc.net]

21:51 FLHerne_ is now known as FLHerne

21:53 CounterPillow_ is now known as CounterPillow

21:55 alpernebbi has joined #panfrost

21:55 indy has joined #panfrost

21:57 rellla has joined #panfrost

21:58 xdarklight has joined #panfrost

21:59 hanetzer3 has joined #panfrost

22:00 mav has joined #panfrost

22:00 hanetzer2 has quit [Ping timeout: 480 seconds]

22:02 bbrezillon has joined #panfrost

22:02 mntirc has joined #panfrost

22:02 simon-perretta-img has joined #panfrost

22:08 rasterman has quit [Quit: Gettin' stinky!]

22:10 jelly has joined #panfrost

22:29 mmind00_ has left #panfrost [#panfrost]

22:30 mmind00 has joined #panfrost

22:50 pendingchaos_ is now known as pendingchaos

22:57 mmind00_ has joined #panfrost

22:57 mmind00_ has quit []

22:58 mmind00 has quit [charon.oftc.net helix.oftc.net]

22:58 simon-perretta-img has quit [charon.oftc.net helix.oftc.net]

22:58 mntirc has quit [charon.oftc.net helix.oftc.net]

22:58 bbrezillon has quit [charon.oftc.net helix.oftc.net]

22:58 jelly has quit [charon.oftc.net helix.oftc.net]

23:06 simon-perretta-img has joined #panfrost

23:07 mntirc has joined #panfrost

23:07 bbrezillon has joined #panfrost

23:15 jelly has joined #panfrost

23:45 jdavidberger has quit [Quit: Leaving.]