ChanServ changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - Logs https://oftc.irclog.whitequark.org/panfrost - <macc24> i have been here before it was popular
<alyssa> Looking at perf counters with IDVS on bump + high poly
<alyssa> execution engine is starving most of the time
<alyssa> <17% of EE time is actually doing work
<alyssa> 8% of the time is reads from the load/store pipe
<alyssa> (EE time)
<alyssa> 7% of the EE time is writes from the load/store pipe
<alyssa> 3% of the EE time is varying interpolatio
<alyssa> really not clear where the time is going. I guess memory access.
<HdkR> No counters for memory stalls?
<alyssa> I don't quite understand the memory counters
<alyssa> Read stall cycles (l2_rd_msg_in_stall): 367282887
<alyssa> Write stall cycles (l2_wr_msg_in_stall): 228086275
<alyssa> that seems.... significant.
<HdkR> Definitely big numbers at least
<alyssa> so 79% of the time we're stalled
<alyssa> wonder if the SoC is super memory bandwidth starved
rasterman has quit [Quit: Gettin' stinky!]
<alyssa> S922x: 4.8 GiB/s
<alyssa> RK3399: 6.6 GiB/S
<alyssa> Guess that explains some cases where the rk3399 is faster
<HdkR> Need more 400GB/s ARM chips
<alyssa> brb porting panfrost to the m1 max
Net147 has quit [Quit: Quit]
Net147 has joined #panfrost
vstehle has quit [Ping timeout: 480 seconds]
JulianGro has quit [Remote host closed the connection]
vstehle has joined #panfrost
rasterman has joined #panfrost
JulianGro has joined #panfrost
alyssa has quit [Quit: leaving]
<cphealy> alyssa: how are you getting the counters when running glmark2? I'd like to do some comparisons on the platforms I'm working with.
<cphealy> additionally, given the relationship with memory bandwidth, it makes sense that AFBC will have a significant impact on the numbers. With AFBC today, IIRC we have YTR available but not SC (solid color). Getting SC enabled with AFBC would likely give another significant reduction in bandwidth over what we have with AFBC today.
macc24_ has quit []
macc24 has joined #panfrost
<macc24> cphealy: i think you can use perfetto
jernej_ has joined #panfrost
jernej has quit [Ping timeout: 480 seconds]
<robclark> there are some example trace configs in https://gitlab.freedesktop.org/mesa/mesa/-/tree/main/src/tool/pps/cfg
<cphealy> cool, tnx macc24 and robclark!
megi has quit [Quit: WeeChat 3.3]
megi has joined #panfrost
<macc24> cphealy: sure no prob, hit me up if any problems arise in anything
kenzie has quit [Quit: The Lounge - https://thelounge.chat]
kenzie has joined #panfrost
floof58 has quit [Ping timeout: 480 seconds]
floof58 has joined #panfrost
JulianGro has quit [Remote host closed the connection]
alyssa has joined #panfrost
<alyssa> cphealy: perfetto is the 'right' way, I think
<alyssa> though for just dumping the counters at a moment in time, it's more convenient to use https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/panfrost/perf/quick.c
<alyssa> ^for me
<alyssa> compile mesa with -Dtools=panfrost
<alyssa> and it'll be in mesa/build/src/panfrost/perf/panquick
<alyssa> the SC modifier, uh, I don't know what that does.
<alyssa> I don't think it works the way you think though, the GPU will produce/consume SC blocks regardless
<alyssa> I don't know if any drivers other than mali-dp actually care about the SC modifier
<alyssa> the AFBC UABI is horribly underspecified
<cphealy> Regarding SC, my thought was that it is similar to YTR in that it can be used when scanning out to display IP with AFBC decoders supporting YTR (or SC in this case.)
<cphealy> alyssa: thanks for the details on the quick perf tool.
rasterman has quit [Quit: Gettin' stinky!]
<alyssa> cphealy: For YTR, there is literally a bit that Mesa sets saying "use YTR" or "don't use YTR"
<alyssa> For SC, there is no such bit... Either the GPU uses SC or not.
<alyssa> I don't understand what the modifier is supposed to do.
macc24 has quit [Ping timeout: 480 seconds]