ChanServ changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - Logs https://oftc.irclog.whitequark.org/panfrost - <macc24> i have been here before it was popular
camus has joined #panfrost
Danct12 has joined #panfrost
derzahl has joined #panfrost
derzahl has quit [Remote host closed the connection]
derzahl has joined #panfrost
derzahl has quit [Remote host closed the connection]
Danct12 has quit [Remote host closed the connection]
derzahl has joined #panfrost
derzahl has quit [Remote host closed the connection]
Danct12 has joined #panfrost
Danct12 has quit [Read error: Connection reset by peer]
Danct12 has joined #panfrost
Danct12 has quit [Quit: Leaving]
guillaume_g has joined #panfrost
Danct12 has joined #panfrost
rasterman has joined #panfrost
nlhowell has joined #panfrost
Danct12 has quit [Quit: Leaving]
rkanwal has joined #panfrost
icecream95 has quit [Ping timeout: 480 seconds]
Net147 has quit [Quit: Quit]
Net147 has joined #panfrost
fahien has joined #panfrost
italove8 has joined #panfrost
CME has quit []
CME has joined #panfrost
rkanwal has quit [Remote host closed the connection]
icecream95 has joined #panfrost
<icecream95>
alyssa: Tarball with some performance data for various RA patches: https://0x0.st/oByD.zst
rkanwal has joined #panfrost
icecream95 has left #panfrost [#panfrost]
fahien has quit [Ping timeout: 480 seconds]
italove8 has quit [Ping timeout: 480 seconds]
<technopoirot>
HdkR: thanks for the Mali-G31 MP1 links
<robmur01>
fun fact: G51 and G31 "MP2" are technically still a single core, just a much fatter one than the respective MP1 configs
rkanwal has quit [Ping timeout: 480 seconds]
<cphealy>
robmur01: What about G52-MP2? Is that truly 2 cores?
<robmur01>
cphealy: yes (and whether they're skinny 2EE cores or full-fat 3EE cores is a global choice, no weird mix-and-match like G51 MP3 and up)
<cphealy>
ack
<CounterPillow>
I'm on 2EE because Rockchip needed to fit more random blitters onto the chip :(
<CounterPillow>
wouldn't be a problem if we cloned you a few times :D
<CounterPillow>
alyssa: btw regarding the DRM_FORMAT_MOD_INVALID stuff, I noticed that weston doesn't break, only plasma does. It seems to try using AFBC on a plane that doesn't support it and then gives up. If this is something that needs fixing in the kernel let me know and I'll put it on my list of rough rockchip edges
<alyssa>
CounterPillow: I don't actually remember whose bug it is
<alyssa>
There was discussion and the consensus IIRC was to treat INVALID as LINEAR in Mesa
<alyssa>
but that might've been to avoid breaking the world rather than actual correctness
<CounterPillow>
I see. Anyway, thought it's interesting that weston doesn't seem to trigger the "is this modifier supported" function in the kernel but kwin_wayland_drm does. Maybe weston just piggybacks whatever mode the console had set, and the console seems to get it right
<alyssa>
Weston is the gold standard for implementing modern DRM interfaces correctly.
<jernej>
ideally, all kernel drivers should report modifiers they support, even if it's only linear
<alyssa>
It's everyone else I'm worried about.
<CounterPillow>
I'm currently compiling kwin (and with that, seemingly half of KDE, yikes) from git source to see if the changes they've made to the DRM backend have improved things at all
<jernej>
I already fixed one of them, to make it work with panfrost
<CounterPillow>
Seemingly only the overlay planes support afbc
<alyssa>
CounterPillow: Rereading that MR, it does look like rockchip and friends might be broken
<alyssa>
but just like new kernels can't break old userspace, we try pretty hard to ensure new userspaces won't break old kernels
<CounterPillow>
I see
<alyssa>
(for some value of old. it's not "forever", but long enough that if the issue is in user's hands, it's too late :p)
<CounterPillow>
Among all the things I want to look into I'll add "fix rockchip's drm driver" to the list then, it would also be nice if plasma started using the plane that supports afbc though because specifically framebuffer updates with lots of changed pixels seem to cause hitches with panfrost occasionally right now, and I suspect memory bottlenecks
<alyssa>
Might help
<alyssa>
Don't discount the possibility that the hardware + software are both dogslow :-p
<CounterPillow>
it seems to work better in weston (at least in supertux2)
<CounterPillow>
the thing I'm observing is an application will composite smooth at 60fps as a 720p window but have a dip to <30fps every second or so for a few frames if run at 1080p
<CounterPillow>
or the whole desktop has a hitch when I bring a window in the background to the foreground and it does a big repaint
<alyssa>
Woof.
<CounterPillow>
teeworlds is the weirdest case, where the application itself reports 60fps frametimes in GALLIUM_HUD but it's clearly not 60fps by the time it leaves the compositor
<alyssa>
Related -- if vsync is disabled in Neverball, GALLIUM_HUD claims fps is 700+ but it feels about 4fps
<alyssa>
because we don't yet implement context priorities
<alyssa>
or preemption
<alyssa>
so Neverball hogs all the GPU cycles and sway doesn't get any cycles left to actually present the frames
<cphealy>
Sounds like a use case where EGL context priority would be useful. ;-)
<alyssa>
Yep.
<alyssa>
I might implement it if working on the kernel didn't make me feel so sad :-p
<CounterPillow>
In the fullscreen teeworlds case, having working direct scanout in the compositor would probably also improve things considerably as it'd completely eliminate the job getting starved
<CounterPillow>
Hmmm, apparently kwin already does direct scan-out. Well that just adds to the mystery
floof58 has quit [Remote host closed the connection]
<alyssa>
CmdCopyQueryPoolResults looks really annoying to implement, since it needs a compute shader
<alyssa>
(Would be perfect for the CSF MCU.......)
<jekstrand>
alyssa: \o/
<anarsoul>
alyssa: hm, adding context priority should be pretty straightforward, the only tricky thing is to maintain backwards compatibility
Danct12 has joined #panfrost
icecream95 has joined #panfrost
<icecream95>
alyssa: Also on the topic of priorities, if I break a fragment job into 32x32 pixel tiles, then it can lock up the desktop for ages (see #6572)
<icecream95>
So it appears that the cost of the larger nodearrays on RA performance is about 20% when SIMD is used, but only 15% otherwise