ChanServ changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - Logs https://oftc.irclog.whitequark.org/panfrost - <macc24> i have been here before it was popular
anholt has quit [Quit: Leaving]
anholt has joined #panfrost
jelly has quit [Ping timeout: 480 seconds]
digetx has quit [Ping timeout: 480 seconds]
digetx has joined #panfrost
hexdump01 has joined #panfrost
hexdump0815 has quit [Ping timeout: 480 seconds]
Daaanct12 has joined #panfrost
Daanct12 has quit [Read error: Connection reset by peer]
<icecream95> Hmm.. duet gives me a gravity reading of 10.595 ms^-2. But if I turn it upside down, it is 9.022 ms^-2! That's.. not very accurate
<icecream95> Callibration helped a lot, it's now reading at 1.000004 gravities
<HdkR> Much better
<icecream95> Accelerometer manufacturers seem to think that calibration only requires a mat3x4 matrix to be multiplied with sensor data, but I found that the offset was much more inaccurate than the scale
felixvalentini has joined #panfrost
felixvalentini has quit []
felixvalentini has joined #panfrost
jelly has joined #panfrost
nlhowell has joined #panfrost
nlhowell has quit [Read error: Connection reset by peer]
MajorBiscuit has joined #panfrost
nlhowell has joined #panfrost
guillaume_g has joined #panfrost
spy has joined #panfrost
spy has quit []
anholt__ has joined #panfrost
anholt_ has quit [Ping timeout: 480 seconds]
anholt has quit [Ping timeout: 480 seconds]
anholt has joined #panfrost
anholt__ has quit [Ping timeout: 480 seconds]
anholt has quit [Ping timeout: 480 seconds]
anholt has joined #panfrost
anholt__ has joined #panfrost
robmur01 has quit [Read error: Connection reset by peer]
robmur01 has joined #panfrost
icecream95 has quit [Ping timeout: 480 seconds]
felixvalentini has quit [Ping timeout: 480 seconds]
felixvalentini has joined #panfrost
nlhowell is now known as Guest406
nlhowell has joined #panfrost
Guest406 has quit [Read error: Connection reset by peer]
rasterman has joined #panfrost
anholt is now known as Guest419
anholt__ is now known as anholt
<alyssa> jekstrand: so... removing nir_variable?
<alyssa> I think I've dealt with most of the fallout, varyings aside.
<jekstrand> Yup. What exactly do you use it for today?
<alyssa> so many things
<jekstrand> ok...
<alyssa> - determine locations on various intrinsics (switched to I/O semantics)
<alyssa> - check if MRT is in use (switched to outputs_written)
<alyssa> - nir_lower_fragcolor (rewrote half the pass to work either before or after I/O lowering)
<jekstrand> hrm... Maybe iris still uses variables. How is this controlled?
<alyssa> - varying linking on Midgard and Bifrost. basically we need to build a table of locations -> pipe_formats, depending on types/sizes/interpolation/number of components/etc
<alyssa> there's no obvious way to replace this. possibly a very complicated walk over the shader instructions using the I/O semantics. much more complicated (and slower!) than nir_foreach_shader_out.
<jekstrand> Yeah...
<alyssa> jekstrand: TTBOMK, radeonsi is the only driver to yeet all the drivers in the state tracker.
<jekstrand> I'm getting confused now
* jekstrand goes and re-reads
<alyssa> Someone moved radeonsi's list of I/O lowering passes into nir_lower_io, added a flag to make mesa/st call that early before the driver ever sees the original shader, and then gated all transform feedback info on that flag.
felixvalentini has quit []
<alyssa> Which means if Panfrost needs transform feedback info (for lowering on Valhall), it needs to pretend to be radeonsi.
<alyssa> I've mostly typed out all the changes needed. But I'm not thrilled about this.
<alyssa> This stuff went in with little review and would've been nak'd if it had the _amd suffixing it should have had.
<jekstrand> Yeah.... This lower_io_variable stuff is bullshit
<alyssa> yes, well
<alyssa> I need to lower transform feedback and so dealing with it is on me now...
<alyssa> and even after all that, dEQP-GLES3.functional.shaders.fragdepth.compare.fragcoord_z isn't passing.
<alyssa> I don't know how that passes on si.
<jekstrand> Ok, just raged at Marek. :)
<alyssa> I don't care for rage, I just need Valhall support landed in the next 2 weeks ;P
<jekstrand> heh
<jekstrand> Yeah
<jekstrand> But, seriously, that all needs to be in si_finalize_nir
* jekstrand is tempted to move it and send a MR.
<cwabbott> iirc the problem is that gallium maintains the fiction that streamout info is not part of the shader
<alyssa> yes
<cwabbott> should we just... pass it to the driver
<cwabbott> ?
<cwabbott> that MR did seem kinda dodgy but I never got past the cover letter to see what he was actually doing
<cwabbott> seems like a classic marek solution to the problem
nlhowell is now known as Guest426
nlhowell has joined #panfrost
<jekstrand> cwabbott: I'm typing :)
Guest426 has quit [Ping timeout: 480 seconds]
rasterman has quit [Ping timeout: 480 seconds]
rasterman has joined #panfrost
* macc24 notices references to logs of this channel in a cadmium for
<macc24> fork*
<macc24> there is a cadmium contributor among us xD
robmur01_ has joined #panfrost
moa has joined #panfrost
austriancoder_ has joined #panfrost
steev_ has joined #panfrost
Moe_Icenowy has joined #panfrost
jolan_ has joined #panfrost
alyssa_ has joined #panfrost
spawacz_ has joined #panfrost
robmur01 has quit [synthon.oftc.net reflection.oftc.net]
Guest419 has quit [synthon.oftc.net reflection.oftc.net]
alyssa has quit [synthon.oftc.net reflection.oftc.net]
spawacz has quit [synthon.oftc.net reflection.oftc.net]
floof58 has quit [synthon.oftc.net reflection.oftc.net]
rcf has quit [synthon.oftc.net reflection.oftc.net]
jambalaya has quit [synthon.oftc.net reflection.oftc.net]
jolan has quit [synthon.oftc.net reflection.oftc.net]
bluebugs has quit [synthon.oftc.net reflection.oftc.net]
MoeIcenowy has quit [synthon.oftc.net reflection.oftc.net]
ckeepax1 has quit [synthon.oftc.net reflection.oftc.net]
orkid has quit [synthon.oftc.net reflection.oftc.net]
remexre has quit [synthon.oftc.net reflection.oftc.net]
steev has quit [synthon.oftc.net reflection.oftc.net]
austriancoder has quit [synthon.oftc.net reflection.oftc.net]
steev_ is now known as steev
orkid has joined #panfrost
rcf has joined #panfrost
remexre has joined #panfrost
ckeepax1 has joined #panfrost
floof58 has joined #panfrost
robmur01_ is now known as robmur01
austriancoder_ has quit []
austriancoder has joined #panfrost
anholt_ has joined #panfrost
jambalaya has joined #panfrost
spawacz_ has quit []
spawacz has joined #panfrost
<cphealy> icecream95: thanks for the hint on the Mali DDK "core_mask" sysfs entry. I was able to switch between 1 and 2 cores on the G52-MP2 and saw performance drop in half when switch to 1 core. Very useful!
<cphealy> device:/sys/devices/platform/13040000.mali# cat core_mask
<cphealy> Current core mask (JS2) : 0x3
<cphealy> Available core mask : 0x3
<cphealy> Current core mask (JS0) : 0x3
<cphealy> Current core mask (JS1) : 0x3
<cphealy> device:/sys/devices/platform/13040000.mali# echo 1 > core_mask
<cphealy> device:/sys/devices/platform/13040000.mali# cat core_mask
<cphealy> Current core mask (JS0) : 0x1
<cphealy> Current core mask (JS1) : 0x1
<cphealy> Current core mask (JS2) : 0x1
<cphealy> Available core mask : 0x3
<cphealy> device:/sys/devices/platform/13040000.mali#
<macc24> cphealy: is that g52 inside what i think it is? ;)
<cphealy> I can only guess what you are thinking... ;-)
rasterman has quit [Quit: Gettin' stinky!]
<robmur01> bear in mind that an MP2 may have up to 4x the L2 cache of a "real" MP1, depending on the respective configs (128K-256K vs. 64K-128K), so numbers may not be exactly 100% representative
<cphealy> robmur01: good point! Is there another magical sysfs for capping max L2 used? ;-)
<robmur01> not sure - maybe for larger configurations that have multiple L2s, but for G52 that's only possible for MP3 and larger configs
MajorBiscuit has quit [Quit: WeeChat 3.4]
nlhowell has quit [Ping timeout: 480 seconds]
nlhowell has joined #panfrost
<alarumbe> hi alyssa_, I sent the kernel driver patch for devcoredump to the dri-devel ML
jambalaya has quit [Remote host closed the connection]
jambalaya has joined #panfrost
jambalaya has quit [Remote host closed the connection]
jambalaya has joined #panfrost
anholt_ has quit [Quit: Leaving]
rasterman has joined #panfrost
alyssa_ has quit []
alyssa has joined #panfrost
<alyssa> ok, got AFBC on Valhall wired up
<alyssa> FPS ~doubles on glmark2 -btexture --off-screen :-)
WoC has joined #panfrost
<alyssa> supertuxkart's deferred renderer is still unplayable though
<alyssa> kinda disappointing
<alyssa> PAN_MESA_DEBUG=perf suggests it's splitting batches due to slow clears
<alyssa> (and shadowing)
<cphealy> alyssa: with AFBC on Valhall, what modifiers are being used?
<cphealy> Anything like this: modifier: ARM_BLOCK_SIZE=16x16,MODE=YTR|SPARSE|TILED|SC (0x800000000000351)
<alyssa> yes. that one
* alyssa considers harder getting rid of blend shaders.
<alyssa> (for GL. they already don't exist in VK.)
<alyssa> If an app doesn't use blend shaders, the extra overhead from having to maybe have a blend shader is nontrivial.
<alyssa> (Especially on Valhall, but even on Bifrost it adds a bunch of RA constraints which hurts code gen especially if MRT is in use.)
<alyssa> If an app *does* use blend shaders, they really want them inlined, I suspect the i-cache cost of the extra variants+inlining is < the i-cache cost of jumping to a blend shader in a random other place
<alyssa> Would save 2 instructions (~4 on Valhall) per blended render target, fewer RA constraints, and possibly better scheduling possible.
<alyssa> In terms of number of variants, it's not "# of blend states" like on AGX
<alyssa> it's "# of blend shader states (+ 1 for non-blend shader states)"
<alyssa> Only a single variant if the app doesn't use blend shaders.
<alyssa> (The combinatorics do start sucking with MRT. but again, that's where we want inlining most.)
<alyssa> to get benefit from the blend shader setup would need the same fragment shader being used with different framebuffer format / blend state combinations after reducing hardware blending.
<alyssa> I suppose that effect would have to be measured. But it seems a mild thing to add variants for.
<alyssa> Also, special shoutout to the blend constant inlining we do which is bonkers and could be fixed properly (replaced with a sysval) if we yeet the blend shader infrastructure
<alyssa> make the GL driver more VK, in a sense.
<alyssa> (what you'd get using Zink+PanVK)
<anholt> vc4 just baked blending into the shader as variants (and const color is a sysval). worked great other than gl-1.0-blend-func.
<alyssa> anholt: Heh, nice. Mali *does* have blending hardware, it's just limited (doesn't support every blend mode or fb format) hence the shader fallback.
<alyssa> Valhall adds float blending hardware.. so that should take further pressure off there.
<alyssa> once I wire it up.
rasterman has quit [Quit: Gettin' stinky!]
<cwabbott> the question is, is the reduction of stuttering worth it?
<cwabbott> the problem with variants isn't the extra binaries, it's the stuttering
<cwabbott> depends on whether most real-world things hit the HW blending path
<cwabbott> also, with SSA-based RA and live-range splitting the extra constraints aren't a problem anymore
<cphealy> alyssa: regarding AFBC, do you see the same with Bifrost? I'm still seeing only DRM_FORMAT_MOD_ARM_16X16_BLOCK_U_INTERLEAVED with bifrost.
<cwabbott> well, aren't a problem for generated code quality, except for an extra move or two... I guess it still makes things more complicated in RA if you have to support it
WoC has quit [Remote host closed the connection]
<alyssa> cwabbott: right..
<alyssa> maybe I botched the implementation, but both current and SSA RAs produce a ton of moves with MRT
<alyssa> and increases the register demand at the moment of the BLEND instruction
<alyssa> (because we don't know how many registers the blend shader will need)
rasterman has joined #panfrost
<alyssa> [vh-plus 1b418f7238d] WIP: panfrost: Support FP16 blending
<alyssa> cwabbott: To be honest, I don't think I'm smart/experienced enough to implement SSA-based RA. I tried.
<alyssa> Topped out trying to implement spilling.
rasterman has quit [Quit: Gettin' stinky!]
icecream95 has joined #panfrost
<alyssa> CSF counters here are interesting
<alyssa> Not sure if you came across that
<icecream95> alyssa: Interesting. I suspect that there are two levels of command stream, one which is executed on the MCU and another for the iterators
<icecream95> That matches what I've settled on for decoding in panwrap
<icecream95> Oh wait I haven't commited that change yet
erle has quit [Ping timeout: 480 seconds]
<alyssa> I'd believe it
nlhowell is now known as Guest741
nlhowell has joined #panfrost
Guest741 has quit [Ping timeout: 480 seconds]