ChanServ changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - Logs https://oftc.irclog.whitequark.org/panfrost - <macc24> i have been here before it was popular
erle has quit [Ping timeout: 480 seconds]
erle has joined #panfrost
psydroid[m] has quit [Ping timeout: 480 seconds]
JulianGroOld[m] has quit [Ping timeout: 480 seconds]
jenneron[m] has quit [Ping timeout: 480 seconds]
toggleton[m] has quit [Ping timeout: 480 seconds]
go4godvin has quit [Ping timeout: 480 seconds]
unevenrhombus[m] has quit [Ping timeout: 480 seconds]
Dylanger has quit [Ping timeout: 480 seconds]
strongtz[m] has quit [Ping timeout: 480 seconds]
stebler[m] has quit [Ping timeout: 480 seconds]
CalebFontenotHaileysCuteNerdyB has quit [Ping timeout: 480 seconds]
rasterman has quit [Quit: Gettin' stinky!]
erle has quit [Ping timeout: 480 seconds]
erle has joined #panfrost
jenneron[m] has joined #panfrost
JulianGroOld[m] has joined #panfrost
toggleton[m] has joined #panfrost
anarsoul is now known as anarsoul|2
anarsoul|2 is now known as anarsoul
pjakobsson has quit []
guillaume_g has joined #panfrost
pendingchaos has quit [Remote host closed the connection]
pendingchaos has joined #panfrost
greenjustin has quit [Ping timeout: 480 seconds]
hexdump01 has joined #panfrost
hexdump01 has quit []
hexdump0815 has joined #panfrost
<hexdump0815>
greenjustin: (in case you read this offline in the irc logs) i have tested your reworked gpu freq scaling patch on mainline and it seems to work well so far
<hexdump0815>
i did not do too much testing yet, but so far i did not see any gpu page faults etc. and it seems to scale the gpu freq properly
<hexdump0815>
this is on kukui-jacuzzi-kappa with v5.18-rc6
<hexdump0815>
i'm using it on xorg xfce on debian bullseye still with some old mesa version right now (21.3.2)
camus has quit [Remote host closed the connection]
rasterman has joined #panfrost
camus has joined #panfrost
CounterPillow has quit [Quit: Bye.]
CounterPillow has joined #panfrost
CounterPillow has quit []
CounterPillow has joined #panfrost
CounterPillow has quit []
CounterPillow has joined #panfrost
CounterPillow has quit []
CounterPillow has joined #panfrost
rkanwal has joined #panfrost
MajorBiscuit has joined #panfrost
MajorBiscuit has quit [Quit: WeeChat 3.4]
megi has quit [Quit: WeeChat 3.5]
megi has joined #panfrost
pjakobsson has joined #panfrost
<jekstrand>
alyssa: I think the time has come to split up bi_finalize_nir/bi_optimize_nir a bit.
<alyssa>
jekstrand: Uh oh
<jekstrand>
alyssa: For panvk, we want the same preprocess/finalize pattern the Intel drivers have.
<jekstrand>
Basic idea is preprocess does up-front stuff that you know you want for all shaders like lower_tex, etc. and runs the optimization loop
<jekstrand>
finalize does everything you need to do at the last minute before passing to the back-end.
* alyssa
is listening
<jekstrand>
Then panvk will do:
<jekstrand>
1. SPIR-V -> NIR and maybe a tiny bit to inline functions, etc.
<jekstrand>
2. bi_preprocess_nir
<jekstrand>
3. panvk-specific lowering like handling descriptor sets
<jekstrand>
4. compile_from_nir which calls bi_finalize_nir
<alyssa>
ok...
<alyssa>
Why is that better than what panvk does now?
<jekstrand>
Because we're having do duplicate a bunch of stuff that's currently in bi_finalize_nir or bi_optimize_nir so it can happen before descriptor set lowering.
<alyssa>
also can we kill the nir_lower_flrp nonsense across the tree while we're at it? I moved it from opt loop to "preprocess" on AGX and it hasn't blown up yet
<alyssa>
Hmmmm, okay.
<jekstrand>
Also, blend lowering should go between preprocess and finalize
<alyssa>
Aha, sure
<alyssa>
wait, but then the blend code doesn't get optimized
<alyssa>
is that ok?
<alyssa>
I guess for Bifrost it should be fine... Midgard relies heavily on opt_vectorize running after lower_blend
<jekstrand>
The optimization loop gets run twice, once in preprocess and once in finalize
<alyssa>
oh?
<jekstrand>
The one in finalize shouldn't do much
<alyssa>
doesn't that hurt compile time (at least for GLES)?
<jekstrand>
We'll have to see. A single noop run shouldn't be bad
<alyssa>
Hope so
<alyssa>
In general I'm a little suspicious of our optimization loops in Mesa
<jekstrand>
Me too
<alyssa>
I realize there are some opts (esp. algebraic) that really need many runs to converge
<alyssa>
but that's not true of every opt. and I think we just throw everything into the big loop and don't think about convergence issues (unless passes fight) and that's it..
<jekstrand>
Yeah
greenjustin has joined #panfrost
<jekstrand>
ugh... I'd forgotten that flrp was that ugly. I think I dug into why a while ago but gave up
<jekstrand>
Of course, the commit message says nothing about why it needs to be part of the opt loop and why it needs this weird lower once buisiness.
<jekstrand>
*sigh*
<alyssa>
I guess if you're Intel with funny flrp lowering it might make sense
<alyssa>
for any driver that wants the straightforward lowering... just call it once with eg lower_tex?
<jekstrand>
maybe?
<alyssa>
I should shader-db that I guess
<jekstrand>
alyssa: Does GLES not have support for struct varyings?
<alyssa>
jekstrand: It gets lowered
<alyssa>
not sure if by NIR or by GLSL
<alyssa>
Ideally panvk would lower and the backend remains unaware
psydroid[m] has joined #panfrost
go4godvin has joined #panfrost
go4godvin is now known as Guest513
guillaume_g has quit []
<jekstrand>
alyssa: Agreed. I just don't know what lowering to call. :-/
<alyssa>
Woof.
<alyssa>
jekstrand: split_struct_vars doesn't want to do it
<jekstrand>
I think the reason Intel drivers don't care is because they never look at varying variables
<alyssa>
and look at what instead? sem.location?
<jekstrand>
Just location
<jekstrand>
Which we map to sensible things
<alyssa>
Hm, might be doable for us too
<jekstrand>
Iris might be doing some remapping somewhere. I don't remember
<alyssa>
I can give that a look when I'm done staring at dEQP-GLES2.functional.shaders.loops.do_while_dynamic_iterations.nested_tricky_dataflow_1_fragment
<jekstrand>
It's all a bit of a mess. There are about 4 different ways the information can get passed around and everyone who touches seems to feel the need to add another. :angry:
<alyssa>
woof
<jekstrand>
Like bi_emit_fragment_out uses driver_location to look up the varible so it can get the regular location. Why don't we just use the regular location?!?
<jekstrand>
Because gallium...
<alyssa>
Uhhhh
<alyssa>
that might be unnecessary?
<alyssa>
sem.location would probably be better?
<jekstrand>
idk. sem is also a gallium thing
<alyssa>
it is?
<jekstrand>
Oh, that's right... The intel drivers do their own remapping using the VUE map thing
<jekstrand>
woof
<cwabbott>
looking up the variable sounds like a leftover from before sem.location was added
<cwabbott>
and sem.location is definitely not just a gallium thing, nir_lower_io sets it
<cwabbott>
the original point of driver_location is that it's the index into the table o' inputs/outputs setup by the backend
<cwabbott>
or if your driver can do some fancy compression, it would facilitate that
<cwabbott>
but then intel bypasses that and does their own compression
jekstrand has quit [Remote host closed the connection]
jekstrand has joined #panfrost
soreau has quit [Ping timeout: 480 seconds]
pch has quit [Remote host closed the connection]
pch has joined #panfrost
jekstrand has quit [Ping timeout: 480 seconds]
soreau has joined #panfrost
robmur01 has quit [Quit: Leaving]
rasterman has quit [Remote host closed the connection]
rasterman has joined #panfrost
jekstrand has joined #panfrost
jekstrand is now known as Guest534
jekstrand has joined #panfrost
Guest534 has quit []
jekstrand has quit []
jekstrand has joined #panfrost
<jekstrand>
alyssa: Does panfrost support FB fetch from depth/stencil?
<jekstrand>
I expect no
<alyssa>
jekstrand: No, but the hardware does, and we could support it if we cared
<alyssa>
No use case yet.
<jekstrand>
kk
<jekstrand>
Nifty
<alyssa>
jekstrand: Also, future Malis might remove it. not sure yet.
<jekstrand>
alyssa: Vulkan requires support for depth input attachments. You can go through textures if you want to split the render pass after every draw.
<alyssa>
jekstrand: Ahhh
<jekstrand>
For an immediate renderer, it's a flush. It sucks but oh, well. On a tiler, it's death.
<anholt_>
is mali like qcm where you can just texture your tile buffer?
<alyssa>
anholt_: depends what you mean by txture
<alyssa>
It goes through the load/store pipe, not the texture pipe
<alyssa>
(so it's samplerless)
<alyssa>
but you can definitely read from the tilebuffer cheaply, which the driver does to implement blending in a bunch of cases
<anholt_>
oh, I just misunderstood jekstrand's question, it was about fbfetch specifically, not just input attachments.
<jekstrand>
I'm reworking input attachment lowering so it can optionally re-route to FB fetch
<jekstrand>
For stuff in previous subpasses, you'll still get texturing. For self-dependencies, though, you'll get FB fetch.
<alyssa>
do real apps (not benchmarks) make use of multiple subpasses?
<alyssa>
(Android Vulkan deferred renderers I guess?)
<jekstrand>
aztec ruins. :)
<jekstrand>
Yes, there are some real apps that do
<jekstrand>
I know someone who's working on writing a brand new render engine right now that's very much designed to take advantage of subpasses.
<jekstrand>
In the mobile world, people do actually target subpasses.
<jekstrand>
In desktop, not so much.
<alyssa>
right, ok
<alyssa>
SuperTuxKart on mobile would benefit a lot from subpasses
<alyssa>
icecream95: You played with that, right?
rasterman has quit [Quit: Gettin' stinky!]
* jekstrand
hates FB fetch. :-/
Danct12 has quit [Quit: Quitting]
<alyssa>
jekstrand: Why?
<jekstrand>
I get to do even more FS output var vec size mangling. :-(
<alyssa>
Oh..
<alyssa>
maybe load_output is a bad idea and we'd rather expose a load_tile intrinsic independent of the var...
<jekstrand>
We could, maybe
rasterman has joined #panfrost
icecream95 has joined #panfrost
strongtz[m] has joined #panfrost
stebler[m] has joined #panfrost
<icecream95>
alyssa: I tried making STK use framebuffer fetch rather than switching framebuffers all the time, but I think I gave up because recompiling after each change took too long
Danct12 has joined #panfrost
anarsoul|2 has joined #panfrost
anarsoul has quit [Read error: Connection reset by peer]
CalebFontenotHaileysCuteNerdyB has joined #panfrost