ChanServ changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - Logs https://oftc.irclog.whitequark.org/panfrost - <macc24> i have been here before it was popular
<alyssa>
robmur01: Bifrost does and Valhall does not
<alyssa>
and it's probably saner to switch the whole compiler to Valhall-style
<alyssa>
Not forking the compiler for Valhall was *probably* still the right call but decisions like this make me less sure I admit
<robmur01>
does Valhall lose the weird restrictions on crossing 4GB boundaries and suchlike?
<alyssa>
which restrictions
<alyssa>
I'm just talking about pointers for LOAD and STORE instructions and such
<icecream95>
malloc: variables.c:3234: assertion botched. Fun.. panfrost.ko is causing memory corruption in unrelated processes again
<icecream95>
But it's only used for some atomic instructions, for others it seems to encode whether it's on an image or an SSBO.. maybe?
<alyssa>
There are different atom instructions
<alyssa>
I'll have XML typed out in a few hours if I don't keep getting distracted by IRC and sketchy ways of cooking fish
<HdkR>
Skwtchy ways of cooking fish. In the dishwasher on high-heat loads
<alyssa>
HdkR: that's the one!
<icecream95>
alyssa: "I think that was v10 only". From comparing shaders between v9 and v10, I still think that it is possible that there are no ISA changes in v10
<icecream95>
In fact, it might still be possible (if unlikely) that the *only* change between v9 and v10 is the CSF
<alyssa>
v10 definitely has cmdstream changes (beyond CSF itself)
<alyssa>
CSF kernel support indicates tiler changes
* icecream95
decides that it would be pointless to argue about what exactly constitutes a "cmdstream change"
<alyssa>
ATOM1.i32 and ATOM1_RETURN.i32 have the same opcode
<alyssa>
Grumble.
<alyssa>
Distinguished only by sr_count
<icecream95>
alyssa: You mean the incr/decr instruction?
<alyssa>
Yeah
<alyssa>
ATOM_C1 on bifrost
<alyssa>
Guess I can model as ATOM_C1_RETURN always, and just allow omitting the staging reg
<alyssa>
Presumably that's what the hardware is actually doing
<alyssa>
and separating out the opcodes is just an assembler syntax detail to make Valhall feel more like Bifrost
<alyssa>
That syntax is a little clunky but it gets the point across
<alyssa>
(it "should" be "ATOM_C1.i32.slot0.ainc.wait0 r50, offset:0x0")
<alyssa>
For the case of "increment and discard the result"
<icecream95>
At least it should make it more clear that there is an empty destination.. the order of registers for atomic operations always confused me
<alyssa>
is that a question?
<icecream95>
no
camus has joined #panfrost
philpax_ has quit [Server closed connection]
philpax_ has joined #panfrost
erlehmann has joined #panfrost
Daanct12 has joined #panfrost
jstultz has quit [Server closed connection]
jstultz has joined #panfrost
cwabbott has quit [Server closed connection]
cwabbott has joined #panfrost
* alyssa
wonders why Valhall phone won't turn on
<alyssa>
Dead battery maybe? boring
<alyssa>
I have atomics, I have shared mem, I do not yet have the interaction
<anarsoul>
alyssa: compute shaders?
<alyssa>
yes
<alyssa>
Something you don't have to worry about over at #lima ;)
<anarsoul>
:P
<icecream95>
alyssa: At least ATOM1* using the same opcode is better than FATAN_ASSIST, which does two different operations returning f32 or v2f16 depending on a modifier bit
<alyssa>
oof
<icecream95>
alyssa: On the topic of random instructions, FSIN_TABLE.u6 is actually floating point--it will return -0 if the source is not in a specific range of values
<alyssa>
Hmm?
robclark has quit [Server closed connection]
<alyssa>
icecream95: also, just posted the MR adding atomics to ISA.xml
robclark has joined #panfrost
<alyssa>
and have been typing away at the mesa side
<alyssa>
32-bit should be correct given the tests are passing :-p
<alyssa>
64-bit is untested so YMMV, it's not used by gles31 at least
<alyssa>
Gets >90% passing on dEQP-GLES31 so that's nice
<HdkR>
oooo
<alyssa>
so at >90% for the whole gles31 cts, I think
<HdkR>
GJ! :D
<alyssa>
looks like this stuff will unfortunately miss the branch point because there are too many hacks in my branch and I'm part time but hey
<icecream95>
alyssa: The domain of FSIN_TABLE is 524288.0 to 1048575.9375
<alyssa>
it's supposed to just look at the bottom 6-bits, hence the name..
<icecream95>
But if it did that, then out of bounds values would return a wrong value.. so it returns -0 instead
<alyssa>
Interesting
<alyssa>
so this is the hardware being "clever"?
bbrezill1 has joined #panfrost
narmstrong has quit [Server closed connection]
<icecream95>
Note that the domain is every value of a certain exponent.. the exponent where the seventh mantissa bit has a value of 4
narmstrong has joined #panfrost
bbrezillon has quit [Ping timeout: 480 seconds]
<icecream95>
alyssa: I'm confused as to how I managed to apparently get ATOM and ATOM_RETURN mixed up, thinking that the latter was 0x68 rathern than 0x120
<icecream95>
I think that it was just a mistake on my part, and there are no weird things like swapping the operations on v10
spawacz has quit [Server closed connection]
spawacz has joined #panfrost
robink has quit [Server closed connection]
robink has joined #panfrost
dschuermann has quit [Server closed connection]
dschuermann has joined #panfrost
daniels has quit [Server closed connection]
daniels has joined #panfrost
steev has quit [Server closed connection]
steev has joined #panfrost
Daanct12 has quit [Read error: Connection reset by peer]
Daanct12 has joined #panfrost
pch_ has joined #panfrost
pch has quit [Ping timeout: 480 seconds]
camus has quit [Ping timeout: 480 seconds]
jolan has quit [Server closed connection]
jolan has joined #panfrost
pch_ is now known as kinkinkijkin
camus has joined #panfrost
orkid has quit [Server closed connection]
orkid has joined #panfrost
austriancoder has quit [Server closed connection]
austriancoder has joined #panfrost
taowa has quit [Server closed connection]
taowa has joined #panfrost
vstehle has joined #panfrost
tolszak has joined #panfrost
ente` has quit [Ping timeout: 481 seconds]
cyrozap has quit [Server closed connection]
cyrozap has joined #panfrost
MoeIcenowy has quit [Server closed connection]
MoeIcenowy has joined #panfrost
camus1 has joined #panfrost
camus has quit [Ping timeout: 480 seconds]
remexre has quit [Server closed connection]
remexre has joined #panfrost
ezequielg has quit [Server closed connection]
ezequielg has joined #panfrost
ente` has joined #panfrost
anholt has quit [Server closed connection]
anholt has joined #panfrost
MajorBiscuit has joined #panfrost
MajorBiscuit has quit []
anarsoul has quit [Server closed connection]
anarsoul has joined #panfrost
rkanwal has joined #panfrost
MajorBiscuit has joined #panfrost
Daaanct12 has joined #panfrost
Daanct12 has quit [Ping timeout: 480 seconds]
`join_subline has quit [Server closed connection]
`join_subline has joined #panfrost
Daanct12 has joined #panfrost
Daaanct12 has quit [Ping timeout: 480 seconds]
Daaanct12 has joined #panfrost
Daanct12 has quit [Ping timeout: 480 seconds]
rasterman has joined #panfrost
<alyssa>
sure
<icecream95>
alyssa: Implementing adjacent nodes for >vec4 support was pretty easy after all of my RA optimisations: 1 file changed, 68 insertions(+), 9 deletions(-)
<alyssa>
:)
<icecream95>
(nodearray branch on my fdo repo)
<icecream95>
Only build-tested, of course
<icecream95>
Implementing the actual splitting of nodes is an exercise for the reader
<icecream95>
Next step: Splitting nodes even more until everything is vec1 :P
rkanwal has quit [Read error: Connection reset by peer]
rkanwal has joined #panfrost
<alyssa>
I mean, if splitting works "that" well it seems like the natural thing to do :-p
<alyssa>
you should know that Valhall has real write masks that I'd like to wire up at some point..
<icecream95>
Well yeah, I spent quite a bit of time trying to get Ghidra to handle that correctly
<icecream95>
About splitting.. vec4 is a good size for NEON to deal with because the constraints are 7 bits, perfect for signed 8-bit operations
* alyssa
regretting her life choices intensifies
<alyssa>
should've gone with real SSA RA when I had the chance... should've should've should've...
<icecream95>
alyssa: But trying to optimise LCRA is just so much fun!
<alyssa>
I see that yes
alarumbe has quit [Ping timeout: 480 seconds]
<macc24>
fun fact: mt8186 has mali g52
<alyssa>
8186?
<alyssa>
is this a thing I have to deal with now?
<macc24>
some upcoming mtk chromebook chip
<macc24>
they put a gpu that is years old into soc that's still not released in 2022 xD
Daanct12 has joined #panfrost
Daaanct12 has quit [Read error: Connection reset by peer]
alarumbe has joined #panfrost
* alyssa
wires up images
<alyssa>
...or maybe I should just be upstreaming harder
<icecream95>
alyssa: What happens if someone tries to use 32-bit instructions to extract the top half of e.g. the program counter?
<alyssa>
elaborate?
<icecream95>
va_pack_src won't pack e.g. BIR_FAU_PROGRAM_COUNTER correctly if the index has a nonzero offset
* alyssa
grumbles
<alyssa>
you are of course correct, fixing
<icecream95>
alyssa: e.g. va_pack_atom_opc.. so we have this nice XML with all of the enums, and then you decide to use a bunch of magic numbers?
<alyssa>
...Would you rather I generate piles of C code?
<alyssa>
because I can do that C:
<icecream95>
alyssa: "va_optimizer". s/r//? Optionally also s/z/s/
<alyssa>
done
<alyssa>
icecream95: re magic numbers, maybe the right way forward is generating VA_ATOM_OPC_AADD etc enums from the XML, but open coding the Bifrost->Valhall enum translation?
<icecream95>
alyssa: "Offending code:". I guess I would find it offensive too if someone printed me to fp when the fprintf was to stderr
<alyssa>
Yes, very offensive. Fixed.
<alyssa>
(Trying to generate the enum translation would require some degree of coupling between the IR and the Valhall ISA definitions... That coupling is what got us into the current Bifrost ISA.xml mess that you complained about a few hours ago.)
<icecream95>
alyssa: on magic numbers.. I guess you could do that. Otherwise you'd have to do something like reference Bifrost names from the Valhall XML and make everything more of a mess