ChanServ changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - Logs - <macc24> i have been here before it was popular
nlhowell is now known as Guest1322
nlhowell has joined #panfrost
Guest1322 has quit [Ping timeout: 480 seconds]
icecream95 has joined #panfrost
<icecream95> Argh, developing v3d is too much work with 250 ms latency to "your" board
<HdkR> There's only so much that mosh can do :(
<alyssa> that's a shame for v3d..
<icecream95> Nah, I don't even have UDP for mosh
<alyssa> icecream95: My colleagues were quite impressed by your CSF work, by the way.
nlhowell has quit [Ping timeout: 480 seconds]
<icecream95> alyssa: Can I have actual hardware now?
<icecream95> There's only so much you can do looking through 2GB of traces from dEQP
<alyssa> which gen?
<alyssa> v9 is commercially available, v10 I don't have.
<icecream95> The manufacturer of a v10 board is offering me hardware when it comes out, but I still have no idea what happened to the v7 board they promised
* alyssa chuckles
<icecream95> Currently I'm implementing shader printf for v3d to make debugging the problem there easier
<alyssa> sounds fun
<alyssa> for you, not for the v3d hardware ;)
<alyssa> oh, um, I'm sorry for probably breaking your push optimizations..
<alyssa> [with 6b2eda6b729 ("pan/bi: Reorder pushed uniforms to avoid moves")]
<icecream95> What, the reordering patch? I was planning to revert that in my fork
<alyssa> Fair enough
<icecream95> Hmm.. I've written tools for replaying ELF core dumps of GPU memory, but what I really need is for the kernel to make a unified CPU and GPU core dump
<icecream95> (I should look at the devcoredump patches to see if that's a better format.. but then I'd have to work out how to import that into Ghidra)
* alyssa hasn't looked at either in depth so can't comment there
<icecream95> Also in the long list of half-finished things is a new scheduler for Bifrost with inspiration from GCC
<alyssa> oh?
* alyssa always wanted to do more with the Bifrost compiler, but is now refocused on Valhall :|
<alyssa> ...Same went for the Midgard compiler, to be fair.
<icecream95> One problem I note with kernel-space BO dumping is that you can't include annotations such as source code for the shaders
<alyssa> Right..
<HdkR> Did the ChromeOS BO object naming never make it upstream to Linux?
<HdkR> ChromeOS/Android
<HdkR> BO object. redundancy department
<icecream95> HdkR: Can it include MBs worth of text data with each BO?
<HdkR> I think it is an arbitrary length "name"
<HdkR> Just does a memcpy in the kernel implementation afaik
<alyssa> icecream95: Unrelated, not sure if you're still interested in Midgard/OpenCL, but the "AVG" instruction is hadd
<alyssa> (and "AVG.round" is "rhadd")
<alyssa> Bifrost instruction naming tends to be more sensible than midgard..
<icecream95> That reminds me, both Midgard and Bifrost seem to generate inefficient instruction sequences for loops..
<alyssa> Yep.
<alyssa> NIR makes it a bit annoying to do otherwise, and we've all shrugged and said "loops are going to be slow anyway.."
<alyssa> jekstrand: and I have talked about this a bit.
<icecream95> I did try to make a NIR pass which rearranges the blocks, but I didn't manage to get it to work properly
<alyssa> CF is tricky.
<alyssa> Adding do...while to NIR sounds good on paper, until you realize it means auditing every NIR pass that does nontrivial CF manipulation
<icecream95> Or just run that pass after all the other ones doing CF manipulation? :)
<alyssa> heh, well, yes
<alyssa> I'm sure jekstrand would love that ;)
<alyssa> A helper to detect do...while from idiomatic NIR, and using that information when translating NIR->BIR, is a possible compromise
<alyssa> Not invasive to the core NIR data structures, still gets the code gen we need
<alyssa> and then a "regular" NIR pass can rearrange the blocks, as you put it, to go from idiomatic while loops to `if { do...while }` etc
<icecream95> At least on Midgard, there are a lot of blocks which only jump to another block which could be removed without touching NIR
<alyssa> *nod*
<alyssa> Not sure if there'd be problem interactions with nir_from_ssa, I guess if the pass is careful it's fine
rasterman has joined #panfrost
<icecream95> ..At one point I tried enabling compression (via OpenCL kernels) for every texture upload in my fork, but I had to disable that after switching tabs in Firefox took more than a second
<alyssa> Oof
<alyssa> ~~Does switching tabs in Firefox usually take less than that?~~
<icecream95> I know, it's far too slow even with software rendering
<icecream95> (The problem was that it was decompressing and recompressing the entire texture after every partial upload, many of which were on the order of 64x64 pixels)
<alyssa> Ouch
<HdkR> That's a good candidate for a small texture heuristic :D
<icecream95> It should be possible to only mess with the blocks that actually got changed.. but you still have to move the rest of the data to make space
<icecream95> I'm thinking of having a single massive BO for all AFBC textures, so that the data blocks can be shared between them
<alyssa> why is mali like this
<icecream95> ..It doesn't even need to be a single BO, really, as long as they are all within a 4GB region
vstehle has quit [Ping timeout: 480 seconds]
<icecream95> Unfortunately, it seems that AFRC is only supported on the lower-end v10 GPUs.. but it probably won't be capable of many of the fun tricks AFBC can do
* alyssa forgot AFRC was a thing
<icecream95> AFBC took me a week to RE, once I was scared that ARM would release its specification. How long will AFRC take?
* alyssa laughs
<alyssa> I don't know anything about AFRC except a few minutes of reading kernel patches and the Arm press release, so...
<icecream95> Was all the stuff with Arm just for Morello, and they don't care about Panfrost otherwise?
<alyssa> a weekend? :)
<alyssa> daniels: ^^
<cphealy> Is anyone else using the G52 experiencing issues with AFBC where only DRM_FORMAT_MOD_ARM_16X16_BLOCK_U_INTERLEAVED gets used? This is what I'm experiencing on Mesa 22.1-devel
rasterman has quit [Quit: Gettin' stinky!]
<daniels> icecream95: I really can’t speak for them, sorry (but wb!)
atler is now known as Guest1324
atler has joined #panfrost
Guest1324 has quit [Ping timeout: 480 seconds]
* alyssa is going to need to fix a bunch of broken Bifrost passes before adding proper 64-bit support, wee...
<alyssa> hm. v9 has more in common with v6 than v7
<alyssa> maybe v7 really is the oddball here :'p
* alyssa supposes message preloads aren't coming back
<alyssa> Pass: 14118, Fail: 75, Warn: 37, Skip: 34, Flake: 1, Duration: 6:05, Remaining: 0
<alyssa> i mean. i guess that's progress.
<alyssa> failing some clipping and interpolation tests, wonder what's subtly different from bifrost there..
atler is now known as Guest1327
atler has joined #panfrost
Guest1327 has quit [Ping timeout: 480 seconds]
<alyssa> I guess the scissor code is changed
<alyssa> oh something screwed up with the point size handling, I see.
soreau has quit [Read error: Connection reset by peer]
soreau has joined #panfrost
<icecream95> message preloads?.. I remember seeing some shaders on v10 where it looked like some registers were preloaded..
<icecream95> alyssa: ^^
<alyssa> ah, alright
<alyssa> Pass: 14128, Fail: 65, Warn: 38, Skip: 34, Duration: 6:00, Remaining: 0
<alyssa> ..some of those are regressions though >.>
<icecream95> "Duration: 6:00". Have you tried mimalloc? That at least speeds up shader compilation a lot
<alyssa> haven't looked at run time yet, the same test suite is under 2 minutes on the n2..
<HdkR> If only all performance issues could be solved with a drop in memory allocator
<icecream95> On the topic of tools to make things faster, mold is a really fast linker, especially after I reported a bug that broke linking Mesa
<HdkR> ooo, I'm super interested in trying out mold. I keep forgetting to try it
tjcorley has quit [Ping timeout: 480 seconds]
tjcorley has joined #panfrost
vstehle has joined #panfrost
camus has joined #panfrost
camus1 has quit [Remote host closed the connection]
dmh_ has joined #panfrost
dmh has quit [Ping timeout: 480 seconds]
pjakobsson has quit [Ping timeout: 480 seconds]
pjakobsson has joined #panfrost
macc24 has quit [Ping timeout: 480 seconds]
unevenrhombus[m] has joined #panfrost
rasterman has joined #panfrost
floof58_ has joined #panfrost
floof58_ has quit []
erlehmann has quit [Ping timeout: 480 seconds]
floof58_ has joined #panfrost
floof58 has quit [Ping timeout: 480 seconds]
floof58_ has quit []
floof58 has joined #panfrost
floof58 has quit []
floof58 has joined #panfrost
erlehmann has joined #panfrost
nlhowell has joined #panfrost
erlehmann has quit [Ping timeout: 480 seconds]
AreaScout_ has quit [Server closed connection]
AreaScout_ has joined #panfrost
rellla has quit [Server closed connection]
rellla has joined #panfrost
wilkom has quit [Server closed connection]
erlehmann has joined #panfrost
wilkom has joined #panfrost
camus1 has joined #panfrost
camus has quit [Ping timeout: 480 seconds]
camus1 has quit []
tomeu has quit [Server closed connection]
tomeu has joined #panfrost
stepri01 has quit [Server closed connection]
stepri01 has joined #panfrost
camus has joined #panfrost
megi has quit [Server closed connection]
megi has joined #panfrost
camus has quit []
camus has joined #panfrost
hl` has quit [Server closed connection]
hl` has joined #panfrost
floof58 has quit [Ping timeout: 480 seconds]
floof58 has joined #panfrost
camus has quit [Remote host closed the connection]
floof58 has quit []
floof58 has joined #panfrost
tanty has quit [Server closed connection]
tanty has joined #panfrost
nlhowell has quit [Ping timeout: 480 seconds]
jambalaya has quit [Server closed connection]
jambalaya has joined #panfrost
nlhowell has joined #panfrost
nlhowell is now known as Guest1400
nlhowell has joined #panfrost
Guest1400 has quit [Ping timeout: 480 seconds]
br has quit [Server closed connection]
br has joined #panfrost
nlhowell has quit [Ping timeout: 480 seconds]
HayashiEsme[m] has quit [Server closed connection]
HayashiEsme[m] has joined #panfrost
dhewg has quit [Server closed connection]
dhewg has joined #panfrost
<alyssa> icecream95: fixed 200+ piglits, so I had to go fix another 200+, so...
<alyssa> icecream95: tag you're it again :-p
<alyssa> actually i'm still it, cube array fix incoming
<alyssa> UnexpectedPass: 168
<alyssa> ok i'll take it
<icecream95> alyssa: Is there anything non-Piglit that actually needs GL_CLAMP to work properly?
<alyssa> probably not
<icecream95> But my OpenCL patches would probably fix more CL Piglits than that…
<alyssa> I don't mind OpenCL patches :-)
<icecream95> I think that Arm using CRC for transaction elimination was maybe not the best idea, because it makes collision attacks ~unfixable
<icecream95> Probably the only real way of avoiding attacks is to randomly (per application start) change the order of pixels to be fed into the CRC
<alyssa> I don't disagree.
<DPA> GL_CLAMP? I would expect programs to use that to render text & icons and similar things, to avoid artifacts at the edge of them from the next glyph / icon inside the same texture.
<alyssa> DPA: You're looking for GL_CLAMP_TO_EDGE
<alyssa> GL_CLAMP is a nonsense legacy thing
<DPA> Oh my god, I had no idea!
<alyssa> confusing, eh?
<DPA> Yes, very much.
<icecream95> "CLAMP .. is broken for nearest filtering". Well, it isn't broken, I think it just handles it exactly the same as it does for linear filtering, which turns out to be different from what APIs expect
<icecream95> Fixing Piglits seems to be a good way to get negative LOC added from your patch..
<icecream95> alyssa: D'ya think this is the sort of hack that Panfrost needs upstream?
<icecream95> But wait there's more:
<icecream95> ?quit
<icecream95> oops
wolfshappen has quit [Ping timeout: 480 seconds]