ChanServ changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - Logs https://oftc.irclog.whitequark.org/panfrost - <macc24> i have been here before it was popular
sigmaris has quit [Ping timeout: 480 seconds]
camus has quit []
megi has quit [Quit: WeeChat 3.2]
megi has joined #panfrost
<icecream95> I've just sent my first non-faulting Panfrost GPU jobs over the network
<icecream95> I just need to make it send the updated BO data back to the driver, then we can have accelerated Vulkan in web browsers...
* icecream95 remembers what happened last time he tried doing stupid things with Vulkan
sigmaris has joined #panfrost
<tomeu> alyssa: using the m1 for panfrost development sounds great to me, this is how I test my wip code:
<tomeu> sudo systemd-nspawn -D ~/nfsroot-panfrost sh -c "su tomeu -c 'ninja -C ~/deqp-build deqp-vk'" && sudo systemd-nspawn -D ~/nfsroot-panfrost sh -c "su tomeu -c 'ninja -j16 -C /home/tomeu/mesa-build' && ninja -j16 -C /home/tomeu/mesa-build install" && ssh 10.42.0.62 PAN_I_WANT_A_BROKEN_VULKAN_DRIVER=1
<tomeu> ~/deqp-build/external/vulkancts/modules/vulkan/deqp-vk -n dEQP-VK.renderpass.suballocation.unused_clear_attachments.*
<tomeu> I'm building with gcc under qemu as this is a x86 machine, so on your m1 it should be really fast
<icecream95> tomeu: With my patches for sending ioctls and BOs over a network you could run everything (including dEQP on panvk) on an x86 machine and only use the ARM device for sending the jobs to the GPU
<icecream95> You'd still need to test it natively to make sure synchronisation works correctly etc.
<macc24> tomeu: FYI gcc in qemu seems to have broken recently
<macc24> at least for me
<tomeu> icecream95: interesting!
<tomeu> macc24: I think it has been years since the last time I upgraded the debian on the nfsroot/container :)
<macc24> tomeu: jesus christ
<tomeu> my only interaction with the machine is that cmdline above, so I'm more than happy to not have to worry about it
rasterman has joined #panfrost
camus has joined #panfrost
<bbrezillon> alyssa: adding panfrost_dump.c to my TODO list ;). Right now I'm finishing the panvk+panfrost-common-lib per-gen split (I'm almost done BTW)
camus has quit [Read error: Connection reset by peer]
warpme_ has joined #panfrost
<tomeu> bbrezillon: have pushed a bunch of stuff from your blend+blit branch to https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12095/commits
<tomeu> plus the start of the split
<tomeu> do you think we could merge that next? and if so, what are the biggest issues that need to be addressed before we can do so?
wwilly has quit [Ping timeout: 480 seconds]
wwilly has joined #panfrost
<bbrezillon> tomeu: I'd like to merge the per-gen stuff first
wwilly has quit []
<tomeu> bbrezillon: ok, and then that MR?
<bbrezillon> tomeu: didn't look at it yet, but yesm that's the idea
camus has joined #panfrost
<alyssa> bbrezillon: awesome sauce
camus has quit [Read error: Connection reset by peer]
<bbrezillon> now I need to split the XMLs...
<alyssa> bbrezillon: I can split the XMLs
<alyssa> if you want to get back to literally anything else
camus has joined #panfrost
<bbrezillon> alyssa: I'm taking care of v7 right now, depending on how it goes I might let you do the others
<alyssa> bbrezillon: Fair enough =)
<Lyude> alyssa: asking since I vaguely remember this being a that had to be solved for panfrost at one point, what was the solution for us fixing the whole "failed to open foo: /usr/lib64/dri/foo.so". Did we just write up small dri shims for each different display chip using panfrost?
<Lyude> oh we might have it handled actually
<alyssa> Hmm?
<alyssa> `kmsro`
<alyssa> I think is the term to grep for
<Lyude> alyssa: gotcha
<alyssa> but yes basically that
<alyssa> robmur01: Noooo! my bifrost fp16 instr_invalid_enc faults are back, and now I have a reproducing shader that doesn't even use v2f32_to_v2f16 *or* f16_to_f32
<alyssa> i'm increasingly convinced the scheduler is fine and there's some rare bug in the packing code
<alyssa> Lyude: i am decidedly tempted to break out your assembler to debug this
<alyssa> of course i have no idea if I can still remember cwabbott 's bifrost synatx :p
<Lyude> neat o:
<alyssa> i just
<alyssa> do i have a subtle bug? does the disasm have the same bug? is there a hw bug?
<alyssa> why is the DDK unaffected
<alyssa> and oh how I wish I could poke the DDK at a finer level
<alyssa> it's hard to figure out how it packs certain things when I have so little influence over clause scheduling
<alyssa> For this buggy shader, my schedule is 1 cycle faster than the DDK
<alyssa> Is the DDK missing a schedule opportunity? Or is there something subtly wrong about the schedule explaining the fault?
<robmur01> well, FWIW my instinct says that if two things are slightly different and one of them is wrong, there's a high chance that the difference is significant :)
<robmur01> +1 for disassemble that thing
wwilly has joined #panfrost
<alyssa> robmur01: hm
<alyssa> new/old bifrost syntax is too differnt uhm
<alyssa> bifrost is canceled
<HdkR> Perfect, long live valhall
<alyssa> help
<robmur01> can't you get both shader binaries and run them through the same tool?
<macc24> \o/ no more g72 quirks
<alyssa> robmur01: which tool?
<alyssa> I can't get the shader binaries close enough since Bifrost scheduling heuristics are basically just calling `rand()`
<robmur01> oh, I thought disassemblers existed already
<robmur01> this whole graphics malarkey still baffles me sometimes :)
<alyssa> disassembler yes
<alyssa> but clause scheduling makes that less than useful
<alyssa> Ok, this is significant --
<alyssa> *FMA.v2f16 r0:t0, t0.h00, 0x409a3f9a /* 4.820264 */, #0.neg
<alyssa> +NOP.i32 t1
<alyssa> *MKVEC.v2i16 r1:t0, t0.h1, 0x00003c00 /* 0.000000 */
<alyssa> +NOP.i32 t1
<alyssa> versus
<alyssa> C *FMA.v2f16 r0:t0, t1.h00, 0x409a3f9a /* 4.820264 */, #0.neg
<alyssa> +MKVEC.v2i16 r1:t1, t.h1, 0x00003c00 /* 0.000000 */
<alyssa> (DDK on top, Panfrost on the bottom)
<alyssa> Apparently DDK's clause scheduling does have logic preventing FMA.v2f16 / MKVEC.v2i16 from being put in a tuple together? but ... why?!
<alyssa> robmur01: daniels suggest I vary the modifier
<alyssa> aboutly MKVEC.v2i16 with t0.h1 won't fuse (See above)
<alyssa> but the DDK -will- fuse MKVEC.v2i16 with t0.h0
<alyssa> there is absolutely no valid architectural reason for that distinction to matter
<alyssa> which just underscores that there is internal pipeline state leaking through and I mean
<alyssa> 14:05
<alyssa> I'll just end up playing whack-a-mole if I keep adding rules on for these symptoms
<macc24> alyssa: how high tolerance do you have to dumb questions?
<alyssa> macc24: fairly low but i can make an exception for you ;)
<macc24> alyssa: is bifrost compiler padding with nops?
<alyssa> macc24: yes, and that's not a dumb question
<macc24> why?
<alyssa> because bifrost is dumb
<alyssa> bifrost's arithmetic logic unit has two units, FMA and ADD
<HdkR> If you can't fill the pipeline with work, then you need to pad :>
<alyssa> they can execute different kinds of instructions
<alyssa> it always executes FMA/ADD/FMA/ADD/... in that order
<alyssa> so if you don't have an instructionr eady for a given unit (FMA or ADD)... you have to put in a NOP instead
<alyssa> more modern designs like Valhall do the same thing, but they do it internal to the hardware. i.e. if they dont have work ready, they'll just stall, the compiler doesn't have to literally tell the hardware to stall.
<macc24> alyssa: what would happen if compiler didn't do any nops
<robclark> that is kinda the opposite direction from every other gpu vendor.. which seems to go in the direction of "make the compiler figure it out"
<HdkR> https://en.wikipedia.org/wiki/Hazard_(computer_architecture) You hit these kinds of problems without nops :)
<alyssa> robclark: yeah, they tried that and called it bifrost and we're still cleaning up the ashes
<macc24> we need to make driver developers make gpus and gpu engineers make drivers
<macc24> that'll solve all problems
<anholt_> broadcom had basically the same people doing the HW and the simulator and the docs and the driver. it's a good way to go.
<anholt_> though the docs can suffer
<macc24> i... wouldn't take broadcom as an example
<anholt_> it's the simplest, best-documented, most stable GPU I've worked on. shame about their display engines.
atler is now known as Guest2626
atler has joined #panfrost
Guest2626 has quit [Ping timeout: 480 seconds]
atler has quit [Read error: Connection reset by peer]
atler has joined #panfrost
Putti has quit [Ping timeout: 480 seconds]
enunes has quit [Ping timeout: 480 seconds]
atler is now known as Guest2631
atler has joined #panfrost
Guest2631 has quit [Ping timeout: 480 seconds]
atler has quit [Ping timeout: 480 seconds]
atler has joined #panfrost
warpme_ has quit [Quit: Connection closed for inactivity]
atler is now known as Guest2637
atler has joined #panfrost
Guest2637 has quit [Ping timeout: 480 seconds]
rasterman has quit [Quit: Gettin' stinky!]
camus has quit []