ChanServ changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - Logs https://oftc.irclog.whitequark.org/panfrost - <macc24> i have been here before it was popular
`join_subline has quit [Read error: Connection reset by peer]
`join_subline has joined #panfrost
atler is now known as Guest109
atler has joined #panfrost
Guest109 has quit [Ping timeout: 480 seconds]
alpernebbi has quit [Ping timeout: 480 seconds]
greenjustin has quit [Ping timeout: 480 seconds]
alpernebbi has joined #panfrost
<icecream95> Oops.. MRT writeout seems broken on Midgard. The compiler doesn't seem to ensure that RTs are written in the correct order
<icecream95> (But why does there need to be a "correct order" at all?)
<icecream95> "pan/mdg: Don't read base for combined stores" seems completely broken to me
<icecream95> ++bugs_from_alyssa_rewriting_my_code;
hexdump01 has joined #panfrost
hexdump0815 has quit [Ping timeout: 480 seconds]
MoeIcenowy has quit [Quit: ZNC 1.8.2 - https://znc.in]
<icecream95> And the same bug affects Bifrost as well?
MoeIcenowy has joined #panfrost
urja has quit [Ping timeout: 480 seconds]
MoeIcenowy has quit [Ping timeout: 480 seconds]
MoeIcenowy has joined #panfrost
Moe_Icenowy has joined #panfrost
Moe_Icenowy has quit []
Moe_Icenowy has joined #panfrost
MoeIcenowy has quit [Ping timeout: 480 seconds]
pendingchaos has quit [Read error: Connection reset by peer]
guillaume_g has joined #panfrost
<icecream95> To be fair, things are broken anyway, the bug just made it worse
fatdog has joined #panfrost
fatdog has quit []
<tomeu> any ideas on why our CI isn't catching those?
urja has joined #panfrost
<icecream95> tomeu: No one thought to write a test combining the features, I guess
<tomeu> we could add a new test to piglit, but I would have expected deqp to already have one for that
<icecream95> dEQP has tests for fragdepth, but it seems that none of them use MRT
icecream95 has quit [Ping timeout: 480 seconds]
icecream95 has joined #panfrost
nlhowell has joined #panfrost
nlhowell has quit [Ping timeout: 480 seconds]
nlhowell has joined #panfrost
rasterman has joined #panfrost
<icecream95> How long is it supposed to take to clone the dEQP ("VK-GL-CTS") repo?
<icecream95> At the current rate it's going, it'll take more than 10 hours. CPU-bound, not network!
<icecream95> Trying with --depth=1 seems to be working a bit better
nlhowell has quit [Ping timeout: 480 seconds]
rkanwal has joined #panfrost
nlhowell has joined #panfrost
<icecream95> Hmm.. updating Sway from the version in Debian stable makes glmark2 *much* faster
<icecream95> sway still has a higher CPU usage than glmark2 itself
<icecream95> I think a lot of the overhead scales with resolution, so for high-res displays the overhead might be noticeable even for applications not at a ridiculous FPS
<icecream95> T860 is still not better than T760 even with a new sway
pendingchaos has joined #panfrost
nlhowell has quit [Read error: Connection reset by peer]
nlhowell has joined #panfrost
icecream95 has quit [Ping timeout: 480 seconds]
nlhowell has quit [Ping timeout: 480 seconds]
atler has quit [Quit: atler]
alyssa has joined #panfrost
atler has joined #panfrost
q4a has joined #panfrost
erle has joined #panfrost
pi__ has quit [Read error: Connection reset by peer]
pi has joined #panfrost
erle has quit [Ping timeout: 480 seconds]
<alyssa> I should probably reinstall linux on my veyron so I can use it as a midgard test board...
<alyssa> Just, such a pain to setup these chromebooks >.>
<macc24> alyssa: if only there was a project that would automate putting linux on chromebooks with just one command
<alyssa> if only they had standards compliant bootloaders so I could use d-i
<macc24> d-i?
<alyssa> debian installer
<macc24> you know that u-boot runs on veyron machines
<macc24> ?
jenneron has joined #panfrost
<jenneron> macc24: i work on integrating postmarketOS with depthcharge, so the only command needed is "pmbootstrap install --sdcard /dev/sdX", but there are only 3 exynos chromebooks supported so far
<alyssa> jenneron: what's the use case of pmOS on chromebooks?
<macc24> jenneron: with any luck adding more chromebooks is trivial
<alyssa> (as opposed to vanilla Alpine)
<jenneron> same use case to other devices - automation and packaging device-specific things
<macc24> not much device specific about chromebooks, aside dtb
<macc24> and their bootloader
<jenneron> secondary u-boots, ucm configs
<macc24> ugh
<macc24> good luck adding more devices :)
<jenneron> there should be more soon
<macc24> feel free to ask me about chromebooks on #aarch64-laptops
<alyssa> jekstrand: So I'm trying to figure out how to handle transform feedback properly on Valhall, I recall you looked at this.
<alyssa> For some background, no Mali has transform feedback hardware, but there's also no hw for geom/tess so that's ok
<alyssa> On Midgard and Bifrost, all varyings are written to driver-allocated buffers. This sucks hard. But it means we can emulate XFB by using user-allocated buffers instead.
<macc24> alyssa: wait... i thought transform feedback was garbage to implement on tilers?
<alyssa> That... sort of works.
<alyssa> Ish.
<alyssa> it happens to pass ES3.1 conformance. It is very much not big GL conformant.
<alyssa> And it doesn't work at all on Valhall.
<jekstrand> ok...
<alyssa> Valhall does have a legacy geometry flow like Midgard/Bifrost, with driver-allocated varying buffers, but there's extra padding with instancing so that legacy XFB path isn't conformant even for ES3.0
<alyssa> (broken KHR-GLES31 test iirc)
<alyssa> We also don't want to use that legacy geometry flow at all.
<alyssa> The "proper" geometry flow on Valhall has the hardware allocate varyings itself on the fly.
<alyssa> This is huge! It means we don't need to do index buffer scans, it means we don't need to do awful cmdbuf patching, etc. Big win, especially for Vulkan
<alyssa> But it means our hack to implement XFB won't work, we have no choice but lowering XFB away to store_global instructions.
<alyssa> For GLES3.1 class XFB, that's not too hard, and is implemented in !15720
<alyssa> Full desktop GL XFB is a lot more work (tessellating strips/quads, etc)
<jekstrand> oof...
<jekstrand> yeah
<alyssa> But all this relies on having XFB information with the driver
<alyssa> In theory nir_xfb_info has all the info we need. But that only works if you pretend to be radeonsi really hard.
<alyssa> !15720 adds lower_io_cb callback to avoid breaking radeonsi or rocking the boat too hard.
<alyssa> but even that doesn't work right
<alyssa> because tons of existing code relies on being called "early", ie before nir_lower_io
<alyssa> If we've suddenly decided we want everything to run late, nobody told me.
<jekstrand> Right
<jekstrand> Last week when we talked about this I started trying to figure out how to un-plumb the radeonsi stuff.
<alyssa> right
<jekstrand> I never got quite finished but I think the radeonsi thing is the wrong approach. We want the XFB info attached to the nir_shader somehow.
<alyssa> OK
<jekstrand> I think we can, I just didn't have the persistence to finish last week. I got too distracted by shiny new drivers that needed descriptor set code.
<alyssa> That would avoid disrupting the existing lower_io code in Panfrost, which I appreciate for one.
<alyssa> getting started on an AGX vulkan driver so soon? ;-)
<jekstrand> Maybe? Who knows? I do many things. :-P
<jekstrand> I can try to get back to that this week. I've got some radv stuff I'd like to finish and I really need to push to get my new implicit sync ioctls landed. That's mostly just waiting on one more review, though.
<alyssa> I don't mind taking over that series, I just need clear direction on what to do.
<alyssa> Because so far nobody agrees and Valhall support is blocked on politics
<jekstrand> Yeah, I hear you.
<jekstrand> I just pushed the branch as-is to radeonsi/lower-xfb-in-finalize
<alyssa> Asterisk: I don't have any AMD hardware and I intend to keep it that way :-p
<jekstrand> Where I was trying to go with it was to eventually get to where the nir_xfb_info is attached to the nir_shader and generated either fairly early by nir_gather_xfb or later pulling info from the GLSL compiler.
<jekstrand> That's fine. I can help with the debugging.
<jekstrand> Where I stopped was just trying to figure out what order to move gallium stuff around.
<jekstrand> I also want to get rid of gather_xfb_with_varyings and have it be one pass that just always does that because why not.
<jekstrand> But it's super-annoying because ANV also serializes that struct and I was maybe trying to make that more automatic (but maybe that was a fool's errand)
<jekstrand> Anyway, it's one of those things where I've got a general idea where it's going but need to massage it until it gets there.
<alyssa> All of this just makes me sad.
<jekstrand> I'm happy to try and pick it back up tomorrow.
<jekstrand> It all makes me very very sad.
<jekstrand> It's pretty clear all the XFB stuff was very bolted on
<alyssa> Yeah, for sure.
<alyssa> ~So why can't I just keep bolting on XFB stuff so I can flip the Mali-G57 switch on~ delet
<jekstrand> Like the XFB linking stuff was "Oh, we need more info. Here, let me add a thing." Then radeonsi was "I need a thing to happen roughly here so I'll add a callback"
<jekstrand> Like the only reason why radeonsi needs the callback is because it needs to poke at the GLSL structures to get the XFB info and it can't do that in its back-end.
<jekstrand> At least that's the only reason I can see.
<jekstrand> Which is the same as what panfrost needs
<alyssa> Right..
<jekstrand> And the only reason why I made nir_gather_xfb return the xfb_info the first time rather than stuffing it in the shader was because I was lazy and everything that goes in a nir_shader needs to support clone, [de]serialize, sweep, etc. and I didn't want to bother.
<jekstrand> So it's a lot of history of people doing something pragmatic and we need to just all knock it off and someone needs to clean it up so we can start doing the right thing.
<alyssa> not sure we're looking at the same branch?
<jekstrand> And I'm happy to do that. I just need to get some spoons back first.
<alyssa> Understood
<jekstrand> And given that I didn't fall asleep last night until 4:30 AM and slept badly after that, I'm not likely to get any spoons today.
<jekstrand> So we'll try for it tomorrow. :)
<alyssa> :'(
<jekstrand> I'll be fine. I'm functional. But some things require more than "functional"
<alyssa> Yeah, I get that
<alyssa> So I should hold off on touching XFB for a few days to give you time to fix the common code, and then it should be possible to rebase the Valhall MR without any fun politics involved?
<jekstrand> That sounds like a decent plan.
<alyssa> OK
<jekstrand> The RADV patches I'm working on are hell to rebase but there's no actual time pressure.
<alyssa> Woof
<alyssa> what about the early/late I/O lowering story?
<jekstrand> IDK
<jekstrand> That's all a mess with history too
<jekstrand> And I'm not that inclined to swim up that stream right now.
<alyssa> Yeah, I feel that.
<jekstrand> Fixing xfb_info seems tractable if annoying
<jekstrand> I've at least got a plan.
<alyssa> Wee
<jekstrand> I'm sorry that 80% of that plan is in my head, not the branch but that's where it sits right now.
<alyssa> no worries :)
<cphealy> With this, it would be the reverse of the more common method with existing upstream drivers by having underlays instead of overlays.
<cphealy> whoops, wrong chat...
<alyssa> down to 8 shaders in my shader-db that spill
<alyssa> all from Android games, wee
<alyssa> interestingly, all but 1 are compute kernels
<alyssa> Fixed spilling on the 1 non-compute kernel
<alyssa> of the remaining 7 that spill, there are only 3 distinct shaders
<alyssa> Another bunch of spills eliminated with a similar fix
bbrezillon has quit [Read error: Connection reset by peer]
bbrezillon has joined #panfrost
jenneron_ has joined #panfrost
jenneron has quit [Ping timeout: 480 seconds]
<alyssa> er my grepfu was wrong, missed another shader that spilled heavily
erle has joined #panfrost
rasterman has quit [Remote host closed the connection]
rasterman has joined #panfrost
<alyssa> alyssa@scootaloo:~$ DISPLAY=:0 glmark2-es2 --off-screen
<alyssa> Error: Error loading EGL library
<alyssa> am I having a PEBKAC?
rkanwal has quit [Quit: rkanwal]
`join_subline has quit [Ping timeout: 480 seconds]
`join_subline has joined #panfrost
Rathann has joined #panfrost
<alyssa> manhattan on mt8192 up to ~21.7fps, woo!
<alyssa> I think that's better than before at least
<alyssa> uh apparently it was 22.9 a commit before
rasterman has quit [Quit: Gettin' stinky!]
<alyssa> grumble. I guess the latency matters more than I thought.
<alyssa> 23.7 with a stupid simple latency-focused scheduler
<alyssa> despite way more spilling!
<alyssa> the web claims this should be hitting 60fps though...
icecream95 has joined #panfrost
floof58 has quit [Quit: floof58]
floof58 has joined #panfrost
<anarsoul> alyssa: icecream95: ping
<anarsoul> I don't think that the wording covers drawing commands that use the same buffer (or rather the same part of the buffer)
<anarsoul> otherwise it'd require a buffer copy for each drawing call
<HdkR> What are you wondering about with buffer_storage?
<HdkR> Oh yea. That bit of the spec is rough. Following x86 semantics quite heavily
Rathann has quit [Quit: Leaving]