#panfrost on 2022-05-24 — irc logs at oftc.irclog.whitequark.org

2022-03-22 11:57 ChanServ changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - Logs https://oftc.irclog.whitequark.org/panfrost - <macc24> i have been here before it was popular

01:06 `join_subline has quit [Read error: Connection reset by peer]

01:20 `join_subline has joined #panfrost

01:54 atler is now known as Guest109

01:54 atler has joined #panfrost

01:56 Guest109 has quit [Ping timeout: 480 seconds]

02:18 alpernebbi has quit [Ping timeout: 480 seconds]

02:18 greenjustin has quit [Ping timeout: 480 seconds]

02:18 alpernebbi has joined #panfrost

02:44 <icecream95> Oops.. MRT writeout seems broken on Midgard. The compiler doesn't seem to ensure that RTs are written in the correct order

02:46 <icecream95> (But why does there need to be a "correct order" at all?)

02:59 <icecream95> "pan/mdg: Don't read base for combined stores" seems completely broken to me

03:00 <icecream95> ++bugs_from_alyssa_rewriting_my_code;

03:13 hexdump01 has joined #panfrost

03:15 hexdump0815 has quit [Ping timeout: 480 seconds]

03:51 MoeIcenowy has quit [Quit: ZNC 1.8.2 - https://znc.in]

04:03 <icecream95> And the same bug affects Bifrost as well?

04:13 MoeIcenowy has joined #panfrost

04:31 urja has quit [Ping timeout: 480 seconds]

04:43 MoeIcenowy has quit [Ping timeout: 480 seconds]

04:45 MoeIcenowy has joined #panfrost

04:56 Moe_Icenowy has joined #panfrost

04:57 Moe_Icenowy has quit []

04:58 Moe_Icenowy has joined #panfrost

05:00 MoeIcenowy has quit [Ping timeout: 480 seconds]

06:06 pendingchaos has quit [Read error: Connection reset by peer]

06:20 guillaume_g has joined #panfrost

06:24 <icecream95> To be fair, things are broken anyway, the bug just made it worse

06:56 fatdog has joined #panfrost

06:56 fatdog has quit []

07:11 <tomeu> any ideas on why our CI isn't catching those?

07:22 urja has joined #panfrost

07:56 <icecream95> tomeu: No one thought to write a test combining the features, I guess

07:59 <tomeu> we could add a new test to piglit, but I would have expected deqp to already have one for that

08:02 <icecream95> dEQP has tests for fragdepth, but it seems that none of them use MRT

08:26 icecream95 has quit [Ping timeout: 480 seconds]

08:31 icecream95 has joined #panfrost

08:31 nlhowell has joined #panfrost

09:01 nlhowell has quit [Ping timeout: 480 seconds]

09:11 nlhowell has joined #panfrost

09:13 rasterman has joined #panfrost

09:17 <icecream95> How long is it supposed to take to clone the dEQP ("VK-GL-CTS") repo?

09:18 <icecream95> At the current rate it's going, it'll take more than 10 hours. CPU-bound, not network!

09:35 <icecream95> Trying with --depth=1 seems to be working a bit better

09:45 nlhowell has quit [Ping timeout: 480 seconds]

09:48 rkanwal has joined #panfrost

09:50 nlhowell has joined #panfrost

09:51 <icecream95> Hmm.. updating Sway from the version in Debian stable makes glmark2 *much* faster

09:51 <icecream95> sway still has a higher CPU usage than glmark2 itself

09:53 <icecream95> I think a lot of the overhead scales with resolution, so for high-res displays the overhead might be noticeable even for applications not at a ridiculous FPS

09:55 <icecream95> T860 is still not better than T760 even with a new sway

10:07 pendingchaos has joined #panfrost

10:16 nlhowell has quit [Read error: Connection reset by peer]

10:18 nlhowell has joined #panfrost

10:20 icecream95 has quit [Ping timeout: 480 seconds]

10:43 nlhowell has quit [Ping timeout: 480 seconds]

11:26 atler has quit [Quit: atler]

12:04 alyssa has joined #panfrost

12:20 atler has joined #panfrost

12:37 q4a has joined #panfrost

13:36 erle has joined #panfrost

14:38 pi__ has quit [Read error: Connection reset by peer]

14:39 pi has joined #panfrost

14:55 erle has quit [Ping timeout: 480 seconds]

16:04 <alyssa> I should probably reinstall linux on my veyron so I can use it as a midgard test board...

16:04 <alyssa> Just, such a pain to setup these chromebooks >.>

16:05 <macc24> alyssa: if only there was a project that would automate putting linux on chromebooks with just one command

16:06 <alyssa> if only they had standards compliant bootloaders so I could use d-i

16:06 <macc24> d-i?

16:06 <alyssa> debian installer

16:07 <macc24> you know that u-boot runs on veyron machines

16:07 <macc24> ?

16:08 jenneron has joined #panfrost

16:10 <jenneron> macc24: i work on integrating postmarketOS with depthcharge, so the only command needed is "pmbootstrap install --sdcard /dev/sdX", but there are only 3 exynos chromebooks supported so far

16:10 <alyssa> jenneron: what's the use case of pmOS on chromebooks?

16:10 <macc24> jenneron: with any luck adding more chromebooks is trivial

16:10 <alyssa> (as opposed to vanilla Alpine)

16:11 <jenneron> same use case to other devices - automation and packaging device-specific things

16:12 <macc24> not much device specific about chromebooks, aside dtb

16:12 <macc24> and their bootloader

16:12 <jenneron> secondary u-boots, ucm configs

16:12 <macc24> ugh

16:12 <macc24> good luck adding more devices :)

16:13 <jenneron> there should be more soon

16:13 <macc24> feel free to ask me about chromebooks on #aarch64-laptops

16:20 <alyssa> jekstrand: So I'm trying to figure out how to handle transform feedback properly on Valhall, I recall you looked at this.

16:21 <alyssa> For some background, no Mali has transform feedback hardware, but there's also no hw for geom/tess so that's ok

16:21 <alyssa> On Midgard and Bifrost, all varyings are written to driver-allocated buffers. This sucks hard. But it means we can emulate XFB by using user-allocated buffers instead.

16:22 <macc24> alyssa: wait... i thought transform feedback was garbage to implement on tilers?

16:22 <alyssa> That... sort of works.

16:22 <alyssa> Ish.

16:22 <alyssa> it happens to pass ES3.1 conformance. It is very much not big GL conformant.

16:22 <alyssa> And it doesn't work at all on Valhall.

16:22 <jekstrand> ok...

16:23 <alyssa> Valhall does have a legacy geometry flow like Midgard/Bifrost, with driver-allocated varying buffers, but there's extra padding with instancing so that legacy XFB path isn't conformant even for ES3.0

16:23 <alyssa> (broken KHR-GLES31 test iirc)

16:23 <alyssa> We also don't want to use that legacy geometry flow at all.

16:24 <alyssa> The "proper" geometry flow on Valhall has the hardware allocate varyings itself on the fly.

16:24 <alyssa> This is huge! It means we don't need to do index buffer scans, it means we don't need to do awful cmdbuf patching, etc. Big win, especially for Vulkan

16:25 <alyssa> But it means our hack to implement XFB won't work, we have no choice but lowering XFB away to store_global instructions.

16:25 <alyssa> For GLES3.1 class XFB, that's not too hard, and is implemented in !15720

16:26 <alyssa> Full desktop GL XFB is a lot more work (tessellating strips/quads, etc)

16:26 <jekstrand> oof...

16:26 <jekstrand> yeah

16:27 <alyssa> But all this relies on having XFB information with the driver

16:27 <alyssa> In theory nir_xfb_info has all the info we need. But that only works if you pretend to be radeonsi really hard.

16:27 <alyssa> !15720 adds lower_io_cb callback to avoid breaking radeonsi or rocking the boat too hard.

16:28 <alyssa> but even that doesn't work right

16:28 <alyssa> because tons of existing code relies on being called "early", ie before nir_lower_io

16:28 <alyssa> If we've suddenly decided we want everything to run late, nobody told me.

16:29 <jekstrand> Right

16:29 <jekstrand> Last week when we talked about this I started trying to figure out how to un-plumb the radeonsi stuff.

16:29 <alyssa> right

16:29 <jekstrand> I never got quite finished but I think the radeonsi thing is the wrong approach. We want the XFB info attached to the nir_shader somehow.

16:30 <alyssa> OK

16:30 <jekstrand> I think we can, I just didn't have the persistence to finish last week. I got too distracted by shiny new drivers that needed descriptor set code.

16:30 <alyssa> That would avoid disrupting the existing lower_io code in Panfrost, which I appreciate for one.

16:30 <alyssa> getting started on an AGX vulkan driver so soon? ;-)

16:30 <jekstrand> Maybe? Who knows? I do many things. :-P

16:31 <jekstrand> I can try to get back to that this week. I've got some radv stuff I'd like to finish and I really need to push to get my new implicit sync ioctls landed. That's mostly just waiting on one more review, though.

16:32 <alyssa> I don't mind taking over that series, I just need clear direction on what to do.

16:32 <alyssa> Because so far nobody agrees and Valhall support is blocked on politics

16:33 <jekstrand> Yeah, I hear you.

16:33 <jekstrand> I just pushed the branch as-is to radeonsi/lower-xfb-in-finalize

16:33 <alyssa> Asterisk: I don't have any AMD hardware and I intend to keep it that way :-p

16:33 <jekstrand> Where I was trying to go with it was to eventually get to where the nir_xfb_info is attached to the nir_shader and generated either fairly early by nir_gather_xfb or later pulling info from the GLSL compiler.

16:33 <jekstrand> That's fine. I can help with the debugging.

16:34 <jekstrand> Where I stopped was just trying to figure out what order to move gallium stuff around.

16:34 <jekstrand> I also want to get rid of gather_xfb_with_varyings and have it be one pass that just always does that because why not.

16:34 <jekstrand> But it's super-annoying because ANV also serializes that struct and I was maybe trying to make that more automatic (but maybe that was a fool's errand)

16:35 <jekstrand> Anyway, it's one of those things where I've got a general idea where it's going but need to massage it until it gets there.

16:35 <alyssa> All of this just makes me sad.

16:35 <jekstrand> I'm happy to try and pick it back up tomorrow.

16:35 <jekstrand> It all makes me very very sad.

16:35 <jekstrand> It's pretty clear all the XFB stuff was very bolted on

16:35 <alyssa> Yeah, for sure.

16:35 <alyssa> ~So why can't I just keep bolting on XFB stuff so I can flip the Mali-G57 switch on~ delet

16:36 <jekstrand> Like the XFB linking stuff was "Oh, we need more info. Here, let me add a thing." Then radeonsi was "I need a thing to happen roughly here so I'll add a callback"

16:36 <jekstrand> Like the only reason why radeonsi needs the callback is because it needs to poke at the GLSL structures to get the XFB info and it can't do that in its back-end.

16:36 <jekstrand> At least that's the only reason I can see.

16:36 <jekstrand> Which is the same as what panfrost needs

16:37 <alyssa> Right..

16:37 <jekstrand> And the only reason why I made nir_gather_xfb return the xfb_info the first time rather than stuffing it in the shader was because I was lazy and everything that goes in a nir_shader needs to support clone, [de]serialize, sweep, etc. and I didn't want to bother.

16:38 <jekstrand> So it's a lot of history of people doing something pragmatic and we need to just all knock it off and someone needs to clean it up so we can start doing the right thing.

16:38 <alyssa> not sure we're looking at the same branch?

16:38 <jekstrand> And I'm happy to do that. I just need to get some spoons back first.

16:38 <alyssa> Understood

16:39 <jekstrand> And given that I didn't fall asleep last night until 4:30 AM and slept badly after that, I'm not likely to get any spoons today.

16:39 <jekstrand> So we'll try for it tomorrow. :)

16:39 <alyssa> :'(

16:39 <jekstrand> I'll be fine. I'm functional. But some things require more than "functional"

16:39 <alyssa> Yeah, I get that

16:40 <alyssa> So I should hold off on touching XFB for a few days to give you time to fix the common code, and then it should be possible to rebase the Valhall MR without any fun politics involved?

16:40 <jekstrand> That sounds like a decent plan.

16:40 <alyssa> OK

16:40 <jekstrand> The RADV patches I'm working on are hell to rebase but there's no actual time pressure.

16:40 <alyssa> Woof

16:40 <alyssa> what about the early/late I/O lowering story?

16:41 <jekstrand> IDK

16:41 <jekstrand> That's all a mess with history too

16:41 <jekstrand> And I'm not that inclined to swim up that stream right now.

16:41 <alyssa> Yeah, I feel that.

16:42 <jekstrand> Fixing xfb_info seems tractable if annoying

16:42 <jekstrand> I've at least got a plan.

16:43 <alyssa> Wee

16:43 <jekstrand> I'm sorry that 80% of that plan is in my head, not the branch but that's where it sits right now.

16:43 <alyssa> no worries :)

17:38 <cphealy> With this, it would be the reverse of the more common method with existing upstream drivers by having underlays instead of overlays.

17:38 <cphealy> whoops, wrong chat...

17:52 <alyssa> down to 8 shaders in my shader-db that spill

17:52 <alyssa> all from Android games, wee

17:52 <alyssa> interestingly, all but 1 are compute kernels

18:19 <alyssa> Fixed spilling on the 1 non-compute kernel

18:19 <alyssa> of the remaining 7 that spill, there are only 3 distinct shaders

18:30 <alyssa> Another bunch of spills eliminated with a similar fix

18:40 bbrezillon has quit [Read error: Connection reset by peer]

18:44 bbrezillon has joined #panfrost

19:23 jenneron_ has joined #panfrost

19:29 jenneron has quit [Ping timeout: 480 seconds]

19:38 <alyssa> er my grepfu was wrong, missed another shader that spilled heavily

19:47 erle has joined #panfrost

20:19 rasterman has quit [Remote host closed the connection]

20:21 rasterman has joined #panfrost

20:39 <alyssa> alyssa@scootaloo:~$ DISPLAY=:0 glmark2-es2 --off-screen

20:39 <alyssa> Error: Error loading EGL library

20:40 <alyssa> am I having a PEBKAC?

20:42 rkanwal has quit [Quit: rkanwal]

20:59 `join_subline has quit [Ping timeout: 480 seconds]

21:06 `join_subline has joined #panfrost

21:32 Rathann has joined #panfrost

21:36 <alyssa> manhattan on mt8192 up to ~21.7fps, woo!

21:36 <alyssa> I think that's better than before at least

21:38 <alyssa> uh apparently it was 22.9 a commit before

21:40 rasterman has quit [Quit: Gettin' stinky!]

21:40 <alyssa> grumble. I guess the latency matters more than I thought.

22:13 <alyssa> 23.7 with a stupid simple latency-focused scheduler

22:14 <alyssa> despite way more spilling!

22:15 <alyssa> the web claims this should be hitting 60fps though...

22:20 icecream95 has joined #panfrost

22:47 floof58 has quit [Quit: floof58]

22:47 floof58 has joined #panfrost

23:01 <anarsoul> alyssa: icecream95: ping

23:01 <anarsoul> so regarding https://www.khronos.org/registry/OpenGL/extensions/ARB/ARB_buffer_storage.txt

23:03 <anarsoul> I don't think that the wording covers drawing commands that use the same buffer (or rather the same part of the buffer)

23:04 <anarsoul> otherwise it'd require a buffer copy for each drawing call

23:06 <HdkR> What are you wondering about with buffer_storage?

23:09 <anarsoul> HdkR: see https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16684

23:22 <HdkR> Oh yea. That bit of the spec is rough. Following x86 semantics quite heavily

23:58 Rathann has quit [Quit: Leaving]