#panfrost on 2022-03-09 — irc logs at oftc.irclog.whitequark.org

2021-07-26 22:56 ChanServ changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - Logs https://oftc.irclog.whitequark.org/panfrost - <macc24> i have been here before it was popular

00:10 <jekstrand> bbrezillon: New panvk secondary command buffer patches: https://gitlab.freedesktop.org/jekstrand/mesa/-/commits/panvk/exec-cmd-if-second

00:10 <jekstrand> bbrezillon: I think maybe I actually like that. I may hate it in the morning, though. (-:

00:17 macc24 has quit [Ping timeout: 480 seconds]

00:23 macc24 has joined #panfrost

00:31 camus has joined #panfrost

00:32 camus1 has quit [Read error: Connection reset by peer]

00:48 JulianGro has quit [Remote host closed the connection]

01:29 vstehle has quit [Ping timeout: 480 seconds]

06:00 vstehle has joined #panfrost

06:34 Daanct12 has joined #panfrost

06:45 <tomeu> jekstrand: can't you just run the CTS in the same way as the CI?

06:50 Daanct12 has quit [Read error: Connection reset by peer]

07:10 Daanct12 has joined #panfrost

07:14 <Daanct12> looks like i was able to crash my pinephone pro

07:14 <Daanct12> i type `timedemo 1` in ioq3 console (latest git) and suddenly everything slows down

07:15 <Daanct12> but if i kill q3 process then panfrost throws up some error and then crash

07:15 <Daanct12> this is on sway though, not too sure if it happens on x11

07:46 <bbrezillon> jekstrand: just wanted to have all the descriptor emission code in panvk_[vX_]cs.{c,h}. If we're worried about the perf cost of the of this extra function call, I'd rather make all emit helpers inline functions defined in panvk_cs.h, because I kinda like this separation (panvk_vX_cmd_buffer.c is already quite big without those emit functions)

07:58 erlehmann has quit [Ping timeout: 480 seconds]

08:52 cphealy has quit [Ping timeout: 480 seconds]

08:53 pendingchaos has quit [Ping timeout: 480 seconds]

09:10 <jekstrand> tomeu: It only runs a tiny subset of tests

09:11 <jekstrand> tomeu: If I want to know how much we fixed, it won't tell me.

09:11 <tomeu> yeah, but in theory the ones that pass

09:11 <jekstrand> Yeah, but when adding new features, it doesn't say what more passes

09:11 <tomeu> if we pass more, we should extend coverage in CI

09:11 <tomeu> ah, right

09:11 <tomeu> for that we need full runs, but hopefully you don't need to do that often

09:12 <tomeu> jekstrand: you could easily send whole runs to CI and shard, but would be good to coordinate that so you don't bog down devices for too long

09:13 <tomeu> we have 7 vim3 boards you could use for that

09:18 <jekstrand> tomeu: If I'm actually hacking on panvk more than a few hours/week, I'll easily dominate 7 boards. :-/

09:19 <jekstrand> If doing local runs is a real problem, I may just have daniels or guy send me a few more. It's not like vim3s are expensive.

09:20 <tomeu> yeah, hopefully you won't spend too much time taking care of your farm

09:21 <tomeu> or if you want to add valhall support to panvk, you can get much more powerful boards with mediatek socs

09:22 <bbrezillon> tomeu: can I get one too ? :P

09:23 <daniels> bbrezillon: of course, what do you need?

09:24 <bbrezillon> I was just kidding. Don't know when I'll get back to panvk dev, and there's a lot to address on Bifrost before that

09:26 <tomeu> they are actually relatively cheap, but come with a screen attached, so not ideal for a farm on your desk :)

09:29 erlehmann has joined #panfrost

09:45 rasterman has joined #panfrost

10:06 MajorBiscuit has joined #panfrost

10:18 camus has quit [Remote host closed the connection]

10:19 camus has joined #panfrost

10:22 tjcorley has quit [Ping timeout: 480 seconds]

10:52 pendingchaos has joined #panfrost

11:13 <macc24> i also want a mt8192 machine definitely-for-development-i-promise

11:14 <macc24> xD

11:56 Rathann has joined #panfrost

11:57 erlehmann has quit [Ping timeout: 480 seconds]

12:01 erlehmann has joined #panfrost

12:18 JulianGro has joined #panfrost

12:25 camus1 has joined #panfrost

12:25 camus has quit [Read error: Connection reset by peer]

12:44 <bbrezillon> jekstrand: there's no generic EndCommandBuffer() implem anymore. I guess I should implement it directly in the primary command buffer implementation

12:55 <bbrezillon> jekstrand: and unfortunately https://gitlab.freedesktop.org/jekstrand/mesa/-/commit/950f23da6f0627e00b2ad19e356f719cd722535b#a9366c3de4e72a81120f5663284043e672c21a80_987_989 doesn't work

12:58 <bbrezillon> I guess that'd work if we had the concept of cmd_dispatch_table, and we'd only force the beginning of the device dispatch table to the default cmd dispatchers

12:58 <bbrezillon> but even then, we'd need default implems for End/BeginCommandBuffer and all other cmd entrypoints that don't start with 'Cmd'

12:59 <bbrezillon> unless we consider those to not be cmd entrypoints, but I'm not convinced that's a good idea

13:03 Daanct12 has quit [Quit: Quit]

13:17 <bbrezillon> hm, actually it works if I move this vk_device_dispatch_table_from_entrypoints() at the beginning

13:26 <bbrezillon> jekstrand: pushed a working version to https://gitlab.freedesktop.org/bbrezillon/mesa/-/commits/panvk-exec-cmd-if-second

13:28 <bbrezillon> if you're happy with this version, I'll update the MR

13:53 _99 has joined #panfrost

13:55 _99 has left #panfrost [#panfrost]

14:20 camus1 has quit []

15:16 cphealy has joined #panfrost

15:21 nlhowell has joined #panfrost

15:22 nlhowell is now known as Guest1705

15:22 nlhowell has joined #panfrost

15:29 Guest1705 has quit [Ping timeout: 480 seconds]

15:41 <jekstrand> bbrezillon: Yeah, I guess we'll need begin/end

15:42 <jekstrand> Woah! deqp-vk run finished in 10 hours! Must have gotten rid of a lot of crashes.

15:46 <jekstrand> bbrezillon: RE: allocation scope. Yeah, SCOPE_COMMAND is wrong. I suspect we want SCOPE_OBJECT or SCOPE_DEVICE

15:47 <jekstrand> VK_SYSTEM_ALLOCATION_SCOPE_OBJECT specifies that the allocation is scoped to the lifetime of the Vulkan object that is being created or used.

15:47 <jekstrand> So I think we want OBJECT

15:47 <bbrezillon> okay, seems to match my understanding then

15:48 <bbrezillon> and I realized panvk was missing a custom secondary_CmdBindDescriptorSets ...

15:57 <jekstrand> bbrezillon: Yeah, I've been thinking about that one.

15:58 <jekstrand> bbrezillon: I think what we want to do is add vk_device::[un]ref_pipeline_layout function pointers and say you have to reference count if you want to use command recording.

15:59 <jekstrand> Then we can implement all the manual stuff in src/vulkan/runtime/vk_cmd_enqueue_manual.c or something

15:59 <bbrezillon> sounds better than what I did in dozen...

16:00 <jekstrand> If we're going to have panvk, lavapipe, and dozen all using this thing, we may as well try to share the manual enqueue funcs.

16:00 pjakobsson has joined #panfrost

16:00 <jekstrand> bbrezillon: Yeah, pipeline and descriptor set layouts have annoying timelines. We reference count descriptor set layouts in ANV for $REASONS.

16:01 <jekstrand> s/timelines/lifetimes/

16:02 <bbrezillon> I ended up copying relevant data from set layouts to pipeline layout in dozne

16:02 pjakobsson_ has quit [Ping timeout: 480 seconds]

16:02 <bbrezillon> I think you were the one suggesting that :)

16:03 <bbrezillon> but even with that, it still makes the manual CmdBindDescriptorSets() kind of ugly/open-coded

16:03 <jekstrand> Why? Ref the pipeline layout and stuff everything in the struct. When you destroy, unref.

16:04 <bbrezillon> you mean the set layouts?

16:05 <jekstrand> BindDescriptorSets takes a pipeline layout

16:06 <bbrezillon> yeah, I was just digressing

16:06 <bbrezillon> since you mentioned the pipeline and set layout being uncorrelated

16:07 <bbrezillon> but sure, your solution works fine and avoids open-coding the helper in 3 drivers

16:16 <jekstrand> bbrezillon: I've not read all your comments yet. Been doing e-mail and chat back-log so far. Do you want me to keep going on secondaries and try to get it actually working and passing non-trivial tests? Or did you want to run with it some more?

16:19 <bbrezillon> jekstrand: I had it passing the basic tests

16:19 <bbrezillon> at least on panfrost

16:19 erlehmann has quit [Ping timeout: 480 seconds]

16:19 pjakobsson has quit [Remote host closed the connection]

16:22 <jekstrand> bbrezillon: Old version or patches on mine?

16:23 <bbrezillon> yours

16:23 <bbrezillon> triggered a lavapipe CI run

16:24 <jekstrand> \o/

16:24 <bbrezillon> to see if we regress things when transitioning to cmd_queue wrappers

16:24 <jekstrand> Thanks!

16:24 <bbrezillon> and we do => https://gitlab.freedesktop.org/bbrezillon/mesa/-/jobs/19578045 :)

16:24 <jekstrand> Of course!

16:24 erlehmann has joined #panfrost

16:24 <jekstrand> Where'd you push the updated branch? I'll take a look.

16:25 <jekstrand> I've got a pretty awesome lavapipe testing machine. :D

16:28 <bbrezillon> https://gitlab.freedesktop.org/bbrezillon/mesa/-/commits/panvk-exec-cmd-if-second

16:30 <jekstrand> cool. I'll pull and debug once I read some IMG comments.

16:35 <jekstrand> And... my vim3 died :(

16:59 Rathann has quit [Quit: Leaving]

17:03 <tomeu> my vim3 died as well, the odroid n2 seems more reliable

17:05 <macc24> imagine having your computers break before they become obsolete this statement was sponsored by business laptop fans

17:18 <bbrezillon> jekstrand: hm, I was looking at the descriptor_set_layout refcount logic, and it looks like the allocator passed to vkDestroyDescriptorSetLayout is ignored. Isn't a problem if the caller uses an allocator that's different from the device allocator?

17:20 <jekstrand> It means they always get the device allocator

17:20 <jekstrand> Which sucks but it is what it is

17:27 <bbrezillon> ah, right, you pass the device allocator in the create path

17:33 davidlt has joined #panfrost

17:43 erlehmann has quit [Ping timeout: 480 seconds]

17:50 rcf has quit [Quit: WeeChat 3.2.1]

18:18 rcf has joined #panfrost

18:29 robmur01 has quit [Quit: Leaving]

18:36 MajorBiscuit has quit [Quit: WeeChat 3.4]

19:09 * jekstrand plugs in serial in hopes of figuring out why his VIM3 is dying

19:23 <jekstrand> panfrost kernel bugs. :(

19:37 <HdkR> Welcome to the kernel bug party \o/

19:37 <bbrezillon> jekstrand: ok, so there's one bug remaining in lvp after the transition to vk_cmd_queue helpers

19:38 <jekstrand> bbrezillon: Oh, can you throw me the latest branch?

19:38 <jekstrand> bbrezillon: I'm looking at an old one and it's blowing up bad

19:38 <bbrezillon> already pushed

19:38 <jekstrand> kk

19:38 <jekstrand> bbrezillon: What's the bug?

19:39 <jekstrand> bbrezillon: I'm seeing it blow up on PIPELINE_BARRIER2 right now

19:39 <bbrezillon> uh, apparently the last version introduced new bugs :)

19:40 <bbrezillon> but I think it all resolves to the same issue

19:40 <bbrezillon> vk_common_Xxx -> Xxx2 wrappers

19:40 <jekstrand> yup

19:40 <jekstrand> How did lavapipe handle that before?

19:41 <bbrezillon> we don't know what lvp implements in lvp_execute_cmd_buffer()

19:41 <daniels> zmike studiously avoiding this channel ;)

19:42 <HdkR> Just make panvk good enough to run typical x86 games and he'll be forced to come over :P

19:43 <daniels> HdkR: but how could you possibly run x86-64 games on aarch64 ... ?

19:43 <HdkR> I hear there is a Fosdem talk from a little known project that sheds some light on this

19:43 <jekstrand> bbrezillon: What I don't get is how it avoided generating those commands before.

19:44 <bbrezillon> it just had lvp_CmdXxx auto-generated with vk_commands_gen.pu

19:44 <bbrezillon> so the core wasn't overloading the CmdXxx implementation with its own wrapper

19:47 <jekstrand> bbrezillon: I think I'm seeing what's going on mayb

19:47 <jekstrand> *maybe

19:47 <jekstrand> and might have a plan

19:48 <bbrezillon> guess we could pass a handler table where each entry is a function that takes a cmd_entry and a void pointer, and then use that table for both filling the device dispatch table, and automating lvp_execute_cmd_buffer() a bit

19:58 rasterman has quit [Quit: Gettin' stinky!]

20:10 <jekstrand> bbrezillon: pushed my branch. Should work now.

20:10 <jekstrand> bbrezillon: I'm going to start moving hand-written wrappers

20:14 <bbrezillon> jekstrand: panvk/exec-cmd-if-second ?

20:14 <jekstrand> bbrezillon: yup

20:14 <bbrezillon> I only see my changes there

20:14 <jekstrand> bbrezillon: I squashed things

20:15 <bbrezillon> mind pointing to the relevant commit?

20:15 <jekstrand> look at the lavapipe commit

20:15 <jekstrand> it's got a new function in there with an allowlist

20:17 <bbrezillon> ok, so that's done manually

20:17 <bbrezillon> got it

20:17 <jekstrand> I don't see another way

20:17 <jekstrand> Also, we've been fighting the auto-generation in lavapipe for a long time because of stuff like that

20:17 <jekstrand> It can now start using 2 wrappers if we have a list.

20:18 pendingchaos has quit [Remote host closed the connection]

20:18 pendingchaos has joined #panfrost

20:18 <bbrezillon> me neither, but I thought you had a brilliant idea :)

20:18 <jekstrand> Nope. I just did the typing. :)

20:19 <bbrezillon> okay, I'll resping with those changes and push that tomorrow then

20:20 <jekstrand> I'll send out an MR with some of the changes and the lavapipe stuff before EOD

20:21 <jekstrand> I'm trying to move some lavapipe stuff into common code now for hand-written enqueue funcs

20:21 <bbrezillon> great!

20:21 <bbrezillon> Cc me on the MR and I'll review it

20:21 <jekstrand> cool

20:25 pendingchaos_ has joined #panfrost

20:25 pendingchaos has quit [Read error: Connection reset by peer]

20:25 pendingchaos_ is now known as pendingchaos

20:26 erlehmann has joined #panfrost

20:36 cphealy has quit []

20:36 davidlt has quit [Ping timeout: 480 seconds]

20:45 cphealy has joined #panfrost

21:04 jernej has quit [Remote host closed the connection]

21:06 jernej has joined #panfrost

21:09 <jekstrand> bbrezillon: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15311

21:09 <jekstrand> bbrezillon: CI is running now.

21:10 <jekstrand> bbrezillon: I've also re-pushed my panvk/exed-cmd-if-second branch on top of that.

21:10 <jekstrand> bbrezillon: I've not done CmdBindDescriptorSets yet

21:19 tanty has quit []

21:22 tanty has joined #panfrost

21:25 tanty has quit []

21:28 tanty has joined #panfrost

21:31 <jekstrand> bbrezillon: Ok, I should have BindDescriptorSets now. Only compile-tested but should roughly work.

22:04 nlhowell has quit [Ping timeout: 480 seconds]

22:06 <icecream95> *phew*, using ioctl and mmap in Rust is a lot harder than in C.. hopefully I won't need to do any more of those

22:16 <icecream95> alyssa: Fittingly, commit afbc24a234e in 22.0 is one of your Panfrost commits :)

22:17 <icecream95> It wasn't AFBC related, though..

22:17 rasterman has joined #panfrost

22:20 <anarsoul> icecream95: unsafe {} ?

22:22 <icecream95> anarsoul: I've got it working now, I don't need suggestions. I did end up with four unsafe blocks in total..

22:23 <anarsoul> hope your unsafe blocks are safe :)

22:23 <icecream95> They caused a few kernel Oopses before I fixed the bugs there ;)

22:46 rasterman has quit [Quit: Gettin' stinky!]

22:52 <jekstrand> I think you're generally allowed to assume the kernel is safe in rust. :)

22:59 <jekstrand> Does layered rendering really require geometry shaders on Mali?

22:59 <jekstrand> I'm seeing pan_prepare_rt assert that `last_layer == first_layer`

23:00 <icecream95> jekstrand: There is no HW support for geometry shaders..

23:00 <icecream95> How does layered rendering work?

23:00 <icecream95> (From the API point of view)

23:00 <jekstrand> On some hardware, you can output a layer ID from the vertex shader

23:00 <jekstrand> In GL, I think it requires outputting it from the geometry shader unless you have an extension

23:01 <jekstrand> Trying to figure out how it's all linked in Vulkan ATM

23:03 <icecream95> One way would be to store a position for every layer, but make it e.g. all zeroes for inactive layers

23:03 <icecream95> Then you can run the tiler job for each layer

23:04 <icecream95> Another way is to use the same tiler heap in multiple fragment jobs, and check against the layer ID in the fragment shader

23:04 <jekstrand> You don't want to broadcast a primitive across all layers. I want to select the layer from the VS

23:05 <icecream95> So my suggestion is to write the position for one layer, but every other layer sets it to zero and the tiler removes the primitive

23:05 <jekstrand> You only render to one layer at a time

23:06 <jekstrand> Well, you have them all bound but each primitive only goes to one

23:07 <jekstrand> I guess if shaderOutputLayer isn't supported and you don't support GS, layered rendering is a no-op and everything goes to layer 0.

23:07 <jekstrand> Kind-of weird that it's allowed at all in that case, though.

23:07 <icecream95> Usually vertex shaders write a single output position, but it should be possible to have multiple outputs, one for each layer

23:07 <jekstrand> I don't want one for each layer

23:07 <jekstrand> I only want one output position

23:08 <jekstrand> I just need to be able to direct it to a particular layer

23:08 <icecream95> ..but if you don't write to the others then the tiler will read uninitialised data for the vertex

23:08 <icecream95> (So if you memset the BO to 0 then you don't have to write to the others)

23:08 <jekstrand> :-/

23:09 <jekstrand> On a tiler, maybe you want one vertex data BO per layer?

23:09 <jekstrand> That seems kinda crazy

23:10 <icecream95> (I don't know how the blob does this, if it supports the feature; this is just thinking what could work from my knowledge of the tiler unit)

23:13 <jekstrand> That's fair

23:14 <icecream95> "vertex data BO per layer". Which vertex data? Position data can be seperated from varyings, so the draw descriptors for each layer would share the varyings, but use a different buffer for positions

23:19 <icecream95> But a third way of implementing it would be: Allocate an index buffer per layer, and in the vertex shader only add the vertex to the index buffer corresponding to the layer, so that the tiler doesn't see the other vertices. This cuts down on memory bandwidth, because you only have to write up to four bytes per layer, rather than 16 for a position

23:19 <jekstrand> yeah

23:21 <icecream95> Well.. that's easy for drawArrays, but when you already have an index buffer it could get tricky

23:28 <icecream95> (The blob does not expose the feature, which is not a surprise)

23:29 <jekstrand> the blob doesn't expose geometry shaders?

23:31 <icecream95> It does, but not multiViewport

23:31 <jekstrand> This is different from multiviewport

23:31 <jekstrand> multiviewport just lets you have a different viewport per layer. The base GS feature adds layered rendering.

23:32 <icecream95> ..it does?

23:32 <jekstrand> At least in GL

23:32 <jekstrand> Vulkan also ties them together weirdly

23:34 <icecream95> Oh.. even if the shaderOutputLayer feature is not enabled, then it can be used from geometry shaders

23:34 <jekstrand> Yes

23:34 <jekstrand> It's very strange

23:34 <jekstrand> shaderOutputLayer really means shaderOtherThanGeometryOutputLayer

23:38 <icecream95> Here's a fourth way of implementing the feature: Rewrite the fixed-function tiler in OpenCL, and then you can add whatever features you like to it. I've already written a software tiler in C that mostly works

23:38 <jekstrand> hehe. Sure. :)

23:51 <jekstrand> Does bifrost run with a subgroup size of 4 in the FS?

23:52 <jekstrand> I'm reading the nir_op_fddx code and that's the only that makes sense

23:56 <icecream95> jekstrand: The "Arm Mali GPU Datasheet" lists v6 Bifrost as having a "warp width" of 4, v7 having 8, and Valhall having 16

23:57 <icecream95> +CLPER.i32 on v7 supports subgroup sizes of 2, 4 and 8

23:57 <jekstrand> Ok. That explains the weird &ing then

23:57 <jekstrand> It sues 1 and 2 where I expected ~2 and ~1 but on a warp size of 4, it's the same.

23:58 <icecream95> Note the BI_SUBGROUP_SUBGROUP4 below

23:58 simon-perretta-img has joined #panfrost

23:59 <jekstrand> Ah

23:59 <jekstrand> Yeah, that makes sense

23:59 <jekstrand> Ok, I think I know how to do coarse/fine on bifrost. Not sure how much anyone cares but it's easy so I'll type it tomorrow once my machine is freed up.