#dri-devel on 2023-06-15 — irc logs at oftc.irclog.whitequark.org

2022-12-21 00:45 ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar

00:00 <DemiMarie> Sorry for being as noisy as a badly tuned engine

00:02 lyudess has joined #dri-devel

00:05 benjamin1 has joined #dri-devel

00:08 Lyude has quit [Ping timeout: 480 seconds]

00:13 benjamin1 has quit [Ping timeout: 480 seconds]

00:15 sghuge has quit [Remote host closed the connection]

00:16 sghuge has joined #dri-devel

00:21 co1umbarius has joined #dri-devel

00:22 columbarius has quit [Ping timeout: 480 seconds]

00:35 columbarius has joined #dri-devel

00:36 co1umbarius has quit [Ping timeout: 480 seconds]

00:38 benjamin1 has joined #dri-devel

00:42 jewins has quit [Ping timeout: 480 seconds]

00:46 benjamin1 has quit [Ping timeout: 480 seconds]

00:57 ngcortes has quit [Ping timeout: 480 seconds]

01:11 benjamin1 has joined #dri-devel

01:19 kzd has quit [Quit: kzd]

01:19 gfxstrand has quit [Ping timeout: 480 seconds]

01:19 benjamin1 has quit [Ping timeout: 480 seconds]

01:29 kzd has joined #dri-devel

01:44 benjamin1 has joined #dri-devel

01:52 ngcortes has joined #dri-devel

01:53 benjamin1 has quit [Ping timeout: 480 seconds]

02:01 <DavidHeidelberg[m]> MrCooper: mupuf: eric_engestrom anholt & others CI interested people: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23629#note_1959074 I did real testing, it seems to work correctly, but I'll be happy for another wave of reviews or acking

02:02 <DavidHeidelberg[m]> Now when we enable one farm, only jobs on these farms gets tested.

02:02 <DavidHeidelberg[m]> If we enable two farms, only these two farms gets tested, all other jobs will be skipped. If we disable farm, nothing except basic build will be tested (and that should go away in future too, but needs more "playing")

02:03 <DavidHeidelberg[m]> also ^ gallo if you find time to say final verdict over the MR :D

02:04 bbrezill1 has joined #dri-devel

02:09 bbrezillon has quit [Ping timeout: 480 seconds]

02:13 yuq825 has joined #dri-devel

02:20 benjamin1 has joined #dri-devel

02:33 benjaminl has joined #dri-devel

02:34 mbrost has quit [Ping timeout: 480 seconds]

02:36 benjamin1 has quit [Ping timeout: 480 seconds]

02:43 <mareko> when I clone piglit, the default branch is master, how to fix that?

02:44 <mareko> nevermind, I cloned the wrong repo

03:09 mbrost has joined #dri-devel

03:14 Company has joined #dri-devel

03:18 heat has quit [Read error: No route to host]

03:19 heat has joined #dri-devel

03:29 heat has quit [Remote host closed the connection]

03:30 heat has joined #dri-devel

03:46 macromorgan has joined #dri-devel

04:04 Daanct12 has joined #dri-devel

04:09 Guest2769 has quit [Ping timeout: 480 seconds]

04:11 Danct12 has joined #dri-devel

04:14 Daanct12 has quit [Ping timeout: 480 seconds]

04:14 ngcortes has quit [Read error: Connection reset by peer]

04:17 TMM has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]

04:17 TMM has joined #dri-devel

04:29 mbrost has quit [Ping timeout: 480 seconds]

04:35 bmodem has joined #dri-devel

04:40 heat has quit [Ping timeout: 480 seconds]

05:04 kzd has quit [Ping timeout: 480 seconds]

05:06 Dr_Who has quit [Read error: Connection reset by peer]

05:12 bmodem1 has joined #dri-devel

05:12 bmodem has quit [Read error: Connection reset by peer]

05:13 Dr_Who has joined #dri-devel

05:29 fab has joined #dri-devel

05:56 sima has joined #dri-devel

06:04 Leopold has joined #dri-devel

06:11 Leopold has quit []

06:13 Leopold_ has joined #dri-devel

06:30 fab has quit [Quit: fab]

06:32 alanc has quit [Remote host closed the connection]

06:32 alanc has joined #dri-devel

06:33 bmodem has joined #dri-devel

06:37 bmodem1 has quit [Ping timeout: 480 seconds]

06:48 rasterman has joined #dri-devel

06:58 sghuge has quit [Remote host closed the connection]

06:58 sghuge has joined #dri-devel

07:00 Leopold_ has quit []

07:00 Leopold_ has joined #dri-devel

07:15 fab has joined #dri-devel

07:18 Guest2279 has quit [Read error: Connection reset by peer]

07:18 peelz has joined #dri-devel

07:19 peelz is now known as Guest3145

07:19 Guest3145 has quit [Read error: Connection reset by peer]

07:20 notpeelz has joined #dri-devel

07:21 YuGiOhJCJ has joined #dri-devel

07:26 jkrzyszt has joined #dri-devel

07:27 Danct12 is now known as Guest3147

07:27 Danct12 has joined #dri-devel

07:29 <MrCooper> karolherbst: should be easy to use -Woverloaded-virtual=1 instead of -Wno-error=overloaded-virtual in CI at least

07:30 Leopold_ has quit []

07:30 <MrCooper> BTW, there actually is a libc++ (from LLVM), it's not the same as libstdc++ (from GCC) though

07:30 rasterman has quit [Quit: Gettin' stinky!]

07:31 Leopold has joined #dri-devel

07:35 bmodem1 has joined #dri-devel

07:35 bmodem has quit []

07:59 lynxeye has joined #dri-devel

08:00 swalker_ has joined #dri-devel

08:01 tzimmermann has joined #dri-devel

08:01 swalker_ is now known as Guest3151

08:01 swalker__ has joined #dri-devel

08:06 tursulin has joined #dri-devel

08:07 pochu has joined #dri-devel

08:08 Guest3151 has quit [Ping timeout: 480 seconds]

08:16 vliaskov has joined #dri-devel

08:46 <karolherbst> MrCooper: yeah.. but our code seems to also trigger at =1.. anyway, I've opened an MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23656

08:49 <MrCooper> I agree with gerddie, which is actually an argument against switching to -Woverloaded-virtual=1 as well

08:50 <karolherbst> yeah, sadly fixing it in codegen is a bit more of a rewrite, it's a bit pointless because there is nothing wrong with the code as is, just... uhhh

08:59 <MrCooper> mainly that we can ignore warnings for CI, we should not hide them though

09:09 <karolherbst> sure, my point is rather that in nouveau's instance of that warning it's a non issue, but anyway... I'll think of something, probably just renaming the method and call it a day

09:11 AndroUser2 has joined #dri-devel

09:12 AndroUser2 has quit [Read error: Connection reset by peer]

09:12 AndroUser2 has joined #dri-devel

09:20 AndroUser2 has quit [Ping timeout: 480 seconds]

09:25 JohnnyonFlame has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]

09:33 Leopold has quit []

09:35 Leopold_ has joined #dri-devel

09:37 mavchatz has quit []

09:39 mvchtz has joined #dri-devel

10:05 heat has joined #dri-devel

10:14 rasterman has joined #dri-devel

10:23 Leopold_ has quit [Remote host closed the connection]

10:31 AndroUser2 has joined #dri-devel

10:38 Leopold has joined #dri-devel

11:04 djbw_ has quit [Remote host closed the connection]

11:09 jfalempe has quit [Ping timeout: 480 seconds]

11:14 swalker__ has quit [Remote host closed the connection]

11:14 swalker__ has joined #dri-devel

11:17 swalker__ has quit [Remote host closed the connection]

11:19 bmodem1 has quit [Ping timeout: 480 seconds]

11:22 pochu_ has joined #dri-devel

11:23 pochu has quit [Ping timeout: 480 seconds]

11:40 jfalempe has joined #dri-devel

11:48 bmodem has joined #dri-devel

12:06 <dolphin> airlied, sima: Anything blocking drm-intel-next merge or just time?

12:06 <dolphin> jani is on vacations and asked to look after the merge, thus pinging

12:08 <dolphin> no drm-intel-fixes this week (just one unused variable fix and on gen3 color fix, they can get in via -next)

12:12 fab has quit [Ping timeout: 480 seconds]

12:21 Arsen has quit [Quit: Quit.]

12:26 Arsen has joined #dri-devel

12:31 leonardo has joined #dri-devel

12:32 leonardo is now known as DottorLeo

12:40 DottorLeo has quit [Quit: Konversation terminated!]

12:41 DottorLeo has joined #dri-devel

12:42 <DottorLeo> hi!

12:43 <DottorLeo> hakzsam: i've seen your patch for IB on GFX6, can it bring better performance on a SI card? :)

12:47 <DottorLeo> mareko: https://gitlab.freedesktop.org/mareko/mesa/-/commit/10c889df309efeeef20a27d04c2637bcb76f8dea talks about GFX7, but should'n fix GFX6 or affect both generations?

12:48 <mlankhorst> drm-tip is broken, fixing up now

12:51 Dr_Who has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]

12:56 Danct12 has quit [Quit: WeeChat 3.8]

13:05 alyssa has joined #dri-devel

13:05 idr has quit [Ping timeout: 480 seconds]

13:06 <hakzsam> DottorLeo: not really

13:08 alyssa has left #dri-devel [#dri-devel]

13:12 <pendingchaos> am I correct in thinking that nir_loop_analyze doesn't actually use the loop invariance information?

13:12 <pendingchaos> it seems to be using it to remove invariant variables from process_list, but later only phis in that list are visited, and phis are never invariant

13:20 DottorLeo has quit [Quit: Konversation terminated!]

13:28 <karolherbst> jenatali: how is `clc.h` used inside MS's CL code? Can I include "util/macros.h" there for `PACKED`? I want to enforce that clc_optional_features is packed, because I have to use it for hashing. Or do you think we can be trusted to not mess it up and ensure that struct is always packed anyway?

13:30 <jenatali> karolherbst: We just copy/paste it into our external project. I'd prefer to keep it standalone if possible but if it needs to grow more dependencies I can live with it

13:31 <karolherbst> yeah.. I mean, in this case it doesn't feel worth the effort as clc_optional_features is just an array of bools basically, but I have to hash it into my key and kinda don't want anybody to mess with it in a wrong way

13:31 <jenatali> Does PACKED even work on MSVC?

13:31 <karolherbst> maybe I just leave a comment and hope for the best?

13:31 <karolherbst> huh...

13:31 <karolherbst> I don't think so

13:31 <jenatali> I know we have #pragma pack but I don't know that works on inline attribute-style stuff

13:32 <jenatali> I'd just throw in a static assert on the size of the type

13:32 <karolherbst> mhhh yeah, probably good enough

13:33 <karolherbst> but also annoying

13:34 <karolherbst> the only thing relevant here is not having any padding

13:35 <jenatali> If you have C++ there's a type trait for that

13:36 <jenatali> Which, the header ends up getting included in C++ for our tests

13:36 iive has joined #dri-devel

13:36 <karolherbst> mhhhhhh

13:38 <karolherbst> I just leave a comment

13:38 <jenatali> Ok

13:40 jewins has joined #dri-devel

13:41 Dark-Show has joined #dri-devel

13:45 nchery has quit [Ping timeout: 480 seconds]

13:45 nchery has joined #dri-devel

14:12 alyssa has joined #dri-devel

14:13 <alyssa> Do we have any optimizations for `ivec2(gl_FragCoord.xy)` or `floor(gl_FragCoord.xy)`?

14:13 <alyssa> grepping shader-db, that seems to come up in a bunch of places

14:13 <alyssa> For some meta shader stuff I added a load_pixel_coord, returning an ivec2

14:14 kzd has joined #dri-devel

14:14 <alyssa> this is what both AGX and Mali Bifrost+ implement natively, with load_frag_coord.xy lowered in the backend to float(load_pixel_coord) + 0.5

14:15 fab has joined #dri-devel

14:15 <alyssa> so wondering if this is something other hardware has too, and if anyone else would care for nir_opt_intrinsics to optimize `f2i(load_frag_coord.xy) -> load_pixel_coord` and `floor(load_frag_coord.xy) -> float(load_pixel_coord)`, saving the add

14:15 <alyssa> or... maybe that's not even necessary?

14:16 <alyssa> Maybe if the load_frag_coord -> load_pixel_coord lowering happens early, then opt_algebraic would chew through the ALU for free?

14:17 YuGiOhJCJ has quit [Quit: YuGiOhJCJ]

14:17 <alyssa> looks like we would need to add a few rules to nir_opt_algebraic but that's easier than opt_intrinsics

14:18 <alyssa> looks like ir3 would potentially benefit

14:20 yuq825 has left #dri-devel [#dri-devel]

14:22 <alyssa> yeah, it'd be ir3 + agx + bifrost

14:22 <alyssa> might still be worth it just to share some lowering

14:22 AndroUser2 has quit [Read error: Connection reset by peer]

14:23 AndroUser2 has joined #dri-devel

14:24 greaser|q has joined #dri-devel

14:24 greaser|q has quit []

14:26 fxkamd has joined #dri-devel

14:26 fab has quit [Ping timeout: 480 seconds]

14:35 greaser|q has joined #dri-devel

14:47 Jakdaw has quit [Quit: Leaving]

14:57 tzimmermann has quit [Quit: Leaving]

14:57 anholt_ has joined #dri-devel

15:03 anholt has quit [Ping timeout: 480 seconds]

15:06 fab has joined #dri-devel

15:17 kxkamil2 has quit []

15:18 pochu_ has quit []

15:24 Duke`` has joined #dri-devel

15:25 Dark-Show has quit [Quit: Leaving]

15:28 bmodem has quit [Ping timeout: 480 seconds]

15:35 bbrezill1 has quit []

15:35 bbrezillon has joined #dri-devel

15:37 smiles_ has quit [Remote host closed the connection]

15:40 jkrzyszt has quit [Ping timeout: 480 seconds]

15:57 sauce has quit []

15:59 kxkamil has joined #dri-devel

15:59 JohnnyonFlame has joined #dri-devel

16:02 sauce has joined #dri-devel

16:21 lynxeye has quit [Quit: Leaving.]

16:28 AndroUser2 has quit [Ping timeout: 480 seconds]

16:33 benjaminl has quit [Quit: WeeChat 3.8]

16:36 benjaminl has joined #dri-devel

16:48 heat_ has joined #dri-devel

16:48 gfxstrand has joined #dri-devel

16:48 heat has quit [Read error: Connection reset by peer]

16:54 mbrost has joined #dri-devel

17:17 nchery has quit [Remote host closed the connection]

17:23 jewins has quit [Remote host closed the connection]

17:23 jewins1 has joined #dri-devel

17:23 djbw_ has joined #dri-devel

17:30 nchery has joined #dri-devel

17:31 jewins1 has quit [Ping timeout: 480 seconds]

17:39 ngcortes has joined #dri-devel

17:42 AndroUser2 has joined #dri-devel

17:45 AndroUser2 has quit [Read error: Connection reset by peer]

17:45 AndroUser2 has joined #dri-devel

18:06 jewins has joined #dri-devel

18:25 AndroUser2 has quit [Remote host closed the connection]

18:30 AndroUser2 has joined #dri-devel

18:40 sima has quit [Ping timeout: 480 seconds]

18:41 ngcortes has quit [Read error: Connection reset by peer]

18:42 <jenatali> gfxstrand: I'm looking at hooking up D3D's (recently revised) render pass descriptions in Dozen, which aim to take advantage of subpass dependency info for TBDR (*cough* QC *cough*). I only need a very small subset of additional info on top of what dynamic rendering gives me. In your opinion, should I be trying to extend the render pass -> dynamic rendering stuff in core Vulkan, or should I just implement my own render pass state machine in

18:42 <jenatali> Dozen?

18:45 <alyssa> jenatali: I would love a way to take advantage of subpass deps with dynamic rendering plus

18:45 <alyssa> Figuring out exactly how to extend the runtime to plumb that through is still an open problem

18:45 <jenatali> Yeah, I see the TODO

18:45 <alyssa> What does this look like in D3D?

18:46 <alyssa> What do you envision the runtime doing for Dozen?

18:46 <gfxstrand> If we could do a runtime thing, that'd be cool.

18:46 <gfxstrand> IDK how practical it is, though.

18:46 <alyssa> cool assuming the thing that Dozen wants is remotely similar to what the actual hw drivers want ;)

18:46 TMM has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]

18:46 TMM has joined #dri-devel

18:46 <alyssa> which is why I'm asking since I didn't know D3d had subpasses

18:47 <jenatali> Additional load/store ops. When ending rendering to an attachment that will be used in a future subpass, there's additional store ops indicating how it'll be used in the future (color attachment, input attachment, storage image)

18:47 <jenatali> And then when beginning the next one, you specify the same modality as a load op

18:47 <alyssa> Interesting

18:47 <alyssa> Do you have a link I can read about this?

18:47 <jenatali> Essentially a "don't flush this from tile memory yet" option

18:48 <jenatali> https://github.com/microsoft/DirectX-Specs/blob/master/d3d/RenderPasses.md

18:48 <alyssa> oh it's chunky ok

18:48 <jenatali> Yeah, our normal APIs for non-TBDR is just "bind color attachments" and "bind depth stencil attachment"

18:49 <gfxstrand> You know, with Microsoft implementing Vulkan on D3D12 and Valve implementing D3D12 on Vulkan, it almost makes you wish we we had some sort of IDK... working group? where we could all get together and discuss stuff so we could have just one API. 🤔

18:49 <jenatali> The render pass stuff we added several years back looks like dynamic rendering today

18:49 <alyssa> gfxstrand: mood

18:49 Kayden has quit [Quit: -> JF]

18:49 <alyssa> Apple: just watch me

18:49 <jenatali> Heh

18:49 <jenatali> So you're saying that we should all switch to OneAPI? That's what I heard

18:50 <alyssa> Intel: 🤡

18:50 <gfxstrand> jenatali: hehe

18:51 <gfxstrand> jenatali: But which One of the APIs?

18:51 <alyssa> openvg++

18:51 <jenatali> Yeah

18:51 <Sachiel> obviously a new one that will fix all the issues

18:51 <jenatali> https://xkcd.com/927/

18:51 <alyssa> jenatali: ok, I do not have sufficient bandwidth for that .md to soak in today

18:52 <gfxstrand> Come on, folks... There's an even subtler jab in there that y'all're missing. 😅

18:52 * alyssa punches Faith, subtly

18:52 <alyssa> that jab?

18:52 <alyssa> :-p

18:52 <gfxstrand> No, not that jab

18:52 <alyssa> Oh

18:52 * alyssa retracts punch

18:53 <jenatali> Apparently too subtle for me

18:53 <alyssa> same

18:53 <gfxstrand> Intel's OneAPI is more like a OneOfEachAPI

18:53 <alyssa> ahaha

18:53 <jenatali> Yeah... I'd noticed that

18:54 <gfxstrand> Do you have any idea how many APIs are in OneAPI? It's a lot.

18:54 <alyssa> jenatali: Anyway, if you can figure out a reasonable way to augment the common VkRenderPass to express subpass deps, that'd be really awesome

18:54 <gfxstrand> I lost count a long time ago

18:55 <alyssa> It's just never been clear to me exactly what that would look like

18:55 <jenatali> alyssa: Unfortunately I don't necessarily know exactly what information all the other drivers would need

18:55 <gfxstrand> Hopefully quite a lot like D3D12 render passes and/or whatever dynamic input attachments thing Khronos cooks up in $FUTURE_DATE

18:55 <jenatali> It's hard for me to guess what's going too far and throwing in the kitchen sink vs being too conservative and not being useful for anyone else

18:55 <alyssa> I can speak to both agxv and panvk for whatever it's worth

18:56 <jenatali> So what I'd want on top of dynamic rendering is a list of input attachments, and a list of preserve attachments, where the preserve also indicates what it's being preserved for, i.e. what's the next use going to be

18:56 <jenatali> I don't need to know when or really anything else

18:57 <alyssa> OK.. I'm trying to figure out exactly how that'd map to hw

18:57 <alyssa> Is the idea that each subpass would be a separate begin/endrendering, but with that extra info to let the driver coalesce them?

18:57 <jenatali> Yep

18:58 <Sachiel> like the suspend/resume flags, but actually useful

18:59 <alyssa> Alright

18:59 <jenatali> We also have suspend/resume for continuing the same exact kind of rendering, but yeah

18:59 <alyssa> jenatali: So, Mali and AGX are both closer to Apple's imageblock model than Vulkan's render passes

19:00 <jenatali> Unfortunately that means nothing to me. I need to learn Metal one of these days...

19:00 <alyssa> Nah you're not missing out

19:00 <alyssa> here's a simplified programming model for AGX:

19:00 <alyssa> - tilebuffer is untyped on-chip memory (the same as GL shared memory)

19:01 <alyssa> - an instruction to store to shared ("tilebuffer") memory at a given offset with a given format implicitly indexed by the pixel coordinate

19:01 <alyssa> - " but load

19:02 <alyssa> - an instruction to blit a block of shared memory (at a particular offset with a particular format) to a storage image

19:02 <alyssa> and that's it

19:02 <alyssa> notice there are no subpasses, renderpasses, render targets, color buffers, .. etc

19:03 <alyssa> How do you actually program that?

19:03 <alyssa> For the regular (single-subpass) case, it's pretty straightforward:

19:03 <jenatali> Yeah that sounds fun :P

19:03 <alyssa> 1. Fragment shaders store their output with something like "local store pixel <colour>, offset = 0, format = rgba8"

19:04 <alyssa> 2. fbfetch is similarly "local load pixel <colour>, offset = 0, format = rgba8"

19:05 <alyssa> 3. At the end of the frame, we insert an "end-of-tile" program that runs once per tile and contains ~1 instruction/render target "blit block at offset 0 of format rgba8 to image 0" and we bind each colour buffer as a corresponding image

19:05 <alyssa> How do subpasses fit in?

19:05 tursulin has quit [Ping timeout: 480 seconds]

19:06 <alyssa> Well, Apple doesn't implement them so this is all based on the Vulkan driver I've written in my head and might be wrong ;-p

19:06 <alyssa> But -- ideally, all the subpasses are concatenated together

19:06 <alyssa> first all the draws of subpass 0, then all the draws of subpass 1, etc

19:07 <alyssa> Within each subpass, fragment shaders can load from input attachments (at one set of offsets) and store to outputs (at different offsets)

19:08 <jenatali> Yep

19:08 <alyssa> Notably the layout of the tilebuffer can change from draw-to-draw (with some careful barriers), so that input attachments that aren't used anymore don't need to stay around and we can fit more in the tilebuffer

19:08 <alyssa> And then finally, at the end of the frame we have our end-of-tile program that stores out just the parts that actually have STORE_OP_STORE, at whatever set of offsets we had for the final layout at the end of the render pass

19:09 <jenatali> Yep, that all sounds right

19:09 <alyssa> Mali's programming model isn't *quite* as loose, since there's a bit more fixed function hardware, but it's the same idea

19:10 <alyssa> All of the subpasses' draws are concatenated into one big batch, the hardware "render targets" are just the ones that are STOREd at the very end, in the middle the LD_TILE/ST_TILE instructions take driver-controlled offsets and formats to make things work according to the (possibly evolving) layout

19:11 <alyssa> Where this gets hard is when there isn't enough space in the tilebuffer to accommodate the whole render pass at once

19:11 <alyssa> in that case, some attachments need to spill to memory

19:11 <jenatali> Yeah. The one thing I'm not super familiar with is how TBDRs are expected to compute a tile layout given the model that we currently have, but what we ended up with was the compromise between "app developers will actually use this" vs "gives enough information that a driver can compute a layout"

19:11 <alyssa> or we need to split up the render pass

19:11 <alyssa> or something like that

19:11 <alyssa> right, this is one of the sticky points

19:12 <alyssa> for best perf the driver has to do something like register allocation

19:12 <alyssa> look at each attachment, look which subpasses it's "live" for, allocate offsets for each subpass allowing overlap for subpasses that don't "interfere" with each other

19:12 <jenatali> My guess is that the driver would do some kind of 2-pass approach over the commands to construct a more holistic view of the pass before actually building the layout, but this is really black-box to me

19:13 <alyssa> right. so one of the open questions is whether THAT code could be shared if we had common render passes

19:13 <alyssa> I know VkRenderPass + monolithic pipelines is designed to have enough information for this to work (TBDRs had a big say in Vulkan 1.0)

19:14 <alyssa> I don't know if it's still possible to do with dynamic rendering plus, especially if you throw pipelines out

19:14 Kayden has joined #dri-devel

19:14 <jenatali> Yeah, makes sense. That's a level of detail more than what I need, like I was saying, all I currently need is, for the current subpass, which attachments are inherited and which will be inherited in the future

19:14 <alyssa> Yeah, makes sense

19:14 <alyssa> Qualcomm's blob is presumably doing all those heroics underneath

19:15 <jenatali> Which, obviously having more information is fine, I can just ignore it, as long as I'm able to get those bits from whatever we end up with

19:15 <alyssa> Sure

19:15 <jenatali> So, the question is, how far do I go right now? :P

19:15 <alyssa> heh

19:15 <alyssa> The other question I have is whether D3D12 style render passes are sufficient to implement subpass merging on Mali/PowerVR/AGX/VideoCore

19:15 <alyssa> The true TBDRs

19:16 <jenatali> Is Adreno that different?

19:16 <alyssa> Yes, Adreno (uniquely) can behave as either a TBDR or an IMR

19:16 <alyssa> ("gmem" vs "sysmem")

19:17 <alyssa> which has substantial implications for how this stuff gets implemented in their driver

19:17 <jenatali> Huh. Didn't realize that, but it makes sense. This whole thing is about allowing them to behave as TBDR though

19:18 <alyssa> (For ex, on Adreno to read from a subpass input attachment you... still use a texture instruction with a magic texture descriptor that loads from gmem instead of system memory. IIRC there's an ugly point somewhere in Vulkan that exists to accommodate that.)

19:18 <alyssa> (This is very different than Mali/AGX where a tilebuffer load is a completely different instruction than texturing)

19:18 <jenatali> Probably the fact that input attachment descriptors have to exist in descriptor sets

19:21 <alyssa> At any rate my question is, given the information your thing would give, where would I compute tilebuffer layouts?

19:21 <alyssa> and how would I spill?

19:21 anujp has quit [Remote host closed the connection]

19:21 <jenatali> Right, I don't see how you would

19:22 <jenatali> For rendering that was actually done with a VK1.0 render pass, you'd want all the info in that pass to compute the tile layouts

19:22 <jenatali> In a hypothetical future where there's a dynamic rendering extension that provides this type of attachment inheritance info, you'd probably want to do the same heroics that the Windows QC driver is doing to reconstruct it

19:23 <alyssa> I guess that's my other question, how is QC's D3D12 driver doing it?

19:23 <jenatali> But IMO it doesn't really make sense to take all of the info from a VK1.0 pass, throw it out and strip it down to dynamic rendering, and then make the driver reconstruct it just because they might also have to deal with constructing it out of nothing sometimes

19:23 <alyssa> Yeah, valid

19:23 <jenatali> 🤷‍♂️ I wasn't really involved in the design of this feature

19:24 <jenatali> Outside of "is there enough info in Vulkan to map to this" :P

19:24 <alyssa> Heh

19:24 <alyssa> FWIW, the actual plan for AGXV is to do dynamic rendering only (and not get the win for VK 1.0 render passes) on the assumption that virtually no content we care about uses subpass merging

19:25 <jenatali> Yeah that makes a lot of sense

19:25 <jenatali> Except maybe some benchmarks which we apparently care about

19:25 <alyssa> and that number will decrease even further if someday Vulkan gains a dynamic rendering approach to subpasses

19:25 <alyssa> which would likely be much much much easier to implement in driver (since the tilebuffer layout becomes the app's problem)

19:25 <alyssa> (This is Metal's approach, basically.)

19:26 <jenatali> Makes sense

19:26 <alyssa> Metal's imageblocks map basically directly to the hardware model I explained

19:27 <jenatali> So, back to my original question: is it worth trying to add limited subpass dependency info on top of dynamic rendering in the Vulkan runtime code?

19:27 <alyssa> I don't know.

19:27 <alyssa> If it isn't invasive I wouldn't nak it, and i don't think gfxstrand would

19:27 <alyssa> But realistically I'm unsure if any other drivers would be able to use it, except maybe turnip

19:27 <alyssa> cwabbott: flto: danylo: questions for you^^

19:28 <jenatali> I guess, I could, while processing dynamic rendering begin, just look at the command buffer render pass state and reconstruct it in my driver... that might be simpler

19:29 <jenatali> Without doing the full render pass state machine myself or trying to guess what other drivers might want

19:31 <jenatali> I'll start with that

19:34 gouchi has joined #dri-devel

19:36 macc24_ has quit [Remote host closed the connection]

19:39 lyudess has quit []

19:39 Lyude has joined #dri-devel

19:45 <danylo> So you want to stitch together dynamic renderpasses if attachment layout allows it? Are there real world examples where there are big renderpasses that are worth stitching together?

19:45 <danylo> Our code in Turnip for dynamic RPs is already cursed...

19:45 <alyssa> danylo: turnip implements vkrenderpass itself and then dynamic on top of RP, right?

19:46 <alyssa> we're talking about flipping that, implementing dynamic rendering + extra info of some kind and then common VkRenderPass

19:46 <danylo> Not really on top

19:47 <danylo> Ok, probably on top =)

19:53 <alyssa> ed125e6cca1 ("tu: Initial support for dynamic rendering")

19:53 <alyssa> unless that got changed later?

19:53 <alyssa> I haven't followed super close

19:56 gildekel has joined #dri-devel

20:01 <danylo> Ugh, I guess that's not the topic I'm ready to discuss during the vacation

20:03 <alyssa> Oh! I didn't realize you were on holiday

20:03 * alyssa zips lips

20:03 <alyssa> sorry!

20:06 <danylo> I mean, I could answer random questions, but dynamic rp ones are headache inducing =)

20:06 <alyssa> understood :p

20:07 <airlied> dolphin: just me should merge today

20:07 <alyssa> today's project has been teaching AGX how to be an immediate mode renderer because the tilebuffer is too small to pass that one dEQP :~)

20:11 <gfxstrand> hehe

20:11 <gfxstrand> I keep typing more spilling code

20:12 <gfxstrand> There seems to be a lot of it.

20:12 <gfxstrand> I might need to spill my spilling code...

20:13 Leopold has quit []

20:14 <alyssa> real

20:15 <alyssa> gfxstrand: any chance I could interest you in writing an AGX rust compiler after you're done with NAK

20:15 <alyssa> :p

20:16 <jenatali> Hm. There's no indication in a render pass whether an attachment that's used in a single subpass as both depth/stencil and an input attachment is going to be read-only from the depth bind point?

20:16 <jenatali> That's... surprising

20:19 Leopold_ has joined #dri-devel

20:19 <gfxstrand> jenatali: That's annoyingly tricky. I think there's an image layout that lets you do it.

20:20 <jenatali> Yeah probably. Right now the common subpass attachment translation just smashes it to feedback loop :)

20:20 <gfxstrand> Actually, it's just GENERAL isn't it?

20:20 <gfxstrand> Yeah, that case is stupid-annoying to optimize

20:21 <gfxstrand> Because answering the question "Is it written?" is non-trivial.

20:21 <jenatali> Yeah it depends on which pipelines will be used

20:21 <jenatali> Which is why we put that info upfront in our render pass description...

20:22 <jenatali> I'm just really surprised there's not some kind of flag on the depth/stencil attachment reference indicating that it's a read-only reference. Trying to reverse engineer that instead of just being upfront about it seems weird

20:24 <gfxstrand> Right

20:26 <gfxstrand> jenatali: Reading docs, it looks like you can use VK_IMAGE_LAYOUT_DEPTH_STENCIL_READ_ONLY_OPTIMAL with input attachments.

20:26 rasterman has quit [Quit: Gettin' stinky!]

20:26 <jenatali> Ok, so it seems like some checks need to be added in vk_common_CreateRenderPass2 to look at the layouts before smashing them to feedback loops

20:27 <gfxstrand> Yeah

20:31 <jenatali> Something like https://gitlab.freedesktop.org/-/snippets/7645

20:32 <alyssa> I'm afraid to ask but what's the requirements around feedback loops in Vulkan?

20:32 <alyssa> Feedback loops on our hardware are.... problematic

20:32 <jenatali> Yeah D3D doesn't support them

20:33 <jenatali> Looks like the requirements are either explicit memory barriers annotating a write->read point in the loop, or else fragment ordering flags somewhere else to ensure proper ordering

20:34 <jenatali> Hence why I care about this, read-only depth/stencil + input attachment is a totally valid use case, feedback loops much less so

20:34 <alyssa> nod

20:34 anujp has joined #dri-devel

20:37 heat_ has quit [Remote host closed the connection]

20:39 heat has joined #dri-devel

20:47 ecm has joined #dri-devel

20:48 heat has quit [Remote host closed the connection]

20:48 <ecm> hi

20:48 alyssa has quit [Quit: leaving]

20:48 heat has joined #dri-devel

20:48 <ecm> I've been having trouble with libEGL since yesterday as it keeps saying libEGL warning: failed to get driver name for fd -1

20:49 <ecm> I do not understand what is meant by this

20:49 <ecm> according to glxinfo my gpu is working fine with nouveau

20:49 <ecm> only EGL throws this error

20:51 <ecm> the EGL debug log for this https://0x0.st/HTH2.txt

20:52 <ecm> strace: https://0x0.st/HTH_.txt

20:55 <karolherbst> the fd being -1 is kinda odd

20:56 <karolherbst> ecm: what's the mesa version?

20:56 <ecm> mesa-23.1.2_1

20:56 <karolherbst> interesting...

20:57 <karolherbst> wayland/xorg?

20:57 <ecm> xorg

20:57 <karolherbst> mhhh.. maybe zink gets into the way or something? kinda weird...

20:58 <ecm> my gpu doesn't support vulkan, zink shouldn't even be used right ?

20:59 ngcortes has joined #dri-devel

20:59 <karolherbst> yeah.. well.. that zink fails could mess things up is my theory

20:59 <karolherbst> but it could also be something entirely different

20:59 <karolherbst> did anything change yesterday?

21:00 <airlied> ecm: are all your packages the same version?

21:01 <ecm> karolherbst: I only reinstalled mesa-vaapi yesterday

21:01 <karolherbst> mhhhhh

21:01 <karolherbst> yeah, maybe airlied is right then.. are all your mesa sub packages on the same version?

21:02 <ecm> https://voidlinux.org/packages/?arch=x86_64&q=mesa

21:02 <ecm> all seem to be 23.1.2

21:03 <karolherbst> yeah sure, but is this true for your system as well?

21:03 <ecm> I've updated to recent rn, so ig

21:03 <karolherbst> mesa-XvMC seems to be 22.2.4 at least

21:04 <karolherbst> is it installed for you?

21:04 <ecm> haven't installed that one

21:04 <karolherbst> okay

21:04 <karolherbst> then I get back to blaming zink

21:05 <ecm> is there some way I can run without checking for zink ?

21:05 <ecm> to test this

21:05 <karolherbst> maybe GALLIUM_DRIVER=nouveau

21:06 <ecm> nope, doesn't change anything

21:07 <karolherbst> maybe something odd on the EGL side, I don't know, might need some EGL experts here... it's a bit weird that GLX works though

21:08 <ecm> yeah, think its an EGL-only problem

21:08 <ecm> is some lib or file missing that it's searching for ??

21:08 <karolherbst> shouldn't

21:10 <karolherbst> it's kinda weird, because at least the mpv strace seems to do nouveau calls just fine

21:10 <karolherbst> DRM_IOCTL_NOUVEAU_GEM_PUSHBUF => commands send to the GPU

21:10 <karolherbst> DRM_IOCTL_NOUVEAU_GEM_CPU_PREP => waiting for the GPU to finish

21:14 <ecm> yes, nouveau works fine for everything

21:15 <karolherbst> do you know if it works in EGL applications besides eglinfo?

21:15 <ecm> no, all EGL apps fail like this

21:15 <ecm> firefox webrender, mpv, etc.

21:15 <karolherbst> mhhh

21:15 <karolherbst> but mpv is using nouveau

21:16 <ecm> wait I'll send eglinfo

21:16 <ecm> eglinfo: https://0x0.st/HTHU.txt

21:17 <karolherbst> is there a second GPU on your system by any chance?

21:18 <ecm> I have integrated graphics

21:18 <karolherbst> did you uninstall/disabled the driver in any way?

21:19 <ecm> of the integrated graphics ?

21:19 <karolherbst> yeah

21:19 lemonzest has quit [Quit: WeeChat 3.6]

21:19 <karolherbst> I wouldn't be surprised if EGL could get confused if it fails to create a context on the integrated GPU

21:19 <karolherbst> eglinfo will try to do so on all GPUs

21:20 <karolherbst> and other EGL applications might run into the same thing then

21:23 <ecm> ok, I reinstalled the mesa-intel-dri to test this, will reboot

21:25 ecm` has joined #dri-devel

21:25 ecm has quit [Read error: Connection reset by peer]

21:25 <ecm`> nope, no change

21:26 lemonzest has joined #dri-devel

21:26 <ecm`> how could I have disabled intel graphics ? some config file or var ?

21:27 <karolherbst> does `DRI_PRIME=0 glxinfo` or `DRI_PRIME=1 glxinfo` change anything?

21:30 <ecm`> no

21:30 <ecm`> both are same

21:30 <karolherbst> both nouveau?

21:30 Duke`` has quit [Ping timeout: 480 seconds]

21:30 <karolherbst> what's the integrated GPU anyway? Or is it a nouveau only system and I misunderstood?

21:31 <ecm`> they show vendor: Mesa renderer: NVA8

21:31 <ecm`> wait should vendor be nouveau for nouveau ?

21:31 <karolherbst> nah, the renderer is the chipset name

21:32 krushia has quit [Ping timeout: 480 seconds]

21:32 <karolherbst> but anyway, is this a single GPU system with the nvidia one as an integrated GPU?

21:32 <karolherbst> or are there two?

21:32 <ecm`> the former

21:32 <karolherbst> ahh okay, my mistake then

21:33 Company has quit [Quit: Leaving]

21:34 <ecm`> there is one GPU NVS-3100m, and i5-520m with integrated graphics

21:34 <ecm`> it's a t510

21:35 <karolherbst> okay, so there are two GPUs

21:36 <karolherbst> `lspci | grep -e VGA -e 3D` should tell anyway

21:36 <ecm`> that just gives 01:00.0 VGA compatible controller: NVIDIA Corporation GT218M [NVS 3100M] (rev a2)

21:36 <ecm`> only the gpu

21:37 <karolherbst> mhhh

21:37 <ecm`> so my integrated graphics isn't working ?

21:37 <karolherbst> it should still be listed as a PCI device in either case

21:37 <karolherbst> maybe it's disabled in the firmware? dunno

21:37 avoidr has joined #dri-devel

21:37 <karolherbst> maybe lspci lists it as something else?

21:37 <karolherbst> mind sharing your `dmesg`?

21:38 <kisak> Ironlake, oof. Could have been disabled in the bios since Optimus of that era wasn't quite viable.

21:39 <karolherbst> yeah....

21:39 <ecm`> lspci: https://0x0.st/HTHC.txt

21:39 <ecm`> dmesg: https://0x0.st/HTXr.txt

21:40 <karolherbst> `[ 9.554213] intel ips 0000:00:1f.6: failed to get i915 symbols, graphics turbo disabled until i915 loads` oof

21:40 <karolherbst> what is even that

21:40 <ecm`> lol

21:40 <karolherbst> but in any case, I don't think this should cause any issues

21:40 <karolherbst> as the device is like not there... at least for now

21:41 <karolherbst> could change once you `modprobe i915`

21:41 <karolherbst> what's the output of `/sys/class/drm`?

21:42 <ecm`> nope modprobe i915 doesn't add anything to dmesg or EGL

21:43 <ecm`> card0 card0-DP-1 card0-DP-2 card0-DP-3 card0-LVDS-1 card0-VGA-1 renderD128 ttm version

21:43 <kisak> ecm`: can you check the bios for something like ... Config > Display > Switchable Graphics and if it exists, if it's set to something that would hinder you.

21:44 <ecm`> ok

21:45 <kisak> pulling random forum noise from a decade ago, so it's not really a trusted info source.

21:48 fab has quit [Quit: fab]

21:48 gouchi has quit [Remote host closed the connection]

21:48 ecm` has quit [Read error: Connection reset by peer]

21:49 bgs has quit [Remote host closed the connection]

21:51 ecm has joined #dri-devel

21:51 <ecm> no such setting lol, i guess it's too old for that

21:51 <ecm> BIOS has very few options

22:04 AndroUser2 has quit [Ping timeout: 480 seconds]

22:04 AndroUser2 has joined #dri-devel

22:05 idr has joined #dri-devel

22:14 AndroUser2 has quit [Remote host closed the connection]

22:15 <cmarcelo> karolherbst: for your comment about the struct of bools not having internal padding in CLC, jenatali pointed out in another context https://en.cppreference.com/w/cpp/types/has_unique_object_representations... seems to me you can stick a static_assert(std:has...<struct clc_...>()); in the cpp file to provide some compile time check for that. (or inside an ifdef cplusplus) which the cpp including it

22:15 <cmarcelo> will pick it up.

22:16 <karolherbst> ohhh.. interesting...

22:16 <karolherbst> let me play around with that then

22:16 <jenatali> Didn't I say that this morning when we were talking about it?

22:17 <jenatali> I was on my phone so I didn't have the exact name of the type trait, but I definitely said there was a type trait you could use to static assert it in C++

22:17 <karolherbst> ehh yeah.. but somehow my brain turned that into "check the size is known"

22:17 <karolherbst> yeah... I didn't think that far

22:21 AndroUser2 has joined #dri-devel

22:30 <karolherbst> seems to work just fine: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23685

22:32 iive has quit [Quit: They came for me...]

22:34 <jenatali> Did you test the negative by adding padding and making sure it breaks? I fully expect it to, just wondering

22:37 kts has joined #dri-devel

22:38 <cmarcelo> jenatali: I tested { bool; bool; bool; int; } and it barfed

22:38 <cmarcelo> s/barfed/correctly failed to compile/

22:39 <jenatali> Great :)

22:44 <airlied> gfxstrand: got any opinion how best to do VK_MESA extensions, since we'd probably want to prototype them in mesa before sending to vulkan, but they'd involve patching the vk registry/headers we have in mesa which seems like a bad maintenacne plan

22:45 <airlied> but I also feel back upstreaming things to vulkan if we aren't going to document them to a competent level

22:48 <gfxstrand> airlied: For what sort of thing?

22:48 <gfxstrand> We do internal extensions all the time

22:48 <gfxstrand> If it's going to be an actual published extension, though, it should go in the spec.

22:49 kts has quit [Ping timeout: 480 seconds]

22:50 <airlied> gfxstrand: VK_MESA_video_decode_av1 is the first one

22:50 * airlied doesn't feel nice cluttering up the spec :-P

22:50 <gfxstrand> Why not?

22:50 <gfxstrand> video is already spec clutter. :-P

22:50 <airlied> because when VK_KHR_video_decode_av1 comes along it'll get funky to tell the difference :-P

22:50 <airlied> but maybe there isn't a nicer way

22:51 <gfxstrand> Then why are we shipping a thing that's just going to be replaced by a subtly different thing?

22:51 <gfxstrand> And how is that any different from the EXT mess we already have?

22:51 benjamin1 has joined #dri-devel

22:51 <airlied> we are shipping a thing because it will actually be useful in the current lifetime of the universe

22:51 jfalempe_ has joined #dri-devel

22:51 <gfxstrand> lol

22:52 <gfxstrand> Then spec it

22:52 <gfxstrand> If the universe dies before the KHR one comes out, there won't be any spec clutter.

22:52 jfalempe has quit [Read error: Connection reset by peer]

22:57 benjaminl has quit [Ping timeout: 480 seconds]

23:01 JohnnyonFlame has quit [Read error: Connection reset by peer]

23:09 AndroUser2 has quit [Remote host closed the connection]

23:10 AndroUser2 has joined #dri-devel

23:14 smiles_1111 has joined #dri-devel

23:19 anujp has quit [Remote host closed the connection]

23:23 Kayden has quit [Quit: -> happy hour]

23:25 ngcortes has quit [Remote host closed the connection]

23:29 AndroUser2 has quit [Remote host closed the connection]

23:31 AndroUser2 has joined #dri-devel

23:36 ecm has left #dri-devel [ERC 5.4 (IRC client for GNU Emacs 28.2)]

23:40 vliaskov has quit [Remote host closed the connection]

23:40 idr has quit [Ping timeout: 480 seconds]

23:53 ecm has joined #dri-devel

23:57 <ecm> karolherbst: looking at the mesa-23.1.2 source, loader_get_driver_for_fd gives the error "failed to get driver name for fd -1"

23:58 <ecm> and loader_get_driver_for_fd is used in platform_x11.c with dri2_dpy->fd_render_gpu

23:59 <ecm> but there is a debug log just before that if the dri2_dpy->fd_render_gpu == -1, which never appears in all these errors