ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar
<DemiMarie> Sorry for being as noisy as a badly tuned engine
lyudess has joined #dri-devel
benjamin1 has joined #dri-devel
Lyude has quit [Ping timeout: 480 seconds]
benjamin1 has quit [Ping timeout: 480 seconds]
sghuge has quit [Remote host closed the connection]
sghuge has joined #dri-devel
co1umbarius has joined #dri-devel
columbarius has quit [Ping timeout: 480 seconds]
columbarius has joined #dri-devel
co1umbarius has quit [Ping timeout: 480 seconds]
benjamin1 has joined #dri-devel
jewins has quit [Ping timeout: 480 seconds]
benjamin1 has quit [Ping timeout: 480 seconds]
ngcortes has quit [Ping timeout: 480 seconds]
benjamin1 has joined #dri-devel
kzd has quit [Quit: kzd]
gfxstrand has quit [Ping timeout: 480 seconds]
benjamin1 has quit [Ping timeout: 480 seconds]
kzd has joined #dri-devel
benjamin1 has joined #dri-devel
ngcortes has joined #dri-devel
benjamin1 has quit [Ping timeout: 480 seconds]
<DavidHeidelberg[m]> MrCooper: mupuf: eric_engestrom anholt & others CI interested people: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23629#note_1959074 I did real testing, it seems to work correctly, but I'll be happy for another wave of reviews or acking
<DavidHeidelberg[m]> Now when we enable one farm, only jobs on these farms gets tested.
<DavidHeidelberg[m]> If we enable two farms, only these two farms gets tested, all other jobs will be skipped. If we disable farm, nothing except basic build will be tested (and that should go away in future too, but needs more "playing")
<DavidHeidelberg[m]> also ^ gallo if you find time to say final verdict over the MR :D
bbrezill1 has joined #dri-devel
bbrezillon has quit [Ping timeout: 480 seconds]
yuq825 has joined #dri-devel
benjamin1 has joined #dri-devel
benjaminl has joined #dri-devel
mbrost has quit [Ping timeout: 480 seconds]
benjamin1 has quit [Ping timeout: 480 seconds]
<mareko> when I clone piglit, the default branch is master, how to fix that?
<mareko> nevermind, I cloned the wrong repo
mbrost has joined #dri-devel
Company has joined #dri-devel
heat has quit [Read error: No route to host]
heat has joined #dri-devel
heat has quit [Remote host closed the connection]
heat has joined #dri-devel
macromorgan has joined #dri-devel
Daanct12 has joined #dri-devel
Guest2769 has quit [Ping timeout: 480 seconds]
Danct12 has joined #dri-devel
Daanct12 has quit [Ping timeout: 480 seconds]
ngcortes has quit [Read error: Connection reset by peer]
TMM has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
TMM has joined #dri-devel
mbrost has quit [Ping timeout: 480 seconds]
bmodem has joined #dri-devel
heat has quit [Ping timeout: 480 seconds]
kzd has quit [Ping timeout: 480 seconds]
Dr_Who has quit [Read error: Connection reset by peer]
bmodem1 has joined #dri-devel
bmodem has quit [Read error: Connection reset by peer]
Dr_Who has joined #dri-devel
fab has joined #dri-devel
sima has joined #dri-devel
Leopold has joined #dri-devel
Leopold has quit []
Leopold_ has joined #dri-devel
fab has quit [Quit: fab]
alanc has quit [Remote host closed the connection]
alanc has joined #dri-devel
bmodem has joined #dri-devel
bmodem1 has quit [Ping timeout: 480 seconds]
rasterman has joined #dri-devel
sghuge has quit [Remote host closed the connection]
sghuge has joined #dri-devel
Leopold_ has quit []
Leopold_ has joined #dri-devel
fab has joined #dri-devel
Guest2279 has quit [Read error: Connection reset by peer]
peelz has joined #dri-devel
peelz is now known as Guest3145
Guest3145 has quit [Read error: Connection reset by peer]
notpeelz has joined #dri-devel
YuGiOhJCJ has joined #dri-devel
jkrzyszt has joined #dri-devel
Danct12 is now known as Guest3147
Danct12 has joined #dri-devel
<MrCooper> karolherbst: should be easy to use -Woverloaded-virtual=1 instead of -Wno-error=overloaded-virtual in CI at least
Leopold_ has quit []
<MrCooper> BTW, there actually is a libc++ (from LLVM), it's not the same as libstdc++ (from GCC) though
rasterman has quit [Quit: Gettin' stinky!]
Leopold has joined #dri-devel
bmodem1 has joined #dri-devel
bmodem has quit []
lynxeye has joined #dri-devel
swalker_ has joined #dri-devel
tzimmermann has joined #dri-devel
swalker_ is now known as Guest3151
swalker__ has joined #dri-devel
tursulin has joined #dri-devel
pochu has joined #dri-devel
Guest3151 has quit [Ping timeout: 480 seconds]
vliaskov has joined #dri-devel
<karolherbst> MrCooper: yeah.. but our code seems to also trigger at =1.. anyway, I've opened an MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23656
<MrCooper> I agree with gerddie, which is actually an argument against switching to -Woverloaded-virtual=1 as well
<karolherbst> yeah, sadly fixing it in codegen is a bit more of a rewrite, it's a bit pointless because there is nothing wrong with the code as is, just... uhhh
<MrCooper> mainly that we can ignore warnings for CI, we should not hide them though
<karolherbst> sure, my point is rather that in nouveau's instance of that warning it's a non issue, but anyway... I'll think of something, probably just renaming the method and call it a day
AndroUser2 has joined #dri-devel
AndroUser2 has quit [Read error: Connection reset by peer]
AndroUser2 has joined #dri-devel
AndroUser2 has quit [Ping timeout: 480 seconds]
JohnnyonFlame has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
Leopold has quit []
Leopold_ has joined #dri-devel
mavchatz has quit []
mvchtz has joined #dri-devel
heat has joined #dri-devel
rasterman has joined #dri-devel
Leopold_ has quit [Remote host closed the connection]
AndroUser2 has joined #dri-devel
Leopold has joined #dri-devel
djbw_ has quit [Remote host closed the connection]
jfalempe has quit [Ping timeout: 480 seconds]
swalker__ has quit [Remote host closed the connection]
swalker__ has joined #dri-devel
swalker__ has quit [Remote host closed the connection]
bmodem1 has quit [Ping timeout: 480 seconds]
pochu_ has joined #dri-devel
pochu has quit [Ping timeout: 480 seconds]
jfalempe has joined #dri-devel
bmodem has joined #dri-devel
<dolphin> airlied, sima: Anything blocking drm-intel-next merge or just time?
<dolphin> jani is on vacations and asked to look after the merge, thus pinging
<dolphin> no drm-intel-fixes this week (just one unused variable fix and on gen3 color fix, they can get in via -next)
fab has quit [Ping timeout: 480 seconds]
Arsen has quit [Quit: Quit.]
Arsen has joined #dri-devel
leonardo has joined #dri-devel
leonardo is now known as DottorLeo
DottorLeo has quit [Quit: Konversation terminated!]
DottorLeo has joined #dri-devel
<DottorLeo> hi!
<DottorLeo> hakzsam: i've seen your patch for IB on GFX6, can it bring better performance on a SI card? :)
<DottorLeo> mareko: https://gitlab.freedesktop.org/mareko/mesa/-/commit/10c889df309efeeef20a27d04c2637bcb76f8dea talks about GFX7, but should'n fix GFX6 or affect both generations?
<mlankhorst> drm-tip is broken, fixing up now
Dr_Who has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
Danct12 has quit [Quit: WeeChat 3.8]
alyssa has joined #dri-devel
idr has quit [Ping timeout: 480 seconds]
<hakzsam> DottorLeo: not really
alyssa has left #dri-devel [#dri-devel]
<pendingchaos> am I correct in thinking that nir_loop_analyze doesn't actually use the loop invariance information?
<pendingchaos> it seems to be using it to remove invariant variables from process_list, but later only phis in that list are visited, and phis are never invariant
DottorLeo has quit [Quit: Konversation terminated!]
<karolherbst> jenatali: how is `clc.h` used inside MS's CL code? Can I include "util/macros.h" there for `PACKED`? I want to enforce that clc_optional_features is packed, because I have to use it for hashing. Or do you think we can be trusted to not mess it up and ensure that struct is always packed anyway?
<jenatali> karolherbst: We just copy/paste it into our external project. I'd prefer to keep it standalone if possible but if it needs to grow more dependencies I can live with it
<karolherbst> yeah.. I mean, in this case it doesn't feel worth the effort as clc_optional_features is just an array of bools basically, but I have to hash it into my key and kinda don't want anybody to mess with it in a wrong way
<jenatali> Does PACKED even work on MSVC?
<karolherbst> maybe I just leave a comment and hope for the best?
<karolherbst> huh...
<karolherbst> I don't think so
<jenatali> I know we have #pragma pack but I don't know that works on inline attribute-style stuff
<jenatali> I'd just throw in a static assert on the size of the type
<karolherbst> mhhh yeah, probably good enough
<karolherbst> but also annoying
<karolherbst> the only thing relevant here is not having any padding
<jenatali> If you have C++ there's a type trait for that
<jenatali> Which, the header ends up getting included in C++ for our tests
iive has joined #dri-devel
<karolherbst> mhhhhhh
<karolherbst> I just leave a comment
<jenatali> Ok
jewins has joined #dri-devel
Dark-Show has joined #dri-devel
nchery has quit [Ping timeout: 480 seconds]
nchery has joined #dri-devel
alyssa has joined #dri-devel
<alyssa> Do we have any optimizations for `ivec2(gl_FragCoord.xy)` or `floor(gl_FragCoord.xy)`?
<alyssa> grepping shader-db, that seems to come up in a bunch of places
<alyssa> For some meta shader stuff I added a load_pixel_coord, returning an ivec2
kzd has joined #dri-devel
<alyssa> this is what both AGX and Mali Bifrost+ implement natively, with load_frag_coord.xy lowered in the backend to float(load_pixel_coord) + 0.5
fab has joined #dri-devel
<alyssa> so wondering if this is something other hardware has too, and if anyone else would care for nir_opt_intrinsics to optimize `f2i(load_frag_coord.xy) -> load_pixel_coord` and `floor(load_frag_coord.xy) -> float(load_pixel_coord)`, saving the add
<alyssa> or... maybe that's not even necessary?
<alyssa> Maybe if the load_frag_coord -> load_pixel_coord lowering happens early, then opt_algebraic would chew through the ALU for free?
YuGiOhJCJ has quit [Quit: YuGiOhJCJ]
<alyssa> looks like we would need to add a few rules to nir_opt_algebraic but that's easier than opt_intrinsics
<alyssa> looks like ir3 would potentially benefit
yuq825 has left #dri-devel [#dri-devel]
<alyssa> yeah, it'd be ir3 + agx + bifrost
<alyssa> might still be worth it just to share some lowering
AndroUser2 has quit [Read error: Connection reset by peer]
AndroUser2 has joined #dri-devel
greaser|q has joined #dri-devel
greaser|q has quit []
fxkamd has joined #dri-devel
fab has quit [Ping timeout: 480 seconds]
greaser|q has joined #dri-devel
Jakdaw has quit [Quit: Leaving]
tzimmermann has quit [Quit: Leaving]
anholt_ has joined #dri-devel
anholt has quit [Ping timeout: 480 seconds]
fab has joined #dri-devel
kxkamil2 has quit []
pochu_ has quit []
Duke`` has joined #dri-devel
Dark-Show has quit [Quit: Leaving]
bmodem has quit [Ping timeout: 480 seconds]
bbrezill1 has quit []
bbrezillon has joined #dri-devel
smiles_ has quit [Remote host closed the connection]
jkrzyszt has quit [Ping timeout: 480 seconds]
sauce has quit []
kxkamil has joined #dri-devel
JohnnyonFlame has joined #dri-devel
sauce has joined #dri-devel
lynxeye has quit [Quit: Leaving.]
AndroUser2 has quit [Ping timeout: 480 seconds]
benjaminl has quit [Quit: WeeChat 3.8]
benjaminl has joined #dri-devel
heat_ has joined #dri-devel
gfxstrand has joined #dri-devel
heat has quit [Read error: Connection reset by peer]
mbrost has joined #dri-devel
nchery has quit [Remote host closed the connection]
jewins has quit [Remote host closed the connection]
jewins1 has joined #dri-devel
djbw_ has joined #dri-devel
nchery has joined #dri-devel
jewins1 has quit [Ping timeout: 480 seconds]
ngcortes has joined #dri-devel
AndroUser2 has joined #dri-devel
AndroUser2 has quit [Read error: Connection reset by peer]
AndroUser2 has joined #dri-devel
jewins has joined #dri-devel
AndroUser2 has quit [Remote host closed the connection]
AndroUser2 has joined #dri-devel
sima has quit [Ping timeout: 480 seconds]
ngcortes has quit [Read error: Connection reset by peer]
<jenatali> gfxstrand: I'm looking at hooking up D3D's (recently revised) render pass descriptions in Dozen, which aim to take advantage of subpass dependency info for TBDR (*cough* QC *cough*). I only need a very small subset of additional info on top of what dynamic rendering gives me. In your opinion, should I be trying to extend the render pass -> dynamic rendering stuff in core Vulkan, or should I just implement my own render pass state machine in
<jenatali> Dozen?
<alyssa> jenatali: I would love a way to take advantage of subpass deps with dynamic rendering plus
<alyssa> Figuring out exactly how to extend the runtime to plumb that through is still an open problem
<jenatali> Yeah, I see the TODO
<alyssa> What does this look like in D3D?
<alyssa> What do you envision the runtime doing for Dozen?
<gfxstrand> If we could do a runtime thing, that'd be cool.
<gfxstrand> IDK how practical it is, though.
<alyssa> cool assuming the thing that Dozen wants is remotely similar to what the actual hw drivers want ;)
TMM has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
TMM has joined #dri-devel
<alyssa> which is why I'm asking since I didn't know D3d had subpasses
<jenatali> Additional load/store ops. When ending rendering to an attachment that will be used in a future subpass, there's additional store ops indicating how it'll be used in the future (color attachment, input attachment, storage image)
<jenatali> And then when beginning the next one, you specify the same modality as a load op
<alyssa> Interesting
<alyssa> Do you have a link I can read about this?
<jenatali> Essentially a "don't flush this from tile memory yet" option
<alyssa> oh it's chunky ok
<jenatali> Yeah, our normal APIs for non-TBDR is just "bind color attachments" and "bind depth stencil attachment"
<gfxstrand> You know, with Microsoft implementing Vulkan on D3D12 and Valve implementing D3D12 on Vulkan, it almost makes you wish we we had some sort of IDK... working group? where we could all get together and discuss stuff so we could have just one API. 🤔
<jenatali> The render pass stuff we added several years back looks like dynamic rendering today
<alyssa> gfxstrand: mood
Kayden has quit [Quit: -> JF]
<alyssa> Apple: just watch me
<jenatali> Heh
<jenatali> So you're saying that we should all switch to OneAPI? That's what I heard
<alyssa> Intel: 🤡
<gfxstrand> jenatali: hehe
<gfxstrand> jenatali: But which One of the APIs?
<alyssa> openvg++
<jenatali> Yeah
<Sachiel> obviously a new one that will fix all the issues
<alyssa> jenatali: ok, I do not have sufficient bandwidth for that .md to soak in today
<gfxstrand> Come on, folks... There's an even subtler jab in there that y'all're missing. 😅
* alyssa punches Faith, subtly
<alyssa> that jab?
<alyssa> :-p
<gfxstrand> No, not that jab
<alyssa> Oh
* alyssa retracts punch
<jenatali> Apparently too subtle for me
<alyssa> same
<gfxstrand> Intel's OneAPI is more like a OneOfEachAPI
<alyssa> ahaha
<jenatali> Yeah... I'd noticed that
<gfxstrand> Do you have any idea how many APIs are in OneAPI? It's a lot.
<alyssa> jenatali: Anyway, if you can figure out a reasonable way to augment the common VkRenderPass to express subpass deps, that'd be really awesome
<gfxstrand> I lost count a long time ago
<alyssa> It's just never been clear to me exactly what that would look like
<jenatali> alyssa: Unfortunately I don't necessarily know exactly what information all the other drivers would need
<gfxstrand> Hopefully quite a lot like D3D12 render passes and/or whatever dynamic input attachments thing Khronos cooks up in $FUTURE_DATE
<jenatali> It's hard for me to guess what's going too far and throwing in the kitchen sink vs being too conservative and not being useful for anyone else
<alyssa> I can speak to both agxv and panvk for whatever it's worth
<jenatali> So what I'd want on top of dynamic rendering is a list of input attachments, and a list of preserve attachments, where the preserve also indicates what it's being preserved for, i.e. what's the next use going to be
<jenatali> I don't need to know when or really anything else
<alyssa> OK.. I'm trying to figure out exactly how that'd map to hw
<alyssa> Is the idea that each subpass would be a separate begin/endrendering, but with that extra info to let the driver coalesce them?
<jenatali> Yep
<Sachiel> like the suspend/resume flags, but actually useful
<alyssa> Alright
<jenatali> We also have suspend/resume for continuing the same exact kind of rendering, but yeah
<alyssa> jenatali: So, Mali and AGX are both closer to Apple's imageblock model than Vulkan's render passes
<jenatali> Unfortunately that means nothing to me. I need to learn Metal one of these days...
<alyssa> Nah you're not missing out
<alyssa> here's a simplified programming model for AGX:
<alyssa> - tilebuffer is untyped on-chip memory (the same as GL shared memory)
<alyssa> - an instruction to store to shared ("tilebuffer") memory at a given offset with a given format implicitly indexed by the pixel coordinate
<alyssa> - " but load
<alyssa> - an instruction to blit a block of shared memory (at a particular offset with a particular format) to a storage image
<alyssa> and that's it
<alyssa> notice there are no subpasses, renderpasses, render targets, color buffers, .. etc
<alyssa> How do you actually program that?
<alyssa> For the regular (single-subpass) case, it's pretty straightforward:
<jenatali> Yeah that sounds fun :P
<alyssa> 1. Fragment shaders store their output with something like "local store pixel <colour>, offset = 0, format = rgba8"
<alyssa> 2. fbfetch is similarly "local load pixel <colour>, offset = 0, format = rgba8"
<alyssa> 3. At the end of the frame, we insert an "end-of-tile" program that runs once per tile and contains ~1 instruction/render target "blit block at offset 0 of format rgba8 to image 0" and we bind each colour buffer as a corresponding image
<alyssa> How do subpasses fit in?
tursulin has quit [Ping timeout: 480 seconds]
<alyssa> Well, Apple doesn't implement them so this is all based on the Vulkan driver I've written in my head and might be wrong ;-p
<alyssa> But -- ideally, all the subpasses are concatenated together
<alyssa> first all the draws of subpass 0, then all the draws of subpass 1, etc
<alyssa> Within each subpass, fragment shaders can load from input attachments (at one set of offsets) and store to outputs (at different offsets)
<jenatali> Yep
<alyssa> Notably the layout of the tilebuffer can change from draw-to-draw (with some careful barriers), so that input attachments that aren't used anymore don't need to stay around and we can fit more in the tilebuffer
<alyssa> And then finally, at the end of the frame we have our end-of-tile program that stores out just the parts that actually have STORE_OP_STORE, at whatever set of offsets we had for the final layout at the end of the render pass
<jenatali> Yep, that all sounds right
<alyssa> Mali's programming model isn't *quite* as loose, since there's a bit more fixed function hardware, but it's the same idea
<alyssa> All of the subpasses' draws are concatenated into one big batch, the hardware "render targets" are just the ones that are STOREd at the very end, in the middle the LD_TILE/ST_TILE instructions take driver-controlled offsets and formats to make things work according to the (possibly evolving) layout
<alyssa> Where this gets hard is when there isn't enough space in the tilebuffer to accommodate the whole render pass at once
<alyssa> in that case, some attachments need to spill to memory
<jenatali> Yeah. The one thing I'm not super familiar with is how TBDRs are expected to compute a tile layout given the model that we currently have, but what we ended up with was the compromise between "app developers will actually use this" vs "gives enough information that a driver can compute a layout"
<alyssa> or we need to split up the render pass
<alyssa> or something like that
<alyssa> right, this is one of the sticky points
<alyssa> for best perf the driver has to do something like register allocation
<alyssa> look at each attachment, look which subpasses it's "live" for, allocate offsets for each subpass allowing overlap for subpasses that don't "interfere" with each other
<jenatali> My guess is that the driver would do some kind of 2-pass approach over the commands to construct a more holistic view of the pass before actually building the layout, but this is really black-box to me
<alyssa> right. so one of the open questions is whether THAT code could be shared if we had common render passes
<alyssa> I know VkRenderPass + monolithic pipelines is designed to have enough information for this to work (TBDRs had a big say in Vulkan 1.0)
<alyssa> I don't know if it's still possible to do with dynamic rendering plus, especially if you throw pipelines out
Kayden has joined #dri-devel
<jenatali> Yeah, makes sense. That's a level of detail more than what I need, like I was saying, all I currently need is, for the current subpass, which attachments are inherited and which will be inherited in the future
<alyssa> Yeah, makes sense
<alyssa> Qualcomm's blob is presumably doing all those heroics underneath
<jenatali> Which, obviously having more information is fine, I can just ignore it, as long as I'm able to get those bits from whatever we end up with
<alyssa> Sure
<jenatali> So, the question is, how far do I go right now? :P
<alyssa> heh
<alyssa> The other question I have is whether D3D12 style render passes are sufficient to implement subpass merging on Mali/PowerVR/AGX/VideoCore
<alyssa> The true TBDRs
<jenatali> Is Adreno that different?
<alyssa> Yes, Adreno (uniquely) can behave as either a TBDR or an IMR
<alyssa> ("gmem" vs "sysmem")
<alyssa> which has substantial implications for how this stuff gets implemented in their driver
<jenatali> Huh. Didn't realize that, but it makes sense. This whole thing is about allowing them to behave as TBDR though
<alyssa> (For ex, on Adreno to read from a subpass input attachment you... still use a texture instruction with a magic texture descriptor that loads from gmem instead of system memory. IIRC there's an ugly point somewhere in Vulkan that exists to accommodate that.)
<alyssa> (This is very different than Mali/AGX where a tilebuffer load is a completely different instruction than texturing)
<jenatali> Probably the fact that input attachment descriptors have to exist in descriptor sets
<alyssa> At any rate my question is, given the information your thing would give, where would I compute tilebuffer layouts?
<alyssa> and how would I spill?
anujp has quit [Remote host closed the connection]
<jenatali> Right, I don't see how you would
<jenatali> For rendering that was actually done with a VK1.0 render pass, you'd want all the info in that pass to compute the tile layouts
<jenatali> In a hypothetical future where there's a dynamic rendering extension that provides this type of attachment inheritance info, you'd probably want to do the same heroics that the Windows QC driver is doing to reconstruct it
<alyssa> I guess that's my other question, how is QC's D3D12 driver doing it?
<jenatali> But IMO it doesn't really make sense to take all of the info from a VK1.0 pass, throw it out and strip it down to dynamic rendering, and then make the driver reconstruct it just because they might also have to deal with constructing it out of nothing sometimes
<alyssa> Yeah, valid
<jenatali> 🤷‍♂️ I wasn't really involved in the design of this feature
<jenatali> Outside of "is there enough info in Vulkan to map to this" :P
<alyssa> Heh
<alyssa> FWIW, the actual plan for AGXV is to do dynamic rendering only (and not get the win for VK 1.0 render passes) on the assumption that virtually no content we care about uses subpass merging
<jenatali> Yeah that makes a lot of sense
<jenatali> Except maybe some benchmarks which we apparently care about
<alyssa> and that number will decrease even further if someday Vulkan gains a dynamic rendering approach to subpasses
<alyssa> which would likely be much much much easier to implement in driver (since the tilebuffer layout becomes the app's problem)
<alyssa> (This is Metal's approach, basically.)
<jenatali> Makes sense
<alyssa> Metal's imageblocks map basically directly to the hardware model I explained
<jenatali> So, back to my original question: is it worth trying to add limited subpass dependency info on top of dynamic rendering in the Vulkan runtime code?
<alyssa> I don't know.
<alyssa> If it isn't invasive I wouldn't nak it, and i don't think gfxstrand would
<alyssa> But realistically I'm unsure if any other drivers would be able to use it, except maybe turnip
<alyssa> cwabbott: flto: danylo: questions for you^^
<jenatali> I guess, I could, while processing dynamic rendering begin, just look at the command buffer render pass state and reconstruct it in my driver... that might be simpler
<jenatali> Without doing the full render pass state machine myself or trying to guess what other drivers might want
<jenatali> I'll start with that
gouchi has joined #dri-devel
macc24_ has quit [Remote host closed the connection]
lyudess has quit []
Lyude has joined #dri-devel
<danylo> So you want to stitch together dynamic renderpasses if attachment layout allows it? Are there real world examples where there are big renderpasses that are worth stitching together?
<danylo> Our code in Turnip for dynamic RPs is already cursed...
<alyssa> danylo: turnip implements vkrenderpass itself and then dynamic on top of RP, right?
<alyssa> we're talking about flipping that, implementing dynamic rendering + extra info of some kind and then common VkRenderPass
<danylo> Not really on top
<danylo> Ok, probably on top =)
<alyssa> ed125e6cca1 ("tu: Initial support for dynamic rendering")
<alyssa> unless that got changed later?
<alyssa> I haven't followed super close
gildekel has joined #dri-devel
<danylo> Ugh, I guess that's not the topic I'm ready to discuss during the vacation
<alyssa> Oh! I didn't realize you were on holiday
* alyssa zips lips
<alyssa> sorry!
<danylo> I mean, I could answer random questions, but dynamic rp ones are headache inducing =)
<alyssa> understood :p
<airlied> dolphin: just me should merge today
<alyssa> today's project has been teaching AGX how to be an immediate mode renderer because the tilebuffer is too small to pass that one dEQP :~)
<gfxstrand> hehe
<gfxstrand> I keep typing more spilling code
<gfxstrand> There seems to be a lot of it.
<gfxstrand> I might need to spill my spilling code...
Leopold has quit []
<alyssa> real
<alyssa> gfxstrand: any chance I could interest you in writing an AGX rust compiler after you're done with NAK
<alyssa> :p
<jenatali> Hm. There's no indication in a render pass whether an attachment that's used in a single subpass as both depth/stencil and an input attachment is going to be read-only from the depth bind point?
<jenatali> That's... surprising
Leopold_ has joined #dri-devel
<gfxstrand> jenatali: That's annoyingly tricky. I think there's an image layout that lets you do it.
<jenatali> Yeah probably. Right now the common subpass attachment translation just smashes it to feedback loop :)
<gfxstrand> Actually, it's just GENERAL isn't it?
<gfxstrand> Yeah, that case is stupid-annoying to optimize
<gfxstrand> Because answering the question "Is it written?" is non-trivial.
<jenatali> Yeah it depends on which pipelines will be used
<jenatali> Which is why we put that info upfront in our render pass description...
<jenatali> I'm just really surprised there's not some kind of flag on the depth/stencil attachment reference indicating that it's a read-only reference. Trying to reverse engineer that instead of just being upfront about it seems weird
<gfxstrand> Right
<gfxstrand> jenatali: Reading docs, it looks like you can use VK_IMAGE_LAYOUT_DEPTH_STENCIL_READ_ONLY_OPTIMAL with input attachments.
rasterman has quit [Quit: Gettin' stinky!]
<jenatali> Ok, so it seems like some checks need to be added in vk_common_CreateRenderPass2 to look at the layouts before smashing them to feedback loops
<gfxstrand> Yeah
<alyssa> I'm afraid to ask but what's the requirements around feedback loops in Vulkan?
<alyssa> Feedback loops on our hardware are.... problematic
<jenatali> Yeah D3D doesn't support them
<jenatali> Looks like the requirements are either explicit memory barriers annotating a write->read point in the loop, or else fragment ordering flags somewhere else to ensure proper ordering
<jenatali> Hence why I care about this, read-only depth/stencil + input attachment is a totally valid use case, feedback loops much less so
<alyssa> nod
anujp has joined #dri-devel
heat_ has quit [Remote host closed the connection]
heat has joined #dri-devel
ecm has joined #dri-devel
heat has quit [Remote host closed the connection]
<ecm> hi
alyssa has quit [Quit: leaving]
heat has joined #dri-devel
<ecm> I've been having trouble with libEGL since yesterday as it keeps saying libEGL warning: failed to get driver name for fd -1
<ecm> I do not understand what is meant by this
<ecm> according to glxinfo my gpu is working fine with nouveau
<ecm> only EGL throws this error
<ecm> the EGL debug log for this https://0x0.st/HTH2.txt
<karolherbst> the fd being -1 is kinda odd
<karolherbst> ecm: what's the mesa version?
<ecm> mesa-23.1.2_1
<karolherbst> interesting...
<karolherbst> wayland/xorg?
<ecm> xorg
<karolherbst> mhhh.. maybe zink gets into the way or something? kinda weird...
<ecm> my gpu doesn't support vulkan, zink shouldn't even be used right ?
ngcortes has joined #dri-devel
<karolherbst> yeah.. well.. that zink fails could mess things up is my theory
<karolherbst> but it could also be something entirely different
<karolherbst> did anything change yesterday?
<airlied> ecm: are all your packages the same version?
<ecm> karolherbst: I only reinstalled mesa-vaapi yesterday
<karolherbst> mhhhhh
<karolherbst> yeah, maybe airlied is right then.. are all your mesa sub packages on the same version?
<ecm> all seem to be 23.1.2
<karolherbst> yeah sure, but is this true for your system as well?
<ecm> I've updated to recent rn, so ig
<karolherbst> mesa-XvMC seems to be 22.2.4 at least
<karolherbst> is it installed for you?
<ecm> haven't installed that one
<karolherbst> okay
<karolherbst> then I get back to blaming zink
<ecm> is there some way I can run without checking for zink ?
<ecm> to test this
<karolherbst> maybe GALLIUM_DRIVER=nouveau
<ecm> nope, doesn't change anything
<karolherbst> maybe something odd on the EGL side, I don't know, might need some EGL experts here... it's a bit weird that GLX works though
<ecm> yeah, think its an EGL-only problem
<ecm> is some lib or file missing that it's searching for ??
<karolherbst> shouldn't
<karolherbst> it's kinda weird, because at least the mpv strace seems to do nouveau calls just fine
<karolherbst> DRM_IOCTL_NOUVEAU_GEM_PUSHBUF => commands send to the GPU
<karolherbst> DRM_IOCTL_NOUVEAU_GEM_CPU_PREP => waiting for the GPU to finish
<ecm> yes, nouveau works fine for everything
<karolherbst> do you know if it works in EGL applications besides eglinfo?
<ecm> no, all EGL apps fail like this
<ecm> firefox webrender, mpv, etc.
<karolherbst> mhhh
<karolherbst> but mpv is using nouveau
<ecm> wait I'll send eglinfo
<karolherbst> is there a second GPU on your system by any chance?
<ecm> I have integrated graphics
<karolherbst> did you uninstall/disabled the driver in any way?
<ecm> of the integrated graphics ?
<karolherbst> yeah
lemonzest has quit [Quit: WeeChat 3.6]
<karolherbst> I wouldn't be surprised if EGL could get confused if it fails to create a context on the integrated GPU
<karolherbst> eglinfo will try to do so on all GPUs
<karolherbst> and other EGL applications might run into the same thing then
<ecm> ok, I reinstalled the mesa-intel-dri to test this, will reboot
ecm` has joined #dri-devel
ecm has quit [Read error: Connection reset by peer]
<ecm`> nope, no change
lemonzest has joined #dri-devel
<ecm`> how could I have disabled intel graphics ? some config file or var ?
<karolherbst> does `DRI_PRIME=0 glxinfo` or `DRI_PRIME=1 glxinfo` change anything?
<ecm`> no
<ecm`> both are same
<karolherbst> both nouveau?
Duke`` has quit [Ping timeout: 480 seconds]
<karolherbst> what's the integrated GPU anyway? Or is it a nouveau only system and I misunderstood?
<ecm`> they show vendor: Mesa renderer: NVA8
<ecm`> wait should vendor be nouveau for nouveau ?
<karolherbst> nah, the renderer is the chipset name
krushia has quit [Ping timeout: 480 seconds]
<karolherbst> but anyway, is this a single GPU system with the nvidia one as an integrated GPU?
<karolherbst> or are there two?
<ecm`> the former
<karolherbst> ahh okay, my mistake then
Company has quit [Quit: Leaving]
<ecm`> there is one GPU NVS-3100m, and i5-520m with integrated graphics
<ecm`> it's a t510
<karolherbst> okay, so there are two GPUs
<karolherbst> `lspci | grep -e VGA -e 3D` should tell anyway
<ecm`> that just gives 01:00.0 VGA compatible controller: NVIDIA Corporation GT218M [NVS 3100M] (rev a2)
<ecm`> only the gpu
<karolherbst> mhhh
<ecm`> so my integrated graphics isn't working ?
<karolherbst> it should still be listed as a PCI device in either case
<karolherbst> maybe it's disabled in the firmware? dunno
avoidr has joined #dri-devel
<karolherbst> maybe lspci lists it as something else?
<karolherbst> mind sharing your `dmesg`?
<kisak> Ironlake, oof. Could have been disabled in the bios since Optimus of that era wasn't quite viable.
<karolherbst> yeah....
<karolherbst> `[ 9.554213] intel ips 0000:00:1f.6: failed to get i915 symbols, graphics turbo disabled until i915 loads` oof
<karolherbst> what is even that
<ecm`> lol
<karolherbst> but in any case, I don't think this should cause any issues
<karolherbst> as the device is like not there... at least for now
<karolherbst> could change once you `modprobe i915`
<karolherbst> what's the output of `/sys/class/drm`?
<ecm`> nope modprobe i915 doesn't add anything to dmesg or EGL
<ecm`> card0 card0-DP-1 card0-DP-2 card0-DP-3 card0-LVDS-1 card0-VGA-1 renderD128 ttm version
<kisak> ecm`: can you check the bios for something like ... Config > Display > Switchable Graphics and if it exists, if it's set to something that would hinder you.
<ecm`> ok
<kisak> pulling random forum noise from a decade ago, so it's not really a trusted info source.
fab has quit [Quit: fab]
gouchi has quit [Remote host closed the connection]
ecm` has quit [Read error: Connection reset by peer]
bgs has quit [Remote host closed the connection]
ecm has joined #dri-devel
<ecm> no such setting lol, i guess it's too old for that
<ecm> BIOS has very few options
AndroUser2 has quit [Ping timeout: 480 seconds]
AndroUser2 has joined #dri-devel
idr has joined #dri-devel
AndroUser2 has quit [Remote host closed the connection]
<cmarcelo> karolherbst: for your comment about the struct of bools not having internal padding in CLC, jenatali pointed out in another context https://en.cppreference.com/w/cpp/types/has_unique_object_representations... seems to me you can stick a static_assert(std:has...<struct clc_...>()); in the cpp file to provide some compile time check for that. (or inside an ifdef cplusplus) which the cpp including it
<cmarcelo> will pick it up.
<karolherbst> ohhh.. interesting...
<karolherbst> let me play around with that then
<jenatali> Didn't I say that this morning when we were talking about it?
<jenatali> I was on my phone so I didn't have the exact name of the type trait, but I definitely said there was a type trait you could use to static assert it in C++
<karolherbst> ehh yeah.. but somehow my brain turned that into "check the size is known"
<karolherbst> yeah... I didn't think that far
AndroUser2 has joined #dri-devel
iive has quit [Quit: They came for me...]
<jenatali> Did you test the negative by adding padding and making sure it breaks? I fully expect it to, just wondering
kts has joined #dri-devel
<cmarcelo> jenatali: I tested { bool; bool; bool; int; } and it barfed
<cmarcelo> s/barfed/correctly failed to compile/
<jenatali> Great :)
<airlied> gfxstrand: got any opinion how best to do VK_MESA extensions, since we'd probably want to prototype them in mesa before sending to vulkan, but they'd involve patching the vk registry/headers we have in mesa which seems like a bad maintenacne plan
<airlied> but I also feel back upstreaming things to vulkan if we aren't going to document them to a competent level
<gfxstrand> airlied: For what sort of thing?
<gfxstrand> We do internal extensions all the time
<gfxstrand> If it's going to be an actual published extension, though, it should go in the spec.
kts has quit [Ping timeout: 480 seconds]
<airlied> gfxstrand: VK_MESA_video_decode_av1 is the first one
* airlied doesn't feel nice cluttering up the spec :-P
<gfxstrand> Why not?
<gfxstrand> video is already spec clutter. :-P
<airlied> because when VK_KHR_video_decode_av1 comes along it'll get funky to tell the difference :-P
<airlied> but maybe there isn't a nicer way
<gfxstrand> Then why are we shipping a thing that's just going to be replaced by a subtly different thing?
<gfxstrand> And how is that any different from the EXT mess we already have?
benjamin1 has joined #dri-devel
<airlied> we are shipping a thing because it will actually be useful in the current lifetime of the universe
jfalempe_ has joined #dri-devel
<gfxstrand> lol
<gfxstrand> Then spec it
<gfxstrand> If the universe dies before the KHR one comes out, there won't be any spec clutter.
jfalempe has quit [Read error: Connection reset by peer]
benjaminl has quit [Ping timeout: 480 seconds]
JohnnyonFlame has quit [Read error: Connection reset by peer]
AndroUser2 has quit [Remote host closed the connection]
AndroUser2 has joined #dri-devel
smiles_1111 has joined #dri-devel
anujp has quit [Remote host closed the connection]
Kayden has quit [Quit: -> happy hour]
ngcortes has quit [Remote host closed the connection]
AndroUser2 has quit [Remote host closed the connection]
AndroUser2 has joined #dri-devel
ecm has left #dri-devel [ERC 5.4 (IRC client for GNU Emacs 28.2)]
vliaskov has quit [Remote host closed the connection]
idr has quit [Ping timeout: 480 seconds]
ecm has joined #dri-devel
<ecm> karolherbst: looking at the mesa-23.1.2 source, loader_get_driver_for_fd gives the error "failed to get driver name for fd -1"
<ecm> and loader_get_driver_for_fd is used in platform_x11.c with dri2_dpy->fd_render_gpu
<ecm> but there is a debug log just before that if the dri2_dpy->fd_render_gpu == -1, which never appears in all these errors