ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar
vliaskov has quit [Read error: Connection reset by peer]
Danct12 has joined #dri-devel
columbarius has joined #dri-devel
benjamin1 has joined #dri-devel
co1umbarius has quit [Ping timeout: 480 seconds]
heat has joined #dri-devel
benjamin1 has quit [Ping timeout: 480 seconds]
benjamin1 has joined #dri-devel
heat_ has joined #dri-devel
heat has quit [Read error: Connection reset by peer]
benjamin1 has quit [Ping timeout: 480 seconds]
yuq825 has joined #dri-devel
Peste_Bubonica has quit [Quit: Leaving]
benjamin1 has joined #dri-devel
benjamin1 has quit [Ping timeout: 480 seconds]
yyds has joined #dri-devel
benjamin1 has joined #dri-devel
simon-perretta-img_ has joined #dri-devel
simon-perretta-img__ has joined #dri-devel
simon-perretta-img has quit [Ping timeout: 480 seconds]
benjamin1 has quit [Ping timeout: 480 seconds]
simon-perretta-img_ has quit [Ping timeout: 480 seconds]
kzd has quit [Quit: kzd]
djbw has quit [Read error: Connection reset by peer]
djbw has joined #dri-devel
JohnnyonFlame has joined #dri-devel
benjamin1 has joined #dri-devel
benjamin1 has quit [Ping timeout: 480 seconds]
oneforall2 has quit [Remote host closed the connection]
oneforall2 has joined #dri-devel
benjamin1 has joined #dri-devel
Danct12 is now known as Guest9121
Danct12 has joined #dri-devel
benjamin1 has quit [Ping timeout: 480 seconds]
benjamin1 has joined #dri-devel
benjamin1 has quit [Ping timeout: 480 seconds]
heat_ has quit [Ping timeout: 480 seconds]
benjamin1 has joined #dri-devel
yyds has quit [Remote host closed the connection]
yyds has joined #dri-devel
benjamin1 has quit [Ping timeout: 480 seconds]
aravind has joined #dri-devel
benjamin1 has joined #dri-devel
benjamin1 has quit [Ping timeout: 480 seconds]
urja has quit [Read error: Connection reset by peer]
urja has joined #dri-devel
benjamin1 has joined #dri-devel
YuGiOhJCJ has joined #dri-devel
benjamin1 has quit [Ping timeout: 480 seconds]
bmodem has joined #dri-devel
a-865 has quit [Ping timeout: 480 seconds]
benjamin1 has joined #dri-devel
Company has quit [Remote host closed the connection]
benjamin1 has quit [Ping timeout: 480 seconds]
a-865 has joined #dri-devel
Guest9121 has quit [Remote host closed the connection]
bgs has joined #dri-devel
bmodem has quit [Ping timeout: 480 seconds]
bmodem has joined #dri-devel
bmodem has quit [Excess Flood]
bmodem has joined #dri-devel
bmodem has quit [Excess Flood]
bmodem has joined #dri-devel
junaid has joined #dri-devel
benjamin1 has joined #dri-devel
junaid_ has joined #dri-devel
junaid has quit [Remote host closed the connection]
shashanks has joined #dri-devel
benjamin1 has quit [Ping timeout: 480 seconds]
junaid_ has quit []
shashanks_ has quit [Ping timeout: 480 seconds]
YuGiOhJCJ has quit [Quit: YuGiOhJCJ]
noodle has joined #dri-devel
<noodle> is it too late for the new nouveau vulkan driver to make it into 23.2?
frieder has joined #dri-devel
sima has joined #dri-devel
benjamin1 has joined #dri-devel
Haaninjo has joined #dri-devel
benjamin1 has quit [Ping timeout: 480 seconds]
smaeul_ has joined #dri-devel
smaeul has quit [Ping timeout: 480 seconds]
alanc has quit [Remote host closed the connection]
alanc has joined #dri-devel
ohmacs^ has quit [Ping timeout: 480 seconds]
ohmacs^ has joined #dri-devel
benjamin1 has joined #dri-devel
RAOF has quit [Remote host closed the connection]
RAOF has joined #dri-devel
benjamin1 has quit [Ping timeout: 480 seconds]
Leopold_ has joined #dri-devel
sghuge has quit [Remote host closed the connection]
sghuge has joined #dri-devel
Leopold__ has quit [Ping timeout: 480 seconds]
TMM has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
TMM has joined #dri-devel
Leopold_ has quit [Remote host closed the connection]
Leopold_ has joined #dri-devel
f11f12 has joined #dri-devel
benjamin1 has joined #dri-devel
<airlied> noodle: there would be no point including it even if it could make it
<airlied> having a release of a in development driver would waste a lot of time
<noodle> airlied: virtio-experimental is however released
noodle has left #dri-devel [Leaving]
benjamin1 has quit [Ping timeout: 480 seconds]
pcercuei has joined #dri-devel
<airlied> yeah which is a bad idea since some kernel patches are not upstream
jani has quit []
ngcortes has quit [Ping timeout: 480 seconds]
jani has joined #dri-devel
jani has quit []
jani has joined #dri-devel
swalker__ has joined #dri-devel
RAOF_ has joined #dri-devel
swalker_ has joined #dri-devel
RAOF_ has quit [Remote host closed the connection]
swalker_ is now known as Guest9144
RAOF_ has joined #dri-devel
RAOF has quit [Ping timeout: 480 seconds]
RAOF_ is now known as RAOF
swalker__ has quit [Ping timeout: 480 seconds]
benjamin1 has joined #dri-devel
lynxeye has joined #dri-devel
<zzoon_vacations_till_6th_Aug[m> airlied: IIRC, you already have a branch for av1 decoding for anv. so you're going to work on it for landing the feature? (to anv)
benjamin1 has quit [Ping timeout: 480 seconds]
yyds has quit [Quit: Lost terminal]
donaldrobson has joined #dri-devel
yyds has joined #dri-devel
benjamin1 has joined #dri-devel
benjamin1 has quit [Ping timeout: 480 seconds]
benjamin1 has joined #dri-devel
benjamin1 has quit [Ping timeout: 480 seconds]
JohnnyonFlame has quit [Read error: Connection reset by peer]
benjamin1 has joined #dri-devel
benjamin1 has quit [Ping timeout: 480 seconds]
andrey-konovalov has quit [Quit: ZNC - http://znc.in]
sumits has quit [Quit: ZNC - http://znc.in]
aravind has quit [Ping timeout: 480 seconds]
<emersion> daniels: what can i do to unstuck the vk wl tearing MR?
benjamin1 has joined #dri-devel
benjamin1 has quit [Ping timeout: 480 seconds]
donaldrobson_ has joined #dri-devel
donaldrobson has quit [Ping timeout: 480 seconds]
benjamin1 has joined #dri-devel
andrey-konovalov has joined #dri-devel
benjamin1 has quit [Ping timeout: 480 seconds]
aravind has joined #dri-devel
benjamin1 has joined #dri-devel
donaldrobson_ has quit [Ping timeout: 480 seconds]
donaldrobson has joined #dri-devel
benjamin1 has quit [Ping timeout: 480 seconds]
anarsoul|2 has quit [Remote host closed the connection]
anarsoul has joined #dri-devel
frieder has quit [Read error: Connection reset by peer]
bmodem has quit [Ping timeout: 480 seconds]
benjamin1 has joined #dri-devel
benjamin1 has quit [Ping timeout: 480 seconds]
devarsh_ has quit [Quit: Connection closed for inactivity]
donaldrobson has quit [Ping timeout: 480 seconds]
donaldrobson has joined #dri-devel
yyds has quit [Quit: Lost terminal]
yyds has joined #dri-devel
benjamin1 has joined #dri-devel
bgs has quit [Remote host closed the connection]
benjamin1 has quit [Ping timeout: 480 seconds]
vliaskov has joined #dri-devel
yyds has quit [Remote host closed the connection]
<zmike> how is drm-shim supposed to work? I'm trying it for r600 and it doesn't seem to be picking up the shim at all
donaldrobson_ has joined #dri-devel
donaldrobson has quit [Ping timeout: 480 seconds]
<zmike> cc alyssa
<daniels> emersion: I've got it on our list to try to push through
<emersion> ty!
Danct12 has quit [Quit: WeeChat 4.0.2]
Danct12 has joined #dri-devel
benjamin1 has joined #dri-devel
aravind has quit [Ping timeout: 480 seconds]
benjamin1 has quit [Ping timeout: 480 seconds]
<zmike> aha I got it
aravind has joined #dri-devel
Company has joined #dri-devel
benjamin1 has joined #dri-devel
Danct12 has quit [Quit: WeeChat 4.0.3]
benjamin1 has quit [Ping timeout: 480 seconds]
donaldrobson has joined #dri-devel
donaldrobson_ has quit [Ping timeout: 480 seconds]
junaid has joined #dri-devel
greenjustin_ has joined #dri-devel
<alyssa> gfxstrand: does it make sense to have textures that are simultaneously bindless and non-bindless (indicated with a backend_flag)
<alyssa> this is making sense to me did i not sleep enough
<zmike> how would a bindless and non-bindless texture work
<alyssa> for gl, regular non-bindless texture with a texture_index(+texture_offset), and also a texture_handle that points to the descriptor in memory (i.e. texture_handle == binding_table_base_address + texture_index*stride)
<alyssa> for vk, same thing but with more descriptor sets
<alyssa> so that way the hardware can use the non-bindless part but software can get the address of the descriptor from the bindless part
<alyssa> maybe my monolithic texture lowering is Bad and that's the problem
<alyssa> a relevant case is reading from an array texture
<zmike> sounds hard to comprehend, but my brain is very smooth
<alyssa> since we use the hardware array texture read but also we clamp the array index in software
<alyssa> and that all happens in the backend
<alyssa> so the driver needs to turn all array texture reads into bindless access since the clamping creates a txs which only works on bindless
<alyssa> but then the hardware read ends up being bindless too for no good reason
<alyssa> but maybe the real answer here is that the clamping should happen before the driver decides whether to force bindless access, so the txs is separate from the tex
benjamin1 has joined #dri-devel
<alyssa> that seems significantly more sensible actually..
<alyssa> thanks faith
benjamin1 has quit [Ping timeout: 480 seconds]
junaid_ has joined #dri-devel
junaid has quit [Remote host closed the connection]
junaid_ has quit []
donaldrobson has quit [Ping timeout: 480 seconds]
junaid has joined #dri-devel
donaldrobson has joined #dri-devel
greenjustin_ has quit [Remote host closed the connection]
greenjustin_ has joined #dri-devel
<DavidHeidelberg[m]> I'm thinking about a small weekend hackfest after XDC; who would be interested in joining? On of the topics of it would be CI, but I would be happy if any people interested working on any Mesa part would join.
yuq825 has left #dri-devel [#dri-devel]
DPA has quit [Ping timeout: 480 seconds]
simon-perretta-img__ has quit []
f11f12 has quit [Quit: Leaving]
simon-perretta-img has joined #dri-devel
benjamin1 has joined #dri-devel
yyds has joined #dri-devel
benjamin1 has quit [Ping timeout: 480 seconds]
<alyssa> I kinda wish gallium had a state tracker (~:
<HdkR> VKGallium
junaid has quit [Remote host closed the connection]
DPA has joined #dri-devel
benjamin1 has joined #dri-devel
<gfxstrand> alyssa: It's not totally crazy
<gfxstrand> alyssa: Like, you could reserve 5 bits of backend_flag to store the set index or something like that if you wanted.
<gfxstrand> And make the semantics u[set_idx] + bindless_handle
benjamin1 has quit [Ping timeout: 480 seconds]
<alyssa> gfxstrand: hmm that's actually reasonable as heck
<gfxstrand> alyssa: In Vulkan, the descriptor set index will always be known at compile time for textures/images.
<gfxstrand> Exact descriptor offset won't but the index will.
<alyssa> sure
<alyssa> same for GL with my fake GL descriptor sets
<alyssa> is that terrible?
<alyssa> that I have descriptor sets in my GL driver? :P
<gfxstrand> Sure, you can just fix it to 0 or 1 or wherever you put the fake descriptor set
<gfxstrand> Terrible? Not at all.
<alyssa> no, I have multiple :-D
<gfxstrand> Sure, why not?
<alyssa> :-D
<alyssa> They're not really descriptor sets
<alyssa> It's just that, when merging shader stages, a single hardware shader needs to be able to access the binding tables from multiple shader stages
<alyssa> so I model this the same way Zink+AGXV would ... descriptor sets per stage
<gfxstrand> Sure
<DemiMarie> Why is writing a Vulkan driver and using Zink for OpenGL much harder than writing an OpenGL driver directly, given that even if Vulkan is not a great fit to the hardware, OpenGL seems worse?
<DemiMarie> alyssa: I trust that if you say it is, then it is. I’m just curious _why_ it is.
<gfxstrand> Vulkan has a lot more boilerplate to get going. It also has a lot less room for hacks.
<gfxstrand> Hardware doesn't support primitive restart? On GL, you can can scan from the CPU. Not great for performance, but great for getting a driver off the ground.
<i509vcb> There is also the fact that for zink to get started you need some pretty non-default extensions
<gfxstrand> Queries funky? In GL, you can do whatever you want. In Vulkan, you have to figure out how to get it to fit in a query pool and use compute shaders as needed to make vkCmdCopyQueryPoolResults() happen.
<gfxstrand> There's that, too.
<gfxstrand> A Vulkan 1.0 driver capable of running some Vulkan apps and a Zink driver are not the same feature level.
<DemiMarie> Why does Zink need so many extensions?
<i509vcb> I'm trying to understand why I am getting asserts with trying to setup sync types in agxv before I see if it's a kernel bug. https://gitlab.freedesktop.org/-/snippets/7672
<alyssa> 15:40 <gfxstrand> A Vulkan 1.0 driver capable of running some Vulkan apps and a Zink driver are not the same feature level.
<i509vcb> The assert I hit has this "We can only have one timeline mode" comment
<alyssa> this is the bigger thing for me
<alyssa> I already have a GL 3.1 driver, right
<alyssa> I would rather spend my time on ARB_geometry_shader so we get a GL 3.3 driver
<alyssa> instead of agxv plumbing so Zink gives us a GL 2.1 driver and DXVK/VKD3D don't work
<alyssa> Eventually, we'll probably have a gl4 native driver and zink+agxv will be gl4 and we'll look at performance comparisons to decide the fate of the native driver
<DemiMarie> Thanks alyssa!
<alyssa> in the mean time ... zink does not solve any of my problems, and it creates a bunch of new ones
<i509vcb> DemiMarie: some features in minimum vulkan 1.0 that are required are implemented with extensions. VK_EXT_custom_border_color is one of those requirements apparently
buduar has joined #dri-devel
<i509vcb> Plus if you don't want to languish in slow wl_shm presentation you need to implement things like VK_EXT_external_memory_dma_buf
<gfxstrand> That's a bit of a salty take but not too far off.
<alyssa> gfxstrand: I mean, Zink does solve problems
<alyssa> Just not any of the ones I have
<alyssa> at least not right now
<DemiMarie> alyssa: to be clear, this is not a feature request (and I hope it did not come across as such!)
<gfxstrand> It also depends a lot on your priorities. If you're bringing up hardware that the Linux desktop has never run on, what are you going to do first? A GL driver that's able to run GNOME? Or a Vulkan driver that's able to run Zink at a level where it can run GNOME?
<DemiMarie> makes sense
<alyssa> Yup
<i509vcb> Imagination seems to have taken the zink route from what I recall?
<alyssa> Gallium makes it really easy to hack enough together for a GL 2 driver that can run GNOME
<alyssa> i509vcb: Imagination has a different set of problems than we do, though
<gfxstrand> I mean, sure, if you're trying to reduce the over-all work required to get to Vulkan + GL nervana, Vulkan-first might make sense. But if your goal is to run GNOME as quickly as possible, writing a gallium driver is the way to do that.
<alyssa> ++
<gfxstrand> Part of the reason why I can forget about GL for NVK is because the nouveau GL driver can already run GNOME.
<alyssa> +++
<gfxstrand> Sure, it sucks, but it can run GNOME.
<gfxstrand> And its ability to run GNOME is good enough to composite whatever game you're running on NVK+DXVK.
<alyssa> ++++
<DemiMarie> How much of this is because Mesa has more helpers for OpenGL than for Vulkan?
<gfxstrand> So for going from where we are today to the maximum gaming experience as fast as possible, forgetting about GL entirely (including Zink, sorry) and focusing on solid Vulkan with DXVK features is the path.
benjamin1 has joined #dri-devel
<gfxstrand> DemiMarie: Some? Vulkan makes it harder to have helpers but also there's definite gaps and we're working on filling those gaps.
<HdkR> Perfect for running Neverball under
<gfxstrand> But, also, how wicked that curve is depends on hardware.
<gfxstrand> On NVIDIA, we don't need many of those helpers because the hardware has basically everything built-in.
<alyssa> AGX is basically a really fast software rasterizer ;~P
<gfxstrand> Literally the only things not built in are MSAA resolves and blits and blits are kinda there but horrible.
<DemiMarie> gfxstrand: I see. Is part of it because Vulkan requires features that would typically be brought up later?
<alyssa> We need ALL the helpers :D
<gfxstrand> Not really
<gfxstrand> But certain things like copying images to/from tiled memory are pretty basic things you don't think about.
<ids1024[m]> You're also probably more likely to be running Vulkan-based games on an Nvidia GPU than on Apple Silicon anyway. Though that would be interesting with a good Vulkan driver and fast x86 emulation.
* DemiMarie forgot that games are the main reason for these fancy new APIs
<gfxstrand> In theory FEX-EMU should make that tractable.
<buduar> Why not to go at GL ES+EGL and program a really performant driver under whatever gpu accelerator, it's not very hard?
<HdkR> gfxstrand: <3
<alyssa> ids1024[m]: Right, that's the other part of the calculus ... NVIDIA GPUs have full hardware support for {geometry shaders, tessellation shaders, transform feedback, ..} and it's reasonably straightforward to implement in the drivers
<gfxstrand> buduar: I can't tell if that's sarcastic or nog.
<alyssa> So going from 0 to DXVK on NVIDIA hardware is a lot more straightforward than AGX where none of the above has effective hardware support
<gfxstrand> :D
<alyssa> but a Vulkan driver that doesn't support those features (dumb as they are) won't be able to run any games other than, like, VkQuake
<DemiMarie> alyssa: is MoltenVK helpful at all, at least in terms of “how do I translate X to something AGX actually implements”?
<i509vcb> I guess you could describe agx as being very shader heavy?
<alyssa> i509vcb: yeah
<alyssa> DemiMarie: absolutely not
<alyssa> moltenvk is a massive pile of hacks
<buduar> gfxstrand, it's not sarcastic, ES does everything correctly, the precision can be lifted, cause they save die area , es is best.
<alyssa> and moltenvk is broken in all the places you would expect given where agx doesn't have support for things
<ids1024[m]> DemiMarie: > * <@demi:invisiblethingslab.com> forgot that games are the main reason for these fancy new APIs
<ids1024[m]> For better or worse, for most normal graphics stuff that aren't games on Linux you just need GLES 2.0 or so. Maybe some fancy professional video software also does fancy things with Vulkan.
<gfxstrand> buduar: First off, "really performant driver" is "very hard" no matter what API or hardware.
<alyssa> TBH, seeing moltenvk claim support for stuff makes me immensely sad because we're trying to do things Right but they get to advertise the punch sooner by layering hacks on hacks and shipping the broken thing fast
<buduar> gfxstrand, it's only tiny extension , have a look at this https://github.com/jermp/s_indexes
<DemiMarie> alyssa: wow, I was not expecting that!
<DemiMarie> Does AGX have any fixed function stuff at all?
<i509vcb> Well GLES 3.2 is nice to have. From what I recall there is some HDR related stuff that a wayland compositor can actually use there
<i509vcb> (or was it 3.0?)
<alyssa> DemiMarie: sure. It's got a rasterizer, texture fetch hardware, and .. yeah those are the biggies
<buduar> the groundwork has been there for so many years, the natural continued extension is only that, and it's magical
<alyssa> Depth/stencil unit, primitive assembly, clipping/culling
<alyssa> It does have a tessellator but it's not sufficient for any of GL/ES/VK/D3D
<alyssa> MoltenVK is broken in precisely those places, Apple's GL driver falls over to tessellating on the CPU and your performance goes off a cliff
<i509vcb> Metal does apparently has mesh related stuff advertised but I imagine how the hardware implements it can be very weird
<alyssa> i509vcb: there's no mesh hardware, it's done entirely software
<i509vcb> oof
<i509vcb> that sounds brutal
<DemiMarie> alyssa: that’s interesting, not least because it tells me what stuff genuinely cannot be emulated in shaders efficiently
<alyssa> our current understanding is that the mesh shaders run as compute kernels that generate geometry by something like device_generated_commands, creating draws with regular vertex shaders
<DemiMarie> GPU-side JIT?
<alyssa> as far as we know, the only trick it has (that an application doesn't have) is a mechanism to allocate memory dynamically from a shader
<alyssa> but even that is implemented in firmware with a kernel dance, not hardware
<buduar> gfxstrand, also look at this what they managed to hack on dma https://people.ece.cornell.edu/land/courses/ece4760/RP2040/C_SDK_DMA_machine/DMA_machine_rp2040.html
<i509vcb> I'd guess if agx is so shader heavy Apple would try to put as much die space into compute/shader execution
<alyssa> i509vcb: well yeah, that's the tradeoff. drop all the fixed function hardware and you can get more shader cores
<alyssa> for implementing Metal, agx is the right design
<alyssa> for D3D or VK or GL... less great.
<alyssa> but critically, entirely possible.
<alyssa> I want to defeat the narrative that AGX somehow "can't" support conformant GL and Vulkan
<gfxstrand> buduar: I don't see how any of those links have anything to do with what's being discussed.
<DemiMarie> how much will not having that hardware hurt Vulkan performance?
<alyssa> It can. Apple chooses not to.
<alyssa> That's a political choice and one that Apple should not be making
<i509vcb> I've found agx to be quite performant from my use with the gl 3.1 driver
<alyssa> and everywhere MoltenVK fails conformance, that's on Apple
<DemiMarie> alyssa: why is that?
<DemiMarie> why is it on Apple and not a MoltenVK bug?
<i509vcb> Someone is certainly going to get the wild idea of trying to run mesa's asahi driver on macOS to get proper Vulkan eventually
<alyssa> i'd rather people switch to linux :~)
<DemiMarie> I had the same thought
<buduar> gfxstrand, but i do see, cause vulkan there is no need to handle any cpu threads, the compilation is so tiny it's bus traffic only, dma can do that
<buduar> when you order bunch of loops in the compiler, dma can handle it
<buduar> but this correct compiling is quite tiny
<DemiMarie> Also does this channel have logs?
<gfxstrand> Yeah, no... That's not how any of this works.
<i509vcb> DemiMarie: yes
<DemiMarie> i509vcb: where?
<i509vcb> same place as #wayland
<DemiMarie> i509vcb: thanks
<i509vcb> Back to what I was initially here to ask...
<i509vcb> gfxstrand: on the snippet I linked above, what would be typically causing that weird assert for vk_sync?
<gfxstrand> i509vcb: Good question. That's definitely odd.
bmodem has joined #dri-devel
<gfxstrand> Oh, that assert has a comment on it! You have more than one timeline type
<i509vcb> This happens in agxv if you were wondering
<gfxstrand> i509vcb: Does your kernel driver support timeline sync objects?
<gfxstrand> If so, then you don't need all that `sync_timeline_type` stuff.
<i509vcb> From what I recall yes, but it's untested
<gfxstrand> Sounds like a good time to test it!
<i509vcb> So I guess I'll need to talk to lina about finding the bugs then
<gfxstrand> The other option is that you can do `device->drm_syncobj_type.features &= ~VK_SYNC_FEATURE_TIMELINE` to disable timeline sync objs.
<gfxstrand> And then the emulation will work fine.
<i509vcb> yup I guess it's a problem with the kernel driver since the timeline deqps just hang forever
buduar has quit [Ping timeout: 480 seconds]
benjamin1 has quit [Ping timeout: 480 seconds]
<i509vcb> Hmm although the emulation doesn't like a different assert apparently
<gfxstrand> We really should make the vk_drm_syncobj_get_type() take a `bool supports_timelines`
buduar has joined #dri-devel
<alyssa> i509vcb: timeline sync needs to work for the kernel merge, so please branch off the driver with real timeline sync and a deqp case hitting the kernel bug and send it to lina for debug
<alyssa> thank s:)
<i509vcb> ok so I guess it's time to build kernels
alyssa has left #dri-devel [#dri-devel]
<gfxstrand> Yes, it should work for the kernel merge
Danct12 has joined #dri-devel
<buduar> Sure it does work so, if you offload to dma, there is no need for threads on CPU, and there's no need to do any locking with performance reasons. And there is no need to fixate sport results by killing underaged Estonian kids. Thread would issue bus instructions and after that alu, for performance reasons it's not needed, human is not able to trace such perf. If you compile correctly there's no CPU threads needed to fill the pipeline with more
<buduar> data, leave them to os smp.
bmodem has quit [Quit: bmodem]
bmodem has joined #dri-devel
Guest9144 has quit [Remote host closed the connection]
anki is now known as xantoz
aravind has quit [Ping timeout: 480 seconds]
lynxeye has quit [Quit: Leaving.]
crabbedhaloablut has joined #dri-devel
yyds has quit [Remote host closed the connection]
donaldrobson has quit [Ping timeout: 480 seconds]
greenjustin_ is now known as greenjustin
buduar has quit []
benjamin1 has joined #dri-devel
djbw has quit [Remote host closed the connection]
djbw has joined #dri-devel
vliaskov has quit [Ping timeout: 480 seconds]
alyssa has joined #dri-devel
<alyssa> zmike: Does Zink support stores&atomics from geometry shaders?
<alyssa> (provided the underlying vulkan driver supports vertexPipelineStoresAndAtomics, I mean)
<alyssa> If so -- I am wondering if it is subtly broken
<alyssa> The Vulkan spec ("9.8.1 Geometry Shaer Execution") implies a geometry shader might be invoked multiple times
<alyssa> but the GL spec ("7.13.1 Shader Memory Access Ordering") implies a geometry shader is invoked exactly once per primitive
<anholt> alyssa: our conclusion has been that the GL spec didn't really mean that, and tests have been fixed over time to allow multiple execution.
<alyssa> anholt: Alright :+1:
<alyssa> So I can implement the Vulkan behaviour even for GL and hopefully everyone is happy?
<anholt> (or, maybe, the GL spec meant that at the time, but they realized whoops, and also nobody needed that detail, so we all just pretended that's what it meant all along)
* alyssa doesn't understand how side effects in vertex shaders can possibly be *useful*, but..
<alyssa> whole bunch of KHR-GLES31 tests look bogus
djbw has quit [Read error: Connection reset by peer]
<alyssa> KHR-GLES31.core.shader_atomic_counters.basic-usage-vs, KHR-GLES31.core.shader_atomic_counters.advanced-usage-multi-stage, etc
* alyssa can make the tests pass with enough hacks but that's not really the point
benjamin1 has quit [Ping timeout: 480 seconds]
heat_ has joined #dri-devel
heat_ has quit [Remote host closed the connection]
Leopold__ has joined #dri-devel
zf has quit [Ping timeout: 480 seconds]
Leopold_ has quit [Ping timeout: 480 seconds]
benjamin1 has joined #dri-devel
djbw has joined #dri-devel
junaid has joined #dri-devel
benjamin1 has quit [Ping timeout: 480 seconds]
benjamin1 has joined #dri-devel
bmodem has quit [Ping timeout: 480 seconds]
junaid has quit [Ping timeout: 480 seconds]
Kayden has quit [Quit: -> JF]
benjamin1 has quit [Ping timeout: 480 seconds]
sassefa has joined #dri-devel
junaid has joined #dri-devel
benjamin1 has joined #dri-devel
aravind has joined #dri-devel
junaid has quit [Remote host closed the connection]
junaid has joined #dri-devel
bylaws has joined #dri-devel
<bylaws> alyssa: adreno geom shaders execute once per vertex in both GL and VK so should definitely be fine
<alyssa> :+1:
<alyssa> wait the geom shader is once per vertex?!
benjamin1 has quit [Ping timeout: 480 seconds]
<bylaws> Output vertices I mean
<bylaws> It's invoked max_vertices times per prim
sima has quit [Ping timeout: 480 seconds]
<alyssa> that's.. also bizarre, wow, lol
benjamin1 has joined #dri-devel
crabbedhaloablut has quit []
gouchi has joined #dri-devel
gouchi has quit [Remote host closed the connection]
sassefa has quit [Remote host closed the connection]
<robclark> alyssa: it does let the GS run for each output vertex in parallel
Kayden has joined #dri-devel
<alyssa> robclark: fair enough. I guess that helps reduce divergence and stuff?
<robclark> I guess it really depends on the structure of the GS..
ngcortes has joined #dri-devel
benjamin1 has quit [Ping timeout: 480 seconds]
junaid has quit [Ping timeout: 480 seconds]
aravind has quit [Ping timeout: 480 seconds]
aravind has joined #dri-devel
ayaka has quit [Ping timeout: 480 seconds]
ngcortes has quit [Ping timeout: 480 seconds]
MoeIcenowy has quit [Ping timeout: 480 seconds]
ishitatsuyuki has quit [Ping timeout: 480 seconds]
benjamin1 has joined #dri-devel
<doras> karolherbst: is anything needed for https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/318?
<karolherbst> not really. I should probably just merge it...
benjamin1 has quit [Ping timeout: 480 seconds]
<karolherbst> someobdy would have to make a release, I do not know to make one, but I guess it's fine to wait anyway
Haaninjo has quit [Quit: Ex-Chat]
karolherbst has quit [Remote host closed the connection]
karolherbst has joined #dri-devel
benjamin1 has joined #dri-devel
aravind has quit [Ping timeout: 480 seconds]
alyssa has quit [Quit: alyssa]
ngcortes has joined #dri-devel
MoeIcenowy has joined #dri-devel
ayaka has joined #dri-devel
<doras> karolherbst: thanks. I agree that it should be fine to wait.
benjamin1 has quit [Ping timeout: 480 seconds]
a-865 has quit [Quit: ChatZilla 0.17 [SeaMonkey 2.53.17/20230727221859]]
benjamin1 has joined #dri-devel
greenjustin has quit [Ping timeout: 480 seconds]
paulk-bis has joined #dri-devel
paulk has quit [Ping timeout: 480 seconds]
ngcortes has quit [Remote host closed the connection]
ngcortes has joined #dri-devel
TMM has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
TMM has joined #dri-devel
benjamin1 has quit [Ping timeout: 480 seconds]
ngcortes has quit [Ping timeout: 480 seconds]
benjamin1 has joined #dri-devel
pa has quit [Ping timeout: 480 seconds]
<karolherbst> I'm really running into the weirdest issues... fs_visitor::split_virtual_grfs where num_vars is 4750009 :')
<gfxstrand> Uh...
benjamin1 has quit [Ping timeout: 480 seconds]
<karolherbst> ehh.. might be a memory corruption actually
<karolherbst> maybe not
<karolherbst> the nir_shader be like `constant_data_size: 4000000,`
<gfxstrand> hehe
<karolherbst> yeah.. nir_print_shader in gdb throws a `Cannot access memory at address 0x7ffffeddcb6f`
<karolherbst> something smells
<karolherbst> prog vars with chip-spv seem to work great, except...
<karolherbst> probably some weirdo overflow somewhere...
<gfxstrand> valgrind is your friend
<karolherbst> valgrind is not my bf anymore, libasan is my new bff
<karolherbst> but it's kinda weird
shashanks_ has joined #dri-devel
<karolherbst> I don't even know why gdb complains about it, because I don't even know what that pointer is supposed to be
<karolherbst> the nir is somewhere else
<karolherbst> nir_print_shader is somewhere else...
<karolherbst> ehhh... probably my stack just got trashed
Kayden has quit [Quit: -> home]
ngcortes has joined #dri-devel
shashanks has quit [Ping timeout: 480 seconds]
<karolherbst> "con 64 %6750014 = load_const (0x00000000003d08cc = 3999948)" mhhhhh
<karolherbst> we probably shouldn't inline massive loops or something
<karolherbst> acutally.. what's the spirv
<karolherbst> the spirv literally has 65 SSA values
<karolherbst> yeah.... something really weird happens and a very small nir explodes massively in size
<karolherbst> "decl_var constant INTERP_MODE_NONE float[1000000] __chip_var__initializer = null" ah yes....
<karolherbst> we _might_ want to lower that to a loop
<gfxstrand> karolherbst: lol
<karolherbst> yeah.. it's the memcpy lowering
pa has joined #dri-devel
<gfxstrand> Yeah, we probably want a threshold on the size. :joy:
<karolherbst> or we alwyas lower it to a loop and let the loop unroller do its magic
<gfxstrand> And maybe something smarter if we know alignments and that the size is well-aligned.
<karolherbst> myyy
<karolherbst> mhhh
<gfxstrand> Because actual loop case copies one byte at a time
<karolherbst> maybe
<karolherbst> yeah...
<karolherbst> I think it's one element rather
<karolherbst> or is it really byte?
<gfxstrand> byte
<karolherbst> it's all based on derefs still tho
<gfxstrand> It's memcpy, mate
<karolherbst> ehhh
<karolherbst> right..
<karolherbst> though
<gfxstrand> (Imagine I said that in a semi-passible Austrailian accent)
<karolherbst> the lowered nir uses uvec4
<karolherbst> ahh yeah
<karolherbst> memcpy lowering isn't that dumb
<gfxstrand> We could probably make the loop do vec4...
<karolherbst> lower_memcpy has this "copy_type_for_byte_size" function which decides how big an element is
<karolherbst> and the biggeest thing is vec4
<gfxstrand> Like, emit 3 loops: copy in vec4s, copy what's left in dwords, copy what's left in bytes.
<karolherbst> it's already done it seems :D
<gfxstrand> Not in the loop case, it isn't
<karolherbst> you even wrote the code
<karolherbst> ahh
<karolherbst> ohh, that's what you meant
<gfxstrand> The unrolled case can do anything
<gfxstrand> The loop case needs work
<karolherbst> I see...
<karolherbst> uhh
<karolherbst> right
<karolherbst> if the size is a constant then we are smart
<gfxstrand> Yup
<karolherbst> otherwise we'd need peak memcpy...
<karolherbst> uhm...
<karolherbst> mhhh
<karolherbst> let's just make the const size thing also emit a loop and let the loop unroller optimize it for now... maybe
<karolherbst> we can always optimize the variable thing later
<karolherbst> mhhh
<karolherbst> might as well just merge those branches and be smarter about element size selection...
<karolherbst> maybe it wouldn't be too bad after all
<karolherbst> I can try to write the code... shouldn't be tooooo bad
<gfxstrand> I'm already typing
<karolherbst> okay
<karolherbst> but I like how chip-spv handrolls their Initializer kernel and doesn't use the SPIR-V initialzer stuff... saves me the trouble of implementing that as well
pcercuei has quit [Quit: dodo]
orbea has quit [Remote host closed the connection]
orbea has joined #dri-devel
<karolherbst> well.. it's faster this way anyway as then you are not limited to one thread...
alyssa has joined #dri-devel
Kayden has joined #dri-devel
a-865 has joined #dri-devel
ngcortes has quit [Remote host closed the connection]
ngcortes has joined #dri-devel
heat has joined #dri-devel