ChanServ changed the topic of #asahi-gpu to: Asahi Linux GPU development (no user support, NO binary reversing) | Keep things on topic | GitHub: https://alx.sh/g | Wiki: https://alx.sh/w | Logs: https://alx.sh/l/asahi-gpu
brolin has quit [Ping timeout: 480 seconds]
* alyssa is talking to a maintainer of another (non-Mesa/Linux project)
<alyssa> wiki style has come up, lol
<alyssa> "I don't need review, you do" is unbalanced
<alyssa> "I need review just like you" is one way to balance but
<alyssa> "nobody needs to pre-merge review, look post-merge" is another (~:
<alyssa> =P
brolin has joined #asahi-gpu
flibitijibibo has quit [Quit: Leaving]
brolin has quit [Ping timeout: 480 seconds]
brolin has joined #asahi-gpu
hightower3 has joined #asahi-gpu
hightower2 has quit [Ping timeout: 480 seconds]
Method has quit [Ping timeout: 480 seconds]
brolin has quit [Ping timeout: 480 seconds]
stsmwg has joined #asahi-gpu
cylm has joined #asahi-gpu
r0ni has quit [Quit: Textual IRC Client: www.textualapp.com]
pyropeter1 has joined #asahi-gpu
eidial has joined #asahi-gpu
PyroPeter_ has quit [Ping timeout: 480 seconds]
eidial has quit [Remote host closed the connection]
drubrkletern has joined #asahi-gpu
cylm has quit [Ping timeout: 480 seconds]
Method has joined #asahi-gpu
cr1901 has quit [Remote host closed the connection]
cr1901 has joined #asahi-gpu
drubrkletern has quit [Remote host closed the connection]
cr1901 has quit [Remote host closed the connection]
cr1901 has joined #asahi-gpu
<chadmed> i feel dirty for not reviewing jannau's ebuilds before slapping that rebase and merge button
<chadmed> one day our luck is going to run out an 300 issues are going to be posted about some silly lil ebuild nuking peoples root fs
<jannau> chadmed: -EWRONGCHAN, but it's mostly just renaming ebuilds for version bumps
<chadmed> ah i was just adding $0.02 to alyssa's musings on review :P
<marcan> nobody reviews my kernel code and apparently I broke suspend in the latest release, sooooo :P
<jannau> I look at them post merge but I wouldn't have spend enough time on brcmfmac to spot that the change breaks suspend
hightower3 has quit [Ping timeout: 480 seconds]
chadmed has quit [Remote host closed the connection]
chadmed has joined #asahi-gpu
nimprod3l has joined #asahi-gpu
<sven> I try to not look at Broadcom stuff if i can avoid it for my sanity ;)
<austriancoder> alyssa: I am always open to learn something
hightower2 has joined #asahi-gpu
nimprod3l has quit [Quit: Leaving]
nsklaus has joined #asahi-gpu
hays has quit [Remote host closed the connection]
cylm has joined #asahi-gpu
<alyssa> mew
hays has joined #asahi-gpu
Brandon has joined #asahi-gpu
Brandon is now known as heliumlake
c10l48 has quit []
c10l48 has joined #asahi-gpu
ColnMustard has joined #asahi-gpu
heliumlake has quit [Remote host closed the connection]
ColnMustard has left #asahi-gpu [Leaving]
brandon has joined #asahi-gpu
brandon is now known as heliumlake
heliumlake has quit []
i509vcb has quit [Quit: Connection closed for inactivity]
mkurz has quit [Ping timeout: 480 seconds]
mkurz has joined #asahi-gpu
r0ni has joined #asahi-gpu
mkurz has quit [Remote host closed the connection]
SinSinati5 has joined #asahi-gpu
stipa has joined #asahi-gpu
SinSinati5 has quit [Remote host closed the connection]
i509vcb has joined #asahi-gpu
c10l48 has quit [Ping timeout: 480 seconds]
nimprod3l has joined #asahi-gpu
<alyssa> early frag tests + side fx: first instruction is a `sample_mask ~0, 1`, which I guess triggers tests to happen
<alyssa> after that normal
<alyssa> (before that is wait pix 512, 3 and fencing)
<alyssa> 3c: f5aa memory_barrier 2, 2, 10
<alyssa> 3e: f5ae memory_barrier 3, 2, 10
<alyssa> 40: f5a9 memory_barrier 2, 1, 10
<alyssa> 42: f5ad memory_barrier 3, 1, 10
<alyssa> fencing needed for image_write -> texture_load
<alyssa> for 2D
<alyssa> interestingly no wait is required
<alyssa> maybe there's a builtin wait
nimprod3l has quit [Quit: Leaving]
<Guest2261> mew
cylm has quit [Ping timeout: 480 seconds]
<alyssa> Guest2261: indeed
<alyssa> a control barrier with a texture fence:
<alyssa> 4e: f596 memory_barrier 1, 2, 9
<alyssa> 50: f5aa memory_barrier 2, 2, 10
<alyssa> 52: f5ae memory_barrier 3, 2, 10
<alyssa> 54: 6800 threadgroup_barrier
<alyssa> 56: f5a9 memory_barrier 2, 1, 10
<alyssa> 58: f5ad memory_barrier 3, 1, 10
<alyssa> the same basic ops there
possiblemeatball has joined #asahi-gpu
c10l48 has joined #asahi-gpu
<zzywysm> FYI Apple dropped a huge patch on top of wine that adds DirectX12 support: https://github.com/apple/homebrew-apple/blob/main/Formula/game-porting-toolkit.rb
<alyssa> I'm aware.
<alyssa> life goes on for us *shrug*
<alyssa> ultimately FOSS will win out technically, usually does
<alyssa> upstream community-backed FOSS, even
<alyssa> there are good reasons gamers prefer radv to amdvlk, despite the latter even being open source (!)
<TellowKrinkle> AFAIK the open source part of the patch is mostly proton stuff. The DX12 support is not included, it's a closed source DX-Metal translator from Apple
<zzywysm> merely thought it would make lina's livestreamed games testing more interesting
<alyssa> that was my read as well
<alyssa> TellowKrinkle's
<alyssa> at any rate, it's not super on topic here
<TellowKrinkle> How? Lina isn't going to be running their DX-Metal translator in Linux...
<TellowKrinkle> (Also vkd3d is a thing, it works pretty well with Vulkan drivers that aren't MoltenVK)
<alyssa> vkd3d + agxv + fex + linux is the dream
<alyssa> how will vkd3d + agxv + fex + linux stack up against d3dmetal + apple's metal driver + rosetta + macOS?
<alyssa> well, we'll know in a few years I suppsoe.
* eric_engestrom likes this future, much better than the "have to keep a windows partition around just for gaming" from a couple of decades ago
<alyssa> I like the former better than the latter >:
c10l48 has quit []
c10l48 has joined #asahi-gpu
LinuxM2 has joined #asahi-gpu
mkurz has joined #asahi-gpu
stsmwg has quit [Quit: Lost terminal]
Armlin has joined #asahi-gpu
yuka has quit [Remote host closed the connection]
yuka has joined #asahi-gpu
c10l48 has quit [Ping timeout: 480 seconds]
c10l48 has joined #asahi-gpu
Armlin has quit []
c10l48 has quit []
c10l48 has joined #asahi-gpu
possiblemeatball has quit [Quit: Quit]
c10l48 has quit [Read error: Connection reset by peer]
c10l48 has joined #asahi-gpu
hightower2 has quit [Ping timeout: 480 seconds]
brolin has joined #asahi-gpu
c10l48 has quit [Ping timeout: 480 seconds]
c10l48 has joined #asahi-gpu
c10l48 has quit [Ping timeout: 480 seconds]
hightower2 has joined #asahi-gpu
c10l48 has joined #asahi-gpu
c10l484 has joined #asahi-gpu
c10l48 has quit [Read error: Connection reset by peer]
<ChaosPrincess> alyssa: so, the main feature of geometry shaders is the ability to write out multiple vertices. But their max count is known and capped ahead of time. So on hw level, what is the difference between a multi-output vertex shader and a geometry one?
<alyssa> ChaosPrincess: By multi-output I assume you mean Metal's "amplification"?
<alyssa> As far as I know, that's limited to outputting 2 vertices (instead of 1) from the VS
<alyssa> which makes it ~useless for our pruposes
<alyssa> (Geometry shaders need like 256 min-max vertices)
hightower3 has joined #asahi-gpu
hightower3 has quit []
LinuxM2 has quit [Quit: Leaving]
<ChaosPrincess> Okay, not quite multi output with vertex but more like compute ones writing to an array
<alyssa> Right, geometry shaders on AGX are conceptually implemented as compute shaders writing to an array and then fed into the rasterizer with a passthrough vertex shader
<alyssa> Similar for mesh shaders and tessellation shaders
<ChaosPrincess> Seems too "simple"
<alyssa> From what i can tell, D3DMetal is doing geometry-on-mesh-on-vertex which sounds like it has "interesting" performance characteristics
<ChaosPrincess> there is bound to be some weird corner case that makes it ass
<alyssa> ChaosPrincess: Well, yeah. Devil's in the details.
<alyssa> Piles and piles of em
<alyssa> but the basic idea isn't too bad
brolin has quit [Ping timeout: 480 seconds]
<alyssa> You get lots of weirdness in all the non-corner cases, actually
<alyssa> like... geometry shaders output triangle *strips*, not triangles
<alyssa> and they can end the strip and start a new strip whenever they want
<ChaosPrincess> Can agx take triangle strips as vertex shader inputs?
<alyssa> yes, that's fine
<alyssa> but either you need to do extra copying (synthesizing vertices that don't actually exist to feed in triangles), or you need to generate an index buffer with primitive restart so you can feed in strips
<alyssa> both are decidedly more complicated than "just write to an array"
<ChaosPrincess> So i take it you also need to write indices from "geometry"
<alyssa> often
<alyssa> similar issues on the input side
<alyssa> if you feed a geometry shader with inputs with an index buffer, well if you're doing compute, you get to do primitive assembly yourself
<alyssa> have fun reading the index buffer yourself in compute
<alyssa> etc
<alyssa> none of these problems are impossible to solve
<alyssa> but there are lots and lots of them
<alyssa> and then.. transform feedback
<alyssa> but i don't want to talk about that right now o_o
<ChaosPrincess> Is primitive assembly "after vertex" or "before raster"?
<ChaosPrincess> Guess im assuming its a fixed function block - c/d?
c10l484 has quit [Read error: Connection reset by peer]
c10l484 has joined #asahi-gpu
aafeke_ has joined #asahi-gpu
<alyssa> force-early-z has depth unchanged, pass type "Translucent punch through", tri merging enabled
<alyssa> late-z has depth any, pass type Punch through, tri merging disabled
<alyssa> with fragcoord z input
<alyssa> zs_emit at the end of the shader
<alyssa> with late-z and no output, still zs_emit at very end
<alyssa> unconditional discard 1/2 set
<alyssa> unknown 1:0 cleared
<alyssa> (from 1)
<alyssa> unknown 3:0 next to cf bindings cleared
<alyssa> and unknown 7 cleared to 0
<alyssa> in occlusion query 2
<TellowKrinkle> Is tri merging mixing pixels from different triangles in one quad? Do they have to disable that for anything that needs derivatives?
<alyssa> TellowKrinkle: It's not 100% clear how tri merging works in the hw, but yes, disabled for anything needing derivatives
<alyssa> for force early with no rt, just sample mask at the start
<alyssa> pass type trans punch through
<alyssa> tri merging enabled
<alyssa> unconditional discard 1/2 still set
<alyssa> well all those bits are set so, um,
<alyssa> oh i can't
<alyssa> duh
<alyssa> duh
aafeke_ has quit [Quit: aafeke_]
aafeke_ has joined #asahi-gpu
aafeke_ has quit [Quit: aafeke_]
<alyssa> Disturbingly, it seems txf does not ignore the sampler (-:
<alyssa> in particular, the hw applies the wrap mode (-:
compassion18 has joined #asahi-gpu
<alyssa> and also disturbingly, I still can't find where bindless samplers are in memory
<alyssa> they just.. don't exist?
<alyssa> how is this memory possible mapped?
<alyssa> lina: I think this might be one for you. wrap.dylib is giving me spooky action
compassion1 has quit [Ping timeout: 480 seconds]
compassion18 is now known as compassion1
<alyssa> In Metal, if you have a sampler in an argument buffer and use it
<alyssa> where does that sampler live in GPU memory?!
<alyssa> There's some global heap of sampler descriptors *somewhere*
<alyssa> but where
<alyssa> i'm dumping all known memory and not seeing anything happen
cylm has joined #asahi-gpu
brolin has joined #asahi-gpu
<alyssa> 2023-06-07 19:37:07.887488-0400 app[4245:43641] Execution of the command buffer was aborted due to an error during execution. Invalid Resource (00000009:kIOGPUCommandBufferCallbackErrorInvalidResource)
<alyssa> that looks, possibly relevant
<alyssa> IDK. -ELATE
<alyssa> lina: Definitely one for you I think
brolin has quit [Ping timeout: 480 seconds]
<alyssa> we're specifically interested in samplers encoded to argument buffers, where does the actual sampler descriptor end up in memory?
<alyssa> This probably will correspond to UAPI stuff
mkurz has quit [Ping timeout: 480 seconds]
Z750 has quit [Quit: bye]
Z750 has joined #asahi-gpu
Z750 has quit []
Z750 has joined #asahi-gpu