ChanServ changed the topic of #asahi-gpu to: Asahi Linux: porting Linux to Apple Silicon macs | GPU / 3D graphics stack black-box RE and development (NO binary reversing) | Keep things on topic | GitHub: https://alx.sh/g | Wiki: https://alx.sh/w | Logs: https://alx.sh/l/asahi-gpu
alexstore06 has quit [Remote host closed the connection]
yuyichao_ has joined #asahi-gpu
alexstore06 has joined #asahi-gpu
nsklaus_ has quit [Ping timeout: 480 seconds]
nsklaus_ has joined #asahi-gpu
user982492 has joined #asahi-gpu
alexstore06 has quit [Remote host closed the connection]
alexstore06 has joined #asahi-gpu
alexstore06 has quit [Remote host closed the connection]
PhilippvK has joined #asahi-gpu
phiologe has quit [Ping timeout: 480 seconds]
skipwich has quit [Quit: DISCONNECT]
skipwich has joined #asahi-gpu
alexstore06 has joined #asahi-gpu
alexstor_ has joined #asahi-gpu
alexstore06 has quit [Remote host closed the connection]
alexstor_ has quit [Ping timeout: 480 seconds]
alexstore06 has joined #asahi-gpu
user982492 has quit [Read error: Connection reset by peer]
user982492_ has joined #asahi-gpu
alexstore06 has quit [Ping timeout: 480 seconds]
user982492_ has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
user982492 has joined #asahi-gpu
jacoxon has joined #asahi-gpu
darkapex has joined #asahi-gpu
user982492 has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
alexstore06 has quit [Remote host closed the connection]
alexstore06 has joined #asahi-gpu
alexstor_ has joined #asahi-gpu
alexstore06 has quit [Ping timeout: 480 seconds]
alexstor_ has quit [Remote host closed the connection]
alexstore06 has joined #asahi-gpu
darkapex3 has joined #asahi-gpu
nsklaus_ has quit [Quit: WeeChat 3.2]
darkapex2 has quit [Ping timeout: 480 seconds]
alyssa has joined #asahi-gpu
<alyssa>
Curious if AGX supports indirection on the constant (uniform) file
<alyssa>
rereading the Metal optimization guide suggests "maybe, if you hint the compiler just right"
yuyichao_ has joined #asahi-gpu
<alyssa>
constant T &arr [[buffer(0)]]
<alyssa>
typedef struct { data[8] } T
<alyssa>
something like that
<alyssa>
there will be magic bits for HSR, [[early_fragment_tests]] may help identify which
alexstore06 has left #asahi-gpu [Leaving...]
yuyichao has quit [Ping timeout: 480 seconds]
alexstore06 has joined #asahi-gpu
yuyichao has joined #asahi-gpu
jacoxon has quit []
jacoxon has joined #asahi-gpu
yuyichao_ has quit [Ping timeout: 480 seconds]
yuyichao has quit [Ping timeout: 480 seconds]
minecrell has quit [Read error: Connection reset by peer]
yuyichao has joined #asahi-gpu
minecrell has joined #asahi-gpu
alexstore06 has quit [Quit: Leaving...]
yuyichao has quit [Ping timeout: 480 seconds]
nsklaus has joined #asahi-gpu
X-Scale` has joined #asahi-gpu
X-Scale has quit [Ping timeout: 480 seconds]
jacoxon has quit []
amw has quit [Ping timeout: 480 seconds]
radex has quit [Read error: Connection reset by peer]
X-Scale has joined #asahi-gpu
user982492 has joined #asahi-gpu
X-Scale` has quit [Ping timeout: 480 seconds]
aleasto has quit [Quit: Konversation terminated!]
aleasto has joined #asahi-gpu
aleasto has quit [Quit: Konversation terminated!]
aleasto has joined #asahi-gpu
nsklaus has quit [Read error: Connection reset by peer]
nsklaus has joined #asahi-gpu
nsklaus has quit []
user982492 has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
jbowen has quit [Quit: leaving]
crucifix has quit [Quit: Leaving]
user982492 has joined #asahi-gpu
<phire>
alyssa: oh, I didn't know GPU/compilers had problems doing indirection on the constant/uniform file. I rely a lot on that in dolphin's ubershader
<phire>
then again, the upershaders were more bound on indexing into the dynamic register file.
<phire>
because some shader ISAs simply don't support it, and their drivers will instead put the array you were dynamic indexing out into main memory, and preformance tanked
<alyssa>
phire: There are two issues being mixed up here (and in the Apple presentation I dug up)
<alyssa>
One is literally doing a load from constant/uniform memory, and the other is pushing the uniform to a dedicated uniform register / push constant / fast access uniform / register-mapped uniform
<alyssa>
Any reasonable ISA can do indirection on the former, I don't know if AGX can do indirection on the latter
<alyssa>
The former is still fast (since it'll be hot in cache) but performing a memory load is inherently more expensive than a pushed uniform that can be accessed anywhere for free
<alyssa>
I don't believe AGX can do dynamic indexing of the register file.
aleasto has quit [Quit: Konversation terminated!]
lucifer178[m] has joined #asahi-gpu
<phire>
I was only using uniform indexing into the register file. When I was talking to an Apple driver engineer back in like 2015, it didn't support it
<phire>
after talking with me, they actually expermented with an optimsation pass that converted unform indirection of small arrays into if/else trees (which I was expermenting with manually to speed up Nvidia GPUs