marcan changed the topic of #asahi-gpu to: Asahi Linux: porting Linux to Apple Silicon macs | GPU / 3D graphics stack black-box RE and development (NO binary reversing) | Keep things on topic | GitHub: https://alx.sh/g | Wiki: https://alx.sh/w | Logs: https://alx.sh/l/asahi-gpu
omoiti has quit [Remote host closed the connection]
omoiti has quit [Remote host closed the connection]
PeterEaston has joined #asahi-gpu
PeterEaston has joined #asahi-gpu
TheJollyRoger has quit [Ping timeout: 268 seconds]
TheJollyRoger has quit [Ping timeout: 268 seconds]
Necrosporus has quit [Ping timeout: 240 seconds]
Necrosporus has quit [Ping timeout: 240 seconds]
omoiti has joined #asahi-gpu
omoiti has joined #asahi-gpu
JusticeEX has quit [Ping timeout: 245 seconds]
JusticeEX has quit [Ping timeout: 245 seconds]
zkrx has quit [Ping timeout: 246 seconds]
zkrx has quit [Ping timeout: 246 seconds]
zkrx has joined #asahi-gpu
zkrx has joined #asahi-gpu
JusticeEX has joined #asahi-gpu
JusticeEX has joined #asahi-gpu
phiologe has quit [Ping timeout: 244 seconds]
phiologe has quit [Ping timeout: 244 seconds]
phiologe has joined #asahi-gpu
phiologe has joined #asahi-gpu
odmir has quit [Remote host closed the connection]
odmir has quit [Remote host closed the connection]
odmir has joined #asahi-gpu
odmir has joined #asahi-gpu
odmir has quit [Ping timeout: 272 seconds]
odmir has quit [Ping timeout: 272 seconds]
_whitelogger_ has quit [Remote host closed the connection]
_whitelogger_ has joined #asahi-gpu
_whitelogger_ has joined #asahi-gpu
omoiti has quit [Read error: Connection reset by peer]
omoiti has quit [Read error: Connection reset by peer]
PeterEaston has quit [Quit: PeterEaston]
PeterEaston has quit [Quit: PeterEaston]
maor26 has joined #asahi-gpu
maor26 has joined #asahi-gpu
irradiated_lemmi has quit [Quit: Idle for 30+ days]
irradiated_lemmi has quit [Quit: Idle for 30+ days]
stemnic has quit [Remote host closed the connection]
stemnic has quit [Remote host closed the connection]
<DarkShadow44>
do you know what exactly u0_u1 do here?
<DarkShadow44>
do you know what exactly u0_u1 do here?
<DarkShadow44>
I think I understand most of the others, but that currently trips me up. The GPU just softfaults, and I guess my lack of understanding of this part is to blame
<DarkShadow44>
I think I understand most of the others, but that currently trips me up. The GPU just softfaults, and I guess my lack of understanding of this part is to blame
k|r|b|t|g|t is now known as krbtgt
k|r|b|t|g|t is now known as krbtgt
<DarkShadow44>
hmm maybe the adress of the buffer on the gpu
<DarkShadow44>
hmm maybe the adress of the buffer on the gpu
<DarkShadow44>
hmm maybe the adress of the buffer on the gpu
<DarkShadow44>
hmm maybe the adress of the buffer on the gpu
JusticeEX has joined #asahi-gpu
JusticeEX has joined #asahi-gpu
<chrisf>
DarkShadow44: should be the base address of the buffer
<chrisf>
DarkShadow44: should be the base address of the buffer
<chrisf>
DarkShadow44: there's a better description of the "simple" loads and stores in the older docs at https://hthh.github.io/m1-gpu-re/
<chrisf>
DarkShadow44: there's a better description of the "simple" loads and stores in the older docs at https://hthh.github.io/m1-gpu-re/
<chrisf>
dougall has since figured out a lot more of the encoding but we're short a description of exactly what happens
<chrisf>
dougall has since figured out a lot more of the encoding but we're short a description of exactly what happens
<DarkShadow44>
Yeah, I expected something like that
<DarkShadow44>
Yeah, I expected something like that
<chrisf>
DarkShadow44: some of the details of that pseudocode are wrong [what happens with a hole in the mask, etc], i had pretty incomplete information at the time
<chrisf>
DarkShadow44: some of the details of that pseudocode are wrong [what happens with a hole in the mask, etc], i had pretty incomplete information at the time
<DarkShadow44>
though I'm not entirely sure why the main GPU program expects them starting from u0_u1 but the .metallib expects them from r20_r21
<DarkShadow44>
though I'm not entirely sure why the main GPU program expects them starting from u0_u1 but the .metallib expects them from r20_r21
<DarkShadow44>
makes sense, thanks
<DarkShadow44>
makes sense, thanks
<chrisf>
DarkShadow44: when the buffer is a parameter to the entrypoint function you know it's uniform across all lanes
<chrisf>
DarkShadow44: when the buffer is a parameter to the entrypoint function you know it's uniform across all lanes
<chrisf>
DarkShadow44: some other rando function may have different values in every lane
<chrisf>
DarkShadow44: some other rando function may have different values in every lane
<DarkShadow44>
that does make sense. Maybe the mantle calling convention just starts function parameters from r20 upwards...
<DarkShadow44>
that does make sense. Maybe the mantle calling convention just starts function parameters from r20 upwards...
<chrisf>
i havent looked in detail at the calling convention, if you want to figure it out, great
<chrisf>
i havent looked in detail at the calling convention, if you want to figure it out, great
<DarkShadow44>
then again, it doesn't matter too much which registers are used for what, since we can make up our own conventions
<DarkShadow44>
then again, it doesn't matter too much which registers are used for what, since we can make up our own conventions
<DarkShadow44>
anyways, I know have setup enough to upload arbitrary programs(*) to the GPU and run them
<DarkShadow44>
anyways, I know have setup enough to upload arbitrary programs(*) to the GPU and run them
<DarkShadow44>
just moving the uniforms to the required registers is enough for the .metallib shaders to work
<DarkShadow44>
just moving the uniforms to the required registers is enough for the .metallib shaders to work
<DarkShadow44>
what's kinda nice: The buffer for the shaders can be reused multiple times, meaning I could upload and run ten different shaders without having to restart the program
<DarkShadow44>
what's kinda nice: The buffer for the shaders can be reused multiple times, meaning I could upload and run ten different shaders without having to restart the program
<DarkShadow44>
similar to what dougallj/applegpu is doing, but all in C and within one single process
<DarkShadow44>
similar to what dougallj/applegpu is doing, but all in C and within one single process
<chrisf>
for real driver we probably wont even have much of a convention -- mesa approach is generally "inline the universe"
<chrisf>
for real driver we probably wont even have much of a convention -- mesa approach is generally "inline the universe"
<DarkShadow44>
eh, depends on what we're given. When we get already inlined bytecode, even easier
<DarkShadow44>
eh, depends on what we're given. When we get already inlined bytecode, even easier
<chrisf>
we'll get already-inlined NIR
<chrisf>
we'll get already-inlined NIR
<chrisf>
we "just" have to lower it to the isa
<chrisf>
we "just" have to lower it to the isa
<DarkShadow44>
okay
<DarkShadow44>
okay
<chrisf>
not trivial but also doesnt look like it will be completely horrible :)
<chrisf>
not trivial but also doesnt look like it will be completely horrible :)
<DarkShadow44>
the shader infrastructure doesn't sound too far either, y'all seem to have done some amazing work already
<DarkShadow44>
the shader infrastructure doesn't sound too far either, y'all seem to have done some amazing work already
<chrisf>
so you've figured out command stream for compute-only?
<chrisf>
so you've figured out command stream for compute-only?
<chrisf>
graphics command stream has plenty of dragons
<chrisf>
graphics command stream has plenty of dragons
<DarkShadow44>
not really, no. I just use the github code to inject arbitrary shaders onto the gpu - replacing the original shader at a fixed offset...
<DarkShadow44>
not really, no. I just use the github code to inject arbitrary shaders onto the gpu - replacing the original shader at a fixed offset...
<DarkShadow44>
it's good enough for me to test the ISA though
<DarkShadow44>
it's good enough for me to test the ISA though
<DarkShadow44>
still wondering how exactly we get to the kernel part of things, btw
<DarkShadow44>
still wondering how exactly we get to the kernel part of things, btw
<DarkShadow44>
assuming there is a kernel layer beneath all that
<DarkShadow44>
assuming there is a kernel layer beneath all that
<chrisf>
there is a kernel driver, yes
<chrisf>
there is a kernel driver, yes
<chrisf>
marcan and others are going to have fun when we eventually need a linux one. not my area, but i hope it's possible to get under the existing driver and trace the mmio etc
<chrisf>
marcan and others are going to have fun when we eventually need a linux one. not my area, but i hope it's possible to get under the existing driver and trace the mmio etc
<chrisf>
the good news is that it probably doesnt hold many fiddly secrets
<chrisf>
the good news is that it probably doesnt hold many fiddly secrets
<DarkShadow44>
hopefully, because kernel stuff is finicky
<DarkShadow44>
hopefully, because kernel stuff is finicky
omoiti has joined #asahi-gpu
omoiti has joined #asahi-gpu
omoiti has quit [Remote host closed the connection]
omoiti has quit [Remote host closed the connection]