ChanServ changed the topic of #asahi-gpu to: Asahi Linux GPU development (no user support, NO binary reversing) | Keep things on topic | GitHub: https://alx.sh/g | Wiki: https://alx.sh/w | Logs: https://alx.sh/l/asahi-gpu
<i509vcb> nir_load_global_constant can be used to read from a buffer, but how do I tell nir to store that value into some uniform registers?
<i509vcb> I have sorted sysvals so that uniform registers can be assigned, but I need to load a value from the sysval buffer and store it in a uniform reg for use
<i509vcb> agx_emit_store_preamble?
<alyssa> i509vcb: short answer is "you dont'"
<alyssa> if you emit load_global_constant, that means the value is read /in the shader/ directly
<alyssa> and there are no uniform registers involved from the drivers perspective
<alyssa> no layouts for the driver to consider, no usc_uniform calls
<i509vcb> hmm okay, so really the only usc_uniform call I'd probably do then is tell the driver where I uploaded the agxv_sysvals_whatever
<alyssa> in practice, the compiler will internally generate store_preamble instructions to optimize the load into a uniform, but that's completely invisible to the driver
<alyssa> right
<alyssa> otoh, if you want to calculate layouts yourself, then you use usc_uniform in the driver and you have to emit load_preamble directly -- not load_global_constant
<alyssa> and set the compiler input accordingly so your uniforms dont get stomped on
<i509vcb> I guess part of this question was reading
<i509vcb> > agx_usc_uniform is then used to copy from that struct to uniform registers according to the layout that you picked earlier
<i509vcb> But from a theoretical performance standpoint loading that value into a uniform register would just generate an redundant mov?
<alyssa> uhh there are two options
<alyssa> 1. you do not assign uniforms, you just generate load_global_const and let the compiler go to town
<alyssa> 2. you do assign uniforms, and instead of generating load_global_const, you generate load_preamble and the compiler won't touch that
chadmed has quit [Ping timeout: 480 seconds]
<alyssa> in the second case only you use agx_usc_uniform to copy things into place
<alyssa> in the first case the compiler deals with it
<i509vcb> I guess the former sounds better from a maintaince standpoint since we already upload all the stuff into a buffer
<alyssa> sure, it's less work
<i509vcb> And I doubt I can do much better than the compiler right now
<alyssa> sure
<alyssa> in #1, the only load_preamble you generate is for the address of the the sysval buffer
<alyssa> and the only usc_uniform is for that single 64-bit address at u0_u1
<alyssa> probably not optimal for a produciton vk driver but the optimizations can come later easier
<i509vcb> I've left the bound buffers in the uniforms for now
<i509vcb> Mainly because of the compatibility stuff currently there
<i509vcb> although it is definitely chunky
<i509vcb> 32 uint64_t means 128 uniform registers just for buffers
as400 has quit [Remote host closed the connection]
as400 has joined #asahi-gpu
<i509vcb> I'm going to take moving over fully to the sysval buffer a little more slowly, I've found that trying to do push constants, descriptors and vbo stuff at the same time as num_workgroups to be a bit hard to test properly
ourdumbfuture has quit [Quit: My Mac has gone to sleep. ZZZzzz…]
ourdumbfuture has joined #asahi-gpu
chadmed has joined #asahi-gpu
jeisom has quit [Ping timeout: 480 seconds]
ourdumbfuture has quit [Quit: My Mac has gone to sleep. ZZZzzz…]
cylm has quit [Ping timeout: 480 seconds]
<i509vcb> I *think* this lowering is correct, not sure why vulkan keeps saying I am loading (0, 0, 0) through: https://gist.github.com/i509VCB/fce24eb2825733cc4bfacef043b398ba
<i509vcb> That's replacing load_sysval_agx
<i509vcb> Which in the case of load_num_workgroups is this: https://gist.github.com/i509VCB/b343bb352ecaad0f23245330359add54
<i509vcb> The base is intentionally hardcoded to 0 as I have shifted other things over in the uniform registers
<i509vcb> huh it's flaking
<i509vcb> I guess I learned what happens when you use agx_usc_uniform on the same uniform registers twice
<i509vcb> You get a race condition
<i509vcb> still doesn't fix the issue where if the sysval being loaded has a binding that isn't zero always fails...
novafacing992 has joined #asahi-gpu
novafacing9921 has joined #asahi-gpu
novafacing992 has quit [Ping timeout: 480 seconds]
novafacing9921 is now known as novafacing992
crabbedhaloablut has joined #asahi-gpu
<i509vcb> agxdecode.dump.0000 seems to tell me this about the first uniform which should be a pointer to a sysvals struct:
<i509vcb> Uniform
<i509vcb> Tag: Uniform
<i509vcb> Size (halfs): 4
<i509vcb> Start (halfs): 0
<i509vcb> Buffer: 0x5ffffac34c
<i509vcb> 000000 22 AD 3B 08 D5 D1 09 05
<i509vcb> but that 000000 is the content of the buffer or the content of the uniform?
<i509vcb> s/content of buffer/content of buffer being pointed to
<i509vcb> I'll also link the preamble's nir, the 32x3 load seems suspcious with %5 being 2 where it should be 8?
<i509vcb> I'd expect 32x3 %15 = @load_constant_agx (%13, %5 (0x8)) (access=none, base=0, format=r32_uint, sign_extend=0)?
albertobasaglia has joined #asahi-gpu
Misthios has joined #asahi-gpu
Mary has quit [Quit: The Lounge - https://thelounge.chat]
systwi has quit [Ping timeout: 480 seconds]
systwi has joined #asahi-gpu
Mary has joined #asahi-gpu
Mary has quit []
Mary has joined #asahi-gpu
Mary has quit [Quit: The Lounge - https://thelounge.chat]
Mary has joined #asahi-gpu
jeisom has joined #asahi-gpu
Ndfkjhw4 has joined #asahi-gpu
ourdumbfuture has joined #asahi-gpu
albertobasaglia has quit [Ping timeout: 480 seconds]
<alyssa> that's a lot of messages
<alyssa> this would load the /address/ of the sysval, you need to wrap this whole thing in a load_const_agx
<alyssa> at least as it's used in the GL driver, load_sysval_agx is intended to return the value
<alyssa> for load_num_wrkgroups that means there would be 2 dependent loads in this case
<alyssa> load_global_const(load_global_const(load_preamble(0) + offset to the num_workgroups))
<alyssa> since you put the address int he buffer and then you need to load from the address int he buffer
<alyssa> agx_usc_uniform works like a memcpy: move the contents of this GPU memory into these uniforms
<alyssa> if you want to put an address in a uniform, you need to pool_upload the address to the cmdbuf pool and pass /that/ address to usc_uniform
<alyssa> that 0000 is a hexdump of the content of the uniform
<alyssa> (alternatively, you could pass the entirety of the sysval buffer to agx_usc_uniform and then use the load_sysval_agx you implemented above, without the load. but that is very chunky since you dont use most of those uniforms in a given pipeline, which is why the gl driver has the layout code.)
cylm has joined #asahi-gpu
ourdumbfuture has quit [Quit: My Mac has gone to sleep. ZZZzzz…]
ourdumbfuture has joined #asahi-gpu
maria has quit [Read error: Connection reset by peer]
maria has joined #asahi-gpu
systwi has quit []
systwi has joined #asahi-gpu
Mary has quit [Quit: The Lounge - https://thelounge.chat]
Mary has joined #asahi-gpu
ourdumbfuture has quit [Quit: My Mac has gone to sleep. ZZZzzz…]
ourdumbfuture has joined #asahi-gpu
ourdumbfuture has quit [Quit: My Mac has gone to sleep. ZZZzzz…]
lena6 has quit [Remote host closed the connection]
mikee3000 has quit [Quit: WeeChat 3.8]
jeisom has quit [Ping timeout: 480 seconds]
DarkShadow4444 has quit [Quit: ZNC - https://znc.in]
DarkShadow44 has joined #asahi-gpu
mikee3000 has joined #asahi-gpu
ourdumbfuture has joined #asahi-gpu
nela has quit [Ping timeout: 480 seconds]
nela has joined #asahi-gpu
ourdumbfuture has quit [Quit: My Mac has gone to sleep. ZZZzzz…]
ourdumbfuture has joined #asahi-gpu
ourdumbfuture has quit [Quit: My Mac has gone to sleep. ZZZzzz…]
Mary2 has joined #asahi-gpu
Mary2 has quit []
ourdumbfuture has joined #asahi-gpu
ourdumbfuture has quit [Quit: My Mac has gone to sleep. ZZZzzz…]
flom84 has joined #asahi-gpu
<i509vcb> alyssa: yep the missing second load_global_const was the issue
ourdumbfuture has joined #asahi-gpu
crabbedhaloablut has quit []
flom84 has quit [Ping timeout: 480 seconds]
<alyssa> :+1:
V has quit [Ping timeout: 480 seconds]
ourdumbfuture has quit [Quit: My Mac has gone to sleep. ZZZzzz…]
V has joined #asahi-gpu
ourdumbfuture has joined #asahi-gpu
Z750 has quit [Quit: bye]
Z750 has joined #asahi-gpu
Z750 has quit []
Z750 has joined #asahi-gpu
ourdumbfuture has quit [Quit: My Mac has gone to sleep. ZZZzzz…]
amarioguy has joined #asahi-gpu
jeisom has joined #asahi-gpu
ourdumbfuture has joined #asahi-gpu
chadmed has quit [Ping timeout: 480 seconds]
akemin_dayo has quit [Ping timeout: 480 seconds]
ourdumbfuture has quit [Quit: My Mac has gone to sleep. ZZZzzz…]
ourdumbfuture has joined #asahi-gpu
darkapex has quit [Remote host closed the connection]
darkapex has joined #asahi-gpu
chadmed has joined #asahi-gpu
cylm_ has joined #asahi-gpu
cylm has quit [Ping timeout: 480 seconds]
ourdumbfuture has quit [Quit: My Mac has gone to sleep. ZZZzzz…]
ourdumbfuture has joined #asahi-gpu
ourdumbfuture has quit [Quit: My Mac has gone to sleep. ZZZzzz…]