ChanServ changed the topic of #lima to: Development channel for open source lima driver for ARM Mali4** GPUs - Kernel driver has landed in mainline, userspace driver is part of mesa - Logs at https://oftc.irclog.whitequark.org/lima/
camus1 has joined #lima
camus has quit [Ping timeout: 480 seconds]
camus has joined #lima
camus1 has quit [Ping timeout: 480 seconds]
<anarsoul>
enunes: it's a bit tricky to add second output for gl_FragDepth/gl_FragStencil with our regalloc :)
<anarsoul>
so it's all speculation, since I don't have access to documentation, but basically I think that whenever we hit instruction with "stop" bit set it writes outputs from specific registers
<anarsoul>
i.e. reg0 for gl_FragColor (register can be changed, it's controlled by RSW)
<anarsoul>
and let's say reg1 for gl_FragDepth+gl_FragStencil
<anarsoul>
currently we don't even account writing to reg0 in regalloc
<anarsoul>
or rather any ssa destinations for nodes with is_end set
<anarsoul>
I guess I need to think how to fix it
<anarsoul>
basically with multiple store_output instrinsics using a single is_end flag won't work
<anarsoul>
so I think we need to rename is_end to is_store (with proper regalloc_index set, so it gets fixed reg number for output)
<anarsoul>
and then set output reg liveness from node to the end of the block
<anarsoul>
last instruction of the block (or discard) should have stop flag set
camus1 has joined #lima
camus has quit [Ping timeout: 480 seconds]
camus has joined #lima
camus1 has quit [Ping timeout: 480 seconds]
camus1 has joined #lima
camus has quit [Ping timeout: 480 seconds]
oftc has joined #lima
oftc has left #lima [#lima]
xdarklight has joined #lima
camus has joined #lima
camus1 has quit [Ping timeout: 480 seconds]
camus1 has joined #lima
camus has quit [Remote host closed the connection]
icecream95[m] is now known as icecream95
camus has joined #lima
camus1 has quit [Ping timeout: 480 seconds]
<enunes>
anarsoul: yeah I can imagine regalloc or ppir in general needing some changes to pass the allocated register back to command stream... but it should be doable
<enunes>
but did you figure out which rsw field contains the registers to be used for gl_FragDepth/gl_FragStencil?
<rellla>
enunes, anarsoul: a little sum-up for the known bits in plbu and rsw bitstream i did some time ago -> https://pastebin.com/raw/6Jbppik1
camus1 has joined #lima
camus has quit [Remote host closed the connection]
camus has joined #lima
camus1 has quit [Ping timeout: 480 seconds]
camus1 has joined #lima
camus has quit [Ping timeout: 480 seconds]
camus has joined #lima
camus1 has quit [Read error: Connection reset by peer]
<anarsoul>
enunes: not yet, I need ppir to be able to handle gl_FragDepth/gl_FragStencil first
<anarsoul>
rellla: thanks!
<rellla>
anarsoul: small comment in the MR, other than that, i think you can keep the r-b. though i didn't test it ;)
<rellla>
anarsoul: we probably should think of some genxml ...
<rellla>
i haven't been aware of nir_shader_instructions_pass, but that saves some lines indeed.
<anarsoul>
rellla: hm, I think there's a mistake in your write up
<anarsoul>
vertex selector is in bits 0x00000C00 of multi_sample
<anarsoul>
and fields to select gl_FragColor are in 0xFF000000
<anarsoul>
4 bits per register, repeated 4 times
<anarsoul>
gl_FragDepth/gl_FragStencil will be a single register (that's how it's used for depth reload), and since we use $0 for it in reload shader it must be 0 currently
<anarsoul>
my guess is it's first unknown 4 bits in depth_test
<anarsoul>
it's actually easy to figure out
<anarsoul>
basically I can clobber gl_FragColor reg number in RSW for depth reload and it shouldn't break anything
<anarsoul>
but once I clobber gl_FragDepth/gl_FragStencil it should break depth reload :)
<anarsoul>
also we can change reload shader to use $2 as output and then poke RSW to fix color and depth reload :)
<anarsoul>
I
<anarsoul>
I'll try it later today
drod has joined #lima
camus1 has joined #lima
camus has quit [Read error: Connection reset by peer]