ChanServ changed the topic of #lima to: Development channel for open source lima driver for ARM Mali4** GPUs - Kernel driver has landed in mainline, userspace driver is part of mesa - Logs at https://oftc.irclog.whitequark.org/lima/
camus1 has joined #lima
camus has quit [Ping timeout: 480 seconds]
camus has joined #lima
camus1 has quit [Ping timeout: 480 seconds]
<anarsoul> enunes: it's a bit tricky to add second output for gl_FragDepth/gl_FragStencil with our regalloc :)
<anarsoul> so it's all speculation, since I don't have access to documentation, but basically I think that whenever we hit instruction with "stop" bit set it writes outputs from specific registers
<anarsoul> i.e. reg0 for gl_FragColor (register can be changed, it's controlled by RSW)
<anarsoul> and let's say reg1 for gl_FragDepth+gl_FragStencil
<anarsoul> currently we don't even account writing to reg0 in regalloc
<anarsoul> or rather any ssa destinations for nodes with is_end set
<anarsoul> I guess I need to think how to fix it
<anarsoul> basically with multiple store_output instrinsics using a single is_end flag won't work
<anarsoul> so I think we need to rename is_end to is_store (with proper regalloc_index set, so it gets fixed reg number for output)
<anarsoul> and then set output reg liveness from node to the end of the block
<anarsoul> last instruction of the block (or discard) should have stop flag set
camus1 has joined #lima
camus has quit [Ping timeout: 480 seconds]
camus has joined #lima
camus1 has quit [Ping timeout: 480 seconds]
camus1 has joined #lima
camus has quit [Ping timeout: 480 seconds]
oftc has joined #lima
oftc has left #lima [#lima]
xdarklight has joined #lima
camus has joined #lima
camus1 has quit [Ping timeout: 480 seconds]
camus1 has joined #lima
camus has quit [Remote host closed the connection]
icecream95[m] is now known as icecream95
camus has joined #lima
camus1 has quit [Ping timeout: 480 seconds]
<enunes> anarsoul: yeah I can imagine regalloc or ppir in general needing some changes to pass the allocated register back to command stream... but it should be doable
<enunes> but did you figure out which rsw field contains the registers to be used for gl_FragDepth/gl_FragStencil?
<rellla> enunes, anarsoul: a little sum-up for the known bits in plbu and rsw bitstream i did some time ago -> https://pastebin.com/raw/6Jbppik1
camus1 has joined #lima
camus has quit [Remote host closed the connection]
camus has joined #lima
camus1 has quit [Ping timeout: 480 seconds]
camus1 has joined #lima
camus has quit [Ping timeout: 480 seconds]
camus has joined #lima
camus1 has quit [Read error: Connection reset by peer]
<anarsoul> enunes: not yet, I need ppir to be able to handle gl_FragDepth/gl_FragStencil first
<anarsoul> rellla: thanks!
<rellla> anarsoul: small comment in the MR, other than that, i think you can keep the r-b. though i didn't test it ;)
<rellla> anarsoul: we probably should think of some genxml ...
<rellla> i haven't been aware of nir_shader_instructions_pass, but that saves some lines indeed.
<anarsoul> rellla: hm, I think there's a mistake in your write up
<anarsoul> vertex selector is in bits 0x00000C00 of multi_sample
<anarsoul> and fields to select gl_FragColor are in 0xFF000000
<anarsoul> 4 bits per register, repeated 4 times
<anarsoul> gl_FragDepth/gl_FragStencil will be a single register (that's how it's used for depth reload), and since we use $0 for it in reload shader it must be 0 currently
<anarsoul> my guess is it's first unknown 4 bits in depth_test
<anarsoul> it's actually easy to figure out
<anarsoul> basically I can clobber gl_FragColor reg number in RSW for depth reload and it shouldn't break anything
<anarsoul> but once I clobber gl_FragDepth/gl_FragStencil it should break depth reload :)
<anarsoul> also we can change reload shader to use $2 as output and then poke RSW to fix color and depth reload :)
<anarsoul> I
<anarsoul> I'll try it later today
drod has joined #lima
camus1 has joined #lima
camus has quit [Read error: Connection reset by peer]
<anarsoul> so depth/stencil register is found
camus has joined #lima
<rellla> :)
camus1 has quit [Ping timeout: 480 seconds]
<anarsoul> I wonder why gl_FragColor register needs to be specified 4 times
camus1 has joined #lima
camus has quit [Ping timeout: 480 seconds]
<anarsoul> ugh, ppir seems to assume that ld_tex has a single successor in block :\
anarsoul|2 has joined #lima
anarsoul has quit [Ping timeout: 480 seconds]
anarsoul|2 has quit [Remote host closed the connection]
anarsoul has joined #lima
<anarsoul> fixed
<anarsoul> now need to fix regalloc to consider output regs...
<anarsoul> enunes: I'm thinking to get rid of hard-coded output reg numbers
<anarsoul> and just let regalloc do its job by not omitting output reg
<enunes> anarsoul: awesome, nice new info
<enunes> I agree with just letting regalloc find an appropriate register
<enunes> maybe the 4 color output thing is for some multisample or MRT use case
<anarsoul> enunes: well, it doesn't look like MRT
<anarsoul> if at least one field is not set to reg number result is very interesting :)
<anarsoul> (try it with e.g. kmscube)
<anarsoul> maybe it's multisample-related, but again, we don't have MSAA enabled atm
<anarsoul> actually, it's very likely it's something related to multi-sample since the word is called "multi-sample" :)
<enunes> anarsoul rellla so just a final heads up, tomorrow I have to disable the CI for a few days, this time for real :)
<anarsoul> guess it's time to merge texture-3d support :)
<anarsoul> it's still compiling for me on my rock64, I'll run deqp when it's done and then assign it to marge
<rellla> you are on a different timezone, anarsoul, so can probably use the time, when enunes is in bed ;)
<anarsoul> rellla: I rebased https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13213 could you take a look if it's still OK for you?
<anarsoul> :)
<anarsoul> rellla: I addressed your previous comments and there's no other changes, so I'll merge it in few hours if you don't mind
camus has joined #lima
camus1 has quit [Read error: Connection reset by peer]
<rellla> anarsoul: a small layout thing, otherwise this lgtm. thanks for that!
<anarsoul> rellla: you're comparing removed lines with added
<anarsoul> indentation is correct here
<rellla> don't you have 3 whitespaces after “generic“? but maybe it's a issue because i'm on mobile phone here...
<anarsoul> rellla: likely a phone thing
<rellla> so then :)
<anarsoul> equal sign is at the same position for type_generic and type_cube
<rellla> yes, but you can drop 2 ws in both lines?
<anarsoul> but why?
<anarsoul> numbers (i.e. 0x00 and 0x1f) are aligned perfectly
<anarsoul> so are equal signs
<rellla> wait
<rellla> this is the way, i would do it... but i'm also fine with a gap of 3
<rellla> moving through gitlab on mobile is fiddly btw :)
drod has quit [Remote host closed the connection]
<anarsoul> rellla: yeah
<anarsoul> rellla: I updated the MR
<rellla> i'm fine with it
camus1 has joined #lima
camus has quit [Read error: Connection reset by peer]
camus has joined #lima
camus1 has quit [Ping timeout: 480 seconds]
camus1 has joined #lima
camus has quit [Ping timeout: 480 seconds]