ChanServ changed the topic of #lima to: Development channel for open source lima driver for ARM Mali4** GPUs - Kernel driver has landed in mainline, userspace driver is part of mesa - Logs at https://oftc.irclog.whitequark.org/lima/
<anarsoul> oof, nir doesn't remove dead code for registers :(
<anarsoul> annoying, but shouldn't be difficult to fix up in ppir
<anarsoul> enunes: out of curiosity, do you have statistics on how many jobs your CI farm runs a day? :)
chewitt has joined #lima
<enunes> anarsoul: it is not a lot of jobs on average since it only triggers on things affecting lima or tree wide nir/galium/mesa etc, I checked now and in the last days it has been something like 20-30 individual jobs per day (gitlab pipelines trigger 3 jobs)
<uis> anarsoul: Sorry, I meant stage, that generates instructions(including moves) after regalloc. It is (usually) after regalloc stage, when compiler knows whether there needs to be a move at all. And as result, instead of removing mov instructions, it just does not generate them.
<anarsoul> uis: lima does regalloc after scheduling, and it's actually scheduling that generates extra movs (in addition to nir -> ppir translation)
<anarsoul> it looks like we have some subtle bug in regalloc that is exposed when I added scheduling root nodes into the same instruction
<anarsoul> root nodes do not have dependencies on each other, so it is safe to schedule them into the same instruction
<anarsoul> yet regalloc somehow messes up :\
chewitt has quit [Quit: Zzz..]
<anarsoul> I think I know what is the issue
<anarsoul> it looks like we don't add interference on destinations of the instruction
<anarsoul> which was mostly fine until recently :)
<anarsoul> and with that fixed we lost 5 shaders in shader-db with regalloc failure
<anarsoul> yet 17 failures in deqp (and 2 unexpected improvements)
<anarsoul> btw, it looks like piglit isn't very good at catching compiler bugs. It often passes when deqp fails
<anarsoul> down to 1 failure, 2 unexpected improvements
<anarsoul> the last failure is quite interesting. If we have multiple "end" blocks we need to mark out registers as live at the last instruction of each "end" block
<anarsoul> enunes: spilling code behaviour is interesting :) see: https://gist.github.com/anarsoul/f66452ecdd2558ec5a431698ef424e32
<anarsoul> it looks like it generates a load/store pair for each node that uses the reg, but doesn't check whether it's already loaded or stored
<anarsoul> it will still work, but it's 5 more instructions
<anarsoul> I'm kind of surprised that regalloc didn't explode on my MR to enable combiner unit usage :)
<anarsoul> pure luck
<anarsoul> CI passes, I'll clean it up tonight and send an MR
<anarsoul> it's already 15% improvement in glmark2 :) a lot of simple glmark2 shaders are now compiled into a single instruction
<anarsoul> I also have DCE ppir pass planned
<anarsoul> and mov coalescing