#lima on 2025-04-30 — irc logs at oftc.irclog.whitequark.org

2024-07-16 04:51 ChanServ changed the topic of #lima to: Development channel for open source lima driver for ARM Mali4** GPUs - Kernel driver has landed in mainline, userspace driver is part of mesa - Logs at https://oftc.irclog.whitequark.org/lima/

02:24 hexdump01 has quit [Ping timeout: 480 seconds]

08:36 mripard has joined #lima

09:10 mripard has quit [Quit: WeeChat 4.6.1]

14:59 linkmauve has joined #lima

15:25 dsimic is now known as Guest14843

15:25 dsimic has joined #lima

15:26 Guest14843 has quit [Ping timeout: 480 seconds]

17:31 <anarsoul|2> > So you can do "vec4 out = vec4(in.x, in.y, 0, -1);" in one go.

17:31 <anarsoul|2> it certainly does. It would improve texture swizzles *a lot*

17:32 <anarsoul|2> e.g. for R, RG formats

18:12 <anarsoul|2> sarbes: ^^

18:17 <sarbes> Texture swizzles are not inferred from the texture format?

18:18 <sarbes> Oh yeah, there are only L/A/LA formats, not R and RG.

18:18 <anarsoul|2> they are lowered in nir, and e.g. SWIZ(X, 0, 0, 1) can be expensive, since it requires 2 movs

18:19 <anarsoul|2> or SWIZ(X, W, 0, 1)

18:20 <anarsoul|2> I think these are the only formats that are emulated

18:21 <anarsoul|2> and technically if shader doesn't use components with 0 or 1 these will be dropped in dce pass

19:04 <sarbes> How would a lowering look in this case? Not being much familiar with the compiler, I would try:

19:04 <sarbes> 1. look for a mov with a constant as a source, containing the values 0/1/-1, and fold both into a swizzle.

19:04 <sarbes> 2. if there is a parent mov with a single successor and no outmod, merge it into the swizzle.

19:12 <sarbes> There is also 0x12: "dest.xyzw = vec2(arg0.x + arg0.y + arg1.x, arg0.z + arg0.w + arg1.y).xyxy", but this seems even more exotic.

20:15 <sarbes> Other opcodes seem to be:... (full message at <https://matrix.org/oftc/media/v1/media/download/AU8t2cC0aPLE8UGS8s_9SFZFKus-mXBBqWzLx0mowr4qQNQarE3c9bkV8NcSorusRtQA69SwP2X7NHWV6DXr0MlCeW0HBk0gAG1hdHJpeC5vcmcvQ3NrdnlZTE9ZQmpUZ0t5U3h0dkJzRVpu>)

20:16 <sarbes> maybe the arguments are the other way around.