#lima on 2025-04-17 — irc logs at oftc.irclog.whitequark.org

2024-07-16 04:51 ChanServ changed the topic of #lima to: Development channel for open source lima driver for ARM Mali4** GPUs - Kernel driver has landed in mainline, userspace driver is part of mesa - Logs at https://oftc.irclog.whitequark.org/lima/

01:35 ity has quit [Remote host closed the connection]

01:42 tanty has quit [Ping timeout: 480 seconds]

01:55 ity has joined #lima

02:06 tanty has joined #lima

03:06 chewitt has joined #lima

03:12 Daanct12 has joined #lima

03:59 tlwoerner_ has quit []

03:59 tlwoerner has joined #lima

04:49 Daanct12 has quit [Ping timeout: 480 seconds]

04:55 Daanct12 has joined #lima

06:23 Daanct12 has quit [Ping timeout: 480 seconds]

08:46 linkmauve has left #lima [Error from remote client]

10:56 linkmauve has joined #lima

11:51 tanty has quit [Quit: Ciao!]

11:53 tanty has joined #lima

11:54 tanty has quit []

12:37 tanty has joined #lima

14:14 fomys_ has quit [Read error: Connection reset by peer]

14:15 fomys has joined #lima

15:44 sarbes_ has joined #lima

15:54 sarbes_ has quit []

17:24 <enunes> sarbes: did you run the recent MR through shader-db? I think would be useful to mention there if it saves an instruction in any of those shaders or just has no diff

17:25 <enunes> I guess it might not save since it would be a const, but anyway, seems like a nice feature added there

17:51 <anarsoul> It should reduce the code size, but I don't expect it to save any cycles. Loading from constant reg is essentially free

17:56 chewitt has quit [Ping timeout: 480 seconds]

18:23 chewitt has joined #lima

18:33 <sarbes> test

18:33 <sarbes> Ah, I'm authenticated now.

18:36 <sarbes> I've run the shader-db now, but I'm not sure what to make of it: https://paste.centos.org/view/bed0cb32

18:40 chewitt has quit [Read error: Connection reset by peer]

18:41 chewitt has joined #lima

18:41 <sarbes> That's with a GL 3.0 override.

18:42 <sarbes> There seems to be no support for GL_EXT_shader_texture_lod

18:43 <sarbes> So for plain GLES 2.0, only the bias seems interesting.

19:19 chewitt has quit [Read error: Connection reset by peer]

19:20 chewitt has joined #lima

19:20 chewitt has quit []

20:02 chewitt has joined #lima

20:02 chewitt has quit []

20:18 <enunes> there should be no need to override with GL 3.0

20:19 <enunes> is that a diff between main and your branch?

20:19 <enunes> LOST is pretty bad, means it broke compilation for that shader

20:21 mripard_ has joined #lima

20:22 mripard__ has joined #lima

20:22 mripard has quit [Ping timeout: 480 seconds]

20:29 mripard_ has quit [Ping timeout: 480 seconds]

20:34 <sarbes> Eh, maybe the override messed things up. Here is a fresh run: https://paste.centos.org/view/4e8d77ce

20:34 <sarbes> No LOST anymore :)

20:38 <sarbes> I've commented out my change for the comparison.

20:42 <enunes> that seems pretty good for the affected shaders, maybe too good? :) hope it hasn't introduced an issue for them. saving over 10 instructions with this change seems a bit too good to be right

20:45 <sarbes> Yeah. Maybe I should rite more tests?

20:47 <sarbes> Is there any way to get the shader binary?

20:47 <enunes> I would suggest to start by picking one of the affected shaders and run shader-db with LIMA_DEBUG=pp and diff it against the same thing against main and take a look at the output

20:49 <enunes> so something like $ LIMA_DEBUG=pp ./run shaders/godot3.4/34-59.shader_test > main and then with your branch

21:13 <sarbes> I can give you the output of shaders/godot3.4/37-58.shader_test if you want. It looks right to me.

21:14 <sarbes> GLSL optimizer reduces it to this: https://paste.centos.org/view/99c5e57f

21:15 <sarbes> Before the change, this is the shader output: https://paste.centos.org/view/9f84f7bd

21:16 <sarbes> And after: https://paste.centos.org/view/5955bbc1

21:20 <sarbes> From my understanding, the const load is happening later than texturing in the pipeline. So the texture unit has to wait for the second cycle in order to get the constant LOD originally.

21:22 <sarbes> Any by supplying the constant via the instruction, the first cycle containing just the constant load+mov can be avoided.

21:30 <sarbes> Or maybe I'm misreading the output.