ChanServ changed the topic of #wayland to: https://wayland.freedesktop.org | Discussion about the Wayland protocol and its implementations, plus libinput | register your nick to speak
mclasen has joined #wayland
mclasen has quit [Ping timeout: 480 seconds]
Satan has joined #wayland
co1umbarius has joined #wayland
columbarius has quit [Ping timeout: 480 seconds]
ManMower has quit [Read error: Connection reset by peer]
ManMower has joined #wayland
ManMower has quit []
ManMower has joined #wayland
zebrag has quit [Quit: Konversation terminated!]
Moprius has quit [Quit: Konversation terminated!]
zebrag has joined #wayland
zebrag has quit [Quit: Konversation terminated!]
ManMower has quit [Quit: leaving]
ManMower has joined #wayland
lsd|2 has quit []
ki[m] has joined #wayland
txtsd is now known as Guest878
txtsd has joined #wayland
Guest878 has quit [Ping timeout: 480 seconds]
ahartmetz has quit [Ping timeout: 480 seconds]
maxzor has quit [Remote host closed the connection]
<pym_>
Working at STMicroelectronics, we face some issue using Weston 10.0 in term of performance compare to weston8.0
<pq>
pym_, hi, nice to hear from you.
systwi has joined #wayland
eroc1990 has joined #wayland
<pym_>
sorry for my ignorance using this channel
<pym_>
pq means Pekka ?
systwi_ has quit [Ping timeout: 480 seconds]
<pq>
pym_, yup, I'm Pekka Paalanen, and no worries. :-)
<pym_>
As told in my initial message we had some bad time using weston 10.0
<pym_>
reason new shader management : single shader to handle all pixel format
<pym_>
(nice to meet you Pekka btw ;))
<daniels>
pym_: can you elaborate on why this is causing you issues please?
<pym_>
as for fact we drop around 12fps using weston-simple-egl -f -b
<pym_>
the main reason is new code that generate extra GPU cycle
<pym_>
As for comparison a RGBA shader cost 2 GPU instructions. new shader 11 !
<pym_>
The worse part is
<pym_>
if (c_input_is_premult) {
<pym_>
if (color.a == 0.0)
<pym_>
color.rgb = vec3(0, 0, 0);
<pym_>
else
<pym_>
color.rgb *= 1.0 / color.a;
<pym_>
}Y
<pym_>
translate into 5 instructions : mov, rcp, jmp !
<pym_>
now I spend time to have a better understanding od this change
<pym_>
If I understood well, prior to pre_curve (if any), we revert pre_mult for RGBA, Solid and external
<pym_>
afterwards we roll back pre_mult (or apply pre_mult for ather pixel format
<daniels>
yeah, I guess the problem is the view-alpha uniform
<daniels>
if you delete color.a *= alpha; in color_pipeline(), is your compiler able to see through that and optimise the color.rgb *= 1.0 / color.a; color.rgb *= color.a; away?
<pym_>
likely don't tis change cost a lot for our GPU
<daniels>
if so, I guess we need a new variant for whether the view is solid or not
<pym_>
well not sure
<daniels>
right, pushing a new uniform isn't going to cost much, but color.a *= alpha; inside color_pipeline(); means that you can't see through it and optimise it away
<pym_>
in fact for RGBA, Solid and External pre_mult is enable, then shader revert this pre_mult before apply pre_curve
<pym_>
but if color curve is identidy ... why bother to revert pre_mult if we roll back this pre_mult
<pym_>
I would suggest to strengh a little bit the "if" statement to avoid the pre_mult revert
hergertme has quit [Remote host closed the connection]
<pym_>
i just wanted your feeling about it
<daniels>
pym_: the first line of color_pipeline() is color.a *= alpha;
<daniels>
alpha is a uniform, so is run-time dynamic
<daniels>
hence in this case it's _not_ necessarily undoing the premult, because color.rgb *= color.a; can potentially be different to color.rgb *= 1.0 / color.a; given that the value of color.a may be changed in between
<pym_>
thie line is not the problem and compiler optimize correctly this part
<daniels>
the compiler can't optimise it
<pym_>
sorry daniels I don't understand
<daniels>
ok
<daniels>
at line 276, we divide color.rgb by (1.0 / color.a)
<daniels>
at line 282, we multiply color.rgb by color.a again
<daniels>
now look inside color_pipeline()
<daniels>
the calls at 257 and 258 will be optimised out because they are constants
<daniels>
_however_
<daniels>
the problem is at line 255, where color.a is multipled by 'alpha', which comes from a uniform value
<daniels>
uniforms are dynamic per-draw, not constant across the lifetime of the shader
<daniels>
so it is impossible for a compiler to optimise this
<daniels>
what would be required inside weston is an additional variant axis for when the view alpha is 1.0, which is the most common case
<pym_>
ok I get
<pym_>
color.a has changed in betwwen
<pym_>
hmmmm, what is the intend of this operation (not in weston 8.0)
<pym_>
this is very GPU consuming
<pq>
daniels, pym_, the reason pre-multiplied alpha is first undone is that otherwise it is not possible to apply anything non-identity in color_pipeline(). view-alpha is not needing that, color_pre_curve() and color_mapping() are.
<pq>
no need for additional variants or anything, we could just move the view-alpha multiplication outside of color_pipeline() and after "pre-multiply for blending".
<pq>
it just means the multiply by view-alpha is 3 instead of 1 multiplications
<pq>
for the color managed case it is two multiplications more, but for the non-managed case it would put the div/mul by .a right next to each other.
<pq>
is that enough to optimize it, I'm not sure, so if we need a stronger solution, it would be possible to add more if-statement in the shader to eliminate the div/pipeline/mul sequence.
<pym_>
At first I tried to get rid from the pre_mult undone if non-identity
fmuellner has quit [Ping timeout: 480 seconds]
<pym_>
but sounds to be not the good solution though
<pq>
I think I had such a condition in the shader code while developing this, but it was complicating the code, and I didn't know it can cause that much of a performance problem.
<pym_>
from my pov, compiler is good enough to find optimization
<pq>
pym_, I would suggest filing a weston bug report with all this, and also explain how your attempts to modify the shader failed. It's end-of-day for me now.
<pym_>
it dosn't fail but look like not fully functional
<pq>
pym_, also, sorry, I don't quite remember where you mentioned this earlier.
<daniels>
pq: ^ I'm not sure if this is 100% correct and fixing everything, since it's more of a workaround than the grand glorious rewrite, but thought I'd just throw it out there anyway in case I get no further
pym_ has quit [Ping timeout: 480 seconds]
pym_ has joined #wayland
hardening has joined #wayland
rgallaispou has quit [Read error: Connection reset by peer]