ChanServ changed the topic of #wayland to: https://wayland.freedesktop.org | Discussion about the Wayland protocol and its implementations, plus libinput | register your nick to speak
txtsd is now known as Guest45
txtsd has joined #wayland
Guest45 has quit [Ping timeout: 480 seconds]
columbar1 has joined #wayland
columbarius has quit [Ping timeout: 480 seconds]
boistordu has joined #wayland
boistordu_ex has quit [Ping timeout: 480 seconds]
checkmanonetwo has joined #wayland
dcz has joined #wayland
checkmanonetwo has quit []
zebrag has quit [Remote host closed the connection]
pnowack has quit [Remote host closed the connection]
pnowack has joined #wayland
dos11 has joined #wayland
<The_Observer>
Hey, I hope it is okay to ask user support questions here. :) I have a problem regarding libinputs with the touchpad of my new laptop (librem 14). On the preinstalled OS (PureOS) right-clicks work, when setting the Touchpad Mouse Click emulation to "Area" in Gnome Tweaks. This does use the Click method "button areas" of libinputs. I want to get this working on my arch system (endeavourOS), which uses libinputs as well (with X.Org)
<The_Observer>
However, according to the output of "libinput list-devices" on Manjaro, libinput does not support this click method for this touchpad. It says, it supports "none". Heres the full output of this command on my arch system: https://pastebin.com/Qwk0gnBA.
<The_Observer>
On PureOS the output is the same, except for showing available click methods: "Click methods: *button-areas clickfinger" (and the lines Kernel and Group). The Version of libinputs is 1.16.4 on PureOS and 1.18.0 on Manjaro. Anybody an idea why it doesn't show those click methods as available on arch but it does on PureOS?
dos1 has quit [Read error: Connection reset by peer]
NoGuest17 has quit [Remote host closed the connection]
NoGuest17 has joined #wayland
<The_Observer>
If here is not the right place to ask this, I'd very much appreciate , if someone could point me in the right direction! :) The usual support channels I use for linux questions, don't really seem to be able to help with this specific topic. And the libinput docs weren't helpfull for me either beyond what I already figured out.
columbar1 has quit []
pnowack has quit [Quit: pnowack]
columbarius has joined #wayland
pochu has joined #wayland
pnowack has joined #wayland
pnowack has quit []
pnowack has joined #wayland
pnowack has quit [Remote host closed the connection]
pnowack has joined #wayland
mixfix41 has joined #wayland
mixfix41_ has quit [Ping timeout: 480 seconds]
dcz_ has quit [Ping timeout: 480 seconds]
NoGuest17 has quit [Remote host closed the connection]
pnowack has quit [Quit: pnowack]
NoGuest17 has joined #wayland
pnowack has joined #wayland
rasterman has joined #wayland
pnowack has quit []
pnowack has joined #wayland
<pq>
MrCooper, the problem with using KMS in the guest OS is that it essentially forces you down to just two buffers, right?
<MrCooper>
why?
<pq>
because when a pageflip signals complete, the old buffer that was in KMS is immediately free.
<pq>
and you can't do another pageflip until the previous completes
<MrCooper>
you can still use another buffer to prepare the next frame before the page flip completes
<pq>
preparing the next buffer usually doesn't take much time
<pq>
so there is no benefit from preparing in advance when you can't submit the flip anyway. It would only introduce more latency.
<MrCooper>
with another buffer, 60 fps can be sustained even if drawing each frame takes ~16 ms on average
<MrCooper>
right, it's the classic throughput vs latency trade-off
<pq>
yeah, but KMS can't keep more than 2 buffers reserved at any time, so you can't submit the third until the oldest one comes back
<MrCooper>
with only 2 buffers, drawing the next frame can only start once the flip completes, leaving significantly less time budget per frame
<pq>
the guest Weston could repaint before the previous flip completes, but all it does is lock the frame contents earlier without getting a chance to submit the frame earlier via guest KMS to host Weston
<pq>
yes
<pq>
that's also not exactly the problem
<MrCooper>
it's a problem at some point :)
<pq>
the chain is: guest Weston -> guest KMS -> qemu -> host Weston -> host KMS
<pq>
the goal: put guest Weston buffer zero-copy on host KMS plane, while maintain full framerate
<pq>
I think it's possible to do with carefully adjusting the repaint-window time in each Weston, but that's specific to Weston only.
<MrCooper>
maintaining full frame-rate with only 2 buffers isn't possible in all cases; more buffers make it possible in more cases
<pq>
exactly
<pq>
and the guest KMS in between makes it impossible to have more than 2 buffers in flight
<MrCooper>
doesn't matter
<pq>
qemu would need to be cycling three buffers towards host Weston, and it cannot have that many because the buffers come from guest KMS.
<silver>
pq, does that mean there are around ten to twelve buffers throughout the entire chain? something like | + | -> || -> || -> || -> || -> ||
<pq>
silver, no. It would need three, but can have only two.
<MrCooper>
pq: if the guest uses more than 2 buffers, qemu will cycle through all of them as well
<pq>
MrCooper, how can the guest use more than 2 buffers?
<silver>
in each chain link? i.e. two passing through guest Weston while two are passing through Guest KMS and so on
<pq>
silver, no, in total.
<MrCooper>
pq: same way as on bare metal
<pq>
MrCooper, but KMS doesn't allow that.
<silver>
ok i get thanks pq
<MrCooper>
pq: can't work on bare metal either then? ;)
<pq>
MrCooper, bare metal does a copy into monitor cable, it's not zero-copy.
<pq>
while guest KMS is zero-copy
<pq>
KMS on bare metal guarantees that when the flip on vblank happens, the old buffer is immediately free. That is not true in guest KMS, it cannot complete the pageflip until the old buffer is actually free, and for that it needs to wait until the host Weston completes flip.
<pq>
If it was possible for guest KMS to complete a flip without immediately releasing the old buffer, then there could be more buffers in flight, and the full framerate would be must easier to achieve because the buffers could occupite multiple steps in pipeline rather than needing to wait for each buffer to go through the whole pipeline until being able to do the next one.
aleasto has joined #wayland
<MrCooper>
"bare metal does a copy into monitor cable" makes no sense; the buffer is scanned out more or less according to the video mode timings, and can only be re-used once scanout has switched to another buffer
<pq>
yes
<MrCooper>
there is conceptually no difference between the cases; a buffer can be reused once the flip to another buffer has completed
<pq>
the difference is, guest KMS does not scan out. It gives the buffer to the next in chain.
<pq>
while bare-metal KMS can release the old buffer immediately when the new buffer enters scanout, the guest KMS cannot release the old buffer at the time it sends the new buffer forward.
<MrCooper>
that just affects the timings, not the basic principle
<pq>
and KMS rules say that you cannot complete the pageflip until the old buffer is free for re-use
<pq>
yes, the timings are the very issue
<pq>
like I said, it is possible to make it work, I think, by carefully adjusting repaint-window in each Weston.
<MrCooper>
if the timings don't allow sustaining full frame-rate with 2 buffers, more buffers can be used in both cases
<pq>
but that solution only works with Westons
<pq>
no, guest KMS forbids using more than 2 buffers
<pq>
well, you can *have* more than 2 buffers in a pool, but you can't have more than 2 in flight.
<pq>
It's the case of nested compositors with zero-copy, expect in this case we have the guest KMS in the middle that says that you cannot submit a new frame until until the frame before the currently submitted one has stopped displaying.
<emersion>
can OUT_FENCE_PTR help a bit maybe?
<emersion>
also, grumble grumble virtualization not a good fit for KMS grumble
<pq>
it might? But do KMS clients expect they might need more than 2 buffers?
<emersion>
hm
<pq>
exactly, or KMS not a good fit for virtualization :-)
<emersion>
ahah, yes
<emersion>
GBM surface is backed by the same logic used on other platforms as well righjt?
<emersion>
right*
<emersion>
i guess GBM surface is missing a way to release a buffer with a fence
<MrCooper>
the only possible issue I can see is if the guest KMS ends up taking longer than a refresh cycle between a buffer becoming ready for flip and the flip completing; if that was the case, sustaining full frame-rate would be impossible regardless of the amount of buffers. As long as that's not the case, more buffers can be used for better throughput
<pq>
I think so, yeah - but it doesn't matter, GBM is willing to allocate more than 2 buffers and the KMS client can listen for the fence itself before handing back to GBM. Which is how It's immagine it working anyway.
<pq>
*how I imagine
<emersion>
right
<pq>
MrCooper, I have been assuming that all buffer rendering is already complete.
<MrCooper>
doesn't matter either :)
<pq>
os always immediately ready for flip
<pq>
*so
<pq>
MrCooper, the testing by Vivek anyway proves that the framerate gets halved if the host Weston uses KMS planes for the guest buffers, and that carefully adjusting timings can make it go faster. The problem is that not all compositors are Weston, they have different timing policies.
<MrCooper>
no timing policy can guarantee sustaining full frame-rate with 2 buffers (even on bare metal); adding more buffers can, in both cases
<MrCooper>
(assuming drawing a frame doesn't take longer than a refresh cycle)
<pq>
I assume that rendering never takes any significant amount of time even close to a refresh cycle, so on bare metal, yes it can.
<pq>
and yes, multiple buffers help *if* your problem is that *rendering* takes time.
<pq>
but in this case, the problem is not the rendering taking time.
<pq>
The problem is that both guest and host Weston latch on to the exact same refresh cycle of the actual monitor, while guest KMS in between forbids submitting another frame until the frame before the currently submitted frame has stopped displaying.
<pq>
So guest Weston submits the frame exactly when it is too late for host Weston to take it for the next vblank.
<pq>
hence prensenting that frame has to wait another refresh cycle in the host Weston, and host Weston won't release the old frame from KMS until the new frame has completed KMS flip.
<pq>
The result is halved framerate.
<MrCooper>
all I'm hearing is weston issues, no fundamental reason why more buffers couldn't be used for better throughput
<pq>
becuase you can't submit a new buffer to guest KMS until the pageflip of the previous has completed.
<pq>
you can't fill all the slots in the chain with buffers because of that
<emersion>
yeah, that makes sense
<pq>
also Weston is a ref herring here. You *can* make it work by adjusting how the Westons schedule. But what do you do with other compositors then?
<pq>
*red herring
<pq>
no compositor can submit a frame to KMS until the previous flip completes
<silver>
i normally disable flipping when possible but here is what it says on the nv manpage
<pq>
even if qemu does not honour frame callbacks, it simply cannot zero-copy submit another frame until it gets one from guest KMS, and guest KMS cannot give a new frame until qemu makes it signal flip complete.
<pq>
and qemu can't signal guest KMS flip complete until it gets wl_buffer.release for the old buffer.
<pq>
so if qemu wants to be zero-copy, there is no way it could make actual use of a third buffer.
<pq>
to repeat, the chain is: guest compositor -> guest KMS -> qemu -> host compositor -> host KMS -> display
<MrCooper>
pq: you correctly point out there are other causes for not sustaining full frame-rate, for which more buffers don't help; I'm pointing out there are causes for which more buffers do help
<pq>
MrCooper, sure, but why mention those cases when we are talking about the Qemu use case?
<silver>
this is a very informative topic and i personally appreciate it very much, also forgive me for having to catch up during the middle of it. i do have one remaining question, Does bare metal exclude opengl, xrandr, egl, etc.?
<MrCooper>
because they can apply with qemu as well
<pq>
MrCooper, but the problems where they apply are not the problem at hand with qemu.
<MrCooper>
they're just being masked in the particular case you're focusing on
<pq>
silver, those are unrelated things.
<pq>
MrCooper, yes, I'm focusing on the problem what will remain a problem even if rendering never took any time at all.
<silver>
ok i was thinking they were possible solutions but aimed at drivers and thats why i asked if the goal could be the same with bare metal some how
pnowack has quit [Quit: pnowack]
<dottedmag>
pq: By any chance can QEMU do some trickery by giving the guest the appearance of releasing a buffer? Some MMU trick to switch the memory under-the-hood and keep the buffer to pass it further?
<pq>
dottedmag, I can't think any way that would not involve a copy, and the whole point was to be zero-copy.
<emersion>
pq, if qemu controls the whole address space, maybe tricks would be possible?
<pq>
submitting a buffer to KMS is a non-destructive operation, and KMS clients do trust the buffer keeps its contents even afterwards.
<pq>
Wayland clients rely on the same
<emersion>
pq, copy-on-write maybe?
<dottedmag>
Well, can't QEMU just modify the page tables to point the buffer to another place in host memory, so that guest can start writing to it immediately, but the buffer content is still there for QEMU to use?
<pq>
dunno
<dottedmag>
Ah, contents has to be preserved. I see, ok.
<emersion>
ie, makev the submitted buffer read-only, and map a new buffer on top used for new writes
<emersion>
make*
<pq>
emersion, then at some point you need to copy either the remaining old contents over, or the written new content back.
<pq>
unless the writes cover everything
<pq>
also since we are talking about sharing buffers from inside the guest OS all the way to host KMS, I doubt such trickery would even be possible without explicit support for it in the host kernel.
<pq>
since those buffers need to be allocated compatible with host KMS to begin with, i.e. essentially allocated by host drivers
<pq>
or something
<pq>
I think the OUT_FENCE thing might be best bet forward for a general solution.
rasterman has quit [Quit: Gettin' stinky!]
flacks has quit [Quit: Quitter]
flacks has joined #wayland
pnowack has joined #wayland
rasterman has joined #wayland
<MrCooper>
pq: not sure how the out fence could make a difference; trying to submit a page flip before the previous one has completed will still fail
<pq>
MrCooper, if qemu can set the out fence in guest KMS, then qemu does not need to wait for the old buffer to actually become free, and it can signal pageflip completion earlier. This would make guest compositor use more than 2 buffers towards KMS, allowing it to submit another frame before the old buffer is released.
<pq>
hence filling all the stages in the chain with buffer, reaching full framerate zero-copy and with host KMS direct display.
txtsd has quit [Ping timeout: 480 seconds]
<pq>
It would result in proper pipelining regardless of the exact phase difference in the repaint and refresh cycles.
<MrCooper>
right now, AFAICT the out fence is signalled at the same time as the completion event is sent
<pq>
yes, that's a question whether than can be decoupled.
<pq>
qemu guest DRM driver would change.
<MrCooper>
this is core code
<pq>
yes, so it would need to be decoupled
<MrCooper>
still, might be possible in theory
<pq>
like I implied, we would need to complete a pageflip without simultaneously releasing the old buffer, and that is unheard of in KMS.
<pq>
or then we would need to be able to submit another KMS flip while the previous flip has not yet completed.
<MrCooper>
the latter would effectively be possible with this
<pq>
yes
<emersion>
i wonder if there are non-qemu use-cases for this
<pq>
but it would depend on the KMS driver what it does
<emersion>
maybe GUD?
<pq>
what's that?
<emersion>
driver for USB displays
<pq>
oh that, hmm.
<emersion>
well, i'm not sure
<pq>
feel a bit far-fetched
<emersion>
was just thinking it could be combined with DAMAGE_RECTS to get more fine-grained damage regions perhaps
<emersion>
or better compression algos
<pq>
I have a hard time imagining any other use case that would not be better with "don't go through KMS to begin with".
<MrCooper>
pq: I think the main reason this isn't allowed with the current KMS API is that it could result in each flip getting replaced before it completed, i.e. a frozen screen
<dottedmag>
I wonder why dma-fence is actually named fence. Is it a reference to a real-world fence, or is there other explanation, perhaps historical?
<pq>
hmm... maybe a fence stops you from running through until the fence falls?
<dottedmag>
Yes, it's the synchronisation primitive. I just wonder why the new, and narrowly used, word, there's already plenty of them.
<pq>
there are many different kinds of synchronization primitives
<pq>
IIRC "fence" is already an overloaded term with sync primitives, too
<dottedmag>
Is it different from "future"? Seems to be quite similar: flips at some point of time from "not done" to "done".
<pq>
I don't know what "future" is
<pq>
but I think DRM fence and Vulkan fence were quite different, or something
<dottedmag>
So, is DRM fence a synchronization primitive that carries only "done/not done" flag, and allows waiting for it to switch, or is there more functionality?
<pq>
Yeah, it can be represented as an fd that can be passed from process to another, and from driver to another. Or maybe dma-fence is the official name for it? No, it's sync_file?
<dottedmag>
Ah, also passing around as a fd. Thanks, now it will be much easier to read the documentation :)
<pq>
I think there is *one* kind that can be passed around as an fd, and I'm never sure what it's called.
<pq>
since as a winsys dev, I don't get to deal with other kinds, if at all
<pq>
or, not in my current topics
<emersion>
yes, the kernel term is sync_file, at least for IN_FENCE_FD/OUT_FENCE_PTR
<emersion>
there are newer sync primitives introduced for vulkan purposes, named drm_syncobj
<pq>
swick, I wonder, in HLG, the OOTF in defined at the display end to adapt to different monitors and viewing environments. We wanted to do the same with contrast/brightness adj. Do you remember seeing notes about how that was supposed to work with PQ?
aleasto has quit [Remote host closed the connection]
<emersion>
pq, yes, feel free
<emersion>
zamundaaa: thanks for keeping pushing for drm-lease btw. this is a pretty complicated protocol, and there's always a lot of discussion about wording. i think we aren't too far from the end of the tunnel hopefully :)
<emersion>
pq, if we get more patches via email, i have an automated tool to convert them to GitLab MRs (with comments and everything). but probably not worth it to setup for now.
<pq>
emersion, cool, but yeah. I still have my old tools too, so applying a series from email is just a click for wayland and weston.
<emersion>
well, still need to manually convert to a MR, and if the patchset isn't perfect it becomes more complicated
pochu has quit [Ping timeout: 480 seconds]
audgirka_ has joined #wayland
ice9 has joined #wayland
<ice9>
i'm trying to run x11 game in wayland session but it doesn't start, any idea?
audgirka__ has joined #wayland
<emersion>
ice9: any error message? what compositor are you using?
<dottedmag>
ice9: This depends on the desktop environment you're using. Please ask in GNOME or KDE (or...) support channels
audgirka has quit [Ping timeout: 480 seconds]
audgirka_ has quit [Ping timeout: 480 seconds]
<pq>
hmm, so PQ is pre-baked for the monitor and environment... while HLG is a generic signal that needs adapting to monitor and environment.
<MrCooper>
pq: taking a step back, since the guest buffers are scanned out directly on the host, the host compositor can release the previous buffer as soon as the host flip completes, and the guest KMS driver should be able to complete its flip immediately after that; therefore I don't see any KMS related issue, I suspect it's purely bad interaction between the guest & host compositor repaint cycles
<pq>
MrCooper, yes, it is about the two compositor repaint cycles.
<pq>
MrCooper, usually you would be able to stack an arbitrary number of compositors nested, and it only increases the latency to display, but it won't reduce framerate, since every step on chain can have a buffer.
<pq>
that would a "fully pipelined" presentation
<pq>
sticking guest KMS in there causes the "fully pipelined" to change into "end-to-end roundtrip" for each frame.
<MrCooper>
still not seeing the difference :( you submit a flip for the next buffer, and you can re-use the previous buffer when the flip completes, both in the host & guest
<pq>
the trade-off is a bit like video playback vs. interactive, queueing vs. immediate mode
<MrCooper>
the complexity added by the guest is that the guest compositor repaint cycle must run before the host one, otherwise it ends up lagging behind by one refresh cycle
<pq>
yes, but the problem is, when can the flip complete? Can it complete when the frame moves one step closer to display, or does it need move all steps to the display and the the release need to come all steps back before it completes.
<pq>
remote displays have the same problem: do you let the frame rate be governed by end-to-end roundtrip latency, or do you send a stream of frames at the refresh rate of the the monitor at the far end.
<emersion>
roundtrip sounds pretty bad
<emersion>
when latency is high
<MrCooper>
(even when the compositor repaint cycles aren't aligned, full frame-rate should be possible with more than 2 buffers, just with one refresh cycle higher latency)
<pq>
KMS can only do end-to-end roundtrip right now, because normally its the final step in a chain, so it end-to-end vs. stream makes no difference.
<pq>
MrCooper, exactly! But guest KMS in the middle denies the possibility to take advantage of more than 2 buffer.
<MrCooper>
so you keep saying, but keep failing to see why :(
<MrCooper>
*I keep*
<pq>
it's because the pageflip signals two things at the same time: 1) you can flip again, and 2) the old buffer is free.
<pq>
*pageflip completion
<pq>
so it's impossible to have more than 2 buffers reserved in KMS, so having a pool of even more buffers makes no difference.
<pq>
only two will ever be used at a time at most
<pq>
that means that below KMS, no matter long the chain of elements is, there can only ever be two buffers in total in that chain
<MrCooper>
k, I think I realize what happens now, thanks for your persistence
<pq>
eh heh :-p
dcz_ has joined #wayland
audgirka__ has quit [Remote host closed the connection]
<dottedmag>
pq: So in theory one could add another flag to KMS pageflip saying "don't bother waiting for buffer to become free, just give me an event when it happens", and then compositors can be updated to start using it?
<pq>
dottedmag, that's what the out fence ptr KMS property basically does, but an open question is, can we assume that KMS apps that use that can also handle the old buffer not becoming free when the flip completes.
pochu has joined #wayland
<dottedmag>
pq: That's why I said "another flag", to make it opt-in.
<dottedmag>
"If you wish to perform well under guest KMS, set this flag and be ready that the buffer will be freed later"
pnowack has quit [Quit: pnowack]
pnowack has joined #wayland
tzimmermann has quit [Quit: Leaving]
reillybrogan has quit [Ping timeout: 480 seconds]
reillybrogan has joined #wayland
pochu has quit [Read error: No route to host]
pochu has joined #wayland
lanodan has quit [Quit: WeeChat 3.1]
lanodan has joined #wayland
pochu has quit [Ping timeout: 480 seconds]
The_Observer has quit [Quit: Page closed]
moa has joined #wayland
bluebugs is now known as Guest125
moa is now known as bluebugs
Guest125 has quit [Ping timeout: 480 seconds]
hug0 has joined #wayland
hug0 has quit []
rasterman has quit [Quit: Gettin' stinky!]
onelegend has joined #wayland
<onelegend>
I use wayland all night and wayland all day
xexaxo has joined #wayland
xexaxo has quit [Remote host closed the connection]
xexaxo has joined #wayland
xexaxo has quit [Remote host closed the connection]