ChanServ changed the topic of #wayland to: https://wayland.freedesktop.org | Discussion about the Wayland protocol and its implementations, plus libinput
nerdopolis has joined #wayland
shoragan has quit [Quit: quit]
shoragan has joined #wayland
shoragan has quit [Remote host closed the connection]
shoragan has joined #wayland
co1umbarius has joined #wayland
columbarius has quit [Ping timeout: 480 seconds]
nerdopolis has quit [Ping timeout: 480 seconds]
nerdopolis has joined #wayland
Company has quit [Remote host closed the connection]
sargoe has quit [Remote host closed the connection]
nerdopolis has quit [Ping timeout: 480 seconds]
fmuellner has quit [Ping timeout: 480 seconds]
Brainium has quit [Quit: Konversation terminated!]
Guest6008 has quit [Remote host closed the connection]
cool110 has joined #wayland
cool110 is now known as Guest6199
<kchibisov> emersion: you could create a surface of size 1x1, then you resize to 800x600 with scale 2, then you swap buffers -> you crashed, because you've commited 1x1.
<kchibisov> Or, you have 2 windows rendering at your application, each window has the same context, you get a resize with the scale of 2, you call eglMakeCurrent, you resize, you apply scale -> you've crashed, because you've commited the old size with scale 2.
<kchibisov> Because eglMakeCurrent latched the buffer.
<kchibisov> So unless you study mesa's EGL code and bugs you'll have a very good chance to get crashes.
sima has joined #wayland
Plasmoduck has joined #wayland
bodiccea has quit [Ping timeout: 480 seconds]
tzimmermann has joined #wayland
Plasmoduck has quit [Ping timeout: 480 seconds]
junaid has joined #wayland
junaid has quit [Remote host closed the connection]
junaid has joined #wayland
andyrtr_ has joined #wayland
andyrtr- has joined #wayland
andyrtr has quit [Ping timeout: 480 seconds]
andyrtr has joined #wayland
junaid has quit [Remote host closed the connection]
andyrtr_ has quit [Ping timeout: 480 seconds]
heapify has joined #wayland
andyrtr- has quit [Ping timeout: 480 seconds]
bodiccea has joined #wayland
heapify has quit [Quit: heapify]
<emersion> kchibisov: I don't understand what this buffer latching is about
bodiccea_ has joined #wayland
<kchibisov> I think this issue has all others linked.
mvlad has joined #wayland
<kchibisov> This is all relevant for egl, not for vulkan iirc.
<wlb> weston/11.0: Michael Tretter * backend-drm: schedule connector disable for detached head https://gitlab.freedesktop.org/wayland/weston/commit/a5d52075a07b libweston/backend-drm/ drm-internal.h drm.c kms.c
<wlb> weston Merge request !1319 merged \o/ (Schedule connector disable for detached head for weston 11.0 https://gitlab.freedesktop.org/wayland/weston/-/merge_requests/1319)
<kchibisov> Though, the toolkits could workaround that by setting viewporter dst_size to the one they've resized. So when everything matches compositors will ignore, but if you got a missmatch due to EGL stuff it'll 'amortise' the issue. Though, I'd rather have mesa fixed wrt that context stuff.
bodiccea has quit [Quit: Leaving]
<emersion> i still don't understand why locking/unlocking buffers has any impact on what happens on the wire Wayland-wise
<kchibisov> The user might think that they've resized the buffer to account for the new scale.
<kchibisov> But in reality they don't.
<kchibisov> However the scale will be applied.
<emersion> so, makecurrent, resize, set_scale, swapbuffers, and you got a wrong size?
<kchibisov> So if you do (your surface was 501x401) eglMakeContextCurrent(), wl_egl_resize_surface(800x600), wl_surface.set_buffer_scale(2), eglSwapBuffers() -> crash, 501 not dividable by 2.
<kchibisov> yes.
<emersion> why do you resize while the context is current?
<emersion> it sounds purely like a client design issue
<kchibisov> If you have 2 windows at the same time one could assume that to start working with the surface you need to make it current.
<kchibisov> That's what Qt for example did.
<kchibisov> And that's what I've assumed until I've read all the mesa's wayland egl code.
<kchibisov> And you can clearly resize once the context is current, you just can't call the function to make it current.
<kchibisov> One other case, is if you don't do eglMakeCurrent, but do wl_egl_create_window.
<kchibisov> Because it'll do the same to the actively current context.
<kchibisov> Hm, maybe it was for eglCreateContext actually, so if you have a current context and you create a new one, it'll latch the current one.
<kchibisov> One could make context current to compile shaders for example, then make it not current, but the buffer will still be latched, because you've called eglMakeCurrent.
<MrCooper> "latched" as in "attached and committed"?
<kchibisov> latched as in resizing won't apply until the eglSwapBuffers is called.
<kchibisov> So your resize will go into the next frame.
<kchibisov> But your scale can go to this frame.
<kchibisov> Because scale is set by client, but resize is done by mesa.
<wlb> weston Issue #778 opened by Hyomin Kim (hyoputer.kim) The mouse button release signal is not called after moving the surface https://gitlab.freedesktop.org/wayland/weston/-/issues/778
<MrCooper> so the client must only change buffer scale if it also calls eglSwapBuffers?
<kchibisov> They must change the buffer scale before the buffer gets latched.
<kchibisov> So before the operations listed in egl spec (buffer age, etc), eglMakuCurrent(probably a bug), eglCreateContext(if there's any other current context on the calling thread, probably also a bug).
<kchibisov> The client though, could ensure that their size actually applied by querying the size.
<kchibisov> So they could try to resize, check if they resized, and if the sizes matches they could set buffer scale.
junaid has joined #wayland
junaid has quit []
iomari892 has joined #wayland
<kchibisov> Maybe I should stop caring about that and tell every 'my egl application crashes, your toolkit is bugged' bug report that it's not my issue and close them.
<kchibisov> Once fractional scaling will be more adopted it won't be a thing anyway.
<emersion> it is a client bug though
<emersion> and i will not stop crashing your client
<kchibisov> My client is fine.
<kchibisov> Because I know how to write and when it'll latch.
<kchibisov> But I still believe that eglCreateContext must not latch.
<kchibisov> Because it doesn't even make any sense for it to latch other arbitrary context.
rasterman has joined #wayland
Company has joined #wayland
<emersion> jadahl: would it be possible to get this merged? https://github.com/flatpak/flatpak/pull/4920
iomari891 has joined #wayland
iomari892 has quit [Ping timeout: 480 seconds]
<pq> _DOOM_ should not use libwayland-server as a generic utility library because if those utilities are lacking in some way, enhancements will probably not be accepted. That goes especially with the event loop stuff, and the other stuff is likely frozen by ABI anyway. So it's a risk having to use something else in the future anyway as your requirements grow.
<jadahl> emersion: i'm not a flatpak maintainer, I can only try to nag again
iomari892 has joined #wayland
iomari891 has quit [Read error: Connection reset by peer]
cmichael has joined #wayland
<pq> kchibisov, you have to be really *really* careful with EGL in order to be sure of the buffer size after resize is what you expect. Once EGL or GL internally needs a buffer, it's size is locked until the next swapbuffers. Use wl_egl_window API to query it, I think? Or always wl_egl_window_resize() immediately after eglSwapBuffers and live with that.
<kchibisov> pq: I know, I'm just afraid that agressive 'kill the client logic' could make more harm with egl.
<kchibisov> Like you update you kwin, and then none of the Qt apps launch anymore.
<kchibisov> Because they have broken EGL handling.
<kchibisov> Which worked for them on anything other than Wayland.
<kchibisov> pq: my issue though, is not gl operations, but egl operatons which don't do any rendering or affect the buffer.
<kchibisov> At least in the past daniels said that it's a 'mesa issue' here (https://gitlab.freedesktop.org/mesa/mesa/-/issues/6547 , last comment).
<pq> right, you seem to already know everything there is about this
<kchibisov> If we start saying that it's not a mesa issue, then we must update the spec one more time.
<pq> Unfortunately there is no other way to inform clients they are doing glitchy things than outright protocol error.
<kchibisov> I mean, it's fine to do a protocol error.
<pq> It is a Mesa issue if EGL API calls lock the back buffer with no reason to.
<kchibisov> it's just maybe not a right time for due to known bugs in drivers.
<pq> But EGL API calls that actually do need a locked buffer, it's not a bug.
<kchibisov> pq: I agree, and I'm saying that we have a situtation with 1)(mesa latches where it shouldn't) which affects Qt.
<pq> Qt? You were not talking about winit or glutin?
<kchibisov> I was talking about the situation we have in general.
<kchibisov> My code is resistant to that issue, because I'm well aware how it works.
<pq> ok
<kchibisov> You'll see Qt bug being linked.
<kchibisov> While it's not an issue right now, because you can't possibly crash due to that with the scaling from the wl_output.enter, once the wl_surface::preferred_buffer_scale will be used it'll start crashing in addition.
<kchibisov> Simply because you render the first frame with the scale of 1 and scaling of 1 + broken size is not a real issue other than a 'glitch', but with the new event, it'll be delivered to you along the first configure, you'll apply that scaling and crash if the latched buffer was not dividable.
<pq> No, you definitely should crash regardless of where you got your scale factor from.
<kchibisov> I mean, that with legacy you render first frame at scale of 1.
<pq> The protocol error is that the client is internally inconsistent: it promises the buffer size + viewporter results in integer surface size, and it doesn't.
<kchibisov> I understand, but I'm saying that client doesn't really control the buffer size with mesa.
<pq> oh, the initial scale 1
<pq> client does control the buffer size with Mesa EGL, but it very hard to get it right.
<kchibisov> You have a broken result even when you 1) Create a EGL surface of size 800x600 2) and the next operation you do is an instant resize to 900x700.
<kchibisov> So whatever you've passed initially will be used.
<pq> yeah, you'd better use the right size from the beginning.
<kchibisov> Right, but it's sooo common, because folks don't want to have a Nullable type.
<pq> It's really wasteful to create one size and then immediately resize.
<kchibisov> The issue I'd at least fix is the eglCreateContext/Surface latching other context.
<pq> I don't understand how a Nullable type relates here.
<emersion> pq, if you create an EGLSurface with arbitrary size at init time, your surface is guaranteed to never be null
<emersion> but yeah it's not great
<pq> yes, I'd those are bugs
<emersion> allocating large slices of GPU memory is slow
<kchibisov> I've seen a lot of folks doing a 1x1 wl_egl_window stuff.
<kchibisov> And then they resize to the right thing on new configure.
<kchibisov> I've fixed myself 3-4 other folks clients due to that...
<kchibisov> And there's still a Qt issue due to exact same reason...
<pq> What do you want to happen? Compositors stop sending the protocol error for a while?
<kchibisov> I would at least don't send them for dmabuf for a while.
<pq> Compositors certainly can stop sending protocol errors if they want to.
<kchibisov> If it was an error from wl_shm they must kill the client.
<pq> but if compositors stop sending the error, then the problem disappears. Why would any client side get fixed then?
<emersion> that sounds like a bad workaround
<pq> can people actually see the glitch?
<kchibisov> I can.
<pq> oooh, that gives me an idea
<kchibisov> That's the reason I'm aware of all of that, because I was fixing the glitch.
liquidh20 has joined #wayland
<emersion> EGL can do shm, too
<kchibisov> Hm, right...
<kchibisov> And it'll have the exact same logic as dmabuf path.
<pq> instead of a protocol error, a compositor could use placeholder "this window is bugged" graphics on the wl_surface until the size requirement is respected again.
<pq> make the glitch much more severe while not killing the app
<kchibisov> Just draw a bright pink window, it's really annoying.
<pq> yeah, or if you want it fancy, have a text in it saying to file a bug or something - so if it doesn't go away immediately and the window becomes unusable, the user gets a clue.
<kchibisov> Though, the thing is that once fractional scaling is in use, this glitch sort of goes away.
<pq> why?
<kchibisov> you'll have a wrong scaled client, but you can't kill it anymore.
<pq> huh? why?
<kchibisov> Because the cliest asked to be scaled to dst via viewporter.
iomari892 has quit [Read error: Connection reset by peer]
iomari892 has joined #wayland
<pq> hwre does the requirement of integer wl_surface size disappear?
<pq> doesn't fractional scaling depend on viewporter, which still guarantees integer wl_surface size?
<kchibisov> I mean, it's all integer, you just can submit a buffer of wrong size.
<kchibisov> And all the buffers are scale 1.
<pq> oh course, but that's not a protocol error
<kchibisov> exactly.
<kchibisov> So the issue won't get fixed, but masked.
<pq> we're not after wrongly scaled clients, we're after protocol errors
<kchibisov> yeah, that's true.
<pq> wrongly scaled clients are client bugs, we compositors don't care
<kchibisov> I'm just saying that the root cause might not get actually fixed.
<pq> right
<kchibisov> Like if all toolkits do fractional scaling and all compositors do fractional scaling, the real issues will be tolerated.
<pq> it just joins set the of other client bugs that a compositor cannot detect or warn about - life as usual
<kchibisov> And given that Qt and kwin can do fractional scaling, the issue is not that big of a deal for kwin.
<pq> yup, unless people see the glitch, which the compositor now cannot make more visible either
<kchibisov> The only compositor where you can really observe glitches is sway, because it's tiling.
<pq> oh, I thought you were able to see the glitch on a floating window
<kchibisov> I know that glitch is observable in gnome and sway.
<kchibisov> And in gnome the way to observe it is to try start your window maximized.
<kchibisov> If the buffer is not matching what gnome wants it won't make it maximized.
<q234rty> tbh I'm running a patched wlroots with that particular protocol error removed since the hidpi Xwayland patch I'm using also triggers that
<pq> right, subtle
<kchibisov> Maybe making Qt apps crash for a while is a way to go, it's widely used so it could get attention to get it fixed.
liquidh20 has quit []
<kchibisov> Oh, if they pick initial dimensions like 480x240 they'll work around the bug.
<kchibisov> (technically 120x120)
molinari has joined #wayland
nerdopolis has joined #wayland
fmuellner has joined #wayland
kts has joined #wayland
nerdopolis has quit [Remote host closed the connection]
heapify has joined #wayland
nerdopolis has joined #wayland
<pq> emersion, thanks for noticing that Weston code with unmerged kernel UAPI.
nerdopolis has quit [Ping timeout: 480 seconds]
natewrench has left #wayland [#wayland]
manuel1985 has joined #wayland
noord has quit [Quit: bye]
noord has joined #wayland
noord has quit []
noord has joined #wayland
kts has quit [Ping timeout: 480 seconds]
iomari891 has joined #wayland
iomari892 has quit [Read error: Connection reset by peer]
<wlb> weston Merge request !1320 opened by Philipp Zabel (pH5) doc: add workaround for doxygen 1.9.6 bug with cairo >= 1.17.6 https://gitlab.freedesktop.org/wayland/weston/-/merge_requests/1320
<wlb> weston Merge request !1321 opened by Philipp Zabel (pH5) libweston: Document struct weston_mode https://gitlab.freedesktop.org/wayland/weston/-/merge_requests/1321
eroux has joined #wayland
iomari891 has quit [Ping timeout: 480 seconds]
heapify has quit [Quit: heapify]
<wlb> weston Merge request !1321 merged \o/ (libweston: Document struct weston_mode https://gitlab.freedesktop.org/wayland/weston/-/merge_requests/1321)
<wlb> weston/main: Philipp Zabel * libweston: Document struct weston_mode https://gitlab.freedesktop.org/wayland/weston/commit/40df321d7c50 include/libweston/libweston.h
<wlb> weston Merge request !1320 merged \o/ (doc: add workaround for doxygen 1.9.6 bug with cairo >= 1.17.6 https://gitlab.freedesktop.org/wayland/weston/-/merge_requests/1320)
<wlb> weston/main: Philipp Zabel * doc: add workaround for doxygen 1.9.6 bug with cairo >= 1.17.6 https://gitlab.freedesktop.org/wayland/weston/commit/f8611607ecb5 doc/sphinx/meson.build
nerdopolis has joined #wayland
rv1sr has joined #wayland
kts has joined #wayland
rv1sr has quit []
rv1sr has joined #wayland
tzimmermann has quit [Quit: Leaving]
manuel1985 has quit [Quit: Leaving]
fmuellner has quit [Ping timeout: 480 seconds]
cmichael has quit [Quit: Leaving]
The_Company has joined #wayland
Company has quit [Read error: Connection reset by peer]
nerdopolis has quit [Ping timeout: 480 seconds]
junaid has joined #wayland
mort_7 is now known as mort_
heapify has joined #wayland
___nick___ has joined #wayland
carlos_ has joined #wayland
nerdopolis has joined #wayland
___nick___ has quit []
___nick___ has joined #wayland
___nick___ has quit []
kts has quit [Quit: Konversation terminated!]
___nick___ has joined #wayland
riteo has joined #wayland
nerdopolis has quit [Remote host closed the connection]
<riteo> hi!
<riteo> kennylevinsen: sorry for the delay, I have one right here
<riteo> one sec...
nerdopolis has joined #wayland
<riteo> there
<riteo> the popup gets done, it gets a new buffer attached, gets destroyed, commits, syncs (I don't remember if I added it for good luck) aaand... Errno 32.
<riteo> (the segfault is artificial)
<riteo> uh by get destroyed I meant damaged, sorry
Nosrep has quit [Remote host closed the connection]
<kennylevinsen> are you reading/flushing the display yourself? if this had been a dispatch, it should have shown the protocol error that caused the disconnect
<riteo> I'm handling event handling myself yes
<riteo> but I do get protocol errors
<kennylevinsen> it could look like you flushed yourself, and exploded immediately as write failed despite there still being stuff to read
<riteo> Actually I don't think I ever manually flushed
<kennylevinsen> I don't see a protocol error here, just "error 32 while flushing the Wayland display"
<riteo> sorry, I meant that I usually get protocol errors
<kennylevinsen> (protocol error means that you received a message from the server telling you exactly what crime the client committed, followed by disconnection)
Nosrep has joined #wayland
<kennylevinsen> right
<riteo> oh I wrote flushing in the error message thinking about it
<riteo> lemme check
<riteo> oops I indeed flush in the main polling thread lol
<riteo> there's an error condition in there though
<kennylevinsen> there are times it makes sense to flush, but you should not die on the failed *write* - wait for the failed dispatch/read
<riteo> and as I said I sometimes get protocol errors, also I have other logic to read the error so I'm not sure what I'm doing wrong there
<kennylevinsen> without the debug log *with* the protocol error, I cannot say what is done wrong
<kennylevinsen> but the complete log should show the exact sequence of events causing the issue
<riteo> oh wait, so I should keep dispatching even after a failed flush?
<kennylevinsen> dispatch until failed dispatch
<riteo> all right, let me change the code
<kennylevinsen> if you stop after failed flush, you might still have stuff to read (such as the protocol error!)
<riteo> ohhh I see now, thanks for letting me know
<riteo> noooooo
<riteo> > xdg_surface@71: error 3: xdg_surface has never been configured
<riteo> it is indeed the dreaded issue I linked then
<riteo> thanks a lot for helping me with debugging the issue btw!
DPA2 has joined #wayland
<riteo> sooo... I can only hope that mutter/kwin have some saner behaviour, otherwise we're in a bad situation
<riteo> but first I'll make a new log
DPA has quit [Ping timeout: 480 seconds]
junaid has quit [Quit: leaving]
junaid has joined #wayland
<riteo> wait all of a sudden it looks like it doesn't dispatch anymore, it just spins...
<kennylevinsen> Unless it's a compositor bug, you're always in a bad place when you have protocol errors - even if other compositor are more lenient... :)
<kennylevinsen> but might be fixable
<riteo> I just noticed that in the docs it recommends to call the dispatch method both if the queue is empty and _after_ canelling/reading the events: https://wayland.freedesktop.org/docs/html/apb.html#Client-classwl__display_1a40039c1169b153269a3dc0796a54ddb0
<riteo> why?
junaid has quit []
<riteo> sorry I meant "is _not_ empty"
junaid has joined #wayland
DPA has joined #wayland
<kennylevinsen> it dispatches until there is nothing left to dispatch (first loop), then writes, waits for data, and dispatches regardless of whether poll failed
heapify has quit [Quit: heapify]
<kennylevinsen> Dispatching events first ensure that you do not block waiting for new stuff to read when you already have old stuff to do (deadlock), and dispatching after is... Well because you just read stuff so you have work to do.
<riteo> I put this into a while loop though
DPA2 has quit [Ping timeout: 480 seconds]
<riteo> so it'll prepare read and dispatch right away
<riteo> it's a dedicated thread
<kennylevinsen> In that case you can just as well call wl_dispatch
<riteo> twice?
<kennylevinsen> *wl_display_dispatch
<kennylevinsen> Instead of rolling your own loop
<riteo> oh there's a reason
<riteo> we need a mutex
<riteo> the main thread can do some stuff so locking avoids changing/reading data while the events thread is still changing it
<kennylevinsen> Either way, follow the suggested standard method unless you're certain you do not need it - and certain you want to debug thsy yourself ;)
<kennylevinsen> *that you dumb phone
* riteo shrugs
<riteo> I'll just implement it as the docs say
<riteo> well, as I said I'll make a new log, send it here and... dunno, the issue is known
<riteo> it's definitely a bug though as inert objects should just take requests as no-ops
<riteo> that's just the nature of the asynchronous protocol
<emersion> drakulix[m], d_ed[m]: do you want to meet sometime this week?
<drakulix[m]> I was under the impression that there is a KDE conference this week and thus they couldn't make it.
<drakulix[m]> And given the protocols we want to talk about, I was thinking next week might be a better time for the next w-p meeting.
<emersion> oh right
<emersion> sorry, i didn't remember
<drakulix[m]> btw, do I remember correctly, that you wanted to post the minutes to the wiki?
<riteo> same thing but now we have a nice more reasonable debug log, nice :)
junaid has quit [Ping timeout: 480 seconds]
rasterman has quit [Quit: Gettin' stinky!]
DPA has quit [Ping timeout: 480 seconds]
DPA has joined #wayland
molinari has quit [Ping timeout: 480 seconds]
___nick___ has quit [Ping timeout: 480 seconds]
DPA2 has joined #wayland
DPA has quit [Ping timeout: 480 seconds]
<riteo> well, I'll experiment and see how's the situation going elsewhere. Thanks for everything, cya!
riteo has quit [Quit: epic error moment]
DPA has joined #wayland
DPA2 has quit [Ping timeout: 480 seconds]
mvlad has quit [Remote host closed the connection]
sima has quit [Ping timeout: 480 seconds]
rv1sr has quit []
fmuellner has joined #wayland
agd5f has quit [Read error: Connection reset by peer]
cheez has joined #wayland
cheez1 has joined #wayland
cheez1 has left #wayland [#wayland]
cheez has quit [Quit: Lost terminal]
cvmn has joined #wayland
eroc1990 has quit [Ping timeout: 480 seconds]
caveman has quit [Remote host closed the connection]