#dri-devel on 2024-11-20 — irc logs at oftc.irclog.whitequark.org

2024-07-16 04:52 ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar

00:00 mbrost_ has quit [Ping timeout: 481 seconds]

00:04 rasterman has quit [Quit: Gettin' stinky!]

00:05 anujp has quit [Ping timeout: 480 seconds]

00:05 anujp has joined #dri-devel

00:17 <Company> people just need to learn that there's 3 different names for each gpu: kernel driver, GL driver, Vulkan driver

00:17 <Company> and those names are creatively chosen, not because they make sense

00:22 <Company> but I can confirm from Gnome that people who aren't aware of this get very confused by those names

00:22 <airlied> in some cases they made sense when they were chosen, but the meanings have shifted

00:23 <alyssa> for us, kernel=gl=asahi, vk=honeykrisp which.. could be worse

00:23 <Company> also, Gnome lost its creativity because we named everything gnome-thing

00:24 <feaneron> you can't say that with a straight face

00:24 <Company> which worked for a few years but now that we want to replace older apps with newer ones, we don't have names available anympre

00:24 * feaneron named his app Boatswain, and all types start with Bs

00:25 <feaneron> BsWindow, BsStreamDeck, it's all bs

00:26 <Company> i'm still salty that it's called gnome-text-editor because typing that in a terminal takes way too long

00:27 <Company> and autocomplete doesn't work either, because everything starts with "gnome-"

00:28 <zf> KDE had the right idea :-)

00:45 <alyssa> Company: rip gedit

00:46 <alyssa> wait is gedit not gnome-text-edtior

00:48 mbrost has joined #dri-devel

00:52 mattst88_ has quit []

00:52 <feaneron> it is not

00:52 mattst88 has joined #dri-devel

00:52 <mattst88> nope, separate thing

00:52 <mattst88> gnome-text-editor is the new thing

00:54 <alyssa> joy

01:01 <Company> maintainer issues caused a fork

01:01 <Company> so gedit still exists

01:06 <Company> bunch of projects had some reckoning when the gtk4 transition happened, both because gtk4 nudged very hard towards design changes, from menu + toolbar towards "touch" design

01:06 <Company> and because backends that weren't reasonably clean and operating under an X model suddenly had to deal with a toolkit that didn't bent over backwards to make that work on Wayland

01:07 <Company> so you can no longer do updates via XCopyArea()

01:08 <pac85> I thought gnome-text-editor was done from scratch, it looks very different than gedit (at least how I remember it)

01:09 <Company> not sure - but 90% of it is GtkSourceView and that remained a thing

01:10 <Company> it's either gedit and deleting all the plugin stuff or redone from scratch around the GtkSourceView port to GTK4

01:12 <Company> https://gitlab.gnome.org/GNOME/gnome-text-editor/-/commit/bdf472712b995ec737a4913ac2f57cf89bf7bc4a

01:13 <Company> it's a case of "what do I do now that there's a lockdown?"

01:24 <alyssa> i mean go back far enough and krita is a gimp fork right? ;P

01:30 The_Company has joined #dri-devel

01:32 Company has quit [Ping timeout: 480 seconds]

01:33 <DemiMarie> alyssa: why is the userspace code not named Asahi too?

01:33 Company has joined #dri-devel

01:34 <DemiMarie> Company: Yup, GTK4 very very much pushes one towards a certain design style, which makes porting some applications almost impossible. I have no idea how I would port Horizon EDA.

01:35 anholt has quit [Ping timeout: 480 seconds]

01:39 The_Company has quit [Ping timeout: 480 seconds]

01:43 NiGaR has quit [Remote host closed the connection]

01:45 anholt has joined #dri-devel

01:47 NiGaR has joined #dri-devel

01:55 epoch101 has joined #dri-devel

01:57 a1batross has joined #dri-devel

01:57 oneforall2 has joined #dri-devel

02:01 anholt has quit [Ping timeout: 480 seconds]

02:04 alane has quit []

02:05 alane has joined #dri-devel

02:11 NiGaR has quit [Remote host closed the connection]

02:13 NiGaR has joined #dri-devel

02:14 NiGaR has quit [Remote host closed the connection]

02:14 NiGaR has joined #dri-devel

02:21 <Company> DemiMarie: there's 2 answers to that. One of them is to find some modern UI designers to take on that topic, and change the parts that don't fit well anymore

02:21 <DemiMarie> Company: fit well with *what*?

02:21 <Company> DemiMarie: and the other option is to do an alternative/companion to libadwaita that focuses on that older style of application design

02:22 <DemiMarie> Is the existing UI objectively bad in some way, even for applications that do not need touch support?

02:23 <Company> none of that design is about touch support, that's just how people call it

02:23 <DemiMarie> Is there scientific evidence that the newer design is objectively (as opposed to subjectively) superior?

02:23 <Company> I have no idea

02:24 <Company> though I'd be interested how anyone would quantify "better" for design

02:24 <DemiMarie> “How long does X take?” studies would be the most obvious one I can think of, but there are people (not me) who actually do work on this stuff.

02:24 <Company> but people like developing apps this way, so that's what's happening

02:25 <DemiMarie> Is GTK no longer intended to be used without libadwaita or another platform library?

02:26 <Company> I always compare it to the web for the answer

02:26 <Company> can you make a webpage without some framework? Sure. Should you? Probably not.

02:27 <Company> GTK is trying to push the widgets that imply some sort of UI design out of the platform

02:28 <Company> and focus on the core building blocks

02:29 <Company> but that leaves you without the base widgets that make up the UI - sidebars, headerbars, toolbars, statusbars, etc (why are those all bars?)

02:30 <Company> and you also have no design language, ie no consistent spacing, no good color/contrast choices, all the themeing stuff is missing

02:31 <Company> and that's basically what a framework/platform lib gives you

02:34 <Company> if someone made a library with a toolbar widget, a menu and some MDI docking widget, so that you could implement Gimp's and Inkscape's UI with it, then you could port those and the whole Cinnamon/Mate apps to it and you could probably find a bunch more

02:34 <Company> gedit!

02:34 <Company> but you need to find someone who wants to create that library, and there's been a distinct lack of interest for years

02:35 libv has quit [Ping timeout: 480 seconds]

02:35 <DemiMarie> At least some are just leaving GTK instead.

02:39 <Company> that's also an option - depends on how much UI you have and how well other stuff fits

02:40 NiGaR has quit [Remote host closed the connection]

02:41 NiGaR has joined #dri-devel

02:41 NiGaR has quit [Remote host closed the connection]

02:41 <Company> all I know is that the Gnome community is not gonna make it happen

02:41 mbrost_ has joined #dri-devel

02:41 NiGaR has joined #dri-devel

02:41 NiGaR has quit [Remote host closed the connection]

02:41 NiGaR has joined #dri-devel

02:42 NiGaR has quit [Remote host closed the connection]

02:42 NiGaR has joined #dri-devel

02:42 The_Company has joined #dri-devel

02:46 mbrost has quit [Ping timeout: 480 seconds]

02:47 anujp has quit [Ping timeout: 480 seconds]

02:48 libv has joined #dri-devel

02:48 feaneron has quit [Ping timeout: 480 seconds]

02:49 Company has quit [Ping timeout: 480 seconds]

02:58 mbrost_ has quit [Ping timeout: 480 seconds]

03:00 <Lynne> is there some database of the throughput of GPUs in terms of 32/16/8-bit (non-matrix) integer ops?

03:01 amarsh04 has quit []

03:08 NiGaR has quit [Remote host closed the connection]

03:09 NiGaR has joined #dri-devel

03:09 NiGaR has quit [Remote host closed the connection]

03:09 NiGaR has joined #dri-devel

03:09 NiGaR has quit [Remote host closed the connection]

03:09 NiGaR has joined #dri-devel

03:10 NiGaR has quit [Remote host closed the connection]

03:10 NiGaR has joined #dri-devel

03:11 NiGaR has quit [Remote host closed the connection]

03:11 NiGaR has joined #dri-devel

03:23 heat has quit [Ping timeout: 480 seconds]

03:31 amarsh04 has joined #dri-devel

03:35 amarsh04 has quit []

03:36 u-amarsh04 has joined #dri-devel

03:47 glennk has joined #dri-devel

03:52 epoch101 has quit []

03:55 epoch101 has joined #dri-devel

03:55 u-amarsh04 has quit []

03:57 amarsh04 has joined #dri-devel

03:57 amarsh04 has quit []

03:57 amarsh04 has joined #dri-devel

04:06 The_Company has quit []

04:06 Company has joined #dri-devel

04:24 mbrost has joined #dri-devel

04:31 mbrost_ has joined #dri-devel

04:35 mbrost has quit [Ping timeout: 480 seconds]

04:39 anholt has joined #dri-devel

04:41 mbrost has joined #dri-devel

04:47 mbrost_ has quit [Ping timeout: 480 seconds]

04:47 bmodem has joined #dri-devel

05:12 nerdopolis has quit [Ping timeout: 480 seconds]

05:12 mbrost has quit [Ping timeout: 480 seconds]

05:24 Dalphon has quit [Ping timeout: 480 seconds]

05:35 epoch101 has quit []

05:55 fab has joined #dri-devel

05:56 fab is now known as Guest246

05:56 aradhya7 has joined #dri-devel

06:01 verylowendsteve has joined #dri-devel

06:01 verylowendsteve has quit [Remote host closed the connection]

06:03 kzd has quit [Ping timeout: 480 seconds]

06:06 Dalphon has joined #dri-devel

06:07 justalowendguy has joined #dri-devel

06:17 bmodem has quit [Ping timeout: 480 seconds]

06:18 bolson has quit [Ping timeout: 480 seconds]

06:22 Duke`` has joined #dri-devel

06:35 JLP_ has joined #dri-devel

06:35 <justalowendguy> Did you miss me?

06:35 JLP has quit [Ping timeout: 480 seconds]

06:36 Fijxu has quit [Quit: XD!!]

06:36 <justalowendguy> I replaced battery finally on one of the macbooks.

06:37 <justalowendguy> it had overscrewed screws, but with brand new battery it works rediculously well. And can finally work now, did set up some more equipment to my office.

06:37 Fijxu has joined #dri-devel

06:39 kts has joined #dri-devel

06:42 blaztinn_ has quit [Remote host closed the connection]

06:42 <justalowendguy> But i do the combinations tasks on phone a lot too, It really seems my theory is functional.

06:43 Dalphon has quit [Ping timeout: 480 seconds]

06:43 blaztinn has joined #dri-devel

06:53 kts has quit [Quit: Konversation terminated!]

06:57 justalowendguy has quit [Ping timeout: 480 seconds]

07:09 Fijxu has quit [Ping timeout: 480 seconds]

07:11 Fijxu has joined #dri-devel

07:12 jsa1 has joined #dri-devel

07:20 dolphin has joined #dri-devel

07:22 kts has joined #dri-devel

07:27 justalowendguy has joined #dri-devel

07:28 Guest246 has quit [Ping timeout: 480 seconds]

07:30 sima has joined #dri-devel

07:31 Fijxu has quit [Quit: XD!!]

07:32 Fijxu has joined #dri-devel

07:42 JLP_ has left #dri-devel [#dri-devel]

07:42 JLP has joined #dri-devel

07:49 Fijxu has quit [Ping timeout: 480 seconds]

07:50 Fijxu has joined #dri-devel

07:51 fab has joined #dri-devel

07:51 fab is now known as Guest256

07:58 Fijxu has quit [Quit: XD!!]

07:59 Fijxu has joined #dri-devel

08:00 sghuge has quit [Remote host closed the connection]

08:00 sghuge has joined #dri-devel

08:02 warpme has joined #dri-devel

08:03 digetx has quit [Quit: No Ping reply in 180 seconds.]

08:04 bmodem has joined #dri-devel

08:04 digetx has joined #dri-devel

08:05 aradhya7 has quit [Quit: Connection closed for inactivity]

08:06 Guest256 has quit []

08:08 Fijxu has quit [Ping timeout: 480 seconds]

08:14 digetx has quit [Ping timeout: 480 seconds]

08:15 digetx has joined #dri-devel

08:15 Perseverance has joined #dri-devel

08:25 Perseverance has quit [Ping timeout: 480 seconds]

08:25 frieder has joined #dri-devel

08:30 Fijxu has joined #dri-devel

08:31 tzimmermann has joined #dri-devel

08:36 xroumegue has quit [Ping timeout: 480 seconds]

08:38 phasta has joined #dri-devel

08:39 <MrCooper> RAOF: my recommendation is to just signal the release point when the atomic commit completion event arrives, not materialize any fence before

08:40 <MrCooper> zamundaaa[m]: even a client which prefers signalled release points might reasonably say "the release point has a fence, I can re-use this buffer, don't need to allocate another one"

08:43 Perseverance has joined #dri-devel

08:45 bnieuwenhuizen has quit [Quit: Bye]

08:45 bnieuwenhuizen has joined #dri-devel

08:50 Fijxu has quit [Ping timeout: 480 seconds]

08:56 Fijxu has joined #dri-devel

09:00 lynxeye has joined #dri-devel

09:03 <emersion> sounds like a broken client

09:07 <MrCooper> why? That's how I'd implement it for dynamic number of buffers

09:08 rgallaispou has joined #dri-devel

09:08 <MrCooper> if there being a fence doesn't imply the buffer can be re-used, what does?

09:10 Fijxu has quit [Quit: XD!!]

09:16 LeviYun has quit [Ping timeout: 480 seconds]

09:18 apinheiro has joined #dri-devel

09:19 warpme has quit []

09:23 Fijxu has joined #dri-devel

09:24 justalowendguy has quit [Remote host closed the connection]

09:30 sensationfrom has joined #dri-devel

09:30 sensationfrom has quit [Remote host closed the connection]

09:31 JLP has quit [Ping timeout: 480 seconds]

09:31 JLP has joined #dri-devel

09:31 franchiseeofhell has joined #dri-devel

09:36 kts has quit [Quit: Konversation terminated!]

09:38 <emersion> MrCooper: it's easy to wait for the timeline point to be signalled

09:38 <emersion> as opposed to materialized

09:39 <MrCooper> right, then there's no point materializing the release point from OUT_FENCE_PTR though?

09:41 <MrCooper> materializing the relase point with a compositor GPU work fence and the client re-using that buffer does make sense though

09:43 <emersion> MrCooper: hm, what's the difference between GPU work fence and KMS fence work? why does it make sense to use one but not the other?

09:43 <emersion> i suppose because one will happen "sooner" than the other?

09:43 <MrCooper> GPU fence is guaranteed to signal ASAP, OUT_FENCE_PTR might miss a display refresh cycle

09:44 <MrCooper> and can't signal before the next cycle in the first place

09:44 <emersion> right

09:44 * lynxeye seems to be equally confused as emersion and RAOF

09:44 <emersion> i agree

09:44 <emersion> now i'm kinda wondering why OUT_FENCE_PTR exists in the first place

09:44 <lynxeye> What's the point of the out fence if you can use it to wait for the GPU (scanout) to be done with the buffer?

09:45 <MrCooper> emersion: as a trap? ;)

09:45 <emersion> :D

09:45 <emersion> it would be useful for writeback, maybe

09:45 <MrCooper> it could be used instead of completion events in some cases

09:46 <MrCooper> e.g. when turning off a CRTC?

09:47 LeviYun has joined #dri-devel

09:47 <lynxeye> Wasn't the point of the fence that you could use it to pass back to whoever is waiting for the buffer to be free again instead of waiting for the atomic commit completion event and signaling that back to waiters?

09:49 <franchiseeofhell> I only contriubuted this hack in words salad format. I,e you do not show this code online indeed, and YES i understand , it's perfectly fine, however i do not have a proper convoy or asylym given where i can safely work with that code, though i personally think, that Jews , Americans and Russians already have that code in the world of military and i gave them nothing, they have it

09:49 <franchiseeofhell> mixed with self-timing. The code itself is very easy. I think China has it too, but other europe ones i do not know , they have all sorts of things there. I need that code to defend my life and save it. So give me some security jamming signal the least or asylym to work it out, it's very simple couple months work. And one more request i have, could someone look at the dma engines and

09:49 <franchiseeofhell> clean them up better than atm? I see that karolherbst knows howto code well, and there is no such problem, i can see that 2011 intel laptop works also seriously good, any troubles i no longer do not have. If you say i might leak something , i seriously try not to, but i suspect i am under a huge number of taps. I need reliable signal and area to work at, every one knows where i stay,

09:49 <franchiseeofhell> then i suppose just arrange it. Please can some look and bring in sane dma engines API bits also, cheers.

09:50 <MrCooper> lynxeye: if it was, it was ill-conceived

09:53 jkrzyszt_ has joined #dri-devel

09:55 mripard has quit [Quit: WeeChat 4.4.2]

09:57 <sima> MrCooper, emersion lynxeye so for some tiling gpus it actually makes sense to start rendering while the flip is scheduled but hasn't happened yet

09:58 franchiseeofhell has quit [Remote host closed the connection]

09:58 <sima> when you're extremely limited on memory you can kinda get 3 buffers but only allocate 2 by max pipelining everything

09:58 <sima> and with tilers you can do the tiling preprocessing while the buffer isn't needed yet

09:58 <sima> android apparently does that for some platforms, which is why that out fence exists

09:59 <sima> there's also an entire can of worms on the kernel side around that out fence breaking the kernel-internal rules to avoid dma_fence deadlocks

09:59 <lynxeye> MrCooper: agreed. I mean the point of in and out fence FDs was to allow explicit fencing. But the out fence is useless for that, as scanout is potentially still using the FB after the out fence has signaled if no flip away from this FB is scheduled

09:59 <sima> and I haven't found out a good way to fix it yet

09:59 <sima> so, it's a bit a mess

10:00 <sima> lynxeye, that sounds like compositor bug to me

10:00 warpme has joined #dri-devel

10:01 <lynxeye> sima: Huh? Unless your tile buffer covers the whole FB (which I guess is pretty unlikely) you can't start rendering even on a tiler until the buffer isn't scheduled for scanout anymore.

10:02 <sima> lynxeye, yeah hence the fence, because that allows you to queue up the entire batch to the gpu already and the gpu to start with vertex shader and putting the vertices into the right buckets

10:02 Fijxu has quit [Ping timeout: 480 seconds]

10:03 <sima> hwcomposer v1 went even further and made that out fence a full future fence iirc

10:04 <sima> and just assuming that both surface flinger and the app would get around to scheduling the flip

10:04 <lynxeye> sima: Okay, so OUT_FENCE_PTR is really just a footgun for everyone not reading the docs carefully in the general (non-android) case.

10:05 <sima> yeah I think unless your use-case is "extremely memory limited machine where you realistically can only ever allocate 2 buffers and are willing to trade latency for throughput by pipeling everything as deeply as possible" you do not want it

10:05 <sima> maybe we should add that to the property docs

10:06 <sima> if someone volunteers to type this patch I could upfront r-b it?

10:17 deliveryguysad has joined #dri-devel

10:17 <lynxeye> sima: I guess I can add some words of caution there. I still don't see how signaling that fence without a follow-up flip queued does improve pipelining. I do see how making this fence wait for queued flips before signaling would add lots of state tracking to the kernel, as the fence is on a atomic commit level, not the much simpler FB level.

10:18 Fijxu has joined #dri-devel

10:20 <sima> lynxeye, hm maybe we're talking past each another still

10:20 <sima> so you have two buffers, A currently being scanned out, B finished rendering, flip queued up

10:21 <sima> now in the old days you'd need to stall the opengl stack until A becomes available

10:21 <sima> with the out-fence userspace/app can start to queue up rendering and push it to the kernel

10:21 <sima> and the gpu even start with vertex processing

10:22 <deliveryguysad> That you know better than me, who is at my side and who is not, is not true, your secret intelligence parts, do not know my case better than me, i hacked all the monsters, even though they thought i was stupid. I now i have changeable ip addresses based vpn's so that llvm irc retaliate fart guy is correct technically , you can not distinguish vpn dynamic ip address as offshore

10:22 <deliveryguysad> signalling or scam. It is not possible 160bit HMAC 128-bit aes, it does not matter what vpn it is vray mesh, openvpn, or wireguard, if you renew the ip lease dynamically for public ip, instagram nor facebook nor any site understands that i use vpn. So he is there and in many other subject spot on, but not on me.

10:22 <sima> even while the flip is still queued up

10:22 <deliveryguysad> it's sha1 two rounds of emersion client you can not handle either, dynamic session.

10:22 <sima> and if you miss the next vblank then sure there's going to be a stall, but the assumption is that you simply do not have memory for buffer 3, so there's no way to avoid it

10:23 <sima> also with a more modern app/driver stack like vk you could do this yourself mostly

10:24 <sima> but this was designed back when command parsing and relocations in the kernel was all the rave, and gl drivers where a lot dumber

10:24 <sima> so queuing up in userspace means you still had to do that kernel work when the buffer was finally available

10:24 <lynxeye> sima: Yea, I guess it makes sense for the case where you expect rendering and flips being busy. It's just that it doesn't make a lot of sense to signal that fence when there is no B flip queued up, which is what causes the footgun trap.

10:25 <sima> lynxeye, yeah that's just compositor being broken

10:25 deliveryguysad has quit [Remote host closed the connection]

10:26 <sima> but that's the same for gpu compositing, if you just hand back a random out-fence for a buffer that has no relationship with when the buffer isn't in use anymore, then yeah that's a bug

10:26 mlankhorst has quit [Ping timeout: 480 seconds]

10:27 <sima> like gpu compositing might also need to later on recomposite, and if you've already signalled the out_fence for the only current buffer you have from the app, you're broken

10:27 <sima> exact same thing applies to kms, except with kms direct plane scanout the recompositing is guaranteed

10:28 <lynxeye> sima: It means the compositor can't implement a fenced release of the buffer by using the KSM out fence. With GPU rendering you can do that: if you know the client has a new buffer lined up for the next time you render, you can use the GPU render fence to pass back to the client for fenced release. If you do the same with the KMS out fence you might miss the vblank for scheduling the next flip and the fence signals too early, so

10:28 <lynxeye> ompositor/client can't use that for fenced release.

10:28 <sima> oh I forgot: on some android they actually used the manual scanout buffer to hold the frame, so they could signal the buffer much earlier

10:28 <sima> but that's again future fence semantcis

10:29 <sima> lynxeye, you only get the out_fence when you've submitted the flip to the kernel

10:29 <sima> at which point if you miss the vblank your entire screen eats the miss and there's nothing you can do

10:29 <sima> if you try to hand the fence back earlier it's a future fence, with all the perils that entails

10:30 Fijxu has quit [Ping timeout: 480 seconds]

10:30 <MrCooper> lynxeye: the fence doesn't signal before the atomic commit completes

10:30 <sima> and if you drop a frame because it's not ready but still hand out the out_fence for that flip back to the app, you're just broken

10:31 <lynxeye> sima: You get the out fence when you submit the flip _to_ this buffer and it will signal when the buffer has been scanned out once. What you want for fenced release is a fence that singals when a next flip _away_ from that buffer is scheduled and scanout is done.

10:31 Fijxu has joined #dri-devel

10:31 <sima> lynxeye, yeah that's just busted

10:31 <MrCooper> which is indeed what RAOF's plan was presumably, it's still problematic for the reasons I describe

10:32 <MrCooper> *described

10:32 <sima> and unless you're ok with trading latency for more throughput with deeper pipelining, you shouldn't even do that 2nd approach, like MrCooper explained

10:32 <sima> but the first one you've described is just plain broken

10:33 <sima> and it would also be broken on the gpu compositor path, if your compositor ever needs that frame again for a recomposition

10:34 <sima> unless you go with the X11 school of "surely background color is an acceptable fallback to a damage event"

10:36 oftckindones has joined #dri-devel

10:37 mlankhorst has joined #dri-devel

10:37 <lynxeye> sima: Again, it works for the render composition path if you know the client has lined up a new buffer for the next composition cycle, as the next render composition will use the new buffer for sure. So you can pass the render fence to the client for fenced release (which is what weston does today). With the KMS fence that's not possible, even if you have a flip to another buffer scheduled it might still miss the vblank and you e

10:37 <lynxeye> with scanout reusing the buffer after the out fence has signaled.

10:38 <sima> lynxeye, how?

10:38 <sima> like buffer A is currently being scanned out, B is queued up with a kms flip

10:38 <sima> you tell the app that it can render into A as soon as the out_fence for B has signalled

10:38 <sima> the kernel misses a vblank

10:39 <MrCooper> lynxeye: again, the fence doesn't signal before the commit completes

10:39 <sima> the out_fence is delayed appropriately, it will not signal on the next vblank, but the next vblank after the flip actually happened

10:39 <MrCooper> so what you describe can't happen

10:39 <sima> so how does the app manage to render into A while it's still being scanned out?

10:39 lfryzek has quit []

10:39 Fijxu has quit [Ping timeout: 480 seconds]

10:39 <sima> the docs should be really clear on this, if they're not we need to fix them

10:40 <MrCooper> lynxeye: the commit missing a refresh cycle is a problem in itself

10:40 <sima> but that's not a "latency/throughput tradeoff" but a "it's just broken" thing

10:40 Fijxu has joined #dri-devel

10:41 <oftckindones> now, you perhaps wanted to signal me, that i was abusive to david heidelberg, which i know i was firing words at you all, i have absolute memory, do not bother, maybe that i would get some in the ass, do not bother, maybe that its time to change my identity, yeah i know , do not worry, i know everything about my case, however i do not know why you changed the name to female ones, like

10:41 <oftckindones> sima, emma, and whatever the gfxstrand is , cause that is your internal life, i hope that i did not cause the like for the earlier reasons or for some other reason.

10:42 <lynxeye> sima: Now I'm confused again. Why would the app want to wait until the out fence for B signals if it want's to render into A? That's a full scanout cycle of latency. Surely it would want to start rendering into A as soon as the display engine has flipped away from A?

10:43 <sima> lynxeye, that's what will happen

10:43 <sima> where do you see an additional vblank happening?

10:43 <sima> the out_fence for B signals the moment the hw stops scanning out A and starts scanning out B

10:44 <sima> it does _not_ signal when the hw has finishes scanning out B for the first time

10:44 <lynxeye> argh, seems I read the docs wrong _again_

10:44 <sima> it's an out_fence for the flip itself, not for buffer B

10:44 <sima> same applies to gpu compositing

10:45 <sima> you don't need to wait with the out_fence for buffer A until you've finished rendering with B

10:45 <sima> all you need is wait until you've committed to rendering with B and don't need A anymore

10:45 <sima> so the out_fence for A would be whatever the end-of-batch fence for the last render job that used A is

10:45 <sima> not the end-of-batch fence of the first render job that uses B

10:46 <sima> otherwise you have a notch too much latency in your signalling

10:46 <sima> exact same story with kms

10:47 Fijxu has quit [Quit: XD!!]

10:47 Fijxu has joined #dri-devel

10:49 lfryzek has joined #dri-devel

10:49 <lynxeye> right, so it's totally fine to plug the out fence from commit with buffer B into the fenced release for buffer A.

10:50 lfryzek has quit []

10:51 oftckindones has quit [Remote host closed the connection]

10:53 jsa1 has quit [Remote host closed the connection]

10:54 <MrCooper> it works correctly, so it's "fine" if you don't care about the issues I described

10:58 rasterman has joined #dri-devel

11:00 lfryzek has joined #dri-devel

11:04 <lynxeye> MrCooper: But isn't this a policy decision you would want the client to make? If it doesn't want to allocate more buffers for any reason, it can pick a buffer with a unsignaled release fence for the next rendering, potentially waiting for a missed vblank or whatever. If the client cares about avoiding that, it should pick a buffer with the fence already signaled or potentially allocate a new one. How is this different from a GPU

11:04 <lynxeye> r fence received from the compositor?

11:05 <MrCooper> how can the client know if it's a KMS fence or a GPU fence?

11:05 <MrCooper> (warning, rhetorical question :)

11:06 <MrCooper> if it can't, it can't make that choice

11:07 Fijxu has quit [Ping timeout: 480 seconds]

11:07 <lynxeye> MrCooper: Why does the client care? If the buffer release fence is unsignaled it may stay in that state for quite a while, regardless if it's a KMS or GPU render fence. If you want to avoid blocking at any cost, you must use a buffer with a signaled fence.

11:08 <MrCooper> as explained before above: GPU fences are guaranteed to signal ASAP, KMS ones aren't

11:09 <MrCooper> look, feel free not to trust me on this, you can't say I didn't warn you though :)

11:11 tobiasjakobi has joined #dri-devel

11:12 <lynxeye> MrCooper: I still don't get the difference. The flip for the KMS fence is queued, so it's ASAP as-in whatever the next reachable vblank is. The job signaling the compositor GPU render fence might be delayed in the same way by another job hogging the GPU queue.

11:13 <lynxeye> If the client want to avoid blocking on buffer availability it must choose a buffer where the release fence is already signaled.

11:13 <MrCooper> to put it differently, it's very unlikely that any future GPU work by the client could start before a GPU fence from the compositor signals anyway, this isn't true for a KMS fence though

11:14 <MrCooper> a job which blocks the compositor GPU work also blocks future client GPU work

11:15 <lynxeye> MrCooper: Agreed, if you talk about a single GPU with one execution queue. In a hybrid setup you might run the composition on a different GPU than the client.

11:15 <MrCooper> true

11:16 <MrCooper> (having a déjà vu right now :)

11:16 tobiasjakobi has quit []

11:18 <MrCooper> that just makes GPU fences more problematic though, not KMS ones less so

11:18 <lynxeye> right

11:19 <lynxeye> I guess what I'm saying is that the client should always expect the fences to be problematic ;)

11:20 <MrCooper> indeed, so clients should probably only re-use a buffer with unsignalled release point if they can't allocate another one

11:25 <lynxeye> yep, randomly picking a buffer just because you received a fenced release for it is a recipe for hurt, maybe just more pronounced right now if the release fence happens to be a KMS one.

11:30 <sima> MrCooper, there's also multi ctx scheduling in pretty much all hw

11:31 <sima> or most at least

11:31 sarnex has quit [Ping timeout: 480 seconds]

11:31 <sima> plus if someone hangs on a ringbuffer gpu you look at a multi-second timeout

11:31 <sima> so not sure why the gpu fence is less problematic than the kms one

11:32 <sima> my take is this is down to how much memory you can waste, and where you are on the latency/perf tradeoff

11:32 <sima> which is really tricky

11:32 <MrCooper> right (amdgpu being a notable exception, it's getting there though :), still unlikely that future client work can preempt already-flushed compositor work though

11:32 <sima> yeah you generally don't win against the compositor

11:33 <sima> but the latency/throughput still applies

11:33 <sima> like maybe app wants to queue up the next frame as soon as possible, because it's cpu intensive to do that

11:33 <sima> or it wants lowest latency and does everything super late close to next vblank

11:33 <sima> or whatever the app feels like

11:34 dviola has quit [Ping timeout: 480 seconds]

11:34 <sima> so I'm with lynxeye that I'm not seeing why returning a buffer with an unsignalled out_fence is harmful

11:34 <sima> no matter which one

11:34 <sima> if you don't care about memory, just allocate more winsys if you'd block otherwise

11:34 <sima> if you don't, pick the right choice according to your latency/throughput goals

11:35 <sima> ofc if the app is dumb and just blindly starts rendering the moment it gets a frame back

11:35 <sima> you get to keep all the pieces

11:35 <MrCooper> point is that assuming the client uses the same GPU as the compositor, it can use a buffer with unsignalled release GPU fence without penalty, whereas this isn't true with a KMS fence

11:35 <MrCooper> *my point

11:35 <sima> unless the goal was intentionally to not ever waste memory and prioritize throughput

11:35 <sima> MrCooper, but does that case exist?

11:36 <sima> like with a reasonable compositor you don't start the next frame before the previous one finished

11:36 <sima> so if the compositor then picks a new buffer for that rendering, the old buffer is already not in use

11:37 <sima> because the out-fence for the buffer is when it was last used for rendering, _not_ the out-fence for the first rendering of the next buffer

11:37 <MrCooper> not sure what you're asking, it's like the majority of GL and possibly Vulkan apps

11:37 <sima> I'm asking whether the compositor in the gpu path actually ever hands back a buffer with a non-signalled outfence

11:37 <sima> unless it's kinda busted

11:38 <MrCooper> mutter does

11:38 <sima> or the compositor already decided to toss latency overboard, at which point the app trying is pointless

11:38 <sima> how does that happen?

11:38 Fijxu has joined #dri-devel

11:39 <MrCooper> it may set the GPU fence of its last compositing work on the release point as soon as the client has attached another buffer, which may be before that work has finished

11:40 <lynxeye> same for weston iirc

11:40 <sima> but that's the "we tossed latency already" case

11:41 <sima> if the compositor doesn't toss latency, it waits with picking which buffer when it composites the next frame

11:41 <sima> at which point the previous has finished

11:41 <MrCooper> not sure what you mean by "toss latency"

11:41 <sima> unless your app renders so fast that the new app frame finishes faster than the gpu compositing of the dekstop

11:41 <MrCooper> if anything this helps latency, doesn't hurt it

11:42 <sima> if the compositor already commits to using B while A is still being used it might miss a frame because B isn't done yet

11:42 jkrzyszt has joined #dri-devel

11:42 <MrCooper> not what I'm saying

11:42 <sima> so I'm assuming we have a compositor which does a late decision about which frame, shortly before the point of no return for the next vblank

11:42 <MrCooper> commits to using B while GPU work using A is still in flight

11:43 <sima> then quickly queues up gpu work and issues the kms flip

11:43 <MrCooper> gamescope maybe

11:43 <sima> MrCooper, at that point, is B finished rendering or fence still pending?

11:44 <MrCooper> finished (in mutter and most other compositors)

11:44 TMM has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]

11:44 TMM has joined #dri-devel

11:45 <MrCooper> actually mutter also does something like what you describe (though the mechanics work a bit differently)

11:45 jkrzyszt_ has quit [Ping timeout: 480 seconds]

11:46 <sima> so if B is finished, but A isn't yet, the app is rendering much faster than your compositor

11:46 <sima> you're not going to have any problem at all

11:46 <MrCooper> unless you care about benchmark numbers ;)

11:46 <sima> but if the compositor commits to B before it's finished, it's not prioritizing latency

11:47 <daniels> fwiw OUT_FENCE_PTR was indeed written to support the case where people wanted to queue up deeper pipelines of work without necessarily caring about immediate latency or hitches

11:47 <sima> MrCooper, the app allocates more frames to keep the benchmark people happy and the compositor hopefully does mbox semantics for flips?

11:47 <daniels> if you are gunning for the absolute minimal possible latency and getting some kind of new content on every refresh no matter what, then that is not the hammer for you

11:48 <sima> unless you're super constrained and want to limit to just 2 buffers, at which point you'll block until the previous one is available no matter what

11:48 <sima> and would much prefer you can block on a dma_fence since that block point is later

11:48 <MrCooper> sima: the point is that if the client re-uses a buffer with an unsignaled KMS fence, its frame rate will be capped to the display refresh rate

11:48 <sima> MrCooper, isnt' that the point?

11:49 <sima> if you want free-wheeling mbox winsys flips, you need to make sure those happen

11:49 <MrCooper> not if the client wants to go as fast as possible, which it can with unsignalled GPU fences

11:49 <sima> and sure usually the kms flip fence takes a bit longer than the gpu flip fence, but it would still not be mbox

11:50 <sima> MrCooper, who says your compositor is not super dense and queued up that gpu fence behind a kms flip out_fence?

11:50 <sima> if you want free-wheel, you need to do that

11:50 <MrCooper> k, this is getting too hypothetical

11:50 <daniels> sima: it's not just about being super-constrained, but by the time you've allocated a buffer with all the disruption that ensues (mmu etc), you're probably too late anyway

11:52 <sima> I'm still not sure what's the use-case beyond benchmark numbers

11:52 <sima> like if you have something like gamescope you still want to free-wheel and absolutely ignore every buffer with unsignalled out-fence

11:53 <sima> daniels, yeah but aside from startup that shouldn't happen during runtime

11:53 <daniels> yeah. if you want to build something like kmscube, then use OUT_FENCE_PTR and it'll be useful to you. if you want to build something like gamescope, build gamescope instead and don't use OUT_FENCE_PTR because it's not useful to you.

11:53 <daniels> I don't think there's anything controversial there

11:53 <lynxeye> I guess that takeaway is simple: if you don't want to block unexpectedly, don't use random buffers with unsignaled release fences, in which case you don't care if it's a render or kms fence and also don't care about compositor policy regarding latency vs. throughput.

11:53 <MrCooper> I never claimed anything else; if you're fine with your compositor potentially producing orders of magnitude lower benchmark numbers than others, go for it! ;)

11:54 * daniels shrugs

11:56 <sima> daniels, there at least was more than just kmscube that wanted out_fence for deeper queues

11:56 <sima> but it's really only "I can allocate 2 buffers but not 3 because simply not enough memory"

12:00 Fijxu has quit [Ping timeout: 480 seconds]

12:00 feaneron has joined #dri-devel

12:01 rossy_ has quit [Remote host closed the connection]

12:01 rossy has joined #dri-devel

12:08 <daniels> or, you do have 3 or 4 buffers, but for whatever reason (ease of design, deep hardware pipelines, slow hardware, whatever), you queue work up long in advance

12:12 warpme has quit [Read error: Connection reset by peer]

12:12 warpme has joined #dri-devel

12:18 warpme has quit [Read error: Connection reset by peer]

12:29 Fijxu has joined #dri-devel

12:32 mripard has joined #dri-devel

12:32 nerdopolis has joined #dri-devel

12:32 JLP has quit [Ping timeout: 480 seconds]

12:32 Fijxu has quit [Quit: XD!!]

12:43 Fijxu has joined #dri-devel

12:47 JLP has joined #dri-devel

13:04 azerov has quit [Quit: Gateway shutdown]

13:05 jsa1 has joined #dri-devel

13:12 azerov has joined #dri-devel

13:14 Fijxu has quit [Ping timeout: 480 seconds]

13:30 dolphin has quit [Quit: Leaving]

13:42 dviola has joined #dri-devel

13:43 Company has quit [Quit: Leaving]

13:47 Fijxu has joined #dri-devel

13:50 TMM has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]

13:50 TMM has joined #dri-devel

14:08 rbm has quit [Remote host closed the connection]

14:10 mripard has quit [Quit: WeeChat 4.4.2]

14:11 bmodem has quit [Ping timeout: 480 seconds]

14:12 rbm has joined #dri-devel

14:13 diego has joined #dri-devel

14:13 Mangix has quit [Read error: Connection reset by peer]

14:13 Mangix has joined #dri-devel

14:15 dviola has quit [Ping timeout: 480 seconds]

14:19 vliaskov has joined #dri-devel

14:24 phire_ has joined #dri-devel

14:24 phire is now known as Guest287

14:24 phire_ is now known as phire

14:25 diego has left #dri-devel [#dri-devel]

14:26 dviola has joined #dri-devel

14:28 Guest287 has quit [Ping timeout: 480 seconds]

14:28 rbm has quit [Remote host closed the connection]

14:29 rbm has joined #dri-devel

14:29 rbm has quit [Remote host closed the connection]

14:35 <rgallaispou> Hi guys, regarding this patch https://lore.kernel.org/dri-devel/20241115-drm-connector-mode-valid-const-v1-3-b1b523156f71@linaro.org/ Is it better to wait for someone to merge the whole serie, or can I apply it on -next independently ?

14:37 warpme has joined #dri-devel

14:42 warpme has quit []

14:47 dsimic is now known as Guest289

14:47 dsimic has joined #dri-devel

14:49 Guest289 has quit [Ping timeout: 480 seconds]

14:49 alyssa has quit [Quit: alyssa]

14:50 mvlad has joined #dri-devel

14:52 sarnex has joined #dri-devel

14:59 epoch101 has joined #dri-devel

15:02 rbm has joined #dri-devel

15:03 phasta has quit [Quit: Leaving]

15:04 warpme has joined #dri-devel

15:14 fab has joined #dri-devel

15:15 alyssa has joined #dri-devel

15:15 fab is now known as Guest293

15:15 <alyssa> karolherbst: How do we feel about enqueue_kernel in Mesa? *sweat*

15:18 * alyssa doesn't love the syntax but

15:22 Haaninjo has joined #dri-devel

15:26 soreau has quit [Ping timeout: 480 seconds]

15:29 warpme has quit []

15:31 kzd has joined #dri-devel

15:37 <karolherbst> alyssa: I haven't though about how to implement that one at all

15:37 <karolherbst> but it's required for CL C 2.0 support

15:46 <alyssa> my objection to it is that the argument passing model feels weird

15:46 <alyssa> I want to just call `kernels` with arguments

15:46 <alyssa> instead of this weird closure trampoline thing which will surely not translate to good code without backflips

15:50 <alyssa> like I can't enqueue the same kernel from both host and device without compiling it twice, effectivley

16:12 rbm has quit [Remote host closed the connection]

16:15 <alyssa> yeah.. studying the spir-v more, this is definitely implementable but it would mean extra variants

16:15 <alyssa> maybe that's ok though?

16:16 <alyssa> Just feels really silly

16:18 rbm has joined #dri-devel

16:18 rbm has quit [Read error: Connection reset by peer]

16:18 rbm has joined #dri-devel

16:19 rbm has quit [Remote host closed the connection]

16:21 kts has joined #dri-devel

16:21 rbm has joined #dri-devel

16:21 rbm has quit [Remote host closed the connection]

16:22 urja has quit [Ping timeout: 480 seconds]

16:22 rbm has joined #dri-devel

16:23 rbm has quit [Remote host closed the connection]

16:27 rbm has joined #dri-devel

16:28 lemonzest has quit [Quit: WeeChat 4.4.3]

16:29 urja has joined #dri-devel

16:38 lemonzest has joined #dri-devel

16:39 epoch101_ has joined #dri-devel

16:41 epoch101 has quit [Ping timeout: 480 seconds]

16:41 frieder has quit [Remote host closed the connection]

16:55 jkrzyszt has quit [Quit: Konversation terminated!]

17:14 lsntvt has joined #dri-devel

17:15 bolson has joined #dri-devel

17:25 sukuna has joined #dri-devel

17:38 <robclark> sima, mlankhorst: any idea what this lockdep splat is about:

17:38 <robclark> https://www.irccloud.com/pastebin/ZpISiGMV/

17:39 LeviYun has quit [Ping timeout: 480 seconds]

17:40 LeviYun has joined #dri-devel

17:41 <sima> robclark, huh, never seen one of those

17:42 <sima> bit dinner time already, but let me try at least

17:42 * robclark neither

17:44 <sima> hm held_lock->references is an uint, at least here

17:44 <sima> so overflowing that with lots of gem bo seems unlikely

17:44 tobiasjakobi has joined #dri-devel

17:44 tobiasjakobi has quit []

17:47 <robclark> tbf this is sparse/vm_bind type thing, so same bo could appear many times (but I think drm_exec should just be skipping the dups)

17:48 <robclark> still, I don't think it would be 2^^32

17:50 <sima> robclark, we should skip the already locked ww_mutex for the EDEADLCK case, those should never get to lockdep

17:51 <robclark> right

17:51 iive has joined #dri-devel

17:52 <sima> well for the initial trylock case, but the lock_acquired is only for the success case

17:52 <sima> robclark, feels like quicksand and I'm scared

17:52 <sima> can you repro this reasonably well?

17:53 tzimmermann has quit [Ping timeout: 480 seconds]

17:53 <robclark> so far just saw it once when running dEQP-VK.sparse_resources.buffer.\*

17:53 <sima> hm

17:54 <robclark> I can see if it is repeatable, but so far sample size of 1

17:54 rbm has quit [Remote host closed the connection]

17:54 <sima> I guess first try to repro reliably or faster because I have no idea what's up here

17:54 <sima> and then maybe we can try to trac ww_mutex_lock for dma-buf and see what's up

17:55 <robclark> k.. just wanted to see if anyone else was familiar with that before I spent more time on it vs debugging $my_bugz

17:55 <sima> nah this sounds like lockdep internals gone very wrong potentially

17:56 <sima> but it's current->held_locks and if you somehow manage to corrupt that I'd expect the entire kernel to crash&burn much earlier

17:58 <robclark> well, I _am_ playing with objs, locks, and mapping .. so can't rule out corrupting things, but yeah, I'd expect more of a fireball if things went wrong there

17:58 vliaskov has quit [Read error: Connection reset by peer]

17:58 <sima> robclark, before you waste time trying to repro

17:59 <sima> ww_mutex_lock on the same lock in a loop, until you've gotten -EDEADLCK UNIT_MAX times?

17:59 <sima> because I'm not entirely sure that's handled correctly, and it's about the only guess I have about what could go wrong

17:59 <sima> because I'd guess you do not actually have UINT_MAX distinct ww_mutex in your machine

18:01 <robclark> hmm, we could be re-using the same obj (and lock) many times..

18:02 <sima> yeah

18:02 <DemiMarie> MrCooper: why would a program ever want to render more than once per frame? That is just wasting the user’s GPU and electricity.

18:02 <sima> and then maybe walk those a few times

18:02 <sima> and then maybe an accounting bug in lockdep so that you exhaust much quicker than 2^32 attempts

18:02 <sima> a stretch at best, but the only one I've come up with

18:02 <robclark> seems plausible

18:07 soreau has joined #dri-devel

18:07 <sima> robclark, and allocate the lock from a gem_bo or so, because lockdep handles allocated locks differently from static ones

18:07 <sima> just to make sure you're not chasing the wrong phantom here

18:08 <robclark> yeah, will hack that up in ~5 or so.. just looking at a differnt bug first

18:12 <sima> I'll get myself stuffed with raclette meanwhile, it's ready now

18:12 <robclark> enjoy

18:15 dos1 has joined #dri-devel

18:24 <mattst88> mmmm, raclette

18:35 jsa1 has quit [Ping timeout: 480 seconds]

18:42 Perseverance_ has joined #dri-devel

18:43 <karolherbst> alyssa: I didn't even check how it's represented in the spirv, but I expect that's implemented with some new nir intrinsic we'd tell drivers to implement? Don't really know if this feature is all emulated in runtimes or if there are actually hw supporting it natively

18:44 <alyssa> karolherbst: ye, the spirv is a pretty straightforward translation of the cl

18:44 <alyssa> clang hoists the block function into its own function, lays out a structure for the capture, and passes it in as a u8* generic pointer

18:44 <karolherbst> right...

18:44 <alyssa> but the heavyweight enqueue is going to be in driver runtimes

18:45 <karolherbst> I was more wondering to what enqueue_kernel translates to

18:45 <alyssa> similar problem space to DGC in vulkan

18:45 <karolherbst> like what spirv instruction is used there

18:45 <alyssa> it's just an EnqueueKernel spirv instruction

18:45 <alyssa> or something

18:45 <alyssa> with a function pointer thing

18:46 <karolherbst> ohh "OpEnqueueKernel"

18:46 <karolherbst> yeah, now I also find it in the specs 🙃

18:47 <karolherbst> I'd have ideas on how to implement it on nvidia, just 0 ideas on how to do it in gallium, but maybe that's just gonna be a driver thing in it's entirety

18:47 <karolherbst> + whatever lowering we do in nir

18:47 Perseverance has quit [Ping timeout: 480 seconds]

18:48 <karolherbst> there is also the problem of fencing or rather.. events it's called here

18:48 <karolherbst> but that's just "OpGroupWaitEvents" I guess

18:49 <DemiMarie> Where is the right place to request a Mesa-specific extension? WebGL and WebGPU implementations generating SPIR-V are having to resort to ugly hacks because Vulkan requires that all shaders terminate.

18:50 <airlied> why would that be mesa specific?

18:52 kts has quit [Quit: Leaving]

18:53 <karolherbst> at least there is no relation between API and kernel events 🙃 that would be cursed

19:07 <robclark> sima: no luck with the repeated locking in a loop, so I guess it is more complicated to repro than that

19:08 <robclark> it seems to be dEQP-VK.sparse_resources.buffer.ssbo.sparse_binding_aliased.buffer_size_2_24 which triggers it

19:11 <robclark> hmm, well maybe that was just coincidence

19:14 dviola has quit [Read error: Connection reset by peer]

19:14 urja has quit [Ping timeout: 480 seconds]

19:15 diego has joined #dri-devel

19:15 <pixelcluster> DemiMarie: can you clarify? what hacks are you talking about? "shaders that don't terminate" doesn't sound like something that will really work ever to me but I might be missing something

19:15 <sima> robclark, hm running low on ideas then

19:19 lynxeye has quit [Quit: Leaving.]

19:19 diego has quit [Read error: Connection reset by peer]

19:20 diego has joined #dri-devel

19:20 <karolherbst> DemiMarie: shaders not terminating would need an entirely new entry point to be launched and probably have to be compute only

19:20 <karolherbst> if you want to do compute, don't use graphics

19:21 <pixelcluster> well, what valid webgpu/webgl app would ever want non-terminating shaders in the first place?

19:21 diego has quit [Read error: Connection reset by peer]

19:21 <karolherbst> vulkan probably will need an extension for long running compute jobs anyway to implement CL on top of it tho

19:21 <pixelcluster> oh yeah, for actual compute tasks I totally see the use case for persistent thread-style stuff

19:21 <karolherbst> pixelcluster: at some point non-terminating or shaders running for 5 minutes doesn't really make much of a difference :D

19:21 <pixelcluster> just not... webgl :P

19:22 diego has joined #dri-devel

19:22 <karolherbst> so yeah.. I can totally see the use case for compute

19:22 <karolherbst> but it might require a special entry point to launch such tasks

19:22 <karolherbst> "just make shaders run long" won't fly

19:23 <pixelcluster> yeah, I'm also kind of unsure about the lower levels of the stack

19:23 <karolherbst> though with vulkan it could be an extension struct passed on the enqueue for compute have special timeout properties

19:23 diego has quit [Read error: Connection reset by peer]

19:23 <karolherbst> *having

19:23 diego has joined #dri-devel

19:23 tobiasjakobi has joined #dri-devel

19:23 <karolherbst> like "vk_ext_explicit_compute_timeout" or something

19:23 tobiasjakobi has quit []

19:23 <DemiMarie> pixelcluster: If a non-terminating shader was guaranteed to either cause `VK_ERROR_DEVICE_LOST` or just hang, that would be fine. The problem is that compilers are allowed to assume that shaders terminate, and a malicious shader can use this to defeat compiler-inserted bounds checks.

19:23 diego has quit [Read error: Connection reset by peer]

19:24 <karolherbst> and then disallow using the fences for anything not compute

19:24 <karolherbst> DemiMarie: well you can't prove they'll never terminate

19:24 diego has joined #dri-devel

19:24 <pixelcluster> DemiMarie: well, such bounds checks failing should never constitute a security risk

19:24 <pixelcluster> should they?

19:24 <DemiMarie> pixelcluster: yes, they do

19:25 <karolherbst> normal drivers will nuke the context anyway

19:25 <karolherbst> so where is the issue?

19:25 <karolherbst> or rather.. drivers should by default set up sane timeouts for GPU jobs

19:25 <pixelcluster> how is OOB access by shaders a security concern?

19:25 <DemiMarie> https://github.com/gfx-rs/wgpu/issues/6528 and https://github.com/gfx-rs/wgpu/issues/6572

19:25 <pixelcluster> as in, this sounds like an issue that should rather be fixed

19:25 <karolherbst> DOS is also a security issue

19:26 <pixelcluster> oh yeah

19:26 <DemiMarie> pixelcluster: The shaders are provided by websites, which are untrusted and might be malicious.

19:26 <pixelcluster> but you get those even if you don't have the bounds check problem

19:26 <karolherbst> DemiMarie well it's up to the browser to ensure there are no data leaks across tabs

19:26 diego has quit []

19:26 <karolherbst> if a browser uses the same VK "context" across tabs it's a bug in the browser

19:27 <karolherbst> because it opted into sharing state across tabs by doing so

19:27 <pixelcluster> DemiMarie: sure, ok, let me ask another way: how can OOB accesses result in worse behavior than having your context killed (which is what non-termination will do anyway, so it must be fine)?

19:27 <DemiMarie> karolherbst: Is VK OOB guaranteed to not corrupt CPU memory, and do real browser implementations actually do this?

19:27 dviola has joined #dri-devel

19:27 <karolherbst> DemiMarie: GPU memory uses virtual memory

19:27 <DemiMarie> pixelcluster: they can allow a website to access or tamper with data that it should not be able to access, and therefore perform a security exploit.

19:27 krei-se has quit [Read error: Connection reset by peer]

19:27 <karolherbst> so it's like one application doing OOB can't mess with other applications

19:27 <karolherbst> same thing

19:28 <DemiMarie> karolherbst: Web browsers must not allow arbitrary native code execution, sandboxed or otherwise.

19:28 <karolherbst> if your GPU doesn't have an MMU to do virtual memory, then maybe don't use the GPU

19:28 <pixelcluster> well

19:28 <pixelcluster> if the GPU doesn't have a MMU then how is it going to implement Vulkan

19:28 <karolherbst> DemiMarie: well they do

19:28 <karolherbst> and anyway, that's not the issue

19:29 <karolherbst> an OOB access can't corrupt the state of other applications or GPU contexts

19:29 <DemiMarie> karolherbst: there are of course vulnerabilities, but they are vulnerabilities, and that means that they must be fixed

19:29 <karolherbst> if it can in your browser, it's a browser bug

19:29 <DemiMarie> the issue is that an OOB access can corrupt the state of the application performing the access

19:29 <karolherbst> yeah, life is rough sometimes

19:29 <karolherbst> that's why browsers should sandbox their tabs

19:29 krei-se has joined #dri-devel

19:30 <DemiMarie> That is good enough for almost everything, but browsers need to guarantee that there are no OOB accesses at all.

19:30 <karolherbst> they don't

19:30 <DemiMarie> karolherbst: that's why you should uses Chromium

19:30 <DemiMarie> karolherbst: they try, and when they fail, they try again :)

19:30 <karolherbst> an OOB access is pretty harmless if you ignore driver bugs

19:31 <karolherbst> what's the threat model here anyway? you visit a website and.. it hangs your tab?

19:31 <pixelcluster> actually

19:31 <karolherbst> web browsers have all the tools necessary to isolate things properly

19:31 <pixelcluster> which guarantees about OOB access are we even talking about

19:32 alyssa has left #dri-devel [#dri-devel]

19:32 <karolherbst> I don't even know, but I'm hope bda isn't part of webgpu 🙃

19:32 <karolherbst> *I

19:32 dviola has quit [Read error: Connection reset by peer]

19:32 <pixelcluster> I don't think any API (Vulkan especially not) guarantees anything about what happens in OOB accesses

19:32 <DemiMarie> karolherbst: If one isn’t running the UMD in-process, then yes, an OOB GPU access is harmless. That’s why native contexts work.

19:32 <pixelcluster> it's UB in exactly the same way non-terminating shaders are

19:33 <karolherbst> DemiMarie: OOB GPU accesses do nothing to other applications

19:33 <DemiMarie> The problem is that browsers do not use kernel ioctls directly, but rather userspace APIs.

19:33 diego has joined #dri-devel

19:33 <karolherbst> the security boundary is your GPU context

19:33 <karolherbst> you can have many of them in vulkan

19:33 <karolherbst> use one per tab

19:33 <karolherbst> done

19:33 <karolherbst> if browser share it between tabs it's their problem and their bug

19:34 <pixelcluster> erh

19:34 <pixelcluster> I think in practice you want to use a process per tab

19:34 <karolherbst> sure

19:34 <pixelcluster> but yes

19:34 diego has quit [Read error: Connection reset by peer]

19:34 <karolherbst> and a real vulkan instance per tab and all that

19:34 <zamundaaa[m]> pixelcluster: VK_EXT_robustness2 does make guarantees about OOB accesses

19:35 <karolherbst> vulkan has a bit of documentation on security boundaries across objects

19:35 <pixelcluster> oh right lol robustness exists

19:35 <karolherbst> if those aren't enough, then one can always add an ext tightening it up more

19:35 diego has joined #dri-devel

19:35 diego has left #dri-devel [#dri-devel]

19:36 <pixelcluster> well in any case, terminating shaders really aren't the (only) problem, what you are asking for is a subset of SPIR-V that has no UB at all

19:36 <karolherbst> right.. but then you have vulkan features like bda with throw aware your oob checks, so we can't only hope that bda isn't exposed in webgpu :D

19:36 <pixelcluster> I don't think it is, for obvious reasons :D

19:36 <karolherbst> it's the wrong solution for this problem anyway

19:36 <karolherbst> it would be fun though (tm)

19:36 <pixelcluster> yes, UB is a thing and we won't get rid of it

19:37 <DemiMarie> pixelcluster: nah, the problem with infinite loops specifically is that it is very hard to get rid of them without incredibly disgusting workarounds, like explicit iteration counters.

19:38 <karolherbst> UB isn't a security problem in applications, and it's not one in GPU programming either as long as you don't pick holes in your security boundaries

19:38 <karolherbst> well

19:38 <karolherbst> it's a problem in applications like sshd 🙃

19:38 <karolherbst> but that's not what we are talking about here anyway

19:40 gouchi has joined #dri-devel

19:40 gouchi has quit [Remote host closed the connection]

19:41 <DemiMarie> My preferred solution to all of this would be to compile Mesa to WebAssembly and give it access to the kernel ioctls via a virtGPU native context interface. If the WebAssembly module gets popped, who cares, it's in the browser process which already runs arbitrary WebAssembly. The hardware protections ensure that the compiled shader code can't do harm.

19:41 <DemiMarie> Unfortunately, browser vendors don't take this approach, probably because they have to support platforms with proprietary drivers.

19:41 <karolherbst> it's also another platform specific path

19:42 <pixelcluster> at this point what's the difference between a webassembly module and a properly isolated renderer process?

19:43 <karolherbst> the API used

19:44 <karolherbst> though I fully understand browsers not wanting to isolate to such an extreme extend, because having 500 open tabs can kinda cause funky problems 🙃

19:44 <karolherbst> though not sure if GPU contexts are that limited on modern GPUs

19:44 <karolherbst> prolly not at all

19:44 <pixelcluster> you could also restrict the fancy isolation to the fancy apis

19:44 urja has joined #dri-devel

19:44 <pixelcluster> if you have 500 tabs each running webgpu shaders you have bigger problems

19:45 <DemiMarie> Maybe exposing GPU access to random websites without user consent was not a great idea 😆

19:45 <DemiMarie> If WebGL and WebGPU required a permission it would prevent most of the abuse.

19:45 <linkmauve> karolherbst, I currently have exactly 15321 open tabs. :p

19:46 <linkmauve> And Firefox still works!

19:46 <Sachiel> WebGL and WebGPU requiring permissions would just be an extra click from users away from the abuse

19:46 <karolherbst> though did it create 15321 GPU contexts for it's per page rendering

19:46 <karolherbst> I'm sure browsers are a bit smarter than that

19:46 <linkmauve> I doubt so. ^^

19:46 <psykose> a popup asking "Wanna get hacked? Y/N" wouldn't meaningfully change what browsers have to do to sandbox webgpu use at all

19:48 <psykose> reminds me of cookies and privacy policies

19:48 <karolherbst> reminds me of excel

19:48 <psykose> hah

19:48 <karolherbst> and people click accept anyway

19:48 dviola has joined #dri-devel

19:49 <DemiMarie> linkmauve: how many of those tabs are actual websites?

19:49 <karolherbst> yeah apparently if you ask users "this thing wants to do X if you press deny it might not work" isn't really telling users to say deny 🙃

19:52 heat has joined #dri-devel

19:55 <linkmauve> DemiMarie, all of them!

19:55 <linkmauve> Of course not currently loaded.

19:55 <DemiMarie> linkmauve: Wow!

20:04 <sima> DemiMarie, so with mesa vk if you createa a vkdevice per security context nothing should ever escape that gpu box, not even to your cpu side process

20:04 <DemiMarie> sima: Nice, thanks! That should be enough for browsers then.

20:04 <sima> unless you just put your cpu side datastructures into cpu mmapped gpu memory ofc

20:04 <sima> but don't do that

20:04 <sima> at least that's I'd say what we're aiming for form the kernel side

20:04 <DemiMarie> sima: does Mesa do that?

20:05 <sima> for gl the situation is a mess, but with arb_robustness you should have enough isolation to also not blow up too badly

20:05 <sima> DemiMarie, it would be really, really stupid

20:05 <sima> I've seen one really old intel libva that abused gpu bo for cpu datastructures and that's by far not the worst thing that codebase did

20:05 <DemiMarie> sima: gotcha, just checking

20:05 * DemiMarie wonders what the worst thing actually was

20:05 dviola has quit [Read error: Connection reset by peer]

20:06 <sima> I'm not going to uncover those nightmares

20:06 diego has joined #dri-devel

20:06 <DemiMarie> Fair :)

20:06 <sima> but I've watched developers who really don't shy away from reworking terrible code to fix it fold in less than a day sifting through it

20:07 <DemiMarie> yeah at that point I would just do a from-scratch rewrite

20:07 <sima> anyway for gl's arb_robustness I'm less sure how solid it is everywhere, so if you're paranoid maybe just limit to vk

20:08 <sima> also don't enable the hsw vk implementation in anv, but I think that might have gotten nuked meanwhile

20:10 diego has left #dri-devel [#dri-devel]

20:11 epoch101_ has quit []

20:11 <sima> DemiMarie, oh and ofc the usual "this is all aspirational, I'm not speaking for any vendor team including intels" disclaimer, but I think as a dri-devel stance it should be pretty solid

20:12 <DemiMarie> sima: that's good enough for me :)

20:12 <DemiMarie> (in particular, it is good enough for Qubes OS to enable native contexts at some point, which rely on this guarantee, at least as an opt-in option)

20:12 <sima> it's after all still a horribly complex kludge of stuff written in C in both kernel and fw, plus hw is disappointingly also not bug free as we learned the hard way last few years :-/

20:13 <DemiMarie> Is the FW generally of good quality code-wise?

20:13 * DemiMarie wonders if it is time to write drivers and FW in Rust

20:14 <sima> DemiMarie, yeah, even more so going forward if someone comes with a mesa vk driver and the kernel side doesn't just use standard vm_bind with full blown gpu mmu and hw context it's really questionable we'll consider it for merging I think

20:14 <sima> just too busted design imo

20:14 <sima> DemiMarie, I've never seen any fw code in my life for any gpu

20:14 <DemiMarie> sima: Ah, I was thinking that sine you work for Intel you would have at least talked to the people who did write it.

20:14 <sima> for rust in the kernel, we'll hopefully get there, but it's going to be a while

20:15 <DemiMarie> good news is that one can mix Rust and C, even in the same driver

20:18 epoch101 has joined #dri-devel

20:28 rasterman has quit [Quit: Gettin' stinky!]

20:35 Guest293 has quit []

20:53 dviola has joined #dri-devel

20:56 yogesh_m1 has quit [Ping timeout: 480 seconds]

21:00 sima has quit [Ping timeout: 480 seconds]

21:05 <benjaminl> how does the bot decide which labels to put on mesa MRs?

21:06 gouchi has joined #dri-devel

21:08 <karolherbst> based on touched files

21:10 <benjaminl> hmm, so I have an MR with a commit that touches src/vulkan/runtime, but it didn't get the vulkan label

21:10 <benjaminl> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32127/diffs?commit_id=3ce9a5a6093c56beb32e0498f257fabdc31b402c

21:12 <benjaminl> oh, is this because that commit wasn't in the initial MR?

21:12 <karolherbst> https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/.mr-label-maker.yml?ref_type=heads

21:12 <karolherbst> ahh yeah

21:12 <karolherbst> the bot only does the scan once

21:12 <benjaminl> that makes sense, thanks!

21:13 <benjaminl> is there a way to retrigger the bot, or should I just get somebody with permissions to change the labels if I run into this again?

21:17 diego has joined #dri-devel

21:17 Company has joined #dri-devel

21:18 dviola has quit [Ping timeout: 480 seconds]

21:25 diego has quit []

21:26 dviola has joined #dri-devel

21:36 epoch101_ has joined #dri-devel

21:38 epoch101 has quit [Ping timeout: 480 seconds]

21:38 Duke`` has quit [Ping timeout: 480 seconds]

22:09 gouchi has quit [Remote host closed the connection]

22:10 Peuc has quit [Quit: Peuc]

22:12 epoch101_ has quit []

22:12 Peuc has joined #dri-devel

22:26 Mangix has quit [Read error: Connection reset by peer]

22:26 Mangix has joined #dri-devel

22:37 jernej_ has joined #dri-devel

22:37 jernej has quit [Read error: Connection reset by peer]

22:51 Haaninjo has quit [Quit: Ex-Chat]

22:56 mvlad has quit [Remote host closed the connection]

23:03 Calandracas_ is now known as Calandracas

23:09 oneforall2 has quit [Remote host closed the connection]

23:14 jhli has quit [Remote host closed the connection]

23:19 DodoGTA has quit [Read error: Connection reset by peer]

23:19 DodoGTA has joined #dri-devel

23:28 Caterpillar has quit [Quit: Konversation terminated!]

23:34 yogesh_m1 has joined #dri-devel

23:53 glennk has quit [Ping timeout: 480 seconds]

23:56 oneforall2 has joined #dri-devel