#dri-devel on 2022-12-20 — irc logs at oftc.irclog.whitequark.org

2022-12-16 22:42 ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar

00:00 maxzor has quit [Ping timeout: 481 seconds]

00:05 ayaka_ has quit [Ping timeout: 480 seconds]

00:07 MajorBiscuit has quit [Ping timeout: 480 seconds]

00:20 <Lynne> with vulkan encoding, gpu buffers will likely get sent directly to network devices

00:20 <Lynne> wonder how good pcie p2p is these days

00:32 pcercuei has quit [Quit: dodo]

00:34 K0bin has quit [Quit: K0bin]

00:38 zf_ is now known as zf

01:16 <tleydxdy> it works but there's some kernel hazards

01:18 yuq825 has joined #dri-devel

01:26 fxkamd has joined #dri-devel

01:26 warpme_____ has quit []

01:28 JohnnyonFlame has quit [Ping timeout: 480 seconds]

01:30 columbarius has joined #dri-devel

01:31 co1umbarius has quit [Ping timeout: 480 seconds]

01:52 fxkamd has quit []

02:05 ngcortes has quit [Remote host closed the connection]

02:05 ngcortes has joined #dri-devel

02:09 <italove> emersion: how can I do that?

02:11 ngcortes has quit [Read error: Connection reset by peer]

02:16 ybogdano has quit [Ping timeout: 480 seconds]

02:35 orbea has quit [Remote host closed the connection]

02:35 orbea has joined #dri-devel

02:46 heat_ has joined #dri-devel

02:46 heat has quit [Read error: Connection reset by peer]

02:50 heat_ has quit [Read error: Connection reset by peer]

02:51 heat has joined #dri-devel

03:13 <italove> nvm, setting WAFFLE_GBM_DEVICE to the right rendernode fixed the problem

03:16 JohnnyonFlame has joined #dri-devel

03:29 Lucretia has quit [Ping timeout: 480 seconds]

03:36 bmodem has joined #dri-devel

03:38 ayaka_ has joined #dri-devel

03:40 randy_ has joined #dri-devel

03:47 ayaka_ has quit [Ping timeout: 480 seconds]

03:51 heat_ has joined #dri-devel

03:52 heat has quit [Read error: No route to host]

04:08 randy__ has joined #dri-devel

04:08 randy_ has quit [Read error: Connection reset by peer]

04:24 YuGiOhJCJ has joined #dri-devel

04:24 randy__ has quit [Read error: Connection reset by peer]

04:24 ayaka_ has joined #dri-devel

04:28 Company has quit [Quit: Leaving]

04:48 aravind has joined #dri-devel

04:55 Lucretia has joined #dri-devel

05:02 randy_ has joined #dri-devel

05:08 ayaka_ has quit [Ping timeout: 480 seconds]

05:14 randy__ has joined #dri-devel

05:20 randy_ has quit [Ping timeout: 480 seconds]

05:23 heat_ has quit [Remote host closed the connection]

05:23 heat_ has joined #dri-devel

05:37 kts has joined #dri-devel

05:42 aravind has quit [Ping timeout: 480 seconds]

05:43 Leopold_ has quit [Remote host closed the connection]

05:43 Leopold_ has joined #dri-devel

05:50 kts has quit [Ping timeout: 480 seconds]

05:54 Duke`` has joined #dri-devel

06:12 kts has joined #dri-devel

06:12 <DrNick> anyone know why Radeon's don't list VAProfileMPEG4Simple and VAProfileMPEG4AdvancedSimple?

06:14 <Lynne> yeah, mpeg4 decoding is incredibly buggy

06:14 <Lynne> for me it's an instant lockup if I enable it

06:40 <Lynne> got the nvidia drivers to cleanly create a context and an sps+pps session

06:40 <Lynne> almost figured out the slice params struct needed for frame encoding

06:42 <Lynne> airlied: still struggling with mesa registry??

06:48 fab has joined #dri-devel

06:51 <DrNick> oh its disabled in the state tracker, not in radeonsi, that explains why I couldn't find it

06:59 <airlied> Lynne: nope all the new headers are pushed out now in main

07:00 <airlied> Lynne: just been typing on encode, there's a lot of typing before I even figure out how it works

07:06 tzimmermann has joined #dri-devel

07:07 ahajda_ has joined #dri-devel

07:09 rasterman has joined #dri-devel

07:22 MajorBiscuit has joined #dri-devel

07:25 bgs has joined #dri-devel

07:27 dcz_ has joined #dri-devel

07:32 bgs has quit [Remote host closed the connection]

07:33 alanc has quit [Remote host closed the connection]

07:33 alanc has joined #dri-devel

07:33 Major_Biscuit has joined #dri-devel

07:33 Leopold_ has quit [Ping timeout: 480 seconds]

07:36 danvet has joined #dri-devel

07:37 maxzor has joined #dri-devel

07:40 MajorBiscuit has quit [Ping timeout: 480 seconds]

07:44 frieder has joined #dri-devel

07:54 maxzor has quit [Remote host closed the connection]

07:54 maxzor has joined #dri-devel

08:00 maxzor_ has joined #dri-devel

08:00 maxzor has quit [Remote host closed the connection]

08:02 maxzor__ has joined #dri-devel

08:06 garrison has joined #dri-devel

08:06 heat_ has quit [Ping timeout: 480 seconds]

08:06 jagan_ has joined #dri-devel

08:07 garrison has quit []

08:07 garrison has joined #dri-devel

08:07 garrison has quit []

08:08 i-garrison has quit [Ping timeout: 480 seconds]

08:09 i-garrison has joined #dri-devel

08:09 maxzor_ has quit [Ping timeout: 480 seconds]

08:16 co1umbarius has joined #dri-devel

08:17 columbarius has quit [Ping timeout: 480 seconds]

08:23 maxzor__ has quit [Remote host closed the connection]

08:24 maxzor__ has joined #dri-devel

08:25 tzimmermann has quit [Quit: Leaving]

08:25 tzimmermann has joined #dri-devel

08:31 ayaka_ has joined #dri-devel

08:35 randy__ has quit [Ping timeout: 480 seconds]

08:37 randy_ has joined #dri-devel

08:37 <airlied> tzimmermann: yeah i have barely any knowledge of that patch left, bit it does sound like systemd has it covered

08:38 <tzimmermann> airlied, thanks for confirming. i sent out a patchset yesterday that reverts this workaround

08:40 tursulin has joined #dri-devel

08:43 i-garrison has quit []

08:44 ayaka_ has quit [Ping timeout: 480 seconds]

08:44 i-garrison has joined #dri-devel

08:52 apinheiro has joined #dri-devel

08:55 <javierm> tzimmermann: saw your patch-set and looked good to me from a quick glance. I'll take a detailed look now

08:55 <tzimmermann> javierm, thanks

08:57 <tzimmermann> jfalempe, can i ask you for a review of https://lore.kernel.org/dri-devel/20221216193005.30280-1-tzimmermann@suse.de/ ?

08:58 <jfalempe> tzimmermann, yes sure.

08:58 <tzimmermann> thanks

08:59 <jfalempe> I've seen the kernel bot report, but missed your patch, maybe I need better mail filter rules on dri-devel

08:59 <javierm> tzimmermann: that one looks good to me as well. Why sparse reports it for ARCH=i386 but not for ARCH=x86 ?

08:59 lynxeye has joined #dri-devel

09:02 <tzimmermann> javierm, no idea. it runs on debian. maybe it is debian's way of naming things? what actually is 'i386' in this context? the kernel has long lost the ability to use i386 IIRC

09:03 <tzimmermann> jfalempe. ha! i know. my spam filter has a habit of constantly filtering out mails from the same people

09:04 <javierm> tzimmermann: dunno, maybe x86-32?

09:05 <javierm> the reported .config has CONFIG_X86=y and CONFIG_X86_32=y

09:07 <tzimmermann> jfalempe, got the r-b. thanks a lot

09:07 <jfalempe> tzimmermann, you're welcome ;)

09:09 <javierm> tzimmermann: https://elixir.bootlin.com/linux/latest/source/Makefile#L402

09:13 <tzimmermann> looks like compatibility

09:14 <javierm> tzimmermann: yeah. They could just use ARCH=x86 and set CONFIG_X86_32=y

09:15 JohnnyonFlame has quit [Ping timeout: 480 seconds]

09:15 <javierm> anyways, just wondered why sparse complained for x86_32 since grep'ing the code it seems that shouldn't affect the __iomem type modifier

09:20 pcercuei has joined #dri-devel

09:21 <emersion> MrCooper: iirc you said GNOME now waits for client buffers to be ready? how do you account for amdgpu using POLLOUT instead of POLLIN?

09:21 <MrCooper> that was fixed a while ago

09:22 <emersion> oh so POLLIN works now?

09:22 <emersion> that's good to know

09:22 <MrCooper> yep, so not handling it at all :)

09:22 <emersion> do you know in which kernel version?

09:22 <MrCooper> not offhand, but I can look it up

09:23 <emersion> i can grerp the commit log, but not sure what to search for

09:23 <emersion> was it like 1 year ago? or less?

09:27 <javierm> tzimmermann: I was today years old when I noticed the samples/ directory :)

09:27 <tzimmermann> javierm, same here. i never noticed it before

09:29 maxzor__ has quit [Ping timeout: 480 seconds]

09:41 <MrCooper> emersion: AFAICT it was 5.15

09:42 <emersion> thanks!

09:42 JohnnyonFlame has joined #dri-devel

09:42 <MrCooper> np

09:43 xypron has quit [Quit: xypron]

09:43 xypron has joined #dri-devel

09:46 xypron has quit []

09:47 xypron has joined #dri-devel

09:48 xypron has quit []

09:48 xypron has joined #dri-devel

09:48 <javierm> tzimmermann: is great to see the benefits of the common aperture infra that allows now to remove the FBINFO_MISC_FIRMWARE flag

09:48 JohnnyonF has joined #dri-devel

09:49 xypron has quit []

09:49 <tzimmermann> javierm, yeah. but to be honest, this patchset is something like the third hard u-turn within the rabbit hole. i was actually out to fix something entirely different

09:50 <tzimmermann> and cleaning up first, made some sense

09:50 xypron has joined #dri-devel

09:50 <javierm> tzimmermann: I see

09:52 <tzimmermann> javierm, thanks for the r-bs. i'll merge it early in january

09:53 <javierm> tzimmermann: cool. As mentioned in the ML, I think we need an ack from kvm/virt folks before for patch #9

09:53 <tzimmermann> ok, np

09:54 <javierm> unlikely that there will be merge conflicts but just to be good citizens

09:55 JohnnyonFlame has quit [Ping timeout: 480 seconds]

09:57 tobiasjakobi has joined #dri-devel

09:58 tobiasjakobi has quit []

10:00 kts has quit [Quit: Leaving]

10:05 <tzimmermann> javierm, oh cool! thanks for reviewing those other patches as well

10:08 <javierm> tzimmermann: no worries, those were easy to review :)

10:08 <javierm> I'll also go through your "drm: Fix color-format selection in fbdev emulation" series

10:08 <javierm> I believe that's the last one that you wanted me to review?

10:10 <tzimmermann> javierm, there will be a v2 of the series. you may want to wait

10:11 <javierm> tzimmermann: ah Ok. But I thought that was only to change some of the kunit tests as reported by Jose?

10:11 <tzimmermann> after reading jose's comment, i think the tests need some fixes wrt endianess

10:11 <javierm> tzimmermann: that's what I understood. But if is only about fixing the tests, feel free to carry my r-b's

10:11 <tzimmermann> if you like, you can review the rest of the patchset, of course

10:12 <tzimmermann> that shouldn't change much from the test fixes

10:12 <javierm> yeah

10:13 <javierm> tzimmermann: I prefer to do things in batch because I'm a terrible context switcher :)

10:13 <tzimmermann> :)

10:14 jkrzyszt has joined #dri-devel

10:18 <javierm> tzimmermann: I'm confused about DRM_FORMAT_BGRX8888, that's not DRM_FORMAT_HOST_XRGB8888 | DRM_FORMAT_BIG_ENDIAN, right?

10:19 <javierm> hmm, no. It's LE: #define DRM_FORMAT_BGRX8888 fourcc_code('B', 'X', '2', '4') /* [31:0] B:G:R:x 8:8:8:8 little endian */

10:19 <javierm> but then you also have include/drm/drm_fourcc.h:# define DRM_FORMAT_HOST_XRGB8888 DRM_FORMAT_BGRX8888

10:20 <tzimmermann> javierm, from what i understand, the bgrx format was there first and the big-endian flag came later

10:20 <tzimmermann> ?

10:20 <tzimmermann> the data is the same for bgrx and xrgb|be

10:22 <javierm> tzimmermann: right, that's what I was thinking. Hence what I meant in my first comment actually DRM_FORMAT_BGRX8888 == (DRM_FORMAT_XRGB8888 | DRM_FORMAT_BIG_ENDIAN)

10:22 kts has joined #dri-devel

10:22 <javierm> tzimmermann: I'm trying to figure out if we can somehow simplify the else if statements by using DRM_FORMAT_HOST_XRGB8888

10:23 <javierm> and avoid doing the drm_fb_swab() unless is necessary

10:25 <javierm> tzimmermann: anyway, maybe that could be done as a follow-up anyways if is possible to even simplify drm_fb_blit() more

10:29 YuGiOhJCJ has quit [Quit: YuGiOhJCJ]

10:31 JohnnyonF has quit [Ping timeout: 480 seconds]

10:38 devilhorns has joined #dri-devel

10:38 srslypascal has quit [Ping timeout: 480 seconds]

10:51 ahajda_ has quit [Ping timeout: 480 seconds]

10:51 srslypascal has joined #dri-devel

10:57 ahajda_ has joined #dri-devel

11:03 <javierm> tzimmermann: patch #8 is awesome. I always found confusing that the drivers formats array mixed both emulated and native formats

11:10 ahajda_ has quit []

11:13 bmodem1 has joined #dri-devel

11:17 bmodem has quit [Ping timeout: 480 seconds]

11:21 ahajda_ has joined #dri-devel

11:26 <tzimmermann> thanks :)

11:38 JohnnyonFlame has joined #dri-devel

11:46 deathmist1 has joined #dri-devel

11:48 deathmist has quit [Ping timeout: 480 seconds]

12:10 jagan_ has quit [Remote host closed the connection]

12:31 apinheiro has quit [Ping timeout: 480 seconds]

12:41 maxzor has joined #dri-devel

12:43 jagan_ has joined #dri-devel

13:19 pjakobsson_ has quit [Remote host closed the connection]

13:27 warpme_____ has joined #dri-devel

13:34 kts has quit [Quit: Leaving]

13:45 Guest52 has quit []

13:50 <shadeslayer> looks like freedreno isn't using a POT alignment https://mesa.pages.freedesktop.org/-/mesa/-/jobs/33787109/artifacts/results/summary/results/trace@freedreno-a530@valve@counterstrike-source-v2.trace.html ... is there way to get a symbolized backtrace?

14:04 yuq825 has quit []

14:08 cef has quit [Ping timeout: 480 seconds]

14:11 troy has joined #dri-devel

14:12 troy is now known as Guest271

14:12 Guest271 has quit []

14:15 JohnnyonFlame has quit [Ping timeout: 480 seconds]

14:17 kts has joined #dri-devel

14:31 cef has joined #dri-devel

14:40 djbw has quit [Read error: Connection reset by peer]

14:54 devilhorns has quit []

14:59 <lina> I just ran into something that looks very broken in kmsro...

15:00 <lina> I'm having issues with KWin breaking badly with whole screen capture.

15:00 <DavidHeidelberg[m]> shadeslayer: I have branch WIP for that

15:00 <lina> The screen resource is allocated with renderonly_create_gpu_import_for_resource, which then gets imported into the GPU

15:01 <lina> KWin then exports that fd again and re-imports it as a texture, which then gets re-imported into the display controller with renderonly_create_gpu_import_for_resource

15:01 <lina> But that returns the same GBM handle, since there is only one unique name for any given buffer, and there is no reference counting mechanism here

15:02 <lina> So when it's done with the texture import, renderonly_scanout_destroy gets called and destroys the handle on the display controller side

15:02 <lina> And then next time KWin wants to present that frame, there's no buffer any more

15:02 <DavidHeidelberg[m]> shadeslayer: see the last three commits: https://gitlab.freedesktop.org/okias/mesa/-/commits/split-dwarf

15:07 <emersion> lina, drivers are responsible for ref'counting

15:09 <lynxeye> emersion: seems lina is right, renderonly is broken in that case. Drivers do the desuplication and refcounting on the GPU BO level, but there is nothing in renderonly that would prevent a double import of the same BO on the KMS side, so you get non-refcounted handles

15:10 <lynxeye> renderonly needs a handle hashtable for the KMS side to do proper refcounting.

15:15 <shadeslayer> DavidHeidelberg[m]: ah thanks, I guess I'll have to wait for that :(

15:15 <DavidHeidelberg[m]> shadeslayer: you can apply and re-run (if it isn't flake)

15:15 <tzimmermann> i have a one-line patch for xorg (incl r-b) that makes the xserver work with ofdrm. could someone please merge it? https://gitlab.freedesktop.org/xorg/xserver/-/merge_requests/990

15:16 <lina> lynxeye: Yeah, I'm implementing that now (following the way drivers do it)

15:16 <emersion> renderonly needs to die :S

15:18 bgs has joined #dri-devel

15:22 <shadeslayer> I'll have to have a look over xmas :)

15:23 <lynxeye> emersion: welcome to the club ;)

15:23 maxzor has quit [Remote host closed the connection]

15:27 maxzor has joined #dri-devel

15:38 <danvet> lynxeye, you need to de-dupe in libdrm/userspace

15:39 <danvet> and it's intentional

15:39 <danvet> so needs an import cache to do the de-duping

15:39 <lynxeye> danvet: sure, that's what all the drivers do for the GPU side, but it's missing for the KMS side of renderonly

15:40 <danvet> lynxeye, oh in the compositor?

15:40 <lynxeye> danvet: No, renderonly helpers in Mesa.

15:41 <danvet> ah

15:41 <danvet> well guess those should get fixed or something

15:41 <danvet> maybe the entire import cache shared as much as possible

15:42 <danvet> tzimmermann, looked at your patch, cried in Xorg

15:43 <danvet> lynxeye, emersion does kmsro fully own the kms side gem id space? or do we have an unfixable bug in there ...

15:44 <danvet> this feels vaguely similar to some of the other reimport fun we've discussed in the past

15:45 <lynxeye> danvet: I guess everyone is using this KMS handle space through the GBM device sitting on top of the renderonly screen, so should be okay to fix this at the renderonly level.

15:46 <lina> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20397

15:46 fxkamd has joined #dri-devel

15:47 <lina> I guess that could break if clients are doing their own BO allocs/exports directly though, but I'm not sure if anyone does that.

15:48 Company has joined #dri-devel

15:48 <tzimmermann> danvet, cried?

15:49 <danvet> tzimmermann, why does xorg have hard coded list of all this stuff

15:50 <tzimmermann> danvet, because the alternative is to redesign the xserver's backend code :p i'm already glad that the workaround is easy

15:50 <tzimmermann> but i get your point

15:51 JohnnyonFlame has joined #dri-devel

15:51 <emersion> lina: clients specifically are forbidden to do it on a FD shared with GBM, because that breaks the ref'counting

15:52 <danvet> lina, very quick read of all this, but the HandleToFD side seems missing?

15:52 <danvet> you get the aliasing for any shared buffer

15:52 <danvet> and it might come back somehow

15:52 <danvet> biggest mistake we didn't do this in libdrm :-/

15:56 Haaninjo has joined #dri-devel

16:07 <lynxeye> danvet: biggest mistake was to not refcount handles in the kernel. You can always add userspace caching on top to avoid the syscalls, but not having refcounts at this level was just setting up a trap for everyone.

16:08 <emersion> indeed

16:10 <lina> danvet: I don't think we care about HandleToFD, do we? The FD is its own reference, and then if/when it comes back either we still have the original reference or we import it again fresh.

16:11 <lina> It doesn't matter if the BO gets destroyed while the FD still exists, because that means we aren't actually using it in the kms device since there is no direct reference.

16:11 <lina> And then if it comes back it'll create a fresh handle when imported.

16:12 <emersion> lina: the import de-duplicates handles

16:13 <emersion> ah, eh

16:13 <emersion> misread, sorry

16:14 heat_ has joined #dri-devel

16:17 <lynxeye> lina: danvet: FD export doesn't go via renderonly. If you ask for the FD all the renderonly drivers will give you the FD exported from the GPU side BO, which is already properly deduped and refcounted, not the KMS that is handled by renderonly. IOW: it's fine.

16:23 ybogdano has joined #dri-devel

16:23 <danvet> lina, if the gem bo of an exporterd buffer survives until you re-import, you get the same gem bo id back

16:23 <danvet> so the exact same unrefcounted gem bo id problem

16:24 <danvet> lynxeye, I only read one line, and that called HandleToFD on the kms_fd

16:24 <danvet> hence why I asked

16:24 <danvet> but if it's all good, then good :-)

16:38 tursulin has quit [Ping timeout: 480 seconds]

16:46 <danvet> lynxeye, hm thinking about this some more, can't you still end up in a loop

16:47 <danvet> hm, I guess I'm just confused maybe

16:55 gouchi has joined #dri-devel

16:56 <lina> danvet: If the gem bo is alive, we have an object for it, so then on re-import we get back the same bo ID and find the object that way (and increase the refcount).

16:57 gouchi has quit []

16:57 <lina> The path that calls HandleToFD is right after BO creation, which already allocated the BO out of the map

16:57 ybogdano has quit [Ping timeout: 480 seconds]

16:57 <danvet> lina, yeah I'm just confused :-)

16:59 <lynxeye> danvet: one of the primary virtues of renderonly is its ability to confuse people

17:02 <danvet> it's pretty good at that, I admit

17:02 <lina> ^^

17:02 bmodem1 has quit [Ping timeout: 480 seconds]

17:04 <danvet> lina, still trying to make a fool of myself ... :-) so I don't think it's a real issue, but it looks a bit strange that you're adding the array entry in renderonly_create_kms_dumb_buffer_for_resource() before the refcount and everything is set up

17:05 <danvet> not an issue in mesa, but it freaks me out with my kernel hat somewhere :-)

17:06 Major_Biscuit has quit [Ping timeout: 480 seconds]

17:07 <danvet> also doesn't the FDToHandle outside of the mutex potentially race with a concurrent destroy?

17:07 <danvet> destroy&re-create of a different buffer but same gem bo_id

17:12 <Lynne> lina: I think kwin is faulty for giving pw an fd it doesn't really have ownership over

17:12 ybogdano has joined #dri-devel

17:13 <Lynne> no proper synchronization either, if pw maps it in userspace and starts copying while the compositor renders another frame to it

17:14 <Lynne> IIRC pw only supports linear buffers last I took a look

17:14 <danvet> pw some screen grabber?

17:15 frieder has quit [Remote host closed the connection]

17:16 <Lynne> pipewire

17:19 <jadahl> Lynne: it supports modifiers (explicit/implicit), as long as both sides of the pipewire stream does

17:19 <jadahl> any copying happens while copyer holds on to the buffer, and it won't be overwritten until its released

17:27 <Lynne> I believe most clients want linear buffers though, because they map them for transfer, and a badly behaving client could slow down a compositor

17:27 <jadahl> if clients intend to mmap they shouldn't ask for dmabufs to begin with

17:28 <jadahl> if they go via egl/vulkan/.. then don't think linear is necessary?

17:29 <Lynne> well, the protocol started out as linear-only and host-visible

17:31 kts has quit [Remote host closed the connection]

17:32 kts has joined #dri-devel

17:42 tzimmermann has quit [Quit: Leaving]

17:45 <lina> Lynne: This has nothing to do with PW directly, this is just KWin re-importing its own framebuffer as a texture itself (to blit it into the buffer that it then gives PW).

17:45 <lina> Also this works with modifiers after another bugfix in Kwin, PW supports that now and I verified OBS is getting fancy compressed framebuffers ^^

17:48 <lina> danvet: But the way util_sparse_array works is that it allocates stuff for you at a given slot when necessary... there's no way to initialize things before adding them to the array, it's always the other way around ^^

17:50 <lina> In that particular case it's a fresh BO so we're guaranteed to have a fresh handle and fresh array slot so we're the only possible owner (note the assert that the refcnt is zero indicating a free slot), so the lock doesn't need to cover initialization later on.

17:51 <lina> That was about the first comment, re FDToHandle you're right, there's a race (and it's probably in a bunch of drivers too!)

17:51 <danvet> yeah that case is ok (wouldn't be in the kernel, there you need to extend the lock or inverse the sequence because kernel needs to be robust against userspace guessing ids before the syscall has finisehd)

17:51 <danvet> it just freaked me out because kernel :-)

17:52 <lina> Yeah, but in the kernel you probably wouldn't use that kind of data structure ^^

17:52 <danvet> xarray is pretty much the same thing

17:52 <lina> It's kind of funky, apparently it's inherently threadsafe? But we still need the lock to tie it to refcounting anyway I think so that doesn't buy us much.

17:52 <danvet> just slightly different interface, as in caller allocates

17:52 <lina> Yeah, so you can initialize it first

17:52 <danvet> which you want anyway because GFP_ hierarchy

17:52 <danvet> yup

17:53 <lina> I need to write Rust abstractions for that ~next up, since I'm adding command queues to the UAPI (for compute) real soon now and that means I finally need to start keeping track of objects other than GBM BOs.

17:53 <danvet> yeah xarray is also rcu safe in most things, (plus spinlock for when it isn't)

17:53 <danvet> but in practice it's simpler to just use the spinlock to also protect what callers need instead of scratching your head about rcu

17:53 <danvet> oh no xarray rust wrapper yet?

17:54 <lina> Nope!

17:54 <lina> (Honestly, the existing Rust support is... very barebones, even downstream. I discovered right before release that there was no module device ID alias generation support...)

17:54 <danvet> oops

17:54 <lina> (I think I'm the only person actually using any of this in a production driver with real users ^^;;)

17:54 <lina> But that's probably good since it's actually getting everything usable slowly!

17:55 <danvet> I thought the nvme driver was fairly real?

17:55 <lina> Real yes but I'm not sure it has any users... considering tne module alias stuff.

17:55 <danvet> that would need mod alias stuff, or did they simply use the PCI_ macros for that

17:55 <danvet> hm I guess you can bind it through sysfs at runtime if it's just for playing around

17:55 <lina> They probably just only ever tested it built-in, or specified a single alias manually (which was the only thing that worked)

17:56 <lina> It works built in, the alias generation is just for module autoloading

17:56 <danvet> sysfs has an interface to add at least pci bindings at runtime

17:56 <danvet> oh right, once loaded it's pci_register_driver or whatever it was

17:56 <lina> Yeah

17:57 <lina> I'm not sure if they have the actual PCI match tables at all either, but if that isn't in Rust4Linux downstream I'm sure it's in a branch someone has somewhere (R4L has OpenFirmware match support, it's not hard to adapt that to other bus types)

17:58 <lina> IIRC device DMA/etc isn't merged into r4l either, so yeah, there's lots of moving parts still work in progress...

17:59 <lina> I think I'm not stepping on anyone's toes since I mostly stuck to DRM. The only core stuff I touched was some bugfixing, minor details, and general OF/Device Tree support.

17:59 <lina> GPUs are special snowflakes enough they tend not to overlap too much with "normal" drivers...

18:00 <lina> Oh yeah, I did add the IOMMU pgtable ops abstractions, but same thing, I doubt anyone else will need that for quite a while...

18:01 <danvet> yeah bypassing dma-api and directly wrestling iommu is not something many drivers do

18:01 <danvet> kvm and some vfio stuff and that's about it I think

18:02 <lina> Oh we still hit dma-api behind the scenes I think (in DRM), it's just that there's no actual IOMMU for it to interact with (though there could be one...), and instead the GPU has its own MMU layer that just happens to reuse the IOMMU pagetable code.

18:04 <lina> I had an interesting conversation with the Rust folks about what is safe and what is unsafe... technically, those IOMMU pagetable abstractions are 100% safe!

18:04 <Lynne> Venemo: latest mesh shared changes break mpv, btw, I had noticed this when it was still a draft, but thought it would be fixed

18:04 <Lynne> segfault in radv_amdgpu_get_bo_list

18:05 <lina> Sure, you can ask it to map any random GPU VA to any random "physical" address... but all it's doing is changing some data in safely allocated memory (page tables).

18:05 <lina> It's up to the GPU driver to actually install the base pointer into the IOMMU and make it all scary and real, and that's the only unsafe operation ^^

18:06 <lina> Safety is quite interesting (and hard) to define when it comes to device drivers.

18:28 rsjw has joined #dri-devel

18:48 K0bin has joined #dri-devel

18:48 kts has quit [Quit: Leaving]

18:53 alatiera8 has joined #dri-devel

18:56 alatiera has quit [Ping timeout: 480 seconds]

19:11 ajax has quit [Remote host closed the connection]

19:45 djbw has joined #dri-devel

19:49 JohnnyonFlame has quit [Ping timeout: 480 seconds]

20:05 K0bin has quit [Quit: K0bin]

20:07 jkrzyszt has quit [Remote host closed the connection]

20:07 jkrzyszt has joined #dri-devel

20:53 ahajda_ has quit []

21:01 lynxeye has quit [Quit: Leaving.]

21:26 <DemiMarie> lina: that is why one makes a type of base pointers that can always be safely inserted into the IOMMU and which always point to a well-formed page table. Functions that modify the page tables are then marked `unsafe` unless they guarantee all needed invariants.

21:29 maxzor has quit [Remote host closed the connection]

21:29 bgs has quit [Remote host closed the connection]

21:30 lemonzest has quit [Quit: WeeChat 3.6]

21:38 rasterman has quit [Quit: Gettin' stinky!]

21:43 K0bin has joined #dri-devel

21:45 Haaninjo has quit [Quit: Ex-Chat]

22:02 <Venemo> Lynne: which mesh shader changes? and how do I reproduce the issue?

22:03 dcz_ has quit [Ping timeout: 480 seconds]

22:03 <Venemo> Lynne: I wish you had told me sooner

22:04 <Lynne> sorry -_-

22:04 <Lynne> let me bisect

22:10 <Lynne> Venemo: e10b2f273e8a48a2db977469d30f6ed1014484c4

22:12 <Venemo> Lynne: this? https://gitlab.freedesktop.org/mesa/mesa/-/commit/e10b2f273e8a48a2db977469d30f6ed1014484c4

22:12 <Lynne> yeah

22:13 <Venemo> weird

22:13 <Venemo> and how do I reproduce this crash?

22:14 <Lynne> mpv --no-config --gpu-context=waylandvk

22:14 <Lynne> seems to only trigger with wayland

22:14 <Venemo> I'll take a look

22:16 Duke`` has quit [Ping timeout: 480 seconds]

22:25 <Venemo> Lynne: that command doesn't seem to do anything other than showing some help

22:25 <Venemo> Lynne: if I add a video file at the end of that command, it plays the video without issue

22:26 apinheiro has joined #dri-devel

22:26 <Venemo> this is with mpv-0.35.0-1.fc37.x86_64; which version did you use?

22:26 danvet has quit [Ping timeout: 480 seconds]

22:26 <Lynne> 0.35.0-25

22:27 <Lynne> from nov 25, let me update it

22:27 <Lynne> did you try --vo=gpu-next?

22:28 <Venemo> no, why would I? that's not what you wrote above

22:28 <Venemo> tried now, still works

22:30 <Lynne> still broken here with git master

22:30 <Venemo> is there a specific video or video format that is affected?

22:31 <Venemo> I'm going to try its master

22:31 <Lynne> no, everything's broken

22:31 <Lynne> I am on sway, which may make a difference

22:33 <Venemo> sigh, the build from its main branch is broken...

22:34 <Lynne> it won't build? what's the problem?

22:34 <Venemo> it's OK just needs a manual LD_LIBRARY_PATH

22:34 <Venemo> but it still works fine

22:35 <Venemo> I'm running Gnome, do you think that makes a difference?

22:36 <Venemo> can you give me a backtrace please? preferably in a pastebin (don't paste it into IRC)

22:39 <Lynne> narrowed it down more, only sway with the vulkan renderer makes mpv crash

22:40 <Venemo> I don't think I can help without seeing a backtrace

22:41 <Lynne> https://0x0.st/o5BU.txt

22:41 <Venemo> can you try the same with a debug build of mesa please?

22:42 <Venemo> so that it shows line numbers

22:42 <Lynne> from the disassembly, looks like it segfaults in add eax,DWORD PTR [rcx+0x44], which looks like its in the first changed loop

22:42 <Lynne> but let me do a debug build

22:46 <Lynne> it was, with num_extra_cs = 1, extra_cs_array == NULL

22:46 <Venemo> okay I wonder how that is possible

22:47 <Lynne> well, before, !!extra_cs was used to detect to extend the loops

22:47 <Venemo> Lynne: can you give me the backtrace with the debug build please? the previous one seems to be missing some frames

22:48 <Venemo> num_extra_cs should be 0 when there are no extra CSs

22:48 <Lynne> https://0x0.st/o5BD.txt

22:50 <Lynne> extra_cs_array[0] == NULL, rather

22:51 <Venemo> Lynne: this seems to be unrelated to num_extra_cs, line 796 is: handles[i].bo_handle = extra_bo_array[i]->bo_handle; this line is not changed by that patch

22:53 <Venemo> actually the backtrace seems to not match the code, the second frame points to an empty line

22:53 <Lynne> you're on the commit rather than HEAD, right?

22:53 <Venemo> I'm on 55df7ad571470562ffa3f6d71c32787f11b61b14 (latest main)

22:53 <Venemo> which commit are you on?

22:53 <Lynne> the bad commit

22:54 <Lynne> I think the issue is that before, initial_preamble_cs could be NULL

22:54 <Venemo> ok gimme a sec, will take a look at that

22:54 Akari has joined #dri-devel

22:54 <Lynne> the fix is simple, instead of hardcoding 1 on function calls, it should be !!initial_preamble_cs

22:55 <Venemo> I'm trying to think of a use case when it can be NULL and honestly I'm not aware of any

22:56 rsjw has quit [Ping timeout: 480 seconds]

22:56 <Lynne> making that change in the 3 calls fixes it

22:56 <Venemo> I don't understand how the preamble can be NULL

22:58 <Venemo> Lynne: if you go to frame #4 radv_queue_submit_normal, is submit.initial_preamble_cs really NULL?

22:59 <Lynne> yup

23:00 <Venemo> Lynne: what path does the code take in radv_update_preambles?

23:00 <Venemo> does it take the early exit?

23:02 <Lynne> seems to take the queue->qf == RADV_QUEUE_TRANSFER path

23:02 <Venemo> hm

23:02 <Venemo> interesting, I wonder why it does that on sway and not on gnome

23:03 <Lynne> hmm, well, it's got a vulkan renderer

23:03 <Lynne> I guess a difference is that AFAIK, application's semaphores are given to the compositor now

23:04 <Lynne> queue->qf is indeed RADV_QUEUE_TRANSFER

23:05 <Venemo> okay

23:08 apinheiro has quit [Quit: Leaving]

23:09 <Venemo> Lynne: can you apply this commit (on top of main) and see if this helps? https://gitlab.freedesktop.org/Venemo/mesa/-/commit/421bec9d0de49df0052da6b7ab8bd3d87348c45f

23:12 <Lynne> yup, that works

23:12 rsjw has joined #dri-devel

23:14 <Venemo> Lynne: thanks for letting me know, here is a MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20401

23:14 <Venemo> I forgot that 0 preambles is possible...

23:14 jkrzyszt has quit [Ping timeout: 480 seconds]

23:16 <Lynne> np, cheers

23:24 Akari has quit [Ping timeout: 480 seconds]

23:25 fxkamd has quit [Remote host closed the connection]

23:26 fxkamd has joined #dri-devel

23:42 fab has quit [Quit: fab]

23:44 Major_Biscuit has joined #dri-devel