#dri-devel on 2022-07-08 — irc logs at oftc.irclog.whitequark.org

2022-03-22 11:57 ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar

00:09 YuGiOhJCJ has quit [Remote host closed the connection]

00:09 YuGiOhJCJ has joined #dri-devel

00:09 <jenatali> Looks like the t760s are out to lunch again https://gitlab.freedesktop.org/mesa/mesa/-/jobs/25057937

00:15 fahien has joined #dri-devel

00:15 ngcortes has quit [Read error: Connection reset by peer]

00:15 fahien has quit []

00:15 co1umbarius has joined #dri-devel

00:17 columbarius has quit [Ping timeout: 480 seconds]

00:20 kts has joined #dri-devel

00:22 gawin has quit [Ping timeout: 480 seconds]

00:23 maxzor_ has quit [Ping timeout: 480 seconds]

00:25 <daniels> jenatali: it’ll get fixed soon but itmt they’ll be blocked for 90min periods whilst people submit drm updates which terminally hang them

00:26 <daniels> s/whilst/whenever/

00:26 <jenatali> Oof

00:26 <jenatali> But makes sense

00:28 <daniels> (and we only have two of them because that platform is old enough that the others all died)

00:31 ngcortes has joined #dri-devel

00:34 toolchains has joined #dri-devel

00:35 anholt has quit [Ping timeout: 480 seconds]

00:47 <JoniSt> oh no

00:49 toolchains has quit [Ping timeout: 480 seconds]

00:58 toolchains has joined #dri-devel

01:06 toolchains has quit [Ping timeout: 480 seconds]

01:08 toolchains has joined #dri-devel

01:09 JoniSt has quit [Quit: KVIrc 5.0.0 Aria http://www.kvirc.net/]

01:18 ngcortes has quit [Remote host closed the connection]

01:24 toolchains has quit [Read error: Connection timed out]

01:29 Daanct12 has joined #dri-devel

01:49 ybogdano has quit [Ping timeout: 480 seconds]

01:51 toolchains has joined #dri-devel

02:00 toolchains has quit [Ping timeout: 480 seconds]

02:05 sul has quit [Ping timeout: 480 seconds]

02:06 sul has joined #dri-devel

02:09 toolchains has joined #dri-devel

02:29 toolchains has quit [Read error: Connection timed out]

02:38 heat has quit [Ping timeout: 480 seconds]

02:44 pixelclu- has joined #dri-devel

02:49 pixelcluster has quit [Ping timeout: 480 seconds]

02:57 kts has quit [Ping timeout: 480 seconds]

02:57 anholt has joined #dri-devel

03:01 bmodem has joined #dri-devel

03:03 ella-0_ has joined #dri-devel

03:06 ella-0 has quit [Read error: Connection reset by peer]

03:12 ppascher has joined #dri-devel

03:23 JohnnyonFlame has joined #dri-devel

03:31 JohnnyonF has quit [Ping timeout: 480 seconds]

03:32 toolchains has joined #dri-devel

03:37 Daanct12 has quit [Ping timeout: 480 seconds]

03:40 toolchains has quit [Ping timeout: 480 seconds]

03:48 Daanct12 has joined #dri-devel

04:23 toolchains has joined #dri-devel

04:23 kts has joined #dri-devel

04:25 Duke`` has joined #dri-devel

04:31 off^ has joined #dri-devel

05:02 shankaru has joined #dri-devel

05:06 jewins has quit [Ping timeout: 480 seconds]

05:07 Daanct12 has quit [Remote host closed the connection]

05:08 Daanct12 has joined #dri-devel

05:18 YuGiOhJCJ has quit [Remote host closed the connection]

05:18 YuGiOhJCJ has joined #dri-devel

05:19 Daanct12 has quit [Quit: Leaving]

05:27 off^ has quit [Ping timeout: 480 seconds]

05:29 toolchains has quit [Ping timeout: 480 seconds]

05:29 itoral has joined #dri-devel

05:34 Daanct12 has joined #dri-devel

05:39 shankaru has left #dri-devel [#dri-devel]

05:40 shankaru1 has joined #dri-devel

05:49 Duke`` has quit [Ping timeout: 480 seconds]

05:49 JohnnyonF has joined #dri-devel

05:50 Daanct12 has quit [Remote host closed the connection]

05:50 Daanct12 has joined #dri-devel

05:56 toolchains has joined #dri-devel

05:57 JohnnyonFlame has quit [Ping timeout: 480 seconds]

06:04 Daaanct12 has joined #dri-devel

06:04 toolchains has quit [Ping timeout: 480 seconds]

06:05 srslypascal has quit [Remote host closed the connection]

06:05 srslypascal has joined #dri-devel

06:05 off^ has joined #dri-devel

06:06 slattann has joined #dri-devel

06:07 toolchains has joined #dri-devel

06:08 adarshgm has joined #dri-devel

06:08 <adarshgm> test message

06:10 Daanct12 has quit [Ping timeout: 480 seconds]

06:22 alanc has quit [Remote host closed the connection]

06:25 MajorBiscuit has joined #dri-devel

06:26 toolchains has quit [Ping timeout: 480 seconds]

06:28 alanc has joined #dri-devel

06:30 Company has quit [Quit: Leaving]

06:46 toolchains has joined #dri-devel

06:46 tzimmermann has joined #dri-devel

06:47 <tomeu> robclark: what happens with the t760 is that it regressed badly in 5.19-rcX, and people are running it in repos that don't have the fix in their -external-fixes branch

06:48 <tomeu> so it times out

06:52 <tomeu> daniels: I can reduce the limit in my repo, sure

06:52 <tomeu> hopefully there aren't that many forks out there yet

06:55 toolchains has quit [Ping timeout: 480 seconds]

06:56 ppascher has quit [Ping timeout: 480 seconds]

07:16 toolchains has joined #dri-devel

07:24 toolchains has quit [Ping timeout: 480 seconds]

07:30 <pq> kchibisov, EGL Surfaceless platform supports pbuffers, because they are the way to get an EGLSurface there.

07:35 nchery has quit [Read error: Connection reset by peer]

07:42 off^ has quit [Ping timeout: 480 seconds]

07:44 toolchains has joined #dri-devel

07:50 rasterman has joined #dri-devel

07:52 toolchains has quit [Ping timeout: 480 seconds]

07:56 whald has joined #dri-devel

07:59 lynxeye has joined #dri-devel

08:01 ahajda__ has joined #dri-devel

08:02 whald has quit []

08:02 whald has joined #dri-devel

08:02 mvlad has joined #dri-devel

08:07 JohnnyonFlame has joined #dri-devel

08:08 <whald> is GBM supposed to be thread-safe? so can I gbm_bo_alloc on one thread and then gbm_bo_map/unmap on a different thread?

08:14 JohnnyonF has quit [Ping timeout: 480 seconds]

08:15 <whald> because it crashes for me on Intel and AMD, here are the stack traces: https://pastebin.com/LLP1u6ct (Intel is on unmap, AMD is on map)

08:19 <pq> whald, if you do 'git grep pthread' in Mesa's src/gbm/, there are no hits. So I guess that's your answer.

08:20 <pq> hmm, there is 'mtx' though, but it looks very few uses of that

08:21 <lynxeye> whald: That's a bit of a gray area. It seems the map/unmap calls are using context operations, which are only supposed to be called by the thread where the context is current. However there is no way to make the context current on the calling thread via the GBM API...

08:23 toolchains has joined #dri-devel

08:23 <pq> lynxeye, how does that work in single-threaded programs even? Creating a gbm_device implicitly makes a context and makes it current?

08:23 <whald> lynxeye, that's a bit of a bummer, as map/unmap can be quite expensive. i'm trying to offload some slow operations from the main thread to some thread pool. :-/

08:24 <lynxeye> pq: I would need to look up the details again, but I think that's effectively what happens.

08:25 <pq> ...and what context is that, exactly? Like, if I change my EGLContext in the same thread, does that screw up GBM?

08:26 <pq> e.g. if I use GBM stand-alone without EGL, but I also use EGL Device platform to make an EGLContext to use GL with.

08:28 <pq> there is absolutely no indication of any thread-locality or contexts going on in the GBM API, so this is all a big surprise to me

08:29 FireBurn has joined #dri-devel

08:29 <FireBurn> Vulkan is broken on my 6800M *again*

08:29 <FireBurn> Bisecting now

08:29 <pq> I would have assumed that if I do my own locking around GBM, access from multiple threads would be fine.

08:29 fahien has joined #dri-devel

08:29 <lynxeye> pq: All good questions, where I don't have a definite answer without reading the code again. gbm_map/unmap was always kind of a strange thing. All other GBM operations deal with allocations, etc. which are screen level operations, that are thread safe. map/unmap is the only thing in the GBM API that needs context operations and I don't think anyone gave it any thought, as to how the usage model for this is supposed to look like.

08:30 <pq> lynxeye, interesting

08:31 <FireBurn> What the chances of getting a PRIME system added to CI? I mean it breaks on a weekly basis

08:31 bmodem has quit []

08:31 <pq> incidentally, gbm_dri_bo_map() does use a mutex that literally nothing else does.

08:31 <lynxeye> The best advise I can give is: don't use GBM map/unmap, but import the GBM BOs into a higher level API, where this context stuff is actually defined and use that to fill the buffers.

08:33 <pepp> FireBurn: fwiw I tested PRIME earlier this week and it was working fine, so the regression should be recent

08:33 <pepp> and I agree: having a PRIME system in CI would be useful

08:34 <pq> aha, looks like the only purpose of that mutex is to protect dri->context, which gbm_dri_bo_map() creates on demand, and passes it explicitly to dri->image->mapImage()

08:34 <pq> so it doesn't look like it's thread-local in any way

08:35 <pq> but it does mean that if you call GBM from multiple thread, you are using the dri->image API from multiple-threads simultaneously with the one and the same context

08:35 <FireBurn> pepp: Yeah me too, I'm going to hazzard a guess at the vulkan/wsi stuff that's just landed

08:36 <pq> also, gbm_dri_bo_unmap() uses dri->context without any locking or CPU memory barriers

08:36 <lynxeye> pq: right and context operations are not thread safe, so that's a massive footgun right there

08:37 toolchains has quit [Ping timeout: 480 seconds]

08:37 <whald> lynxeye, pq so with GBM map/unmap being not really up for the task, what can I do instead of going full OpenGL for getting my hands on the pixels?

08:37 <pq> whald, there's your answer: you need to make sure only a single thread can use stuff related to the same gbm_device at a time.

08:38 pcercuei has joined #dri-devel

08:38 <pq> (i'm jumping to the assumption that gbm_device objects do not share contexts.)

08:39 <pq> oh, there is GBM_ALWAYS_SOFTWARE end var, hadn't heard about that one before

08:39 <pq> *env var

08:39 <whald> pq, hmm, having a separate gbm_device for the background thread seems doable, but to pass the buffers i'll have to export / re-import I guess. this is getting out of hand. multi-threading was a mistake, again. :-)

08:40 <pq> yup :-)

08:41 <pq> what else do you do with your gbm_device than just alloc/import/map/unmap?

08:41 YuGiOhJCJ has quit [Quit: YuGiOhJCJ]

08:43 <whald> i think just doing the map on the main thread and doing the pixel peeping on a separate thread will not work either, because at least on intel there is a chance that the map will not allocate scratch space but instead do a mmap of the GPU memory, and then there would be some cache coherency stuff missing.

08:44 <whald> pq, i'm also using gbm surfaces to have something to attach to the drm device for scanout. but that's about it. does GBM offer more that I may abuse? :)

08:45 <pq> hmm, that doesn't sound legit for GBM API... I mean, the API does expect you to map and unmap to flush.

08:46 bmodem has joined #dri-devel

08:46 <pq> but if you do that, I wouldn't expect which actually accesses the data to be significant

08:46 <pq> *which thread

08:46 <MrCooper> yeah, never unmapping will result in sadness if the implementation uses a scratch buffer

08:47 <pq> whald, yeah, that's about it. But since you have gbm_surface, don't you also go all the way to EGL and OpenGL anyway?

08:48 <whald> pq, right, mapping / unmapping on the main thread would solve my immediate problem because with linear buffers on intel those operations are cheap. and it will probably still work on other platforms, performance might be degraded.

08:48 <MrCooper> whald: that is required for correctness, deferring the unmap will not work correctly in general e.g. with radeonsi

08:48 <pq> cross-domain access is hard :-)

08:50 <whald> pq, yes, we're using OpenGL for rendering. but I don't have an OpenGL context at hand in the part of the application where I'm doing the separate-thread-pixel-peeping. and that would be quite a refactor...

08:50 <MrCooper> whald: in other words, writes to the mapped memory may not be visible to the GPU before unmap (and GPU writes to the buffer may not be visible in the mapped memory after map)

08:52 <whald> MrCooper, but having a sequence where it goes like T1: create -> T1: map -> T2: peep pixels -> T1: unmap would be fine, right? T1 is responsible for managing the broader state and will e.g. block the buffer from re-use until T2 is done, that's the way it is set up already.

08:54 <MrCooper> I guess that should work, not sure I see what problem it solves though :)

08:55 <MrCooper> T2 or any other thread making use of the buffer with the GPU will need to wait for T1 to unmap first anyway

08:55 toolchains has joined #dri-devel

08:55 <whald> MrCooper, the pixel peeping is pretty slow, almost 200ms and I absolutely cannot block the main thread for that long.

08:55 adarshgm has quit [Ping timeout: 480 seconds]

08:56 <MrCooper> ah, if it's only for reading from the buffer, that makes sense

08:56 <whald> MrCooper, I should have added that T1 does more interesting things while T2 is churning along, but the buffer and it's mapping are private to T2 until it finishes.

08:57 <kchibisov> pq: isn't surfaceless platform requires different extension when creating a display?

08:58 <pq> kchibisov, EGL Surfaceless platform is a different platform yes. It does not use a EGLNativeDisplay.

08:58 <kchibisov> Yeah, I was talking about EGL_KHR_platform_wayland, since with it you don't have pbuffers at all.

08:58 <pq> I know.

08:59 <pq> If you really want pbuffers, Surfaceless can give them to you.

08:59 <MrCooper> whald: almost 200ms seems very slow though, what resolution buffer is that?

08:59 <kchibisov> Oh, I don't want them, I was curious since I've seen this line in extension.

08:59 <kchibisov> https://github.com/KhronosGroup/EGL-Registry/blob/84f25dd4c04a01ea48480f7296ba9d64d435fa87/extensions/KHR/EGL_KHR_platform_wayland.txt#L84

09:00 <kchibisov> Thought that it could be mentioned that pbuffers will also always fail.

09:00 aravind has joined #dri-devel

09:02 <pq> kchibisov, well, they don't *have to* fail. Nothing stops implementing pbuffers.

09:03 <icecream95> MrCooper: I've seen cases where mapping a large BO device-side (i.e. importing it) takes 100 ms or so, I could believe that a CPU map could take as long

09:03 <pq> unlike pixmaps which simply don't exist in Wayland

09:03 toolchains has quit [Remote host closed the connection]

09:05 <pq> kchibisov, btw. the notes about EGL_DEFAULT_DISPLAY on Wayland are absoteluly non-sensical. I've no idea why it was allowed. Maybe it's a shortcut for doing off-screen rendering on the same GPU as the winsys, but not being able to interact with the winsys at all.

09:06 sravn has joined #dri-devel

09:07 <pq> you can't even make a EGLSurface with Wayland EGL_DEFAULT_DISPLAY

09:18 <whald> MrCooper, 200ms is for a full HD buffer. the code doing the processing is not (yet) streamlined at all, I can probably speed it up by a factor of 5 or more. but that's still way to slow for the main thread.

09:19 rgallaispou1 has quit [Read error: Connection reset by peer]

09:20 <MrCooper> factor 5 would still mean 25 fps max, though multiple reader threads might work

09:22 JohnnyonFlame has quit [Ping timeout: 480 seconds]

09:26 <whald> MrCooper, I'm software-decoding some custom "video" stream coming in over UDP and want to exfiltrate previews (so a JPEG encode every 10s is what happens on the "other" thread). we're targeting an intel Atom CPU/GPU combo which has only two cores anyway, so using more threads won't work. processing an udp packet is in the 10-20 us range, and we're receiving about 40k of those if things get busy. so the JPEG encode is orders of

09:26 <whald> magnitude slower on that Atom than anything else we do.

09:28 rkanwal has joined #dri-devel

09:31 <linkmauve> whald, have you tried using the hardware JPEG encoder, if your SoC has one?

09:32 <linkmauve> Check `vainfo`, if there is a VAProfileJPEGBaseline with encode support (VAEntrypointEncPicture), it could let you offload that operation.

09:33 <whald> linkmauve, yep, and it's pretty fast. *but* with GBM I cannot directly create NV12 BOs (not supported), so I go the 1 R8 BO (for Y) and one RG88 BO (for UV) route... *but* the vaapi API refuses to accept multi-object NV12 buffers. pretty sad, eh? :-)

09:34 <linkmauve> Can’t you allocate a single bo and import it into EGL with offsets so that it ends up in a layout that libva accepts?

09:35 <whald> it's not the vaapi API per-se, but the intel-media-driver chickens out of the UV plane has offset == 0, which effectively means the UV has to come after the Y data.

09:35 <linkmauve> :|

09:35 <linkmauve> You should open an issue for that.

09:36 <linkmauve> Does the i915 driver work better?

09:36 JohnnyonFlame has joined #dri-devel

09:37 <whald> linkmauve, I already thought about arranging the data in a single BO by hand, but it seems there are various requirements to get it right, all somewhere in the intel-media-driver. maintaining this would be hell. so i thought encoding a JPEG every 10s with a core to spare can't be that hard. or so.

09:37 pcercuei has quit [Read error: Connection reset by peer]

09:37 off^ has joined #dri-devel

09:37 pcercuei has joined #dri-devel

09:38 <whald> linkmauve, which i915 driver do you mean?

09:38 <linkmauve> The vaapi one.

09:38 frieder has joined #dri-devel

09:38 <linkmauve> It’s named libva-intel-driver in my distribution, and provides /usr/lib/dri/i965_drv_video.so.

09:39 <linkmauve> I’ve usually had better results on my Kaby Lake than with the new one, for instance it does support VP9 encoding while the newer one doesn’t.

09:40 <whald> linkmauve, i just gave it a try: "libva error: /run/opengl-driver/lib/dri/i965_drv_video.so init failed". hmm.

09:40 <whald> (I'm on nixos, that's why the path is strange)

09:42 fahien1 has joined #dri-devel

09:43 fahien is now known as Guest4519

09:43 fahien1 is now known as fahien

09:43 frieder has quit [Remote host closed the connection]

09:47 Guest4519 has quit [Ping timeout: 480 seconds]

09:59 bmodem has quit []

10:02 off^ has quit [Ping timeout: 480 seconds]

10:11 <FireBurn> Looks like it was ff13fc381d59fc8a5b06a40b6bb857503c6e7711 that broke things for me

10:13 fahien has quit [Ping timeout: 480 seconds]

10:14 fahien has joined #dri-devel

10:17 heat has joined #dri-devel

10:22 adarshgm has joined #dri-devel

10:42 rkanwal has quit [Ping timeout: 480 seconds]

10:47 <MrCooper> Venemo: ^

10:47 kts has quit [Ping timeout: 480 seconds]

10:50 adarshgm has quit [Ping timeout: 480 seconds]

10:58 <hakzsam> FireBurn: broke what?

11:13 gawin has joined #dri-devel

11:16 kts has joined #dri-devel

11:23 JoniSt has joined #dri-devel

11:25 Daaanct12 has quit [Remote host closed the connection]

11:33 JohnnyonFlame has quit [Read error: Connection reset by peer]

11:35 JoniSt has quit [Quit: KVIrc 5.0.0 Aria http://www.kvirc.net/]

11:37 mclasen has joined #dri-devel

11:39 off^ has joined #dri-devel

11:40 JoniSt has joined #dri-devel

11:41 itoral has quit [Remote host closed the connection]

11:41 rkanwal has joined #dri-devel

11:42 itoral has joined #dri-devel

11:43 nchery has joined #dri-devel

11:46 icecream95 has quit [Ping timeout: 480 seconds]

11:50 <FireBurn> Rendering on my 6800M

11:50 <FireBurn> Might be a prime thing

11:50 <FireBurn> Or it might not be that commit :/ Tried reverting it and stll seeing the issue

11:59 <Venemo> MrCooper: yeah?

12:01 <Venemo> FireBurn: please open an issue report in the Mesa gitlab, choose the radeon Vulkan issue template and fill in the details of how to reproduce your issue. Thank you.

12:03 <Venemo> That is, assuming you experience the problem running a Vulkan app

12:04 off^ has quit [Ping timeout: 480 seconds]

12:08 aravind has quit [Ping timeout: 480 seconds]

12:15 itoral has quit [Remote host closed the connection]

12:22 off^ has joined #dri-devel

12:29 fahien has quit [Ping timeout: 480 seconds]

12:31 fahien has joined #dri-devel

12:35 zehortigoza has joined #dri-devel

12:44 dri-logg1r has joined #dri-devel

12:46 dri-logger has quit [Ping timeout: 480 seconds]

12:46 mslusarz has quit [Ping timeout: 480 seconds]

12:49 mslusarz has joined #dri-devel

12:50 off^ has quit [Ping timeout: 480 seconds]

12:52 glisse has quit [Remote host closed the connection]

12:52 glisse has joined #dri-devel

12:52 mareko has quit [Remote host closed the connection]

12:52 mareko has joined #dri-devel

12:54 pixelclu- has quit []

12:54 pixelcluster has joined #dri-devel

12:58 <FireBurn> Someone beat me to it https://gitlab.freedesktop.org/mesa/mesa/-/issues/6826

13:01 <FireBurn> Someone else with a 6800M :D

13:07 <jenatali> :( hopefully jekstrand can see what's going wrong because that looked fine to me and I can't really test it out to debug

13:14 sul has quit [Ping timeout: 480 seconds]

13:14 sul has joined #dri-devel

13:16 alyssa has joined #dri-devel

13:17 maxzor has joined #dri-devel

13:19 iive has joined #dri-devel

13:30 off^ has joined #dri-devel

13:41 kts has quit [Ping timeout: 480 seconds]

13:42 nirya has joined #dri-devel

13:43 <linkmauve> digetx, did you manage to fix anything wrt synchronisation between wl_buffer and vaapi?

13:49 wvanhauwaert has joined #dri-devel

13:50 off^ has quit [Ping timeout: 480 seconds]

13:56 wvanhauwaert has quit [Remote host closed the connection]

14:00 jewins has joined #dri-devel

14:01 fxkamd has joined #dri-devel

14:02 off^ has joined #dri-devel

14:10 Haaninjo has joined #dri-devel

14:14 Company has joined #dri-devel

14:17 <linkmauve> digetx, I opened a PR instead: https://github.com/mpv-player/mpv/pull/10382

14:19 camus has quit []

14:23 kts has joined #dri-devel

14:28 <alyssa> eric_engestrom: to be clear I just build with the meson defaults

14:28 <alyssa> which is, TTBOMK, a debug optimized build, so neither NDEBUG nor DEBUG defined

14:28 <eric_engestrom> exactly

14:28 <eric_engestrom> and I expect most people do that too

14:28 <alyssa> I don't think I realized that "debug build with -O2" is different from "debug optimized"

14:28 <alyssa> that's really confusing

14:29 <eric_engestrom> which is why I agree with your MR

14:29 <alyssa> the MR justification is dumber ... cheap asserts are gated behind !NDEBUG, that's what NDEBUG is there for

14:29 <eric_engestrom> it's just your MR description which sounded like the two were a bit confused, that's why I explained the difference

14:29 <alyssa> oh, right

14:29 <alyssa> yeah I don't understand any of this

14:30 <eric_engestrom> perhaps we should rename `DEBUG` to `EXPENSIVE_ASSERT` or something (:

14:30 <alyssa> 100%

14:30 <alyssa> learning that I don't have list/mutex asserts in any of my builds (despite making use of the latter) gave me an "emperor has no clothes" moment yesterday

14:30 <eric_engestrom> haha

14:31 gawin has quit [Ping timeout: 480 seconds]

14:33 <alyssa> eric_engestrom: most DEBUG use is in drivers, fwiw

14:33 <alyssa> and not my drivers

14:33 <alyssa> so I can't do perf testing for that

14:34 <alyssa> there are some weird uses in Mesa, thouhg

14:34 <alyssa> like os_log_message, in util/, which prints to stderr in all builds but can be overriden with a GALLIUM env var in debug builds only

14:34 <alyssa> *blink*

14:35 <ajax> error message handling in mesa does not suffer from what you might call a unity of design

14:35 <alyssa> truth

14:35 <alyssa> when should debug_printf be used? who knows

14:35 <ajax> part of me wants to overhaul it all but part of me wants to leave that for like a gsoc project

14:36 <alyssa> yeah that's fair

14:38 <eric_engestrom> I haven't looked at the detail of any gsoc project, but I had the feeling they were a lot bigger than that

14:38 <eric_engestrom> perhaps we should put in place a list of small tasks like this for newcomers to get into the code

14:39 <ajax> we have label:good-first-task at least

14:39 <alyssa> eric_engestrom: unifying message handling in mesa is a lot bigger than one might think ....

14:42 <eric_engestrom> fair, I might well be underestimating it

14:43 <ajax> i think about it like this:

14:43 <ajax> tiletamine:~/git/mesa% git grep -w fprintf.stderr src | wc -l

14:43 <ajax> 2455

14:43 <ajax> pretty sure some of those should be going to a better place than wherever fd 2 happens to be pointing

14:45 <ajax> the secret here being you can't have good error messages without actually good error handling in the code, so fixing the messages to be any good probaby requires fixing the code around them a bit too

14:45 JohnnyonFlame has joined #dri-devel

14:46 <ajax> best kind of error message is one you don't have to print because you fixed the algorithm so the condition can't happen anymore

14:51 <jenatali> FWIW on Windows, 95%+ of those would be better off printing to OutputDebugString to be visible in a debugger for apps with no console attached

14:51 <jenatali> So, yeah an error logging overhaul that enables that would be super welcome

14:53 <robclark> tomeu: idk, possibly we need a list of maintainer trees and branches and merge all the -external-fixes? So far I've been just trying to get the more limited case of CI for an individual driver in an individual driver tree working.. once that is sorted I guess we can figure out how to roll it up so CI still works when airlied merges -next/-fixes branches

14:55 whald has quit [Remote host closed the connection]

14:59 <alyssa> jenatali: and Android has its own place (logcat?) which freedreno uses

14:59 <eric_engestrom> ajax, jenatali: could you write this in an issue, so that it's not lost?

14:59 <jenatali> alyssa: Yep

14:59 <eric_engestrom> the "where to log to" thing was (partially?) resolved with the common logging infra, but I don't know how widely used it is

15:00 nchery is now known as Guest4535

15:00 nchery has joined #dri-devel

15:00 <alyssa> in panfrost I've mainly solved this by not logging things.......

15:00 <alyssa> (-:

15:00 <eric_engestrom> (I mean src/util/log.h)

15:00 Guest4535 has quit [Ping timeout: 480 seconds]

15:01 <ajax> eric_engestrom: sure. there's at least one open issue already iirc, i'll see if i can find it

15:02 <ajax> but also like

15:02 <ajax> src/glx basically doesn't know about src/util/anything

15:03 <ajax> so yeah there's common logging but there's also still some uncommon logging to get rid of

15:04 <eric_engestrom> this common logging happened after I ~left mesa a couple of years ago I think, so I didn't follow it much

15:04 <eric_engestrom> but it's something that I had wanted to do for a while

15:05 <eric_engestrom> I just checked and src/egl/main/egllog.h is still there, I thought it would've been swallowed into the common stuff

15:05 ahajda__ has quit [Ping timeout: 480 seconds]

15:06 <eric_engestrom> that should be an easy enough task; I'll make an issue and tag it good-first-task

15:09 ybogdano has joined #dri-devel

15:16 MajorBiscuit has quit [Quit: WeeChat 3.5]

15:17 tzimmermann has quit [Quit: Leaving]

15:22 <tomeu> robclark: yeah, hopefully when people start using it, breakage will happen less often and will last for shorter periods of time, but we need things to be stable enough to get started

15:22 gouchi has joined #dri-devel

15:22 <tomeu> so I think I'm going to reduce coverage quite a bit to reduce churn and flakiness

15:22 <tomeu> and we can increase it again later when more people are keeping an eye on regressions

15:23 <tomeu> and maybe we can do something in kernelci.org so less breakage makes it to drm-next

15:25 fxkamd has quit []

15:29 Duke`` has joined #dri-devel

15:48 AlexisHernndezGuzmn[m] has joined #dri-devel

15:52 <robclark> tomeu: one idea, maybe a per gitlab tree CI variable thing to select between "short" and "full" tests? Ie. for CI runs before merging things into msm-next, I want to do a full run on qc runners but maybe on a few sanity tests on mtk/intel/amd..

15:53 <robclark> *only a few...

15:55 <tomeu> we can try that, sure

16:07 alyssa has quit [Quit: leaving]

16:08 krushia has quit [Quit: Konversation terminated!]

16:08 <robclark> jenatali: re: OutputDebugString .. hook it up in mesa_log.. that is where logcat stuff is hooked up for android

16:09 <jenatali> robclark: It is, IIRC, it's just that not everyone uses it

16:09 <jenatali> E.g. nir logging goes straight to stderr

16:09 <robclark> fix other code to use mesa_log as needed

16:10 <robclark> we've hooked some nir stuff up to it

16:10 <jenatali> Oh I agree, it's just work and it hasn't been important enough to me to prioritize it

16:11 <robclark> we've kinda been fixing things as and when we need the msgs to not go into the either on android ;-)

16:12 <jenatali> Makes sense. When I was doing some Android stuff I appreciated that logcat was there

16:12 <jenatali> But I was fighting it because I was doing an in-tree build which is hardcoded to --buildtype=release, and switching it to debug breaks backtrace logging, and even then NDEBUG is defined and DEBUG isn't

16:13 <robclark> as best I can tell, debugging anything on android is a pita

16:14 <jenatali> Amen

16:14 <jenatali> Especially because I couldn't figure out how to get CPU debuggers working... every time they attached to a process it crashed

16:14 <jenatali> So... printf debugging via logcat, yay

16:16 <robclark> so back when android was a container on CrOS, I had reasonably good luck just building mesa without stripping symbols and attaching gdb from outside the container.. but the whole vm thing makes it much harder

16:18 rcf has joined #dri-devel

16:41 alyssa has joined #dri-devel

16:43 fahien has quit [Ping timeout: 480 seconds]

16:56 <jenatali> I don't suppose anybody wants to sign off on a mapi/glapi patch to fix Windows non-shared-glapi builds? https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16713

16:56 fahien has joined #dri-devel

16:59 stuart has joined #dri-devel

17:00 Duke`` has quit [Ping timeout: 480 seconds]

17:05 <alyssa> jenatali: does Windows support futexes?

17:05 <jenatali> alyssa: No. I don't even know what that is :)

17:05 <alyssa> compatible with util/futex.h I mean

17:05 <alyssa> Wikipedia claims Microsoft patented them so I would have thunk

17:06 <alyssa> "Futexes have been implemented in Microsoft Windows since Windows 8 or Windows Server 2012 under the name WaitOnAddress"

17:06 <jenatali> Huh

17:06 <alyssa> Bit of an X/Y problem

17:06 <alyssa> simple_mtx is only backed by C11 mutexes if we don't have futexes

17:07 mvlad has quit [Remote host closed the connection]

17:07 <alyssa> (if we do have futexes, we implement simple_mtx with atomics and a futex ourselves)

17:07 <JoniSt> Not *quite* the same though, linux futex has some more features but they shouldn't be relevant for mutexes

17:07 <alyssa> (and then the simple mtx initializer becomes trivial)

17:08 <alyssa> so if we support util/futex.h on Windows, via WaitOnAddress apparently, lygstate doesn't need to merge the wildly unpopular !17122 and everyone is happy

17:08 pcercuei has quit [Read error: Connection reset by peer]

17:08 pcercuei has joined #dri-devel

17:08 Duke`` has joined #dri-devel

17:08 <jenatali> Well, part of it should still merge, removing the non-simple mutex initializer

17:09 <jenatali> But agreed if we can keep the simple mutex initializer, that'd be nice

17:10 <alyssa> incidentally the C11 impl of simple_mtx_assert_locked makes me very sad.

17:10 <jenatali> Yeah looks like a WIN32 path could be added to util/futex.h

17:12 <alyssa> looks like futex_waait/futex_wake map pretty directly to WaitOnAddress/WakeByAddressAll

17:12 slattann has quit [Read error: Connection reset by peer]

17:13 <alyssa> also comparing the 3 existing impls, lol at OpenBSD being the only reasonable one.

17:13 <jenatali> Yeah I'm trying it out, will see what blows up

17:13 <alyssa> glhf

17:14 <alyssa> I can't decide which impl is more unreasonable, Linux or FreeBSD

17:14 kts has quit [Ping timeout: 480 seconds]

17:21 kts has joined #dri-devel

17:24 <jenatali> Hm, not sure I can implement this without pulling in windows.h, which sucks since that's a big header to include in another header

17:27 <jenatali> Guess I could add a futex.c for Windows only

17:36 <alyssa> Blink

17:43 Akari has joined #dri-devel

17:44 <ajax> where does the Xvfb we use in CI come from?

17:44 alyssa has quit [Quit: i can't focus]

17:52 <DrNick> I like how the futex version of simple_mtx_assert_locked() treats destroyed mutexes as locked

17:54 <daniels> ajax: Debian

17:57 flto has joined #dri-devel

17:59 agx has quit [Read error: Connection reset by peer]

17:59 agx has joined #dri-devel

17:59 Kayden has quit [Quit: -> JF]

18:19 lynxeye has quit [Quit: Leaving.]

18:25 <jenatali> alyssa: !17431

18:25 <jenatali> Let's see what CI says about it

18:30 kts has quit [Ping timeout: 480 seconds]

18:41 LexSfX has quit []

18:44 LexSfX has joined #dri-devel

18:54 lemonzest has joined #dri-devel

19:07 fahien has quit [Quit: fahien]

19:19 <jenatali> Holy crap, p_atomic_add_return is wrong in the MSVC path :O

19:26 <HdkR> It's amazing how incorrect you can make some operations and things somehow work.

19:26 <HdkR> Life finds a way

19:26 <jenatali> Yeah there's really not many hits on that function in the tree though

19:26 rkanwal has quit [Ping timeout: 480 seconds]

19:26 <jenatali> And it's only the "_return" part that's wrong

19:26 <HdkR> I had compareexchange incorrect for months and months and didn't realize :P

19:32 <jenatali> Unfortunately it's used by code that tries to drop multiple references at once... which means if those references should've brought the count to 0, bam that's a leak

19:46 ppascher has joined #dri-devel

19:59 zehortigoza has quit [Remote host closed the connection]

20:04 off^ has quit [Ping timeout: 480 seconds]

20:06 alyssa has joined #dri-devel

20:06 alyssa has left #dri-devel [#dri-devel]

20:19 sarnex has joined #dri-devel

20:34 Haaninjo has quit [Quit: Ex-Chat]

20:35 alyssa has joined #dri-devel

20:35 <alyssa> Do Gallium drivers have a way to detect the API of the frontend?

20:36 <alyssa> I have some slow legacy paths for big GL (and nine?) compat, I'd like to assert(!GLES) because it's a pretty serious bug to hit with GLES

20:36 maxzor_ has joined #dri-devel

20:36 maxzor has quit [Remote host closed the connection]

20:37 <alyssa> (Would have caught !17430)

20:38 <jekstrand> jenatali: Oof. I'll look now

20:40 <anholt> alyssa: none that I know of.

20:41 <anholt> well, nothing proper. rasterizer->point_tri_clip incidentally tells you gles2+.

20:43 <alyssa> heh, right. that'd be a pretty nasty hack..

20:44 <zmike> I've thought for a while that it would be nice to have a context/screen create flag for such a thing

20:45 <alyssa> I guess I'm of 2 minds

20:57 <anholt> oh, neat. tarceri's fix fixed portal2 on crocus as well.

20:58 <daniels> anholt: does this mean no more crashes?

20:59 <jekstrand> jenatali: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17434 Oops....

21:00 <jenatali> jekstrand: Whoops

21:01 <jekstrand> I can't believe I managed to entirely fry Wayland. :-/

21:02 <jenatali> Eh, happens. I've had my share of similar bugs breaking Windows :P

21:04 <anholt> daniels: I just mean the white rendering.

21:05 <daniels> anholt: ah ok - is portal2.trace still crashy on 630 these days, or was that (now-fixed?) MMU?

21:05 <zmike> it fixed this https://gitlab.freedesktop.org/mesa/mesa/-/issues/6240

21:05 <daniels> jekstrand: so off-message :(

21:15 <jekstrand> daniels: hehe

21:15 <jekstrand> daniels: If you want to spend some free brain cycles, we should test WSI in CI somehow.

21:17 <daniels> jekstrand: I have negative brain cycles this week somehow, but also yes

21:18 <daniels> I'm not volunteering for X11, but it's pretty easy for Wayland since you just whack the compositor in a separate thread and use protocol messages to sync on a semaphore so you can inspect whatever you like from the client side

21:18 <daniels> (may or may not have written this for another EGL stack many years ago)

21:31 icecream95 has joined #dri-devel

21:51 Kayden has joined #dri-devel

21:56 <jekstrand> daniels: Could we get xwayland going inside a headless Weston?

21:56 sarnex has quit [Read error: Connection reset by peer]

21:56 sarnex has joined #dri-devel

21:56 <daniels> jekstrand: yep

21:59 <jekstrand> daniels: Well, if you're interesting in wiring something up, this should help quite a bit: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17436

21:59 <jekstrand> jenatali: You may be interested in those too ^^

21:59 <jekstrand> jenatali: Since you're in a WSI review mode. :D

21:59 <jenatali> Yeah, looking

22:01 <daniels> I mean, if you are the compositor, then you can also hobble it to be shm-only

22:02 <jekstrand> daniels: We don't currently auto-detect well

22:02 <jekstrand> daniels: And I'm not sure we do. Most of the time, you want to not support WSI there rather than fall back to a SHM path

22:02 <daniels> yeah, strong agree

22:02 <jekstrand> Like, I think that's what Wayland PRIME would do today.

22:02 <daniels> really?

22:02 <daniels> last I saw it was just blitting to a linear dmabuf

22:02 <jekstrand> Yeah, no one's done any work on Wayland PRIME

22:03 <jekstrand> Nope, not for Wayland.

22:03 <jekstrand> It's not hard to hook up but no one's done it.

22:03 <daniels> I've never personally tested it, but it looked like the code other people had written should be doing that

22:03 <daniels> falling back to shm seems rather worse

22:03 <jekstrand> IMO, the hardest part is just figuring out the WL code to detect when you're on a different GPU and enable the blit path.

22:04 <jekstrand> Apart from that, it's like 5 LOC to add the path

22:05 <daniels> that isn't hard though - we literally send a path to the device that the compositor will be using for GPU imports?

22:05 <jekstrand> Sure.

22:05 <jekstrand> It's just that no one's done the typing.

22:06 <daniels> jekstrand: if you want to do that typing, I'll happily talk you through it, but I literally don't own a multi-GPU machine :P

22:06 <daniels> and I'm not about to go the eGPU route any time soon

22:07 <jenatali> FWIW we find having a software device that pretends to be a GPU incredibly helpful for this kind of stuff

22:07 <jenatali> Something like VGEM or maybe an extended version of that could be helpful for testing these kinds of paths

22:07 <jekstrand> daniels: IDK if I own one either. :)

22:07 <jekstrand> daniels: I did but I moved the RADV card out of my HSW

22:08 <daniels> jekstrand: I'll pop something WSI-ish off your stack in return then :)

22:09 <daniels> jenatali: yeah, dmabufs being software-mappable is super useful there from the compositor side - just need to make swrast clients use vgem to allocate to better test those paths

22:11 <jenatali> We've also got a configurable software GPU driver, where you can "plug in" multiple of them and connect virtual monitors to them, to test all kinds of crazy scenarios, and then at the end of the day you can just read back the displayed output from the other side of the compositor

22:12 <zmike> this seems like it has a lot of overlap with my lavapipe dmabuf wip 🤔

22:12 <daniels> jenatali: yeah, there's the beginnings of work to make vkms be controllable via configfs

22:13 <jenatali> Cool :)

22:13 <alyssa> 1

22:13 alyssa has quit [Quit: whoops]

22:14 <jekstrand> jenatali: Is the RB for the ANV patch too?

22:14 <jenatali> Yep

22:14 <jenatali> Seems straightforward enough

22:15 off^ has joined #dri-devel

22:15 <jekstrand> thanks

22:19 Duke`` has quit [Ping timeout: 480 seconds]

22:27 gouchi has quit [Remote host closed the connection]

22:41 off^ has quit [Ping timeout: 480 seconds]

22:50 alarumbe has joined #dri-devel

22:51 iive has quit []

22:54 srslypascal has quit [Ping timeout: 480 seconds]

22:54 clever has joined #dri-devel

23:05 off^ has joined #dri-devel

23:09 rasterman has quit [Quit: Gettin' stinky!]

23:16 sarnex has quit [Quit: Quit]

23:24 sarnex has joined #dri-devel

23:44 luc4 has joined #dri-devel

23:54 ngcortes has joined #dri-devel