#dri-devel on 2023-05-30 — irc logs at oftc.irclog.whitequark.org

2022-12-21 00:45 ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar

00:15 ultra has quit [Quit: Ultra]

00:15 ultra has joined #dri-devel

00:53 co1umbarius has joined #dri-devel

00:55 columbarius has quit [Ping timeout: 480 seconds]

00:58 sukrutb has quit [Ping timeout: 480 seconds]

01:05 leo60228 has quit [Read error: No route to host]

01:06 leo60228 has joined #dri-devel

01:35 JohnnyonF has quit []

01:51 yuq825 has joined #dri-devel

02:18 <DemiMarie> gfxstrand: is it okay if I send you a private message? I have some rather in-depth questions regarding the future of GPUs and security.

02:23 mwk[m] has joined #dri-devel

02:25 sukrutb has joined #dri-devel

02:32 <DemiMarie> In Qubes OS, for instance, security is the absolute highest priority. Right now, Qubes OS is stuck with software rendering, so even a 2x performance hit over bare metal would still be a huge win. Direct-to-firmware submission means that the GPU firmware could come under fire some pretty high-level attackers. Mobile GPU firmware has been designed to defend against malicious userspace for quite a while now, but most desktop GPU firmware

02:32 <DemiMarie> has not, and relying on the security of a closed source privileged blob does not give me a warm fuzzy feeling. The more protections between guests and the GPU, the better, so long as they do not have the stupendous attack surface that running the shader compiler on the host entails.

02:33 <DemiMarie> Intel SR-IOV is on by default in Windows Sandbox IIUC, so that should be pretty safe. Apple has always cared a LOT about app sandbox escapes, and while the AGX driver must go off of what alyssa and lina could reverse-engineer, I consider the risk of a vulnerability due to this to be comparable to (if not less than!) the risk of a memory corruption hole in one of the other drivers. AMD and especially Nvidia are quite concerning, though.

02:33 sukrutb has quit [Remote host closed the connection]

03:40 aravind has joined #dri-devel

03:45 TMM has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]

03:45 TMM has joined #dri-devel

03:50 oneforall2 has quit [Remote host closed the connection]

03:50 heat_ has quit [Remote host closed the connection]

03:50 oneforall2 has joined #dri-devel

03:50 heat_ has joined #dri-devel

03:54 ondracka has joined #dri-devel

04:04 ondracka has quit [Quit: Leaving]

04:05 bmodem has joined #dri-devel

04:18 JohnnyonFlame has joined #dri-devel

05:00 kzd has quit [Ping timeout: 480 seconds]

05:04 sgruszka has joined #dri-devel

05:15 itoral has joined #dri-devel

05:20 JohnnyonFlame has quit [Ping timeout: 480 seconds]

05:23 sima has joined #dri-devel

05:27 pallavim has quit [Ping timeout: 480 seconds]

05:51 Duke`` has joined #dri-devel

06:09 tzimmermann has joined #dri-devel

06:14 Company has quit [Quit: Leaving]

06:30 alanc has quit [Remote host closed the connection]

06:31 alanc has joined #dri-devel

06:40 <doras> jenatali: I see. Thanks for letting me know. I wasn't familiar with this guideline.

06:47 <lina> DemiMarie: I already found and reported a shader-to-full-system-control CVE in Apple's GPU stack but that one never affected Linux ^^

06:48 frieder has joined #dri-devel

06:48 <dolphin> DemiMarie: I guess the overall challenge is that GPUs have been designed with mostly performance in mind a few years ago, security is only addition of last half a dozen years

06:48 <lina> Given the kernel side is written in Rust I'd be surprised if you can find something to actually exploit from userland directly in the kernel driver other than a DoS (there's plenty of ways to DoS the system anyway...)

06:49 <tjaalton> dcbaker: is mesa 23.0.3 last of the series?

06:49 <lina> And Apple did fix the one hole in their firmware privilege separation I could find...

06:50 <dolphin> If you look at the hardware, there was global address space used by all applications on older hardware, and I think nouveau still doesn't care to zero VRAM passing between different applications.

06:52 <lina> We do zero... the one thing that I'm not so sure about is cross-process tile memory leakage. I don't know whether the shader cores/firmware guarantee tile memory is cleared when switching between VM contexts. We could mitigate that using a shader program prelude in kernel-managed, GPU-RO memory, although it would have some performance cost and complicate the UAPI...

06:55 jfalempe has joined #dri-devel

06:58 sghuge has quit [Remote host closed the connection]

06:58 sghuge has joined #dri-devel

07:02 rauji___ has quit []

07:08 aravind has quit [Ping timeout: 480 seconds]

07:10 mauld has joined #dri-devel

07:14 aravind has joined #dri-devel

07:22 Jeremy_Rand_Talos__ has joined #dri-devel

07:25 Jeremy_Rand_Talos_ has quit [Remote host closed the connection]

07:31 pochu has joined #dri-devel

07:34 rasterman has joined #dri-devel

07:35 <jani> lumag: https://intel-gfx-ci.01.org/ and maybe #intel-gfx-ci or ping DragoonAethis

07:36 <jani> lumag: also https://gitlab.freedesktop.org/gfx-ci/i915-infra/

07:48 <MrCooper> karolherbst_: the end of .gitlab-ci/container/cross_build.sh has a workaround for that

07:57 lynxeye has joined #dri-devel

08:01 pcercuei has joined #dri-devel

08:15 heat_ has quit [Remote host closed the connection]

08:16 heat_ has joined #dri-devel

08:19 vliaskov has joined #dri-devel

09:07 kts has joined #dri-devel

09:11 aravind has quit [Ping timeout: 480 seconds]

09:14 cmichael has joined #dri-devel

09:19 aravind has joined #dri-devel

09:28 sarahwalker has joined #dri-devel

09:44 itoral has quit [Remote host closed the connection]

09:56 <MrCooper> tjaalton: looks like chromium doesn't take the Mesa version into account correctly: https://bugs.chromium.org/p/chromium/issues/detail?id=1442633

10:01 <karolherbst_> MrCooper: it doesn't. That workaround is because the foreign packages would install foreign python. I need native and foreign, but they install both into the same locations

10:01 karolherbst_ is now known as karolherbst

10:02 <MrCooper> gotcha

10:02 <karolherbst> I already use that workaround, but with rustc/bindgen we need both and it's annoying :(

10:32 kts has quit [Quit: Konversation terminated!]

10:38 bmodem1 has joined #dri-devel

10:44 bmodem has quit [Ping timeout: 480 seconds]

10:52 <DragoonAethis> lumag: Re: Patchwork tests, we don't have the setup documented anywhere unfortunately, but the tl;dr is that you can access Patchwork via its API: https://gitlab.freedesktop.org/patchwork-fdo/patchwork-fdo/-/blob/master/docs/rest.rst - from that you pick up events like new series/revs, trigger a pipeline on the CI system of your choice, pull the latest target kernel for a given project, apply mbox (grab it from /api/1.0/series/{id}/

10:52 <DragoonAethis> revisions/{rev}/mbox/), build, run tests, report success/failures over the API for whichever stage failed

10:57 <DragoonAethis> On the hardware testing, good starting points are either https://mupuf.org/blog/2021/02/08/setting-up-a-ci-system-preparing-your-test-machine/ (ping mupuf) or https://www.lavasoftware.org/index.html (ARM-friendly) - we have some in-house scripting to make that tests go brrr, but we're moving towards a cleaner setup with b2c et al

10:59 <mupuf> DragoonAethis, lumag: FYI, we now also support arm64 in our infra

11:00 <mupuf> riscv64 has experimental support

11:00 <mupuf> and armv6 should also be supported very soon

11:01 <DragoonAethis> mupuf: Uh, v6, not v7?

11:01 <mupuf> DragoonAethis: if you have a v7 CPU, it will run v6 code

11:02 macromorgan is now known as Guest1741

11:02 macromorgan has joined #dri-devel

11:02 <mupuf> the difference between the two is more related to the NEON instructions, which isn;t something that impacts system applications

11:05 <DragoonAethis> Yeah, it's just odd, I remember the last v6 boards that did any sort of graphics being ancient and slow

11:07 <mupuf> oh, right, yeah

11:07 <mupuf> just wanted to go for maximum compatibility

11:08 <mupuf> but b2c is not designed for gfx testing, is it? It is designed for running containers from an initrd as fast as possible

11:08 <mupuf> if people want to use it to test NVME drivers, it's up to them ;)

11:09 Guest1741 has quit [Ping timeout: 480 seconds]

11:09 <DragoonAethis> Well, if you're testing NVMe drivers on ARMv6, you'll have a bad time with CPUs not keeping up either :D

11:09 <DragoonAethis> But sure, I guess Pi Picos etc could work in there

11:09 pochu has quit [Quit: leaving]

11:25 pochu has joined #dri-devel

11:29 macromorgan is now known as Guest1743

11:29 macromorgan has joined #dri-devel

11:31 kts has joined #dri-devel

11:34 macromorgan is now known as Guest1744

11:34 macromorgan has joined #dri-devel

11:36 Guest1743 has quit [Ping timeout: 480 seconds]

11:41 Guest1744 has quit [Ping timeout: 480 seconds]

11:51 bmodem1 has quit [Ping timeout: 480 seconds]

11:58 <mupuf> DragoonAethis: lol, yeah, but that's not the point ;)

11:58 * mupuf wants to be able to test i2c drivers :D

12:07 <lumag> DragoonAethis, thanks for the pointer. I didn't know about the /project/N/events/ API.

12:09 <lumag> mupuf, we mostly default to a custom version drm-ci via gitlab. But I will take a look, thanks

12:12 <mupuf> lumag: FYI, I started working on a patchwork to gitlab bridge some time ago: https://gitlab.freedesktop.org/mupuf/patchwork-bridge/-/blob/bridge/bridge.py

12:15 <mupuf> the bridge will apply the patches on top of a tree and push that to a branch when a new revision comes

12:15 <mupuf> and it will report back to patchwork when it is done, so that an email can be sent from there

12:15 <mupuf> we can move that to gfx-ci/patchwork-bridge if you want to work on it :)

12:16 <lumag> mupuf, probably we can use gitlab actions to ping patchwork?

12:16 <mupuf> you mean scheduled pipelines? yeah, that's what the project does

12:18 * mupuf would suggest not putting this pipeline in drm-ci though

12:19 <mupuf> it would be best to have one project responsible for the mirroring for multiple projects, rather than having every project replicate the same thing over and over again

12:19 <mupuf> drm-ci already did that by importing mesa CI verbatim, let's not add more :D

12:26 <DragoonAethis> mupuf: I'm working on something a bit weirder right now, because we have A Mess(tm)

12:26 * mupuf is intrigued

12:27 <DragoonAethis> Basically we've got GitHub for all the internal projects and GitLab for some public stuff

12:27 <DragoonAethis> And then Patchwork that talks with mailing lists and shares some GitLab projects

12:28 <DragoonAethis> Internal rules and governance for attaching GitHub runners makes it somewhat painful to work with, we can't host some CI bits on the public GitLab instance either because I'd like to share as much infra as possible between internal and public flows

12:28 <DragoonAethis> And we have 5 people who know Jenkins already and 1 who know GitHub/Lab CI

12:28 <DragoonAethis> So

12:29 <DragoonAethis> Jenkins <-> Forge Bridge <-> GitHub/GitLab/Patchwork/Gerrit/whatever you want

12:29 <DragoonAethis> Basically an API that keeps track of all the triggers, started pipelines, reporting back results using the appropriate upstream APIs

12:31 <DragoonAethis> And most importantly, it keeps track of what processes are in progress - all the public CI queue stuff is "fake", so to speak

12:31 <DragoonAethis> And we don't really know where each process starts and where it breaks until someone comes and yells at us

12:32 <DragoonAethis> And that's supposed to finally be the one source that just binds it all into sensible queues, what is running and what stopped unexpectedly, etc

12:38 JohnnyonFlame has joined #dri-devel

12:42 <DavidHeidelberg[m]> jljusten: ping, you have nice piglit Debian package, what would you say to push piglit nightlies into experimental?

12:44 <hakzsam> alyssa: gfxstrand: would you be able to review https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23254 please?

12:47 smilessh has joined #dri-devel

12:52 smiles_1111 has quit [Ping timeout: 480 seconds]

12:59 <mupuf> DragoonAethis: good luck...

13:08 MajorBiscuit has joined #dri-devel

13:12 Company has joined #dri-devel

13:13 <tjaalton> MrCooper: you mean comment #15?

13:15 <MrCooper> yeah, also in the Fedora bug report people are hitting the issue after upgrading to a newer upstream version

13:16 <MrCooper> so I'm afraid changing (PACKAGE_)VERSION won't help after all

13:19 <tjaalton> meh

13:20 <MrCooper> I do wonder if Mesa couldn't check whether the cached data is compatible though

13:26 YuGiOhJCJ has quit [Quit: YuGiOhJCJ]

13:39 heat_ has quit [Read error: No route to host]

13:39 heat has joined #dri-devel

13:49 <DragoonAethis> mupuf: Thanks, I'm gonna need it ;-;

13:57 kts has quit [Quit: Konversation terminated!]

14:01 fxkamd has joined #dri-devel

14:04 junaid has joined #dri-devel

14:07 yuq825 has quit []

14:16 Haaninjo has joined #dri-devel

14:18 <alyssa> hakzsam: done to the best of my abilitiy

14:18 <hakzsam> thanks!

14:22 heat_ has joined #dri-devel

14:23 kzd has joined #dri-devel

14:24 heat has quit [Read error: No route to host]

14:26 aravind has quit [Ping timeout: 480 seconds]

14:26 <alyssa> any time

14:26 <alyssa> well,

14:26 <alyssa> most times

14:27 <alyssa> 9am-5pm Mon-Fri, excluding stat holidays illness or planned vacations

14:27 <alyssa> weird express "any time",.

14:27 <alyssa> expression

14:29 <DemiMarie> lina: I strongly recommend adding the shader prelude. If it turns out to be unnecessary then it can just be removed, but it seems that adding it would be a uAPI break and Linus doesn’t like those.

14:31 aravind has joined #dri-devel

14:32 idr has joined #dri-devel

14:48 rauji___ has joined #dri-devel

14:48 <gfxstrand> DemiMarie: I don't think your notion that mobile GPU vendors are doing a good job at security. Just a couple years ago, somsone had a WebGPU app which can reboot your phone.

14:50 <alyssa> gfxstrand: you sure it wasn't a feature

14:50 <alyssa> WebReboot?

14:50 <alyssa> if we're allowed a WebUSB why not expose all the other drivers to arbitrary JS right?

14:51 <DavidHeidelberg[m]> is somehow implied, that we still have problem with running spec/arb_vertex_attrib_64bit/execution/vs_in/vs-input-uint_uvec4-double_dmat3x4_array2-position.shader_test inside mesa and we need to patch piglit 3 years later (read with spongebob voice) after initial introduction ? https://gitlab.freedesktop.org/mesa/mesa/-/commit/576f7b6ea52d39406df119b336396bfa41628726

14:52 <DavidHeidelberg[m]> MrCooper: ^ do you recall how serious it was? :P I guess with flake/fails/skips list we can drop the patch now

15:03 <MrCooper> it may no longer be needed in practice, though conceptually running a slightly different set of tests each time still seems odd

15:05 <lina> DemiMarie: I don't think we should add anything until we understand if there is a problem at all or not. I think the basic mitigation can be done without a UAPI break, you just need UAPI changes to get back some of the efficiency you lost.

15:08 kts has joined #dri-devel

15:13 kts has quit [Quit: Konversation terminated!]

15:14 pallavim has joined #dri-devel

15:14 pochu has quit [Quit: leaving]

15:15 tzimmermann has quit [Quit: Leaving]

15:17 <DemiMarie> lina: fair! Hopefully this can be understood before merge.

15:17 kts has joined #dri-devel

15:18 <DemiMarie> gfxstrand: people are at least looking.

15:18 sgruszka has quit [Remote host closed the connection]

15:21 kts has quit []

15:28 <gfxstrand> DemiMarie: They're looking on desktop, too. In fact, desktop has typically been a decade ahead of mobile when it comes to a lot of basic security stuff like per-context page tables.

15:28 alyssa has left #dri-devel [#dri-devel]

15:28 <a1batross> Hi there! Did anybody started mainlining rk628d bridge?

15:28 <gfxstrand> My point is that the notion that mobile is more secure is a total misnomer.

15:28 Haaninjo has quit [Ping timeout: 480 seconds]

15:30 <DemiMarie> gfxstrand: I was not aware of that, thanks! That might alleviate some of mwk @mwk:invisiblethingslab.com’s concerns.

15:34 frieder has quit [Remote host closed the connection]

15:34 <DavidHeidelberg[m]> MrCooper: slightly different? what does that mean?

15:35 <gfxstrand> That doesn't mean there are no bugs. Of course there are bugs. Some of them may even have security implications. With closed-source firmware our ability to find and fix bugs is indeed limited. But the notion that Mali and Qualcomm have better firmwares than Nvidia or AMD just because they're mobile is just nonsense.

15:36 <MrCooper> DavidHeidelberg[m]: as an example for illustration, say there are 6 tests A B C D E F; each piglit invocation would randomly run a different subset of those, e.g. "B C F" then "A B E"...

15:39 <DavidHeidelberg[m]> MrCooper: we could fix that to run everything? ... and then don't use it in CI anyway because of how long it will run :D

15:40 <MrCooper> yeah, not sure how bad it would be

15:41 TMM has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]

15:41 TMM has joined #dri-devel

15:42 <MrCooper> oh, and actually I don't remember if the subset changes per invocation, or just per build of piglit

15:50 fab has joined #dri-devel

15:57 fab has quit [Quit: fab]

15:57 fab has joined #dri-devel

16:00 <robclark> gfxstrand, DemiMarie: most of the security issues I see getting reported thru bug bounty program are just kernel issues (race conditions where userspace can trigger UAF, etc.. handles are horrible).. but I'd probably not be super excited about usespace cmd submission from vm guest directly to gpu fw

16:04 heat_ has quit [Read error: No route to host]

16:04 heat_ has joined #dri-devel

16:08 aravind has quit [Ping timeout: 480 seconds]

16:09 junaid has quit [Remote host closed the connection]

16:11 djbw has joined #dri-devel

16:17 <gfxstrand> dschuermann, cwabbott: Any suggestions for papers to read about SSA spilling?

16:17 <gfxstrand> I found the Braun and Hack paper that ACO cites. I guess I'll start by reading that one.

16:19 <gfxstrand> Looks like ACO and ir3 both cite that paper. I guess that's the one, then.

16:23 cmichael has quit [Quit: Leaving]

16:24 agd5f_ has quit []

16:25 agd5f has joined #dri-devel

16:27 djbw has quit [Read error: Connection reset by peer]

16:28 <DemiMarie> robclark: exactly!

16:29 Major_Biscuit has joined #dri-devel

16:31 MajorBiscuit has quit [Ping timeout: 480 seconds]

16:38 djbw has joined #dri-devel

16:38 benjaminl has joined #dri-devel

16:38 sarahwalker has quit [Remote host closed the connection]

16:39 <cwabbott> gfxstrand: yes, that's the one

16:41 <cwabbott> at least for once there's actually a paper to point to that more-or-less describes the algorithm you actually need to use and isn't totally impractical

16:50 lynxeye has quit [Quit: Leaving.]

16:50 <jenatali> alyssa: Ping for !23173

17:10 sukrutb has joined #dri-devel

17:21 Major_Biscuit has quit [Ping timeout: 480 seconds]

17:24 eukara has joined #dri-devel

17:43 <dcbaker> tjaalton: I've cut one last release just now, so 23.0.4 will be the last. Sorry for being *so* late with it.

17:44 <tjaalton> dcbaker: okay, thanks!

17:45 alyssa has joined #dri-devel

17:45 <alyssa> anholt_: Is nir-to-tgsi expected to handle instructions where multiple sources are indirect?

17:46 <alyssa> It doesn't seem to work, since addr_reg[0] gets used twice (with the second index clobbering the first), but maybe this was impossible to happen before I went playing with nir_reg

17:46 AndrewR has joined #dri-devel

17:47 <eric_engestrom> dcbaker: do you feel like taking on 23.2?

17:47 <dcbaker> yeah, I can take on 23.2

17:47 <eric_engestrom> (no worries if you want to skip that one)

17:48 <eric_engestrom> dcbaker: see !23205 for the 23.2 schedule

17:53 <tjaalton> +1 for early august release :)

17:53 <tjaalton> early-to-mid

17:57 <robclark> alyssa: the hw that I'm familiar with that has indirect reg access only has a single address register.. so you can't have an instruction that uses two different addr reg values

17:57 <alyssa> robclark: Hmm, ok

17:58 <alyssa> Well, this pass is mostly for ir3's benefit so I can avoid this at a NIR level

17:58 <alyssa> Maybe I botched locals_to_reg and there's a reason it didn't do this before

17:59 <alyssa> Oh.

17:59 <alyssa> The real reason is that locals_to_reg generates moves and doesn't coalesce them, whereas my new thing can coalesce the moves..

17:59 <alyssa> That'd do it, I guess

18:00 <alyssa> robclark: How competent is ir3's backend copyprop of array registers, btw?

18:00 <robclark> you can throw extra mov's at the frontend, no prob

18:00 <alyssa> sounds good

18:01 <robclark> (also, in what case is ir3 involved with anything that uses ntt?)

18:01 <alyssa> I have helpers to chase movs at NIR->backend time, but they're really geared for "legacy" backends ... If ir3 can just eat the moves in the frontend it's probably cleaner

18:01 <alyssa> (It's not. I'm ripping out nir_register which means I have to worry about every backend.)

18:02 <robclark> ahh

18:02 <alyssa> (Today I'm porting nir_lower_locals_to_reg to the new intrinsics and debugging fallout on softpipe+nir-to-tgsi since)

18:07 <eric_engestrom> dcbaker: don't forget to post the 23.0.x on the website too :)

18:09 <eric_engestrom> (`./post_release.py 23.0.1` and adjust the date to when it actually was)

18:15 <dcbaker> thanks for the reminder! I always forget about the website :(

18:26 oneforall2 has quit [Remote host closed the connection]

18:29 Haaninjo has joined #dri-devel

18:30 oneforall2 has joined #dri-devel

18:33 Danct12 has joined #dri-devel

18:36 jewins has joined #dri-devel

18:36 Daanct12 has quit [Ping timeout: 480 seconds]

19:00 junaid has joined #dri-devel

19:02 <DemiMarie> robclark: the best solution would be for the GPU firmware to be open source and subject to public security review. Ideally formal methods would be used to prove at least the absence of runtime errors, such as memory corruption or undefined behavior.

19:03 <airlied> https://arxiv.org/abs/2305.12784 even the hw doesn't like you :-P

19:06 <DemiMarie> IMO data-dependent power consumption is as much a vulnerability as data-dependent timing.

19:07 <DemiMarie> Or DVFS needs to be disabled.

19:08 <DemiMarie> If DVFS is enabled and data-dependent power consumption is present, there is a security problem unless the DVFS is based on something (like instruction counts) that is data independent.

19:15 <alyssa> airlied: interesting paper

19:15 JohnnyonF has joined #dri-devel

19:21 JohnnyonFlame has quit [Ping timeout: 480 seconds]

19:33 cheako has joined #dri-devel

19:39 Leopold_ has quit [Remote host closed the connection]

19:41 Leopold has joined #dri-devel

19:41 BobBeck9 has quit []

19:41 BobBeck has joined #dri-devel

19:47 benjaminl has quit [Ping timeout: 480 seconds]

19:48 heat_ has quit [Remote host closed the connection]

19:48 heat_ has joined #dri-devel

20:05 junaid has quit [Remote host closed the connection]

20:22 Lyude has quit [Quit: Bouncer restarting]

20:23 <karolherbst> is there a way to make the kernel build system not append silly things like "-dirty" to my kernel version?

20:28 Lyude has joined #dri-devel

20:28 <alyssa> karolherbst: yes

20:29 <alyssa> CONFIG_LOCALVERSION_AUTO

20:29 <karolherbst> so I set that to y and it's doing something sane?

20:31 <alyssa> karolherbst: no, unset it

20:31 <alyssa> that's the CONFIG_APPEND_SILLY_THINGS option

20:32 <karolherbst> well.. it's not set for me

20:32 <alyssa> uhhh

20:33 <karolherbst> maybe it's something inside installkernel doing it.. dunno

20:36 Duke`` has quit [Ping timeout: 480 seconds]

20:47 JohnnyonFlame has joined #dri-devel

20:47 benjaminl has joined #dri-devel

20:54 JohnnyonF has quit [Ping timeout: 480 seconds]

21:10 heat_ has quit [Remote host closed the connection]

21:11 heat has joined #dri-devel

21:15 macromorgan has quit [Read error: Connection reset by peer]

21:15 macromorgan has joined #dri-devel

21:16 macromorgan is now known as Guest1777

21:16 macromorgan has joined #dri-devel

21:18 eloy_ has quit [Ping timeout: 480 seconds]

21:18 benjamin1 has joined #dri-devel

21:23 eloy_ has joined #dri-devel

21:24 Guest1777 has quit [Ping timeout: 480 seconds]

21:24 benjaminl has quit [Ping timeout: 480 seconds]

21:35 sima has quit [Ping timeout: 480 seconds]

22:07 rauji___ has quit []

22:13 <jenatali> How upset would people be if I added a nir option to accept mediump in the backend? I.e. to disable the late opts that turn it into actual 16bit converts

22:13 <alyssa> jenatali: define "mediump"

22:13 <alyssa> x2ymp conversions?

22:13 <alyssa> or something else?

22:13 <jenatali> Yeah

22:14 <alyssa> meh

22:14 <jenatali> Just the conversions

22:14 <alyssa> I really want to kill off nir_shader_compiler_options but failing that, .. yeah

22:14 <alyssa> if nobody has deleted options->intel_vec4 after all this time

22:14 <alyssa> disabling f2fmp lowering is mild.

22:14 <jenatali> Unless you want to move it into args to the algebraic opt directly, I'd be fine with that too

22:14 <alyssa> yeah that's what I'm waffling about

22:15 <alyssa> The way nir_opt_algebraic was supposed to work is that all the drivers would have their own passes

22:15 <alyssa> but.. whoops

22:15 <jenatali> Basically, DXIL has a bit which says whether 16bit types should be min-precision or native, and if the only thing that produces 16-bit types are f2fmp then I don't want to set the native bit

22:15 <alyssa> sure

22:16 <alyssa> I think there's a possible future where nir_opt_algebraic.py just generates lists of rules based on an options dict

22:17 <alyssa> and then each driver would define their own backend_nir_opt_algebraic passes inheriting those rules (setting the options they need and also appending their own rules)

22:17 <jenatali> I like the sound of that, though there's some benefit to be had by making a large rule list instead of a bunch of independent passes

22:17 <karolherbst> soo.. like the nir_shader_compiler_options stuff?

22:17 <alyssa> karolherbst: nir_shader_compiler_options is checked at runtime

22:18 <karolherbst> right...

22:18 <alyssa> jenatali: nir_opt_algebraic.py would realistically still be a big rules list

22:18 <jenatali> And I'd probably want some way to keep some options at runtime

22:21 <karolherbst> mhhh

22:23 <alyssa> anyway this is just kinda me daydreaming

22:24 <alyssa> because my actual big thing right now is torching nir_register

22:24 <karolherbst> good luck

22:24 <jenatali> Looking forward to it

22:24 <alyssa> which is the NIR equivalent of rolling a boulder uphill

22:24 <alyssa> pretty close to the point where it becomes blocked on "convert 15 drivers" instead of plumbing

22:24 <jenatali> That analogy implies someone is going to bring it back

22:24 <karolherbst> I can rely on you pinging me on more stuff I have to fix in codegen to make that reality?

22:24 <alyssa> jenatali: No, it implies it's liable to roll down, smack me in the face, and make me roll down with it like in the cartoons

22:25 <alyssa> karolherbst: mais oui

22:25 <jenatali> :P

22:26 <karolherbst> I wonder if I really figure a way out to move all the stupid lowering post SSA or if I just torch codegen once NAK is done

22:26 <karolherbst> though it's probably fine.. most of those passes are SSA compatible already

22:27 <karolherbst> and some are obsolete

22:27 pcercuei has quit [Quit: dodo]

22:31 <alyssa> jenatali: the good news is that, like in cartoons, it's totally harmless to me

22:31 <jenatali> Heh

22:31 <DemiMarie> How does Mesa handle running out of memory?

22:32 <karolherbst> we try to handle malloc fails, but it's all pointless as malloc doesn't fail anyway

22:33 Leopold___ has joined #dri-devel

22:34 <robclark> same way everything else does... crash!

22:34 <karolherbst> well.. you wanna kill processes on high memory presure

22:34 <karolherbst> but yeah...

22:34 <karolherbst> the way things work on linux is, that... you can't handle geoing out of memory anyway

22:34 <karolherbst> *going

22:34 <robclark> "cooperative-low-memory-killing"

22:37 <zf> unless of course you're running out of something other than physical memory, or you're not on linux :-)

22:37 <alyssa> karolherbst: speaking of my NIR reworks, I think https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23208 needs a marge assign

22:37 <karolherbst> ohh right.. I wanted to test that first

22:38 benjaminl has joined #dri-devel

22:38 <Hi-Angel> dcbaker: I don't see in 23.0.4 release https://docs.mesa3d.org/relnotes/23.0.4.html this commit https://gitlab.freedesktop.org/mesa/mesa/-/commit/275cf62e20f9b42d69dea146e41589bc205799d0 ☹

22:39 <Hi-Angel> Ohhh

22:39 <Hi-Angel> I'm stupid

22:40 <Hi-Angel> Never mind, I'm just still waiting for the 22.3.x release

22:40 <Hi-Angel> I thought that was it, sorry

22:40 Leopold has quit [Ping timeout: 480 seconds]

22:41 <Hi-Angel> My friend is just on Fedora who don't update to the next major Mesa release yet, so I'm waiting for the previous bugfix release to tell my friend she can update her system

22:43 <Hi-Angel> So, anyway, sorry for the confusion

22:43 <robclark> hmm, fedora is on 23.0.3

22:44 <Lynne> speaking of OOM, I wish linux had a mode to make malloc fail

22:44 <Hi-Angel> Oh, then wait

22:44 <Hi-Angel> I am right

22:44 <karolherbst> Lynne: you can disable overcommitting

22:44 <Lynne> that would make all the effort we spend on ffmpeg surviving that worth it

22:45 <Hi-Angel> dcbaker: yeah, right, the previous 23.0.3 release was a month ago, and the 23.0.4 from yesterday didn't include the commit :c

22:45 benjamin1 has quit [Ping timeout: 480 seconds]

22:45 <airlied> Hi-Angel: was considering pushing 23.1.1 into f38

22:45 <karolherbst> Lynne: thing is.. sometimes you overcommit on purpose

22:45 <Hi-Angel> airlied: awww, nice!

22:46 <karolherbst> Lynne: there are also issues in regards to fork()

22:46 <zf> there is also ulimit -v

22:46 <dcbaker> hi-angel: the release was ready last week, but I got busy and didn't make it. Since it was fully validated I made it today as-is. If I pulled more patches today I'd restart the validation cycle and it would be a couple days till the release happened, when there were already a number of really important fixes queued

22:47 <dcbaker> I can make a 23.0.5 if there's critical stuff that needs to go in and cut one more release

22:47 <Hi-Angel> I see

22:47 <Hi-Angel> No, it's not critical, basically I don't think it affects anyone but one user…

22:48 <Hi-Angel> Alrighty then… will be waiting for 23.1.1 to get into f38 C:

22:49 <karolherbst> anyway... all of the malloc error handling is purely cosmetic

22:51 <karolherbst> speaking of overcommiting, Kayden: the SVM work on iris is kinda cursed, because I don't think people will like it if the driver unconditionally allocates like multiple TB of virtual memory... so maybe frontends need to opt in on a screen level or something...

22:52 <karolherbst> I kinda wished all of that wouldn't be so painful to implement

22:53 <Kayden> yeah, that sounds kinda awful

22:54 <karolherbst> maybe I don't use malloc and do some mmap magic in svmAlloc instead...

22:54 <karolherbst> but uhh...

22:54 <Lynne> karolherbst: I meant only when it makes sense, not always when running out of memory

22:54 <karolherbst> "when it makes sense"?

22:54 <Kayden> TBH I'm not really sure why you need to do that

22:55 <karolherbst> Kayden: well.. the thing is, I have to allocate memory which has the identical address on the GPU and CPU side

22:55 <Kayden> there are really only a couple places where we have specific buffers that have to be in certain places

22:55 <Lynne> yeah, when the kernel decides to OOM you

22:55 <karolherbst> and I can't really fail it

22:55 <Lynne> gives you half a chance of at least closing down

22:55 <karolherbst> I can probably call malloc until I hit a hole in the GPUs vm

22:56 <Kayden> you could probably just MMAP_FIXED on the CPU side the spot for our binding tables and so on

22:56 <karolherbst> or I call mmap and use a good starting address

22:56 <Kayden> to make sure that nothing gets malloc'd there

22:56 <karolherbst> yeah.. I do all of that

22:56 <karolherbst> _but_

22:56 <karolherbst> _OTHER is the annoying heap

22:56 <Kayden> the one with no restrictions of any kind is the annoying one? :)

22:56 <karolherbst> yeah

22:56 <karolherbst> because

22:57 <karolherbst> malloc shouldn't allocate memory at the same location

22:58 <Kayden> it sounds like we just need to make our GPU allocations visible on the CPU side

22:58 <karolherbst> right

22:58 <Kayden> at one point anv (well, hasvk) used memfds and userptr for allocations

22:59 <Kayden> couldn't we just mmap nothing to get a CPU address and then use that as the GPU address?

22:59 <Kayden> and then just tell util_vma that there's a BO at this address, instead of letting it pick arbitrarily

22:59 <karolherbst> yeah.. that's kinda my alternative idea

23:00 <karolherbst> I'd still like to make that opt in, so we don't mmap for nothing

23:00 <Kayden> yeah, it'd be nice to know if SVM is required

23:00 <Kayden> not sure how expensive that'd be, honestly

23:00 <Kayden> it might not be too bad... it's probably not free though

23:01 <karolherbst> ehh.. as long as it's all done on allocation it's fine

23:01 <Kayden> suballocator and BO caches both do mitigate that, yeah.

23:02 <karolherbst> and then I can still just reuse mallocs return value, because every bo allocation cut a hole via mmap.. it's a lot of mmaps, but... that's probably better than sizing OTHER to e.g. 1TB and cut such a hole

23:03 <karolherbst> it still expensive, but oh well...

23:03 <karolherbst> mhhhh

23:04 <karolherbst> yeah no... I mean.. maybe I dig a bit on how intel's stack is doing that, but when I looked into it, it's really hard to follow

23:05 <karolherbst> though I don't think they bother with all those different heaps in the first place

23:12 rasterman has quit [Quit: Gettin' stinky!]

23:14 <karolherbst> Kayden: actually... I think I could really just skip malloc, it's a bit cursed, but it might not be too bad... so if we back bos with mmap, and I use mmap instead of malloc, I could e.g. say mmap to find the first gap at a starting address

23:14 <karolherbst> fragmentation would be a problem, but...

23:15 <karolherbst> anyway.. I'd check what other drivers do first before I setlle with any approach

23:18 Danct12 has quit [Ping timeout: 480 seconds]

23:22 vliaskov has quit [Remote host closed the connection]

23:47 Haaninjo has quit [Quit: Ex-Chat]

23:54 ngcortes has joined #dri-devel

23:54 rsalvaterra has quit [Read error: Connection reset by peer]