#dri-devel on 2023-03-10 — irc logs at oftc.irclog.whitequark.org

2022-12-21 00:45 ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar

00:03 Kayden has joined #dri-devel

00:13 ybogdano is now known as Guest7217

00:13 Guest7126 is now known as ybogdano

00:31 krushia has quit [Remote host closed the connection]

00:37 <DavidHeidelberg[m]> anholt_: CI passed, time to merge new mesa-swrast machines?

00:37 <gfxstrand> airlied: RE: Vulkan Beta...

00:38 ngcortes has joined #dri-devel

00:38 <gfxstrand> airlied: Yeah, part of the XML parser changes was to unify the skipping of provisional stuff.

00:38 <gfxstrand> I didn't realize we had video in the tree at that time.

00:38 <gfxstrand> (Might be good for CI to build-test that....)

00:39 <gfxstrand> airlied: IDK what the right thing to do is. I kinda want to keep as much beta stuff disabled by default as we can. That way we're guaranteed that we never ship beta stuff accidentally.

00:39 <gfxstrand> But, also, I don't want to make the build system a mess by passing an extra flag into every single python generator.

00:39 <airlied> gfxstrand: not sure we can build stuff without the define at all

00:40 <airlied> so I don't see how we could accidentally ship stuff

00:40 <gfxstrand> Yeah

00:40 <gfxstrand> Let me see what breaks when I drop the check

00:40 nchery has quit [Remote host closed the connection]

00:40 nchery has joined #dri-devel

00:40 <airlied> like I'd wire it up if I had a clue how, but my python brain wasn't full across it

00:41 <airlied> I did consider a ci job for it, but it seemed like overkill until I add drivers code in the tree that I care about under beta

00:41 <airlied> thus far I've avoided merging beta encoding for radv

00:41 <gfxstrand> that's okay

00:41 <airlied> since I think the spec is a bit too churny, also doing it in CI I felt was unfairly pushing maintaining beta builds onto others when it might not be trivial

00:42 <airlied> every header update would require radv changes in the beta code which might be very non-trivial

00:49 JohnnyonFlame has joined #dri-devel

00:53 Kayden has quit [Quit: leave office]

01:02 stuarts has quit []

01:12 krushia has joined #dri-devel

01:15 columbarius has joined #dri-devel

01:17 co1umbarius has quit [Ping timeout: 480 seconds]

01:18 kzd has joined #dri-devel

01:20 nchery has quit [Quit: Leaving]

01:29 kzd has quit [Quit: kzd]

01:32 kzd has joined #dri-devel

01:39 Guest7217 has quit [Ping timeout: 480 seconds]

01:44 idr has quit [Ping timeout: 480 seconds]

01:49 appusony has joined #dri-devel

01:52 ngcortes has quit [Remote host closed the connection]

01:52 ngcortes has joined #dri-devel

01:53 kzd has quit [Quit: kzd]

01:58 windleaves has joined #dri-devel

02:03 heat has quit [Read error: No route to host]

02:03 heat has joined #dri-devel

02:04 wind has quit [Ping timeout: 480 seconds]

02:21 khfeng_ has quit [Ping timeout: 480 seconds]

02:32 alyssa has joined #dri-devel

02:33 <alyssa> my MR is waiting for sanity

02:33 <alyssa> oddly poetic

02:36 khfeng_ has joined #dri-devel

02:36 FireBurn has quit [Ping timeout: 480 seconds]

02:41 ybogdano is now known as Guest7231

02:55 ngcortes has quit [Read error: Connection reset by peer]

02:56 khfeng has joined #dri-devel

02:59 khfeng_ has quit [Ping timeout: 480 seconds]

03:06 khfeng has quit [Remote host closed the connection]

03:06 khfeng has joined #dri-devel

03:14 kzd has joined #dri-devel

03:38 khfeng has quit [Ping timeout: 480 seconds]

03:52 rsalvaterra_ has joined #dri-devel

03:52 rsalvaterra is now known as Guest7237

03:52 rsalvaterra_ is now known as rsalvaterra

03:56 Guest7238 has quit [Ping timeout: 480 seconds]

03:57 Leopold__ has joined #dri-devel

03:57 bmodem has joined #dri-devel

03:58 Leopold has quit [Ping timeout: 480 seconds]

04:03 Zopolis4 has joined #dri-devel

04:16 smiles has quit [Ping timeout: 480 seconds]

04:26 <alyssa> eric_engestrom: oh wow enough banging my head on things later, I think I got clang-format ci working

04:27 <alyssa> bruteforce works!

04:49 heat has quit [Ping timeout: 480 seconds]

05:03 alyssa has quit [Quit: leaving]

05:30 bmodem1 has joined #dri-devel

05:32 bmodem1 has quit []

05:36 bmodem has quit [Ping timeout: 480 seconds]

05:39 bmodem has joined #dri-devel

05:44 Company has quit [Read error: Connection reset by peer]

06:01 fxkamd has quit []

06:19 Kayden has joined #dri-devel

06:27 bgs has joined #dri-devel

06:31 khfeng has joined #dri-devel

06:40 aravind has joined #dri-devel

06:41 kts has joined #dri-devel

06:43 fab has joined #dri-devel

06:55 kts has quit [Quit: Konversation terminated!]

07:09 dn^ has joined #dri-devel

07:24 ahajda has joined #dri-devel

07:26 JohnnyonFlame has quit [Read error: Connection reset by peer]

07:31 danvet has joined #dri-devel

07:34 <eric_engestrom> alyssa: awesome! I'll review it now :)

07:38 tzimmermann has joined #dri-devel

07:42 fab has quit [Quit: fab]

07:45 agd5f_ has joined #dri-devel

07:50 kts has joined #dri-devel

07:52 agd5f has quit [Ping timeout: 480 seconds]

07:55 robobub has quit []

07:55 kzd has quit [Quit: kzd]

07:58 sghuge has quit [Remote host closed the connection]

07:58 sghuge has joined #dri-devel

08:04 YuGiOhJCJ has joined #dri-devel

08:07 jljusten has quit [Ping timeout: 480 seconds]

08:12 hansg has joined #dri-devel

08:18 fab has joined #dri-devel

08:18 tursulin has joined #dri-devel

08:20 <bbrezillon> danylo, danvet: Hi. I'm been looking in more details at the nouveau VM_BIND stuff, and there's still one thing that's not clear to me. Is drm_sched_backend_ops::run_job() consider to be part of the signaling path. IOW, can we allocate from this callback (most/all? drivers seem to allocate their fence object from there with GFP_KERNEL flags), and if we can allocate these fences with

08:20 <bbrezillon> GFP_KERNEL, what make things different for page table allocations.

08:20 <danvet> yeah those are all driver bugs unfortunately

08:20 <danvet> I thought amd reworked to fix this, but the rework never got out of amd into other drivers

08:20 <bbrezillon> ok, so it's indeed considered to be in the signaling path

08:20 <danvet> yeah

08:21 <bbrezillon> is this documented?

08:21 <danvet> no

08:21 <danvet> I did a patch to add signalling annotations and broke everything

08:21 <danvet> so I think best to do is to resurrect that, but add a knob to disable it per-driver

08:21 vliaskov has joined #dri-devel

08:21 <danvet> and then set that knob for all current drivers except amdgpu

08:22 <danvet> https://lore.kernel.org/dri-devel/20200604081224.863494-10-daniel.vetter@ffwll.ch/

08:22 <danvet> https://lore.kernel.org/dri-devel/20200604081224.863494-15-daniel.vetter@ffwll.ch/

08:23 <danvet> I think I've discussed this with thomas hellstrom or mlankhorst or mbrost in the past too

08:24 <danvet> opt-out might need to be per-callback even since iirc amdgpu still has issues with console_lock in the tdr path

08:25 <javierm> danvet, bbrezillon: I believe that topic was also discussed by Christian and Lina in the thread about the agx driver

08:25 <danvet> yeah I need to catch up on that

08:25 <MrCooper> DemiMarie: "kernels from distros like RHEL are often out of date" do not let the base kernel version fool you, most of the code is pretty close to upstream (in RHEL 8 & 9 ATM, older versions are more or less frozen at this point)

08:25 <javierm> danvet: I read everything but have to admit that only understand half ot it :)

08:25 <danvet> I also need to come up with some idea for the annotations in the display helpers

08:26 <danvet> because they fire in a bunch of common but unfixable cases for something that not really anyone is using ...

08:26 <danvet> javierm, yeah that's the usual end result when talking about memory reclaim

08:26 <danvet> utter confusion by the people who got lost halfway through

08:27 <danvet> and terminally sad faces by the leftover few :-/

08:27 <javierm> I see... is not only me then

08:27 <danvet> someone I pointed at the plumbers discussion with könig, gfxstrand and me from 2 years ago summarized it with "I don't really understand, but your facial expressions really worried me"

08:28 <bbrezillon> Unless I'm missing something, for simple drivers (panfrost, etnaviv, ...), it's mostly a matter of pre-allocating the fence object at submit time, and filling it at run_job() time, so no allocation happens in there

08:28 jljusten has joined #dri-devel

08:29 <bbrezillon> but no matter how we solve it, it should probably be added to the drm_sched_backend_ops::

08:29 <bbrezillon> doc

08:29 jkrzyszt has joined #dri-devel

08:29 <danvet> bbrezillon, yup

08:30 <danvet> (on both actually)

08:30 <danvet> hm if amdgpu switched to shadow buffer then the console lock thing is also sorted now, and they should work with the full annotations

08:30 <bbrezillon> are there other callbacks that are supposed to run in the signaling path?

08:31 <danvet> all of them actually :-)

08:31 <danvet> but run_job and timedout_job are the ones where people usually get it wrong

08:32 <danvet> bbrezillon, this is why my first patch annotates the entire scheduler thread

08:33 <bbrezillon> yeah, but I'll probably leave that to you. Just wanted to update the docs, and fix panfrost :P

08:34 <danvet> well with the annotations you get much better testing generally

08:34 <bbrezillon> unless you want to update the docs along with the annotation

08:34 <bbrezillon> sure, I'm not claiming we shouldn't add the annotation, just saying it's worth documenting it too

08:35 ice9 has joined #dri-devel

08:35 <bbrezillon> because the current situation is, we don't have the annotation, and people keep doing the same mistake

08:38 <bbrezillon> actually, this came up while I was discussing the the page table pre-allocation stuff with robmur01 on #panfrost, and he rightfully pointed me to all those drivers allocating stuff in the run_job() path, and this is where I got confused, not knowing is this was part of the signaling critical section and all drivers were buggy, or if it was actually allowed to allocate from there

08:42 kts has quit [Quit: Konversation terminated!]

08:43 <danvet> bbrezillon, yeah probably the docs are a good first step, you're volunteering?

08:44 <danvet> maybe in the main struct text a link to the dma-fence signalling doc section for further details, and then a one-line in each callback that they're signalling critical sections and therefore very limited in what they can do, specicially can't allocate memory with GFP_KERNEL

08:45 <danvet> and then commit message with the note that sadly most drivers get this wrong

08:47 Lightsword_ has left #dri-devel [#dri-devel]

08:48 Lightsword has joined #dri-devel

08:49 <bbrezillon> danvet: yep, I can do that, and fix panfrost along the way

08:49 <danvet> thx

08:50 <danvet> https://dri.freedesktop.org/docs/drm/driver-api/dma-buf.html?highlight=dma_buf#dma-fence-cross-driver-contract and the section right below probably what you want to link to

08:51 rasterman has joined #dri-devel

08:56 dn^ has quit [Remote host closed the connection]

08:57 agd5f has joined #dri-devel

09:01 lynxeye has joined #dri-devel

09:02 agd5f_ has quit [Ping timeout: 480 seconds]

09:06 fab has quit [Quit: fab]

09:07 fab has joined #dri-devel

09:08 apinheiro has joined #dri-devel

09:15 smiles has joined #dri-devel

09:19 fab has quit [Remote host closed the connection]

09:24 Ahuj has joined #dri-devel

09:37 Haaninjo has joined #dri-devel

09:54 jaganteki has quit [Remote host closed the connection]

10:03 ice9 has quit [Ping timeout: 480 seconds]

10:20 fab has joined #dri-devel

10:21 vliaskov_ has joined #dri-devel

10:22 Zopolis4 has quit []

10:26 vliaskov has quit [Ping timeout: 480 seconds]

10:28 ice99 has joined #dri-devel

10:30 fab has quit [Quit: fab]

10:30 fab has joined #dri-devel

10:35 kts has joined #dri-devel

10:37 MajorBiscuit has joined #dri-devel

10:51 jaganteki has joined #dri-devel

10:55 devilhorns has joined #dri-devel

11:31 jaganteki has quit [Remote host closed the connection]

11:39 devilhorns has quit []

11:48 _xav_ has joined #dri-devel

11:50 Company has joined #dri-devel

11:51 xroumegue has quit [Ping timeout: 480 seconds]

11:54 ice99 has quit [Ping timeout: 480 seconds]

11:55 fab has quit [Ping timeout: 480 seconds]

11:57 jaganteki has joined #dri-devel

12:35 YuGiOhJCJ has quit [Quit: YuGiOhJCJ]

12:38 srslypascal is now known as Guest7270

12:39 srslypascal has joined #dri-devel

12:45 Guest7270 has quit [Ping timeout: 480 seconds]

12:54 mohamexiety has joined #dri-devel

13:05 pochu has quit [Quit: leaving]

13:17 kts has quit [Quit: Konversation terminated!]

13:20 kts has joined #dri-devel

13:36 <daniels> this is basically like The Purge but for lavapipe/llvmpipe

13:41 <zmike> haha

13:42 kts has quit [Quit: Konversation terminated!]

13:45 rasterman has quit [Ping timeout: 480 seconds]

13:57 bmodem has quit [Ping timeout: 480 seconds]

14:03 <daniels> fwiw it's looking like our remaining showstopper - apart from missing swrast - is a GitLab bug where it just doesn't hand jobs out to runners. if you see anything stuck in pending for wild amounts of time like 30-40min it's that. if the pipeline is still running other jobs, you can cancel & retry the pending-forever one and it'll sail straight through

14:03 <daniels> I've collected enough about it to figure out how to unbreak it, so we should have a monumentally stupid workaround in place later this afternoon

14:09 <javierm> tzimmermann: thanks for your explanations, makes sense. Feel free to add my r-b to patch #1 too if you resend/apply

14:16 <tzimmermann> javierm, thank you. i do expect that some code can be shared at some point. i simply don't want to get ahead of myself

14:20 <javierm> tzimmermann: yes, makes sense

14:23 MajorBiscuit has quit [Ping timeout: 480 seconds]

14:26 jaganteki has quit [Remote host closed the connection]

14:33 heat has joined #dri-devel

14:37 zehortigoza has quit [Remote host closed the connection]

14:38 oneforall2 has quit [Quit: Leaving]

14:39 oneforall2 has joined #dri-devel

14:47 <DavidHeidelberg[m]> where I can grep for the wayland-dEQP-EGL.functional.negative_api.create_pixmap_surface ? piglit/deqp?

14:49 <daniels> DavidHeidelberg[m]: VK-GL-CTS

14:49 <DavidHeidelberg[m]> thx

14:50 <daniels> though afaict that should be fine as we return EGL_BAD_PARAMETER

14:50 <daniels> is it failing?

14:52 <DavidHeidelberg[m]> daniels: it's fixed by mesa with wayland; but when mesa is without wayland, it should be skipped, not failed I guess?

14:52 <daniels> it should still be passing on x11

14:52 <DavidHeidelberg[m]> oh, then I just workarounded issue by enabling wayland :D

14:52 <DavidHeidelberg[m]> see https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21786/diffs

14:54 kts has joined #dri-devel

14:55 <daniels> heh yeah, I think that's just broken CTS

14:57 rasterman has joined #dri-devel

15:03 MajorBiscuit has joined #dri-devel

15:11 Dr_Who has joined #dri-devel

15:20 fxkamd has joined #dri-devel

15:30 zehortigoza has joined #dri-devel

15:38 Ahuj has quit [Ping timeout: 480 seconds]

15:40 ice9 has joined #dri-devel

15:48 aravind has quit [Ping timeout: 480 seconds]

15:54 FireBurn has joined #dri-devel

15:57 ice9 has quit [Ping timeout: 480 seconds]

16:07 tursulin has quit [Ping timeout: 480 seconds]

16:14 tobiasjakobi has joined #dri-devel

16:14 fab has joined #dri-devel

16:15 tobiasjakobi has quit []

16:15 <DemiMarie> bbrezillon: is the whole dma-fence design the actual problem?

16:16 <DemiMarie> Where core MM stuff gets blocked on async GPU work instead of paging the memory out and having the GPU take an IOMMU fault.

16:17 <DemiMarie> Any chance this can be fixed in drivers other than AMD?

16:17 tzimmermann has quit [Quit: Leaving]

16:21 alyssa has joined #dri-devel

16:21 <alyssa> daniels: i wrote a CI thing and it *works*?!

16:21 <alyssa> send help something must have gone terribly wrong

16:22 Duke`` has joined #dri-devel

16:22 gouchi has joined #dri-devel

16:23 <zmike> I'm here to help, who needs a spot

16:23 krushia has quit [Quit: Konversation terminated!]

16:23 <alyssa> daniels: also, re unassigning, this is what Needs merge"

16:23 <alyssa> helps with :D

16:24 <alyssa> "backlog of MRs to assign to Marge on a rainy day"

16:24 <zmike> would be nice to have a cron job to sweep that a couple times a day

16:25 <bbrezillon> DemiMarie: not sure I follow. Did I say the dma-fence design was the problem?

16:28 <DemiMarie> bbrezillon: no, but it seems to me that the dma-fence design will keep causing bugs until it is changed

16:30 <daniels> alyssa: hrm?

16:32 lynxeye has quit [Quit: Leaving.]

16:33 <alyssa> daniels: which part

16:35 <daniels> alyssa: 'wrote a CI thing'?

16:36 <alyssa> daniels: clang-format lint job

16:36 <alyssa> i thought i would just end up waiting for eric or okias to do it :p

16:36 <daniels> oh, nice

16:37 <alyssa> (..next step is getting panfrost clang-format-clean so I can flip it on there because apparently it isn't right now, whoops)

16:44 <bbrezillon> DemiMarie: I'm no expert on these things, but my understanding is that it's not the dma_fence design itself that's problematic, but more the memory reclaim logic. With mem shrinkers waiting on dma_fence objects to release memory, and waitable allocation happening in the job submission path, you might just deadlock.

16:46 paulk has joined #dri-devel

16:47 <Hazematman> Hey any people familar with nouveau, tegra, virgl, or svga able to look over my MR to change the default PIPE_CAP_DMABUF behavior? https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21654

16:47 <Hazematman> This MR adds support for the `get_screen_fd` API to all the gallium drivers, so I need some people to look over changes to those gallium drivers

16:49 alyssa has left #dri-devel [#dri-devel]

16:51 <gfxstrand> The dma_fence design itself is fine. It's designed that way for very good reasons. There are problems we need to solve but they're more around how things have become tangled up inside drm than they are about dma_fence.

16:54 <bbrezillon> DemiMarie: if you have a way to swapout some memory accessed by in-flight jobs, you might be able to unblock the situation, but I'm sure this 'no allocation in the scheduler path' rule is here to address the problem where a job takes too long to finish and the shrinker decides to reclaim memory anyway.

16:56 <bbrezillon> I think the problem is that drm_sched exposes a fence to the outside world, and it needs a guarantee that this fence will be signaled, otherwise other parties (the shrinker) might wait for an event that's never going to happen

16:56 <gfxstrand> Yup

16:57 <bbrezillon> that comes from the fact it's not the driver fence that's exposed to the outside world, but an intermediate object, which is indirectly signaled by the driver fence, that's created later on when the scheduler calls ->run_job()

16:58 <gfxstrand> Once a fence has been exposed, even internally within the kernel, it MUST signal in finite time.

16:58 stuarts has joined #dri-devel

16:59 <gfxstrand> If you allocate memory, that could kick off reclaim which can then have to wait on the GPU and you're stuck.

17:00 <bbrezillon> so the issue most drivers have, is that they allocate this driver fence in the ->run_job() path with GFP_KERNEL (waitable allocation), which might kick the GPU driver shrinker, which in turn will wait on the fence exposed by the drm_sched, which will never be signaled because the driver is waiting for memory to allocate its driver fence :-)

17:01 <bbrezillon> what gfxstrand said :-)

17:05 <robclark> embedding the hw fence in the job struct is probably easy enough to avoid that.. but then when you start calling into other subsys (iommu, runpm, etc) it starts getting a bit more terrifying

17:08 dviola has joined #dri-devel

17:10 MajorBiscuit has quit [Quit: WeeChat 3.6]

17:15 jaganteki has joined #dri-devel

17:21 FireBurn has quit [Quit: Konversation terminated!]

17:22 krushia has joined #dri-devel

17:29 <bbrezillon> robclark: yeah, that's actually where the whole discussion started. I was trying to see if we could pass a custom allocator to the pgtable/iommu subsystem for page table allocation, so we can pre-allocate pages for the page table update, and avoid allocation in the run_job() path

17:31 <bbrezillon> didn't really think of the runpm stuff, but if allocations can happen in the rpm_get_sync() path, that will be challenging too...

17:31 <bbrezillon> I mean, blocking allocations, of course

17:32 <robclark> bbrezillon: maybe spiff out iommu_iotlb_gather a bit more.. to also handle allocations

17:32 <bbrezillon> yep, was pestering robmur01 with that yesterday :-)

17:32 <vsyrjala> at least acpi is absolutely terrifying in terms of doing memory allocations in runtime pm paths/etc.

17:32 <robclark> the gather struct is already used to defer freeing pages to optimize tlb flushing

17:33 <bbrezillon> the other option would be to just re-implement the page table logic

17:33 kzd has joined #dri-devel

17:34 Ahuj has joined #dri-devel

17:34 mohamexiety has quit []

17:34 <robclark> that is something I'd prefer avoiding

17:35 <bbrezillon> which we might have to do if we want to use some fancy TTM helpers and get advanced memory reclaim involving reclaims of page-tables, not just memory backing GEM objects (didn't really check what the TTM TT abstraction looks like)

17:35 <robclark> I did have a branch somewhere a while back that plumbed the gather struct more in the map path (since map can also trigger free's and tlb flushes)

17:36 <robclark> we don't need to store pgtables in vram, so I don't think that is useful

17:36 <robclark> (and not really sure how good ttm is in general with reclaim)

17:38 <bbrezillon> don't know how good TTM reclaim is, but I'm sure panfrost reclaim is not great :-)

17:38 <robclark> does panfrost have reclaim other than just madv?

17:39 <bbrezillon> nope

17:40 Guest7231 is now known as ybogdano

17:40 <bbrezillon> just ditched the whole reclaim stuff in pancsf, hoping someone could come up with a good reclaim-implementation solution for new drivers :-)

17:40 <bbrezillon> and then I realized TTM had some of that

17:40 <robclark> so I did add common lru and lru iteration

17:40 <robclark> drm_gem_lru_scan()

17:41 <bbrezillon> that's a good start

17:41 <robclark> use that and reclaim is mostly not too bad except random places that might allocate memory

17:41 <bbrezillon> I'll have a look, thanks for the pointer

17:42 <robclark> _but_ I'm not doing iommu map/unmap from scheduler like a VM_BIND impl would do.. that kinda forces the issue of hoisting allocation out of io-pgtable.. but it is a useful thing to do because we can be more clever about tlb invalidates that way

17:43 <bbrezillon> robclark: just curious, where do the page table live if they're not in vram?

17:43 <robclark> there is no vram ;-)

17:43 <robclark> it is all just ram

17:44 <bbrezillon> yeah, I mean, it's the same on pancsf

17:44 <bbrezillon> but it's still memory that's accessed by the GPU

17:44 <robclark> not _really_

17:45 <bbrezillon> okay, the MMU in front of the GPU :)

17:45 <robclark> it is memory accessed by the (io)mmu which might happen to be part of the gpu

17:45 <robclark> right

17:45 <bbrezillon> but I think I get where you were going with 'we don't need to reclaim pgtable mem'

17:46 <bbrezillon> tearing down a mapping will automatically release the memory, since it's all synchronous and dealt on the CPU side

17:47 <bbrezillon> *the pgtable memory

17:48 <robclark> hmm, that is kinda immaterial.. we don't need the pgtable to be in special memory that isn't system memory.. but you can still get into deadlock, because allocations that don't set GFP_ATOMIC or similar flags that allow the allocation to fail can recurse into shrinker

17:48 <bbrezillon> sure, we still need to pre-allocate

17:48 <robclark> yup

17:49 <bbrezillon> guess the question is, should we could pgtable memory in the shrinker

17:49 <robclark> (or, you do for the VM_BIND case.. if you do iommu map somewhere else where it is safe to allocate that is fine)

17:49 <bbrezillon> because currently we don't

17:49 <bbrezillon> *should we count

17:49 <robclark> don't count.. but that isn't the problem

17:50 <robclark> the problem is that fence signals allows other pages to be reclaimed

17:50 <robclark> but the allocation of pages for pgtable can trigger shrinker which could depend on other gpu pages to become avail to free

17:51 <robclark> so, one idea.. which 90% solves it (and at least reduces the # of pages you need to pre-allocate)

17:51 <bbrezillon> sure, I'm just thinking about why TTM keeps track of page table memory in its reclaim logic. The alloc in signalling path is completly orthogonal and needs to be addressed for async VM_BIND

17:51 <robclark> hmm.. or nm.. I was going to say do iommu map synchronously but unmap from run_job(). but that doesn't quite work

17:52 <robclark> TTM is probably doing that because if you have vram (dGPU) you have pgtable in vram

17:52 <bbrezillon> unmap might need to allocate too

17:52 <robclark> reading pgtable over pci is not going to work out great

17:53 <robclark> right, both map and unmap can free and alloc... but limited cases so you probably could put an upper bound on # of pages you'd need to pre-allocate

17:56 <bbrezillon> there's an upper bound for map operations too. I mean, in addition to what you'll always need for the map anyway (the more you map, the more level of page tables you'll have to pre-allocate, but compared to what you'll need for the map operation itself, it should be negligible)

17:57 <bbrezillon> so maybe allocating for the worse case is not such a big deal

17:57 kts has quit [Quit: Konversation terminated!]

17:58 <bbrezillon> I guess you can run into problems if you have a lot of small async map/unmap operations, because then the amount of pages you allocate for the worst case can be much bigger than the amount of page you'd allocate if you knew the state of the VA space

17:59 <robclark> probably don't free pages you didn't happen to use for current map/unmap and keep them avail for next time?

18:01 agd5f_ has joined #dri-devel

18:04 kts has joined #dri-devel

18:08 agd5f has quit [Ping timeout: 480 seconds]

18:09 kzd has quit [Quit: kzd]

18:11 <bbrezillon> sure, we can keep a pool of free pages around to speed up allocation

18:15 mohamexiety has joined #dri-devel

18:23 lkw has joined #dri-devel

18:23 kzd has joined #dri-devel

18:26 <daniels> I'm aware our shared runners are completely starved; this is exacerbated by the job distribution being unfair, which I'm typing up a patch for

18:28 kts has quit [Quit: Konversation terminated!]

18:28 lkw has quit [Quit: leaving]

18:53 <daniels> ok, hopefully that's done now

19:04 iive has joined #dri-devel

19:07 alanc has quit [Remote host closed the connection]

19:08 alanc has joined #dri-devel

19:14 <DemiMarie> bbrezillon gfxstrand: the idea I had is that the shrinker could call some sort of MMU notifier callback, which is not allowed to block or fail. That callback is responsible for unmapping the needed MMU/IOMMU entries, flushing any necessary TLB, and canceling any not-yet-submitted jobs. Jobs that are already in progress will just fault when they try to access paged-out memory, at which point it is up to the GPU driver to recover.

19:14 <DemiMarie> Of course, this

19:14 <DemiMarie> * course, this is equivalent to requiring all GPUs to support pageable memory on the host side.

19:18 Zopolis4 has joined #dri-devel

19:20 <DemiMarie> More generally, having memory reclaim blocked on userspace-provided shaders seems very unsafe. I’m curious why drivers are not required to pin all memory that a shader might have access to, unless the GPU supports retryable page faults.

19:25 ybogdano is now known as Guest7297

19:25 ybogdano has joined #dri-devel

19:26 <robclark> gpu faults because the system is overcommitted on memory isn't going to be hugely popular ;-)

19:26 <robclark> I mean, your average 2GB or 4GB chromebook is under constant memory pressure ;-)

19:27 <robclark> demand-paging on gpu side is something that some gpu's could do.. although compared to CPU where you are stalling one task, stalling the gpu for a fault is stalling 100's or 1000's of tasks.. so not sure if that is exactly great

19:30 <DemiMarie> robclark: what about requring memory to be pre-pinned (and marked unavailable for shrinking)? Also, can’t even `GFP_KERNEL` fail?

19:32 <robclark> GFP_KERNEL in practice can't fail for small allocations, IIRC

19:33 smiles has quit [Ping timeout: 480 seconds]

19:34 <robclark> you can in the shrinker skip over memory that has unsignaled fences.. and this is probably what you want to do for early stages of shrinking. But under more extreme memory pressure you want to be able to wait for things that will become avail to reclaim in the near future

19:47 Ahuj has quit [Ping timeout: 480 seconds]

19:48 ngcortes has joined #dri-devel

20:01 smiles has joined #dri-devel

20:03 gio has quit []

20:10 smiles has quit [Ping timeout: 480 seconds]

20:18 <airlied> bbrezillon: I think the latest nouveau bits have refactored that or are in the process of refactoring

20:19 hansg has quit [Quit: Leaving]

20:23 <airlied> DemiMarie: pin all the things isn't really a winning method, eventually everyone pins all of memory with chrome tabs

20:24 <airlied> dma-fence is pretty much pin all memory until a shader has finished with it, and you know it's finished when it signals a dma-fence

20:31 ybogdano is now known as Guest7305

20:31 Guest7297 is now known as ybogdano

20:35 <airlied> dakr: ^ probably should read above just to reconfirm

20:36 JohnnyonFlame has joined #dri-devel

21:04 <dakr> airlied, bbrezillon: Yes, it's not even entirely fixed in V2.

21:04 <DemiMarie> airlied: So there are a couple problems there.

21:04 <DemiMarie> First is that a shader can loop forever. Timeout detection can handle that, but it runs into another problem, which is that resetting the GPU denies service to other legitimate users of the GPU.

21:05 <airlied> DemiMarie: yes welcome to GPUs

21:05 <DemiMarie> The second is that some workloads (notably compute) have legitimate long-running shaders.

21:05 <dakr> In V2 I still have the page table cleanup in the job_free() callback. The page table cleanup needs to take the same lock as the page table allocation does. Since job_free() can stall the job's run() callback, this is still a potential deadlock.

21:05 <dakr> I'll fix this up in V3.

21:06 <airlied> DemiMarie: for compute to do long running you have to have page faults

21:06 <airlied> at which point you don't need to wait for dma fences in throty

21:06 <airlied> theory

21:06 <DemiMarie> airlied: which GPUs support page faults?

21:06 <airlied> so compute jobs that are long running are not meant to use dma-fences

21:07 <airlied> DemiMarie: for compute jobs, I think all current gen, most last gen

21:07 <DemiMarie> the other option would be to use IOMMU tricks to remap the pages behind the GPU’s back

21:07 <airlied> not sure where it falls over

21:07 <DemiMarie> airlied: what about for graphics jobs?

21:07 <airlied> nope grpahics jobs aren't pagefault friendly

21:07 <DemiMarie> airlied: why?

21:07 <airlied> and you don't really want to take a pagefault in your fragment shader

21:07 <airlied> too many fixed function piecs

21:08 <DemiMarie> and those cannot take page faults?

21:08 <airlied> eventually they might get to where it could work, but it will always be horrible

21:08 <airlied> not usually

21:08 <airlied> texture engines and ROPs generally

21:08 <DemiMarie> what about letting the IOMMU remap pages transparently to the GPU?

21:10 Guest7305 has quit [Ping timeout: 480 seconds]

21:10 <airlied> don't think we always have an iommu, and I'm not sure if you could do that in any race free way

21:11 <DemiMarie> If the GPU could retry requests one could use break-before-make

21:12 <DemiMarie> airlied: GPUs really need to get better at hostile multi-tenancy

21:12 <airlied> they have been getting better, just not sure when they'll be finished

21:13 <DemiMarie> what progress has been made?

21:16 <airlied> you couldn't even pagefault a few gens ago :-)

21:18 <DemiMarie> If I were designing a GPU I would be very tempted to have each shader core run its own little OS (provided by the driver) with full interrupt and exception handling support.

21:21 <DemiMarie> Or at least have the host be able to migrate pages on the GPU.

21:39 Duke`` has quit [Ping timeout: 480 seconds]

21:45 ybogdano is now known as Guest7312

21:45 ybogdano has joined #dri-devel

21:46 <robclark> then you'd have llvmpipe :-P

21:46 <robclark> but I wouldn't share a gpu between hostal tennants

21:47 <dottedmag> DemiMarie: don't you have

21:47 <dottedmag> _special_ GPUs for hostile multitenancy

21:48 <robclark> there is server class stuff that supports sr-iov

21:48 <robclark> I'm sure there is a whole big class of u-arch info leak issues hiding there ;-)

21:55 <airlied> generally they are often compute only

21:56 <DemiMarie> robclark: personally, I would be fine with using LLVMpipe, but unfortunately for Qubes users, the GUI tookit and application writers are not.

21:57 <DemiMarie> dottedmag robclark airlied: Intel Gen12+ iGPUs support SR-IOV, though driver support for it is not (yet) upstreamed. That said, Qubes OS needs to work (and have decent performance) even without such hardware.

21:58 <DemiMarie> Just how long does it take to reset a GPU? Because at least Apple M1+ GPUs reset so quickly that one could reset after every frame and still have a usable desktop.

21:59 apinheiro has quit [Ping timeout: 480 seconds]

22:01 <robclark> probably on the order of few ms .. so might be ok for desktop workloads but probably not for games.. I don't have #'s for reset but resuming gpu is ~1.5-2ms for modern adreno's..

22:03 <robclark> if resetting the gpu was the fastest way to do it then qcom wouldn't have this "zap" shader mechanism to take the gpu out of protected mode since same information leak concerns there

22:04 gouchi has quit [Remote host closed the connection]

22:08 vliaskov_ has quit [Remote host closed the connection]

22:13 agd5f_ has quit []

22:14 agd5f has joined #dri-devel

22:14 <agd5f> DemiMarie, All gfx9 derived GPUs can support recoverable GPU page faults, but that was dropped in gfx10 and newer because it takes a lot of die area and the performance generally makes games unusable. If you want fast games, everything needs to be resident

22:15 <agd5f> you can also preempt to deal with long shaders

22:17 <agd5f> if there is memory pressure, stop the jobs, deal with the pressure, let them run again

22:19 fab has quit [Quit: fab]

22:19 kzd has quit [Quit: kzd]

22:22 kzd has joined #dri-devel

22:27 apinheiro has joined #dri-devel

22:28 ahajda has quit [Quit: Going offline, see ya! (www.adiirc.com)]

22:32 danvet has quit [Ping timeout: 480 seconds]

22:36 <DemiMarie> agd5f robclark: thanks for taking your time to answer my questions!

22:37 <DemiMarie> airlied dottedmag too

22:38 <robclark> np

22:46 avocicltb^ has joined #dri-devel

22:48 Zopolis4 has quit []

22:49 ybogdano is now known as Guest7321

22:49 Guest7312 is now known as ybogdano

23:14 <anholt_> daniels: I'm supposed to be off work today, but it I've taken down the swrast runners again now. Going to need someone competent to maintain them if we're going to, so we're asking around, but probably going to need to move the load back to equinix for a bit. :/

23:30 bgs has quit [Remote host closed the connection]

23:46 MajorBiscuit has joined #dri-devel

23:51 apinheiro has quit [Quit: Leaving]

23:59 MajorBiscuit has quit [Quit: WeeChat 3.6]