#panfrost on 2023-03-10 — irc logs at oftc.irclog.whitequark.org

2022-12-21 00:45 ChanServ changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard + Bifrost + Valhall - Logs https://oftc.irclog.whitequark.org/panfrost - I don't know anything about WSI. That's my story and I'm sticking to it.

01:42 megi1 has joined #panfrost

01:43 megi has quit [Ping timeout: 480 seconds]

01:44 falk689 has quit [Remote host closed the connection]

01:44 falk689 has joined #panfrost

02:15 atler is now known as Guest7227

02:15 atler has joined #panfrost

02:17 Guest7227 has quit [Ping timeout: 480 seconds]

02:20 alpernebbi has quit [Ping timeout: 480 seconds]

02:21 alpernebbi has joined #panfrost

03:53 chewitt has joined #panfrost

03:57 Leopold__ has joined #panfrost

03:58 Leopold has quit [Ping timeout: 480 seconds]

07:17 guillaume_g has joined #panfrost

07:47 Googulator has quit [Read error: Connection reset by peer]

07:50 <bbrezillon> robmur01: uh, yeah, you're right, we allocate the job fence there :-/. Then I don't really get why we need to pre-allocate the page tables (that had been raised on dri-devel, something about allocs in the signaling path can deadlock, but run_job() doesn't seem to be part of the signaling path)

07:55 <bbrezillon> a failure means we'll have the drm_sched job done fence signaled early with an error, potentially leading to page faults on submit jobs if the requested map requests didn't take place. I guess it's a bit more problematic for partial unmaps (we're holding memory we should have freed), but not that much, because we won't release the pages backing the mapped object until the VM mapping

07:55 <bbrezillon> is gone, so no risk of touching memory that has been re-allocated to someone else AFAICT.

08:10 <bbrezillon> looking at https://lore.kernel.org/lkml/20230217134820.14672-10-dakr@redhat.com/, they really split things so no page table allocations happen in the vm_bind run_job() path, but they do alloc the dma_fence in the exec submission path (actual hardware jobs). So either they expect the fence object allocation to never fail/wait or I'm missing something.

08:32 <bbrezillon> robmur01: I asked with danvet on #dri-devel, and he says all drivers are buggy, and run_job() is in the signaling path

08:51 rasterman has joined #panfrost

09:34 Googulator has joined #panfrost

10:37 MajorBiscuit has joined #panfrost

12:00 chewitt has quit [Quit: Zzz..]

12:17 hanetzer3 has joined #panfrost

12:19 hanetzer2 has quit [Ping timeout: 480 seconds]

12:21 hanetzer3 has quit []

12:48 <robmur01> bbrezillon: I guess I don't have a good mental picture of how the DRM scheduler works :/

12:50 <robmur01> but I would note that what we don't have, compared to the "big" GPUs, is any of the issues involved in having to allocate pagetables from GPU VRAM

12:52 <robmur01> and if we ever fail to allocate a single kernel page, then OOM will likely already have big memory hogs like GPU-using processes in its sights, so the failure may be moot :)

13:41 hanetzer has joined #panfrost

13:45 rasterman has quit [Ping timeout: 480 seconds]

14:23 MajorBiscuit has quit [Ping timeout: 480 seconds]

14:49 chewitt has joined #panfrost

14:57 rasterman has joined #panfrost

15:00 chewitt has quit [Quit: Zzz..]

15:00 chewitt has joined #panfrost

15:01 chewitt has quit []

15:03 MajorBiscuit has joined #panfrost

15:11 Dr_Who has joined #panfrost

16:27 guillaume_g has quit []

16:46 paulk has joined #panfrost

17:10 MajorBiscuit has quit [Quit: WeeChat 3.6]

18:04 Googulator has quit [Ping timeout: 480 seconds]

18:21 <robclark> robmur01: the issue is that you don't want allocating memory to be a dependency of shrinker being able to reclaim memory

18:23 <robclark> I kinda think that spiffing out the tlb_gather struct thing to be both allocator and deferred free'er would work

18:24 <robmur01> Ah, that much makes sense, so I guess the subtlety is all in how shrinker and sched interact?

18:24 <robclark> right

18:24 <robmur01> don't suppose there's an idiot's guide to that anywhere? :)

18:25 <robclark> hmm, I think there is a sect in dma-buf rst that alludes to it.. one sec

18:26 <robclark> https://www.kernel.org/doc/html/next/driver-api/dma-buf.html#indefinite-dma-fences mentions mm dependencies.. but yeah, isn't super detailed about it

18:27 <robclark> (userspace fences have similar problem because userspace could be trapped faulting in a page which ends up in shrinker)

18:30 <robmur01> thanks, that at least has the flavour of being helpful

18:32 <robclark> np.. probably would be useful to *somewhere* document a bit better all the dragons

18:35 <robmur01> so at a wild guess, is it that because run_job has started, then shrinker needs to wait for that job to finish, but if the allocation in run_job ends up in reclaim then that's going to wait for the shrinker?

18:37 <robclark> in the VM_BIND case, I think it is because the shrinker could be waiting on a fence that will be signaled when run_job completes

18:38 <robclark> ie you don't want to unpin/reclaim pages that still have iommu map

18:38 <robclark> so you need to wait for the unmap to complete

18:39 <robmur01> OK, that's getting clearer

18:39 <robmur01> TBH unmap should never need to allocate - splitting blocks is an abomination

18:41 <robmur01> as for map, could panfrost_job_push() walk the BOs and "map" non-present PTEs every 2MB to ensure the tables are pre-populated?

18:42 <robclark> even if unmap did not allocate it could be queued up behind a map.. so that doesn't completely help

18:43 <robclark> hmm, maybe? I was thinking it would be easier to give a way to as the io-pgtable what is worst case # of pages it would need to allocate a given size buffer

18:43 <robclark> and then just make sure the driver's pool of pages is big enough before enqueuing to scheduler (ie. from userspace ioctl ctx)

18:44 <robmur01> (or just every 1GB for ranges that are going to be mappable with 2MB blocks, to save freeing those bottom-level tables again when the real mapping happens)

18:45 <robclark> I do have a semi selfish reason for wanting to let driver handle alloc/free.. since there were some cases where map would free pages and need tlb ops, which we kinda don't handle in msm

18:45 <robclark> I started trying to fix that at one point, but it involved plumbing thru the gather struct in a bunch of places

18:45 <robclark> and then got distracted by something else

18:50 <robmur01> hmm, not sure an allocator would really help in that case - that's when map needs to replace an existing table entry with a block mapping, so the TLBI is even more fundamental than the free

18:51 <robmur01> This reminds me that somewhere I think I got about 80% of the way through hooking up the freelist stuff, I should dig that up again sometime...

18:55 <robclark> alloc is kinda separate from free.. but mostly needs to be plumbed thru the same places.. as far as replacing existing mapping, it seems to me that as long as the old page exists until the tlbi then it doesn't matter which version gets hit on a translation.. as long as tlbi happens before the new mapping is considered visble to the gpu.. what am I missing?

19:09 <robmur01> right, it doesn't really matter who frees the page and how, as long as it only happens after the TLBI, but the TLBI must also happen before map() returns

19:10 <robmur01> therefore you can defer the free outside the scope of io-pgtable, but not the TLBI

19:11 <robclark> ok, what I had in mind was roughly inline with that sentament except it would be before msm

19:11 <robclark> ok, what I had in mind was roughly inline with that sentament except it would be before msm's map() returns

19:11 <robclark> ie. outside of io-pgtable.. since io-pgtable has dummy tlb ops

19:11 <robclark> so it would still happen before the mapping becomes visible..

19:11 <robclark> just outside of the pgtable helpers

19:16 <robmur01> eww, I'd forgotten about all that stuff.... never understood why you don't just copy the TLB ops from ttbr1_cfg :/

19:19 <robclark> the issue, IIRC, was that the gpu could be suspended when userspace triggers unmap

19:24 <robclark> I guess there is probably room to do _something_ better... like add some more adreno-smmu-priv ops for runpm or something

19:25 <robmur01> Ah, OK, but you could still have shim ops like "if (pm_runtime_get_if_in_use()) return real_ops->op();"

19:25 <robmur01> snap :)

19:32 <robclark> yeah, I think something like that could work.. need to convince myself that it can be non-racey

21:07 pbsds has joined #panfrost

21:11 pbsds3 has quit [Ping timeout: 480 seconds]

21:11 pbsds is now known as pbsds3

22:08 atler has quit [Quit: atler]

22:28 Rathann has joined #panfrost

23:21 Rathann has quit [Remote host closed the connection]

23:46 MajorBiscuit has joined #panfrost

23:59 MajorBiscuit has quit [Quit: WeeChat 3.6]