ChanServ changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard + Bifrost + Valhall - Logs https://oftc.irclog.whitequark.org/panfrost - I don't know anything about WSI. That's my story and I'm sticking to it.
megi1 has joined #panfrost
megi has quit [Ping timeout: 480 seconds]
falk689 has quit [Remote host closed the connection]
falk689 has joined #panfrost
atler is now known as Guest7227
atler has joined #panfrost
Guest7227 has quit [Ping timeout: 480 seconds]
alpernebbi has quit [Ping timeout: 480 seconds]
alpernebbi has joined #panfrost
chewitt has joined #panfrost
Leopold__ has joined #panfrost
Leopold has quit [Ping timeout: 480 seconds]
guillaume_g has joined #panfrost
Googulator has quit [Read error: Connection reset by peer]
<bbrezillon> robmur01: uh, yeah, you're right, we allocate the job fence there :-/. Then I don't really get why we need to pre-allocate the page tables (that had been raised on dri-devel, something about allocs in the signaling path can deadlock, but run_job() doesn't seem to be part of the signaling path)
<bbrezillon> a failure means we'll have the drm_sched job done fence signaled early with an error, potentially leading to page faults on submit jobs if the requested map requests didn't take place. I guess it's a bit more problematic for partial unmaps (we're holding memory we should have freed), but not that much, because we won't release the pages backing the mapped object until the VM mapping
<bbrezillon> is gone, so no risk of touching memory that has been re-allocated to someone else AFAICT.
<bbrezillon> looking at https://lore.kernel.org/lkml/20230217134820.14672-10-dakr@redhat.com/, they really split things so no page table allocations happen in the vm_bind run_job() path, but they do alloc the dma_fence in the exec submission path (actual hardware jobs). So either they expect the fence object allocation to never fail/wait or I'm missing something.
<bbrezillon> robmur01: I asked with danvet on #dri-devel, and he says all drivers are buggy, and run_job() is in the signaling path
rasterman has joined #panfrost
Googulator has joined #panfrost
MajorBiscuit has joined #panfrost
chewitt has quit [Quit: Zzz..]
hanetzer3 has joined #panfrost
hanetzer2 has quit [Ping timeout: 480 seconds]
hanetzer3 has quit []
<robmur01> bbrezillon: I guess I don't have a good mental picture of how the DRM scheduler works :/
<robmur01> but I would note that what we don't have, compared to the "big" GPUs, is any of the issues involved in having to allocate pagetables from GPU VRAM
<robmur01> and if we ever fail to allocate a single kernel page, then OOM will likely already have big memory hogs like GPU-using processes in its sights, so the failure may be moot :)
hanetzer has joined #panfrost
rasterman has quit [Ping timeout: 480 seconds]
MajorBiscuit has quit [Ping timeout: 480 seconds]
chewitt has joined #panfrost
rasterman has joined #panfrost
chewitt has quit [Quit: Zzz..]
chewitt has joined #panfrost
chewitt has quit []
MajorBiscuit has joined #panfrost
Dr_Who has joined #panfrost
guillaume_g has quit []
paulk has joined #panfrost
MajorBiscuit has quit [Quit: WeeChat 3.6]
Googulator has quit [Ping timeout: 480 seconds]
<robclark> robmur01: the issue is that you don't want allocating memory to be a dependency of shrinker being able to reclaim memory
<robclark> I kinda think that spiffing out the tlb_gather struct thing to be both allocator and deferred free'er would work
<robmur01> Ah, that much makes sense, so I guess the subtlety is all in how shrinker and sched interact?
<robclark> right
<robmur01> don't suppose there's an idiot's guide to that anywhere? :)
<robclark> hmm, I think there is a sect in dma-buf rst that alludes to it.. one sec
<robclark> https://www.kernel.org/doc/html/next/driver-api/dma-buf.html#indefinite-dma-fences mentions mm dependencies.. but yeah, isn't super detailed about it
<robclark> (userspace fences have similar problem because userspace could be trapped faulting in a page which ends up in shrinker)
<robmur01> thanks, that at least has the flavour of being helpful
<robclark> np.. probably would be useful to *somewhere* document a bit better all the dragons
<robmur01> so at a wild guess, is it that because run_job has started, then shrinker needs to wait for that job to finish, but if the allocation in run_job ends up in reclaim then that's going to wait for the shrinker?
<robclark> in the VM_BIND case, I think it is because the shrinker could be waiting on a fence that will be signaled when run_job completes
<robclark> ie you don't want to unpin/reclaim pages that still have iommu map
<robclark> so you need to wait for the unmap to complete
<robmur01> OK, that's getting clearer
<robmur01> TBH unmap should never need to allocate - splitting blocks is an abomination
<robmur01> as for map, could panfrost_job_push() walk the BOs and "map" non-present PTEs every 2MB to ensure the tables are pre-populated?
<robclark> even if unmap did not allocate it could be queued up behind a map.. so that doesn't completely help
<robclark> hmm, maybe? I was thinking it would be easier to give a way to as the io-pgtable what is worst case # of pages it would need to allocate a given size buffer
<robclark> and then just make sure the driver's pool of pages is big enough before enqueuing to scheduler (ie. from userspace ioctl ctx)
<robmur01> (or just every 1GB for ranges that are going to be mappable with 2MB blocks, to save freeing those bottom-level tables again when the real mapping happens)
<robclark> I do have a semi selfish reason for wanting to let driver handle alloc/free.. since there were some cases where map would free pages and need tlb ops, which we kinda don't handle in msm
<robclark> I started trying to fix that at one point, but it involved plumbing thru the gather struct in a bunch of places
<robclark> and then got distracted by something else
<robmur01> hmm, not sure an allocator would really help in that case - that's when map needs to replace an existing table entry with a block mapping, so the TLBI is even more fundamental than the free
<robmur01> This reminds me that somewhere I think I got about 80% of the way through hooking up the freelist stuff, I should dig that up again sometime...
<robclark> alloc is kinda separate from free.. but mostly needs to be plumbed thru the same places.. as far as replacing existing mapping, it seems to me that as long as the old page exists until the tlbi then it doesn't matter which version gets hit on a translation.. as long as tlbi happens before the new mapping is considered visble to the gpu.. what am I missing?
<robmur01> right, it doesn't really matter who frees the page and how, as long as it only happens after the TLBI, but the TLBI must also happen before map() returns
<robmur01> therefore you can defer the free outside the scope of io-pgtable, but not the TLBI
<robclark> ok, what I had in mind was roughly inline with that sentament except it would be before msm
<robclark> ok, what I had in mind was roughly inline with that sentament except it would be before msm's map() returns
<robclark> ie. outside of io-pgtable.. since io-pgtable has dummy tlb ops
<robclark> so it would still happen before the mapping becomes visible..
<robclark> just outside of the pgtable helpers
<robmur01> eww, I'd forgotten about all that stuff.... never understood why you don't just copy the TLB ops from ttbr1_cfg :/
<robclark> the issue, IIRC, was that the gpu could be suspended when userspace triggers unmap
<robclark> I guess there is probably room to do _something_ better... like add some more adreno-smmu-priv ops for runpm or something
<robmur01> Ah, OK, but you could still have shim ops like "if (pm_runtime_get_if_in_use()) return real_ops->op();"
<robmur01> snap :)
<robclark> yeah, I think something like that could work.. need to convince myself that it can be non-racey
pbsds has joined #panfrost
pbsds3 has quit [Ping timeout: 480 seconds]
pbsds is now known as pbsds3
atler has quit [Quit: atler]
Rathann has joined #panfrost
Rathann has quit [Remote host closed the connection]
MajorBiscuit has joined #panfrost
MajorBiscuit has quit [Quit: WeeChat 3.6]