ChanServ changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - Logs https://oftc.irclog.whitequark.org/panfrost - <macc24> i have been here before it was popular
derzahl has joined #panfrost
Daanct12 has joined #panfrost
davidlt has joined #panfrost
davidlt has quit [Ping timeout: 480 seconds]
Daanct12 has quit [Remote host closed the connection]
Daanct12 has joined #panfrost
derzahl has quit [Read error: Connection reset by peer]
pch has quit [singleton.oftc.net synthon.oftc.net]
hanetzer has quit [singleton.oftc.net synthon.oftc.net]
jambalaya has quit [singleton.oftc.net synthon.oftc.net]
DVulgaris has quit [singleton.oftc.net synthon.oftc.net]
Lyude has quit [singleton.oftc.net synthon.oftc.net]
soreau has quit [singleton.oftc.net synthon.oftc.net]
floof58 has quit [singleton.oftc.net synthon.oftc.net]
jstultz has quit [singleton.oftc.net synthon.oftc.net]
bluebugs has quit [singleton.oftc.net synthon.oftc.net]
rcf has quit [singleton.oftc.net synthon.oftc.net]
cphealy has quit [singleton.oftc.net synthon.oftc.net]
tlwoerner has quit [singleton.oftc.net synthon.oftc.net]
narmstrong_ has quit [singleton.oftc.net synthon.oftc.net]
erle has quit [singleton.oftc.net synthon.oftc.net]
FLHerne has quit [singleton.oftc.net synthon.oftc.net]
indy has quit [singleton.oftc.net synthon.oftc.net]
mav has quit [singleton.oftc.net synthon.oftc.net]
Consolatis has quit [singleton.oftc.net synthon.oftc.net]
xdarklight has quit [singleton.oftc.net synthon.oftc.net]
strongtz[m] has quit [singleton.oftc.net synthon.oftc.net]
AreaScout_ has quit [singleton.oftc.net synthon.oftc.net]
stebler[m] has quit [singleton.oftc.net synthon.oftc.net]
sigmaris has quit [singleton.oftc.net synthon.oftc.net]
enunes has quit [singleton.oftc.net synthon.oftc.net]
go4godvin has quit [singleton.oftc.net synthon.oftc.net]
hl` has quit [singleton.oftc.net synthon.oftc.net]
karolherbst has quit [singleton.oftc.net synthon.oftc.net]
atler has quit [singleton.oftc.net synthon.oftc.net]
tomeu has quit [singleton.oftc.net synthon.oftc.net]
ndufresne has quit [singleton.oftc.net synthon.oftc.net]
mriesch has quit [singleton.oftc.net synthon.oftc.net]
pjakobsson has quit [singleton.oftc.net synthon.oftc.net]
jernej has quit [singleton.oftc.net synthon.oftc.net]
alpernebbi has quit [singleton.oftc.net synthon.oftc.net]
jelly has quit [singleton.oftc.net synthon.oftc.net]
DPA- has quit [singleton.oftc.net synthon.oftc.net]
pendingchaos has quit [singleton.oftc.net synthon.oftc.net]
unevenrhombus[m] has quit [singleton.oftc.net synthon.oftc.net]
urja has quit [singleton.oftc.net synthon.oftc.net]
cyrozap has quit [singleton.oftc.net synthon.oftc.net]
simon-perretta-img has quit [singleton.oftc.net synthon.oftc.net]
falk689 has quit [singleton.oftc.net synthon.oftc.net]
italove has quit [singleton.oftc.net synthon.oftc.net]
jschwart has quit [singleton.oftc.net synthon.oftc.net]
robmur01 has quit [singleton.oftc.net synthon.oftc.net]
tanty has quit [singleton.oftc.net synthon.oftc.net]
kenzie has quit [singleton.oftc.net synthon.oftc.net]
CME has quit [singleton.oftc.net synthon.oftc.net]
alarumbe has quit [singleton.oftc.net synthon.oftc.net]
dhewg has quit [singleton.oftc.net synthon.oftc.net]
bbrezillon has quit [singleton.oftc.net synthon.oftc.net]
digetx has quit [singleton.oftc.net synthon.oftc.net]
erle has joined #panfrost
spawacz has joined #panfrost
Major_Biscuit has joined #panfrost
davidlt has joined #panfrost
Daaanct12 has joined #panfrost
Daaanct12 has quit [Remote host closed the connection]
Daanct12 has quit [Ping timeout: 480 seconds]
rasterman has joined #panfrost
Major_Biscuit has quit [Ping timeout: 480 seconds]
<icecream95>
Hmm... should I rely on userfaultfd or mprotect/SIGSEGV handlers for tracking writes to the doorbell page for v10 panwrap?
<icecream95>
I guess I could even point the blob to a completely different set of pages, then proxy writes over in a different thread
<HdkR>
any reason to use userfaultfd if you're monitoring in-process? That's more useful for out of process fault handling isn't it?
<icecream95>
HdkR: Someone might want to override the SEGV handler, but it's less likely for userfaultfds to be messed with by the application being traced
<HdkR>
Oh, you're tracing arbitrary applications? Then yea, userfaultfd is the way to go. I've never really seen anything use it
<icecream95>
But does userfaultfd allow reprotecting a page as soon as an access completes?
<HdkR>
faulting thread sleeps until a response is given in the userfaultfd handling, so it should?
<HdkR>
Can kind of do whatever you want
Major_Biscuit has joined #panfrost
<icecream95>
Hmm.. the other problem is how to copy writes from the MCU back to the userfaultfd-protected pages
<icecream95>
But with how the blob works, I don't think it's super important for that to always be updated.. mostly it's to prevent overflowing the ring buffer, I think
<icecream95>
And it'd take about a thousand batches for that to overflow, so it doesn't matter if we are a *little* behind
Daanct12 has joined #panfrost
guillaume_g has joined #panfrost
<icecream95>
But then how do I make a page become 'missing' again? Otherwise I can only catch writes...
nlhowell has joined #panfrost
Major_Biscuit has quit [Ping timeout: 480 seconds]
nlhowell has quit [Ping timeout: 480 seconds]
<icecream95>
I guess I could just mmap(MAP_FIXED) in a new page which has not been faulted in yet..
camus has quit [Read error: Connection reset by peer]
Major_Biscuit has joined #panfrost
camus has joined #panfrost
<icecream95>
I wonder if it would be better to just poll memory in another thread... but that wouldn't be as fun
Daanct12 has quit [Ping timeout: 480 seconds]
<icecream95>
Meh, I'll go with the wait. It's not as if the blob ever renders at more than 100 fps anyway
Major_Biscuit has quit [Ping timeout: 480 seconds]
pch has joined #panfrost
<icecream95>
I have to say that Arm are very forward thinking... I'm sure people will eventually need > 4 GB ring buffers for submitting GPU command lists /s
camus has quit [Remote host closed the connection]
camus has joined #panfrost
<icecream95>
Merge request for v10 support created! (against panloader, Mesa will hopefully be soon)
<alyssa>
always about the same time in the OpenCL CTS
<alyssa>
those ones don't even implicate panfrost ...
<alyssa>
(this is with Dmitry's fixes)
<alyssa>
I don't even know where to begin with that
* alyssa
enables CONFIG_DEBUG_PREEMPT
<alyssa>
and lockdep, why is lockdep not enabled?
<alyssa>
OK. Just enabled a big pile of debug options. Now we wait and see if I get more useless information out of this splat
<alyssa>
(Unfortunately my current reproducer takes like 15 minutes)
davidlt has quit [Ping timeout: 480 seconds]
<alyssa>
I must say, on the list of things I wanted to do today, I don't think "debug lock splat" made the cut .....
<alyssa>
but I suppose the relevant bugs affect much more than just OpenCL
<jekstrand>
Yeah, the OpenCL CTS likes to torment your threading
<alyssa>
OK, I have splat!
<alyssa>
jekstrand: OpenCL CTS + Karol's runner for squaring the torment
<alyssa>
this is really deep in kernel guts, but at least I have helpful debug info
<alyssa>
drm_gem_get_pages called shmem_read_mapping_page
<alyssa>
which uses the GFP from the mapping passede
<alyssa>
this mapping is seemingly GFP_KERNEL
<alyssa>
so that's part 1
<alyssa>
part 2 is investigating the preemption disable
<alyssa>
logs say it was disabled in get_page_from_freelist
<alyssa>
that's the shrinker, I guess
<robclark>
alyssa: I've seen some folio splats like that on -rc2.. hmm, but also w/ some of my own patches that do more eviction/shrink.. that said other than the shrinker connection it seems like unrelated bug?
<alyssa>
robclark: IDK, I'm way over my head here
<alyssa>
I don't understand where get_page_from_freelist is called from, and why it disables preemption
<robclark>
it's called in page allocation path.. which can be basically anything that can allocate memory.. but things like GFP_ATOMIC should be used in allocation paths when you hold spin locks and things like that
<robmur01>
FWIW first thing I'd do is try something newer than rc2. There have definitely been... issues... this cycle - rc3 didn't even boot for some of us
<alyssa>
OK
<alyssa>
sounds like an "interesting" rebase, currently on some downstream hell because mainlining for this SoC is stalled...
<alyssa>
what would you recommend I rebase against?
<robclark>
that doesn't look like something that should be atomic.. and yeah, only reason I'm on -rc2 is because that is what drm-next is on and msm-next can't be ahead of drm-next
<alyssa>
robclark: hm? (the first sentence)
<robmur01>
I'd expect rc5 to be a bit more solid
<robclark>
you can sprinkle might_sleep() around the call-stack
<alyssa>
(this branch is linux-next 20220614 plus ~200 patches, mostly SoC specific, really delightful actually ....)
<alyssa>
(admittedly a lot of this seems specific to mt8195 and is maybe not needed on mt8192)
<robclark>
try git-rebase first and see how badly it goes.. usually there isn't as much churn btwn -rc's compared to trying to rebase across a merge window
<alyssa>
rebase on..?
<alyssa>
oh, rc5, er ok
<robclark>
oh, that said, the splat actually tells you:
<alyssa>
we'll see how rc5 fairs instead of linux-next, assuming no SoC support slipped through the cracks of commits in next \ rc5
<robclark>
oh, linux-next .. is a great way to beta test everyone else's bugs ;-)
<alyssa>
truth.
<robmur01>
sounds like that one started in next-20220614 and lived for maybe a day or two - such great luck you have there!
<alyssa>
robmur01: truthfully.
<robmur01>
yup, my general rule of thumb would be run -next if you want to find bugs in -next, run mainline before about rc4 to check for critical bugs, run late RCs or release tags to do any actual development work
<alyssa>
that sounds sane.
<robclark>
yeah, same.. I try to stick to mainline when developing my own bugs and regressions :-P
<robmur01>
developing on -next might count as some form of self-flagellation
<robclark>
yeah
alyssa has quit [Quit: leaving]
alyssa has joined #panfrost
<alyssa>
`Purging 1275068416 bytes`
<alyssa>
That's a lot of bytes :|
<alyssa>
Ok, so it gets a little furhter after the uprev, more splat though coming right up
<alyssa>
that part is very clearly in panfrost, though
<robclark>
hmm, 0x4c000000 bytes.. is a fairly roundish #
<alyssa>
might be able to get it myself
<alyssa>
robclark: TBF might just be the CL CTS being dumb
<alyssa>
although refusing to cache BOs above a certain size might be wise.
<robclark>
oh, yeah, I think we cap it at 64MB
Major_Biscuit has joined #panfrost
<alyssa>
robclark: btw, any plans to do conformant cl 3.0 on freedreno?
<robclark>
hmm, doesn't cl3 want you to have annoying things like generic pointers?
<alyssa>
Seemingly not
<alyssa>
cl3 made optional a pile of stuff that was mandatory in cl2
<alyssa>
because that's not confusing or anything
rkanwal has quit [Ping timeout: 480 seconds]
<robclark>
at any rate.. cl is firmly in the category of "I poke at it from time to time on weekends, and not a thing $day_job cares about at all"
<alyssa>
got it
<alyssa>
that's m1 for me, so. :p
<robclark>
there is some work for clvk but (which IMO.. cl on vk still has some, umm, gaps).. but we apparently don't want to ship any native cl drivers
<alyssa>
no?
<robclark>
we apparently don't like things that aren't vk ;-)
<alyssa>
right.
<robclark>
idk, situation might be different if amd and intel had production quality mesa based cl stacks
<robclark>
but I can't argue against not having more vendor gpu stacks.. intel's non-mesa video stack is bad enough ;-)
<alyssa>
yeah
<anarsoul>
well, it works
<anarsoul>
I assume you're talking about video-decoding stack
<alyssa>
mm, tasty tasty circular locking
<robclark>
anarsoul: vaapi? I think we have at least three different versions of it depending on which intel chip you are talking about.. it's a mess
<anarsoul>
yet it works (at least in firefox)
<anarsoul>
but yeah, I agree that overall videodecoding stack in linux is a mess
<alyssa>
Pass 2290 Fails 16 Crashes 6 Timeouts 0
<alyssa>
so >99% by a hair. I'll take it.
<alyssa>
most of the fails are math_brute_force .... delightful ....
<alyssa>
crashes seemingly are more kernel bugs
<anarsoul>
cl kernel or linux kernel? :)
<alyssa>
linux for the crashes, cl for the fails
Major_Biscuit has quit [Ping timeout: 480 seconds]