<axboe>
the sparseness ends up being an issue for the spin_table
<axboe>
which the hack works around
<axboe>
so might work with stock dtb
<jannau>
I thought you tried with the unmodified dtb. I use https://paste.debian.net/1231019/ to simulate the sparseness and that works with my patch and the stock dtb
<axboe>
jannau: for your patch I used unmodified
<axboe>
jannau: like you asked :)
<axboe>
jannau: see the debug output I badly typed
<jannau>
ok. I saw that but I can't explain how the change could cause that
user982492 has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
Guest119 has left #asahi-dev [WeeChat 3.3]
<sven>
axboe: do you want to submit that tipd interrupt mask fix or should I do it?
<axboe>
sven: either way is fine with me, just let me know what you prefer
<sven>
if you already have a patch that’s working it’s probably the easiest to just submit it
<axboe>
ok, will do so right now
<axboe>
sven: adding an actual changelog :)
<axboe>
sven: do you want a suggested-by in there?
<sven>
:D
<sven>
uh, sure
<axboe>
credit where credit is due
<axboe>
Sven Peter <sven@svenpeter.dev>
<axboe>
that your preferred email?
<sven>
yup
<axboe>
shipped
<sven>
nice :)
user982492 has joined #asahi-dev
user982492 has quit []
<axboe>
I'm all for flushing out patches, hate carrying them
<alyssa>
here here
<j`ey>
so with the tps issue, it was a DEFER_PROBE thing or?
<axboe>
I know you guys all know this, but I'm still blown away by the perf here. game changing imho
<j`ey>
axboe: we're a bit late on the carrying patches part :P
<sven>
yeah… we’re carrying way too many patches right now.
<axboe>
j`ey: I know! but can always get done ;)
<j`ey>
sven: 157 patches now
<sven>
:(
<sven>
At least I’m only responsible for maybe 10
<j`ey>
sven: Im hopeful!
<axboe>
with a big backlog like that, useful to just try and flush out some of the independent ones. at least that makes you feel like you made progress ;)
<jannau>
sven: t6000-dart could use a ping for robin
<sven>
I already pinged him off-list last week or so
<sven>
I guess I could also re-submit that ugly usb hack
<axboe>
sven: oh, reminds me, didn't find any time to fiddle with shared tags yet
<axboe>
sven: as a temporary improvement, adjusting AQ to 2 would make sense imho
<axboe>
I've only once seen higher AQ depth be useful, and that was under provisioning and with discard sizes being smaller where then having a higher admin queue depth made a difference
<axboe>
sven: speaking of patches that could be pushed out... ;)
<axboe>
sven: upstream would look kindly at the apple-nvme driver, as it can be combined with a "remove various weird apple workarounds or quirks" in the generic one
<sven>
oh, I agree. But nvme requires this rtkit protocol that’s also required for smc. I’m just waiting for marcan to finish that part
<axboe>
ah ok, dependencies
<sven>
apple-nvme right now only works for platform devices fwiw
<axboe>
ok, so we can't drop the generic quirks
<axboe>
(not a huge issue imho, it's not very invasive)
<axboe>
it's more of an hch'ism
<sven>
:-)
<axboe>
as long as they stay out of the fast path, then I don't really care
<sven>
I considered adding a pcie backend to apple nvme but that driver is imho already big enough for the initial submission
<j`ey>
.. did I just realise that the first 'h' stands for Herr?
<axboe>
sven: I think the platform one is a good start
<axboe>
j`ey: if we ever end up at the same conference, I'll tell you the story of the time hch took my kids to watch the lego movie
<j`ey>
haha ok :D
<sven>
yeah, it’s also been tested by a few people for at least weeks (and sometimes even months) now without any major issues
<sven>
we’ll have to figure out how to deal with the shared rtkit dependency but I guess in the worst case we just create an immutable branch somewhere so that both trees can merge it after review
<j`ey>
jannau: for the atomic thing did you actually get the sleep-during-atomic BUG, or just from manual inspection?
<jannau>
I hit a sleep during atomic BUG
<kettenis>
I hope the bindings we came up with will be fairly uncontroversial since they largely just use generic stuff
<axboe>
sven: two subsequent boots I've run into aq timeouts
<axboe>
haven't seen that before
<sven>
huh, don’t think I ever saw that one either
<axboe>
oh sorry, io queue
<axboe>
hmm weird
<jannau>
the same I saw with the old cpufreq patches?
<axboe>
times out tag 10, then 2, then 3 and resets controller
<axboe>
30 seconds between each
<axboe>
boots fine, log into X and start a bunch of things
<sven>
guess I jinxed it by claiming no one had any major issues ;)
<sven>
so it doesn’t even completion poll the commands but they actually time out?
<axboe>
it does
<axboe>
hang on, took a pic
<axboe>
completioned polled for all 3, on last one it resets the controller, reset fails, IO errors ensue
<axboe>
let me know if you want to see the pic
<sven>
sure
<sven>
it should only reset when completion polling didn’t find that command
<axboe>
could just be timing and I'll still hit it, but seemed to be trivial to hit before and seems solid now
<sven>
strange, I can reproduce it with the cpufreq driver as well
<sven>
and adding that lock seems to fix it too
<sven>
i just don’t understand why
timokrgr has quit [Quit: User left the chat]
<axboe>
sven: trying to come up with a theory :)
<axboe>
but having unserialized issue seems like asking for trouble
<axboe>
though not quite sure why it needs to be serialized against irq
<axboe>
how relaxed is the arm memory model?
<kettenis>
very relaxed
<sven>
very. that writel has a dma barrier though
<axboe>
ok
<axboe>
we _might_ just need an appropriate barrier so that it's visible on a different CPU
<axboe>
but not sure how that's even easily doable without serialization anyway
<axboe>
then again, not an ordering expert, I can always convince myself either way...
<axboe>
I'd sugggest just going with the lock + irq disable around the issue for now
<sven>
yeah, same here.. time to watch will deacon’s io ordering talk again ;)
timokrgr has joined #asahi-dev
<axboe>
sven: cuddle up with a cup of your favorite beverage and open Documentation/memory-barrier.txt ;)
<sven>
I assumed that writel to MMIO should be enough. It’s dma_wmb which should order it against the previous writes to normal memory and then just a single STR
<sven>
:D
<axboe>
sven: my worry was between CPUs
<axboe>
issue on N, irq right after on N+1 - seems far fetched though
<sven>
hmmm… I wonder if there’s anything in the irq handler that could mess something up in that case
<axboe>
the spin unlock is a barrier, and as irq grabs the lock too first, I think it'd be safe
<axboe>
iirc you grab the lock around the whole irq handling
<sven>
yeah, I’m pretty sure most of the irq handling is inside the lock
<axboe>
another thing I just noticed, there seems to be excessive tcb clearing in there
<axboe>
since we memset at alloc, do we really need to clear in both inval and submit?
<sven>
ah, no
<sven>
once should be enough
<axboe>
seems we can kill both of those, and just set flags to zero
<sven>
true, that’s even better
<axboe>
I've got a patch for that too :)
<axboe>
and now booted
<sven>
i still wonder what exactly messes up the submission if it’s not inside a lock though
<axboe>
sven: one idea
<axboe>
you have two issues going on, one is half way through memcpy of command, other happens faster and submits the tag
<sven>
the memcpys should be to two different tags though
<axboe>
not sure how the nvme hw side works if first tag isn't written yet
<axboe>
yep
<axboe>
but only explanation I can find is that we trigger an interrupt for completion and don't find it, then it gets polled later
<sven>
the hw first looks into the tcb at the specified tag, then reads the actual index into the “queue” (which is really more like an array here) and then reads the queue at that index
<sven>
I just set index = tag so that I don’t have to deal with a tcb index and a queue index
<axboe>
sven: assuming you're keeping the ->enabled state for adding suspend, because I don't think it's needed right now
<sven>
i think the enabled originally comes from NVMEQ_ENABLED in pci.c. i convinced myself at some point that i didn't need it but then went back a few days later because i convinced myself that i do need it and then decided to keep it for now just-in-case IIRC
<sven>
and that irq handling indeed looks better :)
<axboe>
yeah I don't think you need it right now, but for suspend you probably will. so I left it alone
<axboe>
I'll toss you that tcb patch too in a bit
<axboe>
you can just fold in or however you prefer, I don't need attribution since it's not in tree yet