<sven>
rqou_: 4k kernels will never work with vfio (or anything else that uses the iommu api directly instead of going through dma-iommu)
kameks has joined #asahi-dev
the_lanetly_052___ has joined #asahi-dev
<rqou_>
why not though? the vfio api seems to be able to expose iommu page sizes that differ from cpu page sizes
the_lanetly_052__ has quit [Ping timeout: 480 seconds]
<kode54>
kernel doesn't yet support mixed page sizes
alexsv has quit [Remote host closed the connection]
<kode54>
otherwise this 16k/4k issue would be better dealt with by supporting that and having separate 16k/4k processes
alexsv has joined #asahi-dev
<rqou_>
right, sven wrote something that can support mixed iommu/cpu pages sizes
<rqou_>
i don't understand why it can't work with vfio though
<sven>
you’d have to audit all the code to make sure nothing makes the implicit iommu page size = cpu page size assumption
<sven>
that’s easy enough for dma-iommu but gets annoying for everything that uses the raw iommu api
<rqou_>
seems like it should be possible for vfio?
<rqou_>
vfio exposes iommu page sizes in the userspace api
<sven>
I don’t know the api well enough
<rqou_>
neither do i lol, i just used it for the first time today
<rqou_>
but VFIO_IOMMU_GET_INFO returns a struct containing a bitmap of supported iova page sizes
<rqou_>
so this looks like it should be theoretically possible
<sven>
you’d have to check if anyone even looks into that and handles the iommu > cpu case
<rqou_>
well, i tried, but it fails before it can even get there
<sven>
that could also be meant for iommu < cpu or for large page support and all that
<rqou_>
i think it's probably for large page support, but it doesn't actually say so explicitly
<sven>
and then it’s a user space api so who knows what people do with it
<rqou_>
as far as i understand it, returning an iommu page size > cpu page size and then failing when attempting to map an improperly aligned page does not violate the api specification
<rqou_>
but who knows what userspace does
<rqou_>
and by userspace afaict i just mean "qemu"
<sven>
right, the same is true for the raw iommu api but I’m still sure lots of random things will break
<sven>
I’d at least hide that case between an explicit opt-in
<rqou_>
fair enough
<sven>
*behind
<rqou_>
won't be the first time vfio has a bunch of random scary opt-in config options
<sven>
Or just not support it at all :D
<rqou_>
like the option that outright ignores improperly configured iommu
<sven>
dma-iommu makes sense to allow distro installers to boot and for stuff like box64/fex
<rqou_>
at least on x86, vfio has a bunch of enthusiast users doing totally wild things
<sven>
If you want something special like vfio you’re an enthusiast and can just switch to 16k kernels 🤷🏻♂️
<rqou_>
yeah, i did that for testing, but i'm hitting a different issue unrelated to any of this
<rqou_>
btw 16k kernels break sublime text (because jemalloc)
<sven>
(Or apply a scary patch/config options)
<rqou_>
vfio even on a 16k kernel still has the problem with "unsafe interrupt remapping"
<sven>
yes, and hopefully by shipping 16k kernels people start fixing their code
<rqou_>
but i have no idea what that actually means or how interrupts actually work on pcie
<rqou_>
there's an option to tell vfio to ignore that, which works
<rqou_>
idk if that's something we can get properly supported
<sven>
hm, I’d have to actually understand vfio and see what’s up with that
<sven>
but I really just know that vfio exists and roughly what it does :D
<rqou_>
i've used it for "enthusiast" gpu passthrough on x86 (mostly works)
<rqou_>
and like i said, i was testing it on m1 hoping i could prototype a driver in userspace
<sven>
the scary patch to make vfio work is to just add that iommu large page whatever bit to the define for a non-dma-iommu domain
<sven>
where work probably means “will crash somewhere”
<rqou_>
lol
<rqou_>
btw there are also scary/cursed high performance networking use cases where people run userspace pci drivers
<rqou_>
idk if anybody would be nuts enough to try that on m1
<rqou_>
once we have tbt it's possible!
<rqou_>
anyways, i'm not familiar with the linux dma-iommu api, but vfio's interface is quite high-level and "should" be possible to get working with mismatched page sizes with "just" some coding
<rqou_>
but not super important
jluthra has quit [Remote host closed the connection]
jluthra has joined #asahi-dev
kameks has quit [Ping timeout: 480 seconds]
<kettenis>
kevans91: LOL, yes, that can't be right (it seems to work just fine though)
<kettenis>
thanks for spotting that
xiaomingcc[m] is now known as xiaoming[m]
kgarrington has joined #asahi-dev
the_lanetly_052___ has quit [Ping timeout: 480 seconds]
kgarrington has quit [Remote host closed the connection]
SSJ_GZ has joined #asahi-dev
the_lanetly_052___ has joined #asahi-dev
kameks has joined #asahi-dev
the_lanetly_052___ has quit [Ping timeout: 480 seconds]
robinp_ has joined #asahi-dev
robinp has quit [Ping timeout: 480 seconds]
ChaosPrincess has quit [Quit: WeeChat 3.4.1]
ChaosPrincess has joined #asahi-dev
<sven>
rqou_: the problem really is that client/userspace drivers won’t expect that weird page size mismatch
<sven>
it’s the same for the iommu api: clients do get access to the iommu granule but they’ll likely make assumptions that are no longer true in our case
<sven>
so the solution is to make that weird case an explicit opt-in. That’s essentially what my patch does, and then dma-iommu uses that opt in
herbas has joined #asahi-dev
herbas has quit []
kameks has quit [Ping timeout: 480 seconds]
kameks has joined #asahi-dev
pilonsi has joined #asahi-dev
pilonsi has quit [Quit: Leaving]
yuyichao has quit [Quit: Konversation terminated!]
yuyichao has joined #asahi-dev
yuyichao has quit [Quit: Konversation terminated!]
yuyichao has joined #asahi-dev
yuyichao has quit [Ping timeout: 480 seconds]
jnn has joined #asahi-dev
c10l7 has quit []
jn has quit [Ping timeout: 480 seconds]
yuyichao has joined #asahi-dev
pilonsi has joined #asahi-dev
pilonsi has quit [Quit: WeeChat 2.3]
pilonsi has joined #asahi-dev
florolf has joined #asahi-dev
kameks has quit [Ping timeout: 480 seconds]
the_lanetly_052___ has joined #asahi-dev
jnn is now known as jn
hectour[m] has left #asahi-dev [#asahi-dev]
phire_ is now known as phire
<kevans91>
kettenis: no problem, thanks for the good reference drivers :-)
opticron has quit [Ping timeout: 480 seconds]
minecrell has quit [Quit: Ping timeout (120 seconds)]
minecrell has joined #asahi-dev
opticron has joined #asahi-dev
<kevans91>
kettenis: it looks like you can kill sc_cpuremap over in aplintc, it's write-only as of fast ipi support
kov has joined #asahi-dev
MajorBiscuit has joined #asahi-dev
<rqou_>
sven: oh ok, that makes sense. thanks for that explanation
MajorBiscuit has quit [Ping timeout: 480 seconds]
<kettenis>
kevans91: it will be needed again when I add support for running interrupts on other cores
<kettenis>
so I've left it in place for now
doggkruse has joined #asahi-dev
doggkruse has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
SSJ_GZ has quit []
amarioguy has joined #asahi-dev
yuyichao has quit [Ping timeout: 480 seconds]
<kevans91>
kettenis: ah, makes sense- I'll resist the urge to rip out our equivalent, then =-)
nicolas17 has joined #asahi-dev
amarioguy has quit [Ping timeout: 480 seconds]
yuyichao has joined #asahi-dev
yuyichao has quit []
yuyichao has joined #asahi-dev
yuyichao has quit [Quit: Konversation terminated!]