<fionera[m]>
<sven> "but that’s possible on these..." <- this was what I wanted to ask ^^ Was curious why parallels on macos doesnt support it
<Lucy[m]>
Does macOS even allow that?
darkapex has joined #asahi
<steev>
i don't have anything that's actually thunderbolt, but i thought it did fionera[m] - https://kb.parallels.com/en/124266 says you can use e.g. an eGPU
<steev>
ahh directly
<fionera[m]>
yeah i would like to have the raw pcie device in the vm :D
<fionera[m]>
tho thats probably smth that macos blocks
chadmed has joined #asahi
nicolas17 has quit [Quit: Konversation terminated!]
nicolas17 has joined #asahi
nicolas17 has quit [Quit: Konversation terminated!]
<MichaelMesser[m]>
I assume that would apply to VMs as well.
<marcan>
that parallels link is for intel machines
Ry_Darcy has joined #asahi
nico_32 has joined #asahi
<marcan>
M1 VMs on macOS do not support passthrough of anything and definitely not eGPUs
<MichaelMesser[m]>
macOS on Intel doesn't support passthrough either
<marcan>
ah, that link is only about paravirt graphics, not passthrough
<marcan>
it's technically possible to make eGPUs work (with performance limitations) on M1 devices, but it's like a whole driver refactoring/development project to pull it off in a way that can work
<marcan>
so you'd have to find someone well versed in graphics and they'd likely have to spend months working on it, for each GPU driver/vendor, and then everything needs to be upstreamed, and upstream might not like the (rather intrusive) changes it brings
<marcan>
likely needs userspace mesa changes too
<marcan>
and you'd still end up with pathologically bad performance for some workloads
<marcan>
VMs don't change any of that (the guest still needs the same driver changes)
<marcan>
technically with a VM you could hook all BAR accesses and make it "work" with no changes to the guest, but performance would be hilariously bad
<marcan>
like 100x slower bandwidth than native VRAM accesses bad
<marcan>
would be trivial to test that with the m1n1 hypervisor once thunderbolt/pcie stuff is in asahi if someone wants to find out just how bad it would be
<marcan>
mostly just completing unaligned r/w support to make it work
Ry_Darcy has quit [Remote host closed the connection]
Ry_Darcy has joined #asahi
<MichaelMesser[m]>
Does anything other than eGPUs use this feature?
<marcan>
not to my knowledge
<marcan>
potentially weird GPU-like things (like FPGA boards with lots of onboard RAM) could
<marcan>
but the problem with GPUs is that making them work "properly" the sane way involved potentially modifying every app/game
<marcan>
GPUs are really the only device where this is exposed to a huge variety of applications
<marcan>
if you have something more niche you can just work around the problem in the driver/app much more easily
<marcan>
*involves
<MichaelMesser[m]>
So if Apple does not care about eGPUs, this is unlikely to be fixed in later hardware?
<Lucy[m]>
Any way to know if it's fixed in the M2? Probably not yet, right?
<chadmed>
i wonder if they even consider it a bug at all
<marcan>
this isn't unique to the M1, it's the same on may other ARM systems
qeeg has quit [Remote host closed the connection]
<chadmed>
yeah i doubt theyll fix it
<marcan>
it's not a "bug", it's software and hardware assuming everything is like x86
<marcan>
they *could* make it like x86 to make this work, but it not being like x86 isn't a bug
<chadmed>
not having eGPU support also encourages software vendors to fix their applications to work better on AGX rather than direct customers to just go buy GPUs that
<chadmed>
"work"
<marcan>
that too
qeeg has joined #asahi
<Lucy[m]>
Didn't you say that performance would be impacted though?
<marcan>
the workaround you'd have to do to make it work impacts performance
<chadmed>
and even if they did make device memory look like x86, i doubt many PC consumer grade cards would work out of the box because of the weird Intel-specific DMA address filtering which itself would require software workarounds that would nuke performance in any case
<chadmed>
so theres no point fixing any of this
<Lucy[m]>
Right. Makes sense then
<chadmed>
this has all already been experienced on ppc64 workstations
<chadmed>
very very few pcie cards work properly on those, theres like 3 consumer grade cards in total that work as intended on talos boards for example
<chadmed>
its going to be an interesting tug of war once arm platforms start becoming commonplace on less "integrated" stuff than apple's
<chadmed>
who will capitulate first? the add in cards or the soc vendors
<marcan>
I think we actually have yet to still 100% validate that this is not supported for PCIe with any memory type other than Device-nGnRE
<marcan>
I just looked up the ARM spec for PCIe integration, and it says Normal-NC is allowed (but not Normal)
<marcan>
we've never tried Normal-NC, just Normal
<marcan>
I'm not holding my breath but it's worth checking
<marcan>
might do it a bit later
<marcan>
()
<marcan>
(Normal-NC would make eGPUs work sanely)
<chadmed>
are cards expected to gracefully handle partial writes on x86?
<marcan>
yes
the_lanetly_052 has joined #asahi
<kettenis>
as far as I know NVIDIA uses Device-GRE mappings instead of Normal-NC in their (proprietary) graphics stack
<marcan>
that won't work with userland that tries to make unaligned accesses to VRAM
<marcan>
which some userland does
<kettenis>
it'd be slow since the kernel would emulate the misaligned loads and stores
<kettenis>
and presumably NVIDIA's userland code doesn't do this
<kettenis>
(I mean unaligned accesses to VRAM from userland)
<marcan>
it's up to individual applications, not nvidia's userland code
<marcan>
e.g. memcpy will often do unaligned accesses for performance
<marcan>
that's why we can't just fix this in drivers
<marcan>
there is no way unaligned emulation will perform well when literally memcpy hits it
<kettenis>
well, does OpenGL/Vulkan/CUDA allow mapping VRAM directly into userland
<kettenis>
or is that an Intel/AMD extension?
<marcan>
I think every GPU driver allows that?
<kettenis>
the usb ones certainly don't ;)
<marcan>
do USB GPUs exist, besides that abomination Lina is working on?
<marcan>
(DisplayLink is not a GPU, it's a display controller)
<kettenis>
I was thinking of displaylink as an example of something that doesn't expose a VRAM framebuffer
<kettenis>
but yeah, not really a GPU
<marcan>
yeah, this isn't about framebuffers, those end up being driver-managed; it's about things like texture buffers, VBOs, etc.
<kettenis>
I'm not familliar enough with the various graphics APIs, but at least some of the OpenGL functionality to directly map stuff like that from VRAM is an extension that drivers don't have to implement
<kettenis>
ah, well, KVM throws in a whole other complication
<kettenis>
and usually these discussions happen in the context of the open-source (Mesa) graphics stack
<kettenis>
which defenitely uses memcpy and assumes that non-aligned access works
<kettenis>
it'd be interesting to see what an Apple Silicon Mac Pro looks like and whether Apple is going to support PCIe GPUs in those
<marcan>
still need to test that normal-NC really does not work
bisko has joined #asahi
<marcan>
nope, does not work, and in fact you get a nice AMCC panic from macOS
<marcan>
panic(cpu 0 caller 0xfffffe0013443760): "AMCC PLANE3 PIO request with RO flag set error: INTSTS 0x0000000000400000 AFERRLOG0/1/2/3 0x101000/0x17f1406/0x2000000/0x40001 ADDR 0x600101000 CMD/SIZE/TYPE 0x14(CifNCWr)/0x7f/0x1 AID/TID 0x10/0" @AppleT8101PlatformErrorHandler.cpp:1323
<marcan>
presumably RO=reorder
<marcan>
nGnRnE and nGnRE both work for PCIe BARs
<marcan>
not sure how nGnRnE is supposed to actually work since PCIe writes are posted, but at least the fabric does not complain
bisko has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
<kettenis>
with nGnRnE you simply don't get the benefits of posting at the CPU level
<marcan>
yeah, it's just still posting behind the scenes which seems a bit odd
<kettenis>
compatible with x86 ;)
<marcan>
heh
<marcan>
I really need to reverse engineer that AMCC error handler stuff, it'd be very useful
<kettenis>
wonder wether GRE or GnRE works
<marcan>
(I only have some stub stuff in m1n1 experiments)
<kettenis>
since G is what you really care about for mapping "prefetchable
<kettenis>
" PCI bars
<kettenis>
but even then, it is probably not worth it spending time on making eGPUs work
<marcan>
kettenis: tested they do
<marcan>
but yeah, won't fix the alignment problem
<marcan>
all Device modes work for aPCIeC, but Normal modes do not
<kettenis>
so that means that all pci drivers that use ioremap_wc() to map a prefetchable BAR won't work on these machines
mikoxyzzz has joined #asahi
mikoxyzzz is now known as miko
miko has quit [Quit: WeeChat 3.5]
miko has joined #asahi
miko has quit []
miko has joined #asahi
Ry_Darcy has quit [Remote host closed the connection]
the_lanetly_052 has quit [Ping timeout: 480 seconds]
c10l has quit [Quit: Bye o/]
c10l has joined #asahi
the_lanetly_052 has joined #asahi
the_lanetly_052 has quit [Ping timeout: 480 seconds]
miko has quit [Quit: WeeChat 3.5]
gabuscus has joined #asahi
<fionera[m]>
Oh holy I really started a discussion about that :) I was interested because I was wondering if I can pass it to a VM. Either Windows or Linux and not only GPUs but also other PCIe Devices like Network Cards. (Yes I use my eGPU case to test enterprise NICs on my Laptop)
c10l has quit [Quit: Bye o/]
c10l has joined #asahi
<kettenis>
the answer is pretty much no for linux (either passthrough or native)
miko has joined #asahi
<milek7_>
GL_MAP_COHERENT_BIT is required from gl 4.4
<milek7_>
before that I think it is possible to get away with intermediary buffer in the driver
chadmed has quit [Ping timeout: 480 seconds]
chadmed has joined #asahi
MajorBiscuit has joined #asahi
amarioguy has joined #asahi
<unrelentingtech>
<chadmed> "and even if they did make device..." <- what is the intel specific dma address filtering thing? amdgpu generally just works on non-broken aarch64 with no workarounds