ChanServ changed the topic of #asahi-dev to: Asahi Linux: porting Linux to Apple Silicon macs | Non-development talk: #asahi | General development | GitHub: https://alx.sh/g | Wiki: https://alx.sh/w | Logs: https://alx.sh/l/asahi-dev
jovahd has joined #asahi-dev
jovahd has quit [Quit: WeeChat 4.0.4]
jeisom has joined #asahi-dev
as400 has quit [Remote host closed the connection]
as400 has joined #asahi-dev
sawyer has quit [Quit: sawyer]
ourdumbfuture has quit [Quit: My Mac has gone to sleep. ZZZzzz…]
gabuscus has quit []
chadmed has quit [Remote host closed the connection]
jeisom has quit [Ping timeout: 480 seconds]
lena6 has joined #asahi-dev
gabuscus has joined #asahi-dev
malfunction54 has joined #asahi-dev
tristan2_ has joined #asahi-dev
tristan2 has quit [Ping timeout: 480 seconds]
Graypup__ has quit [Quit: meow]
Graypup_ has joined #asahi-dev
<marcan>
jannau: with my genpd defer patches, it should work if you don't give the genpd to simpledrm as long as dcp owns it and *it* knows how to handle it. however, if you remove DCP from the device tree, just simpledrm will break without explicit multi pd handling.
Graypup_ has quit [Quit: meow]
Graypup_ has joined #asahi-dev
<jannau>
this is with display/dcp disabled since they must not take over the framebuffer
<jannau>
it will not be a problem for dcp in the final state as the phy will get it's own node and everything has a single power-domain
<jannau>
adding the code to simpledrm shouldn't be a problem. it has already code to handle multiple clocks and regulators
faustine has joined #asahi-dev
chadmed has joined #asahi-dev
chadmed has quit []
chadmed has joined #asahi-dev
faustine has quit [Quit: Lost terminal]
crabbedhaloablut has joined #asahi-dev
<marcan>
yup
<marcan>
in a future where m1n1 can do atcphy stuff and point the display at multiple places, it probably makes sense for it to add the relevant PDs to the framebuffers dynamically
<jannau>
why is t602x' ps_dptx_phy_ps always-on? is it not blocked on t602x laptops and disabling it breaks the display? If that's the case we should either add it to dcp's or the panel's power-domain
ellyq has quit [Read error: Connection reset by peer]
mps has quit [Quit: leaving]
compassion1785 has quit [Ping timeout: 480 seconds]
compassion1785 has joined #asahi-dev
eiln has joined #asahi-dev
mps has joined #asahi-dev
<marcan>
jannau: probably yes, back when I did bringup on that I assume I saw it was breaking the display and did the quick fix.
<marcan>
maz: so we have another fun challenge now. turns out running x86 games in a 4K VM on a 16K host is actually doable and already proven to work well. that neatly sidesteps the whole 4K kernel pain for us, and might easily be the way to go at this point.
<marcan>
but, that means we need TSO for KVM. right now it's a prctl on the host (not sure if you saw that patch) and I assume it would work as-is to enable TSO globally in the VM if KVM just keeps that state untouched from process context.
<marcan>
but of course it would be more efficient to only have TSO in the VM where needed, which would mean forwarding that control to the guest. in principle, AIUI the guest can directly twiddle that bit without any VM exits, but we have to explicitly allow that and it would be IMPDEF of course.
<marcan>
alternatively we could have higher level hypervisor calls to forward this to prctl without dropping all the way to qemu or whatever, since I assume a vm exit (or more) to qemu for every context switch in the guest is a really bad idea
<marcan>
actually I'm not sure if the guest ACTLR behaves as intended already, since apple came up with an IMPDEF ACTLR_EL12... I feel like I looked into this already but I'm not sure any more. it might be that right now EL1 always gets default ACTLR (ACTLR_EL12) even if EL2 set something in its ACTLR_EL2, regardless of what is enabled in HACR_EL2.
<marcan>
so we might at the very least need to copy ACTLR_EL12 (impdef) <= ACTLR_EL2 to make the prctl work "automatically" for kvm host processes
<marcan>
(I know you're not going to like any of this, but I hope we can come up with at least a least-horrible solution because, really, using TSO is non-negotiable here, the perf boost for x86 emu is major)
<marcan>
I'd be happy with enabling TSO globally in the VM (Apple already do this for Rosetta on Linux VMs on macOS anyway), we just need to signal it to the guest somehow so FEX can pick it up.
eiln has quit [Quit: WeeChat 4.0.4]
eiln has joined #asahi-dev
hightower3 has joined #asahi-dev
hightower4 has quit [Ping timeout: 480 seconds]
cy8aer has quit [Remote host closed the connection]
roxfan has joined #asahi-dev
cy8aer has joined #asahi-dev
<maz>
marcan: the one thing I don't want to create is some new form of ABI at the userspace or hypercall level upstream. I know you went down that way in the Asahi tree, and that's fine by me as long as you keep it there. *how* you make it work is interesting though. If you have a bit of a spec/write-up that describes the various controls in the AUX regs at both ELs.
<maz>
... I'd be interested in reviewing it.
jeisom has joined #asahi-dev
<eiln>
rebased and pushed the just-works heap mapping
<eiln>
marcan: I thought unk_size was firmware size too, but 0x180000 worked on 13.5 (0x100000). turns out unk_size only has to be greater than the current size, which the 0x180000 unknowingly covered.
<eiln>
sorry I wasn't clear. it doesn't have to alloc bottom-up anymore. but it'd be helpful (for me) to leave as is until we confirm the firmware boot mess across all variants. I also haven't checked vm_size for all dts. (rightfully) moving to iova.c is trivial, and I'll handle it right after
<eiln>
how far do your other machines go before halting? :P
<marcan>
maz: I mean, eventually this should be upstreamed in *some* form. Especially if we're going down the VM route, it will be incredibly silly to require downstream patches on the guest kernel when hardware support is otherwise not an issue.
<marcan>
I'm not particularly keen on the "this will be downstream forever because upstream won't take it in any form" outcome :/
eiln has quit [Quit: WeeChat 4.0.4]
<marcan>
re ACTLR, I just realized I do know how this works with EL1 since obviously I've tested this in m1n1 HV. I'll write a quick spec.
eiln has joined #asahi-dev
<eiln>
wait my key expired
<marcan>
eiln: will look at camera again a bit later, going to sort out this TSO thing first :p
<eiln>
sounds good, because I messed up my signature lol
<marcan>
so basically, for VM-wide TSO, we need to poke ACTLR_EL12 (e.g. copy it from ACTLR_EL2 to inherit whatever the host configured for that process, or something else)
<marcan>
and then ideally signal the VM somehow, so software running in the guest can know it's in a TSO environment
<marcan>
for exposing TSO to the VM dynamically, we either disable trapping that reg (and AIDR) and signal it somehow, or we provide some kind of hypercall to poke it.
<marcan>
bit 4 sticks some extra x86 flags in CPSR (I think?)
<marcan>
not sure what the rest of the bits do exactly. those 3 are the ones we care about for x86 emu, and TSO is by far the most important one.
<marcan>
sorry not CPSR, APSTATE I think? which I think is IMPDEF
<marcan>
supposedly NZCV which should map to CPSR bits unless they added state somewhere else...
<marcan>
ah, it's in NZCV bits 26-27 and then save/restore on exceptions is via ASPSR_EL1 bits 1-2 (which is also IMPDEF).
<marcan>
which I think apple actually calls APSTATE_EL1? so maybe it's not exceptions? let me check...
<maz>
if you don't need the dynamic flip of TSO inside the guest, then it 's pretty easy to do, and we could probably stick that in a module -- just hook whatever you need to do in the arch-specific vcpu_load/vcpu_put helpers (the module interface itself is to be created).
<marcan>
nah it's definitely SPSR semantics, not PSTATE semantics
<marcan>
I'll keep my name then (some of the apple names are terribad)
<marcan>
maz: well, we all know how stable the Linux kernel APIs are and how sustainable modules are long-term...
<maz>
they've been around for 30 years.
<marcan>
modules yes
<maz>
and indeed, I don't plan to have a stable ABI.
<maz>
that's the distros problem.
<marcan>
yeah, and the point of upstreaming is... not having that problem :)
<maz>
quite.
<maz>
but this still breaks a lot of things: migration of such a guest is fscked.
<marcan>
sure, but I think it's fair to say if you start enabling IMPDEF features migration is fscked
<maz>
(well, migrating an apple guest to anything else is fscked for plenty of reasons)
<marcan>
I mean that's pretty obvious)
<marcan>
(and nobody cares for our use case)
<marcan>
one thing to keep in mind is that if we ever enable the APFLG feature especially, that one *definitely* has to be exposing this impdef stuff to EL0, and there's no trapping this
<maz>
for IMPDEF regs, you could disable TIDCP, and context switch that in the same location.
<marcan>
which means if we're doing that we might as well do what apple intended here, and just expose this raw to the guest and let it deal with it the same way my patch already does on bare metal
<maz>
as long as there is an _EL12/EL02 accessor
<marcan>
and then all KVM has to do is save/restore context for these things
<marcan>
how are CPU implementors passed to the guest right now? is there passthrough mode or is it always some qemu thing?
<maz>
MIDR values as seen as the host's, directly from KVM. which means that if your vcpu thread migrates from one type to another, you see it.
<marcan>
ah, so that isn't trapped?
<maz>
KVM has no provision to cope with asymetric systems.
<maz>
no, this can be overloaded with VMIDR_EL2.
<marcan>
I mean what does KVM do right now?
<maz>
KVM just exposes the host view. nothing else.
<marcan>
and from what I see in the code, that includes AIDR_EL1
<maz>
yup.
<marcan>
that means my ACTLR code right now will... actually break. because KVM is claiming to be an Apple CPU that supports Apple IMPDEF features, but it doesn't.
<marcan>
at the very least we need to zero out AIDR_EL1 in KVM to make the current state of things not weirdly broken...
<maz>
well, that's IMPDEF... so any behaviour is compliant!
<marcan>
well yes, but if you're claiming to be a specific CPU and then not supporting its features, that's kind of broken isn't it :)
<maz>
show me the spec of that COPU, and we'll talk! :D
<maz>
CPU*
<marcan>
oh come on :p
<marcan>
look I know this sucks for everyone involved but I'm just trying to make this all work in the least shitty way for everyone involved
<marcan>
maz: anyway, I need to get dinner, but I assume we're not going to have a straight answer for this all anyway at this point so... let's say I want to just implement this in KVM the "apple way" with the features and context switching and everything. re context switching, does this affect the userspace interface and/or how hard is it to add to that without breaking the world? or can I just ignore ...
<marcan>
... all that as long as we don't migrate, and the state stays in the kernel?
ourdumbfuture has joined #asahi-dev
<maz>
well, I'm willing to help. but I'm also not going to turn KVM upside down for that. my proposal is to allow a module to change the IMPDEF state at load/put time. for stuff such as AIDR, we could either expose it as writable to userspace, or trap it and forward that trap to userspace.
<maz>
that will give you the possibility to save/restore the state as long as there is an EL12 accessor.
<marcan>
sure, either option would work for AIDR
<marcan>
if the IMPDEF state is in a module, keep in mind we also need somewhere to *save* that state
jeisom has quit [Ping timeout: 480 seconds]
<marcan>
though honestly, I'm not sure a module buys us much
<marcan>
we're going to have downstream kernels for hwe anyway, and I'm not sure maintaining a kernel patchset vs a module with an unstable API/ABI makes much difference, and for users it also doesn't really make a difference whether they install a module or a whole separate kernel...
chadmed has quit [Read error: Connection reset by peer]
<marcan>
what's the status of this one? is it supposed to work? should I start looking through macOS traces? :p
<eiln>
that means the asc hasn't even booted, likely an err in ctrr setup. I patched src/isp.c to add the heap to the phandle so of_iommu can find it. is that there? ah-'s t6000 booted before
<jannau>
ah- got further than that, not sure if that's already integrated
<marcan>
eiln: which patch? (I'm still on my branches as is)
<eiln>
I pushed to isp-dapf with yours cherry-picked
<marcan>
the heap is supposed to be linked statically in the DT, I added that to my t6000 one
<eiln>
nothing hardware yet, opcode 0x004 is PRINT_ENABLE. only 0xe00000 vs t8103 0x1800000 is suspicious though.
<marcan>
that just looks like a fault after the driver gives up, not the problem
<marcan>
hm wait something is weird here
<marcan>
that ininital DART map looks off by a page
<marcan>
ah no it's just the dumper is weird
<jannau>
the end printing in proxyclient is weird / off-by-one
<jannau>
not sure what I thought when doing it
hightower4 has joined #asahi-dev
<marcan>
eiln: isn't this multi DART business the same old thing from USB?
<marcan>
apple handles it by mirroring DART registers, we just instantiate multiple IOMMUs
<marcan>
they do this weirdo thing with USB where certain requests go through different DARTs
<marcan>
same story here it looks like
<marcan>
hold on, let me just go back to t8103 and try to clean this up
<marcan>
jannau: we didn't need anything special in USB for this right? the iommu code just handes multiple IOMMUs?
hightower3 has quit [Ping timeout: 480 seconds]
<jannau>
marcan: yes, just list multiple "iommus"
<jannau>
that is/will be the easy solution for multiple display output. the display-subsystem node will just list all dart-disp*s, that requires no code in the device drivers
<jannau>
it's currently limited to 2 darts but I think it's should easily extend to more than 2
<eiln>
ISP/ANE/AVE just needs TTBR and TLB invalidation mirrored
<marcan>
that's just a shortcut for "there are multiple DARTs and you need to configure them all"
<marcan>
that's effectively what you're doing and what Apple does for USB too
<marcan>
they do it as a horrible hack in the IOMMU driver
<marcan>
instead of just instantiating the hardware multiple times, which is actually what is going on here
<marcan>
so we just do that
<marcan>
jannau: oh, we need 3 here :/
<marcan>
was that done in the DART driver?
<jannau>
marcan: I think it's just increasing MAX_DARTS_PER_DEVICE
<sven>
the dart driver only supports two right now, I think you can just change a define to make that three
<marcan>
ah yeah
<jannau>
in apple-dart.c
<sven>
yeah, that
<marcan>
hm, now I broke t8103 the same way
<marcan>
and I also don't see things mirrored in the DARTs
<eiln>
where is it not mirrored?
<marcan>
I mean the multi-dart thing isn't working now
<marcan>
[ 0.371541] apple-dart 22c0e8000.iommu: adding as dart #0
<marcan>
[ 0.372900] apple-isp 22a000000.isp: failed to init iommu: -517
<marcan>
something's wack here
<marcan>
oh that's just EPROBE_DEFER lol
<eiln>
my understanding is that they are not real darts. and initing them as separate iommus might erase the special tunables
<marcan>
they are absolutely real DARTs
<marcan>
this is just Apple being very, very stupid in how they represent things in their device tree
<marcan>
which is a pattern they have
<marcan>
I believe the underlying reason for the multiple DARTs is performance or perhaps different configurations, where they connect different hardware memory initiators to different DARTs so it doesn't bottleneck on one
<marcan>
with USB we could see certain kinds of requests wind up on one DART or the other (ask sven about that)
<marcan>
that EPROBE_DEFER is wrong, isp is making it up
<marcan>
and of *course* apple made that inconsistent
<maz>
marcan: well, if you only plan to hit upstream after a few *years*, that's probably not something I should be concerned about, and you'll have to convince other people than me!
<marcan>
lol
<maz>
and on a completely unrelated note, I've finally tagged CS v3.2. no significant change with the state of the v3-dev branch, only silkscreen and LCSC reference updates.
<marcan>
nice :)
<marcan>
eiln: and everything works with 3 proper DARTs :)
<marcan>
eiln: best part? it works without the "CTRR"/tunables setup at all
<marcan>
because unlike Apple, our DART driver actually knows how to initialize stuff without relying on random hardcoded register pokes in the device tree
<eiln>
dart-tunables-instance-* is in the dt, and is required for the hack
<marcan>
yes, because that sequence is basically initializing the DART
<marcan>
which our driver knows how to do normally
<eiln>
why would they do this? isn't this easier?
<marcan>
because apple lol
<marcan>
their hardware engineering is pretty good
<marcan>
their software engineering... is not.
<marcan>
I'm so sorry you got caught up in this mess and I/we didn't catch it earlier and tell you what it's about :(
<marcan>
would've probably saved you quite a bit of time :/
<marcan>
but yeah our DART driver knows how to share page tables and everything, it literally just works... (other than that inconsistent bypass thing I had to patch)
<eiln>
the "tunables" masks unused regs tho, 0x64/0x68/0x6c. interesting
<eiln>
it's fine, I figured out a hack during the ANE days
<eiln>
and it boots all the way?
<marcan>
t8103 works all the way to videao, yeah
<marcan>
*video
<marcan>
and yeah, they do init some unknown regs but it doesn't seem to matter
<marcan>
if it ever does we should try to work out what they do
<marcan>
t6000 is still broken though
<marcan>
eiln: pushed what I have to the same old branches
<sven>
Most darts have tunables that can just be ignored
<sven>
They sometimes use regs we have no clue about
<sven>
And yeah, dwc3 also has two darts that are merged into a single one inside adt
<sven>
and the splits makes no sense
<sven>
all of device mode and half of host most goes through the first and the other half of host mode through the second one
<sven>
I‘d love to see why they need this specific split :D
mps has quit [Quit: leaving]
<sven>
and, uh, maybe I misremember but I though k mentioned this a while ago when we discussed dapf. maybe I should’ve put more emphasis on just how much of a hack the adt can be
roxfan has quit [Ping timeout: 480 seconds]
<eiln>
marcan: ah- timed out here (enable APPLE_ISP_DEBUG to print fw logs)
<eiln>
I should, sigh. not enough time in the day..
<marcan>
the args.pad_40[5] = 0x90; is not necessary, so it looks like just the sensor ID
<marcan>
that should be in the device tree then, since it's in the ADT too
<marcan>
(I'm assuming that's what it is since it matches)
<marcan>
anyway, good night ;)
<marcan>
and yeah, mood
lena6 has quit [Ping timeout: 480 seconds]
<ChaosPrincess>
does m1n1 have a convenient way to diff a large memory range?
<ChaosPrincess>
in other news, ave firmware finally talks to me: "FW Cfg: prod, tag: AppleAVE2FW-6040.2.1, SHA: f5fece645"
completenoob has quit [Ping timeout: 480 seconds]
<marcan>
\o/
<ChaosPrincess>
marcan: didn't you have something on stream that showed pretty coloured hexdump diffs?
<marcan>
that's regmon
<marcan>
proxyutils.RegMonitor
<ChaosPrincess>
ty
<marcan>
the various m1n1 shells have a hook where if a "mon" variable exists, it'll call mon.poll() after every input
ourdumbfuture has quit [Quit: My Mac has gone to sleep. ZZZzzz…]
faustine has joined #asahi-dev
<eiln>
I went to dim my computer, but I should reply
<eiln>
ChaosPrincess: you built on the isp stuff right? the ipc code can nearly be shared. the TERMINAL channel outputs the logs w/o having to memdiff
<ChaosPrincess>
eiln: not directly on top of isp, but using isp as reference, terminal outputs the logs, yes, but the channel table is different, the iovas are 0xffffffff, i am diffing to find out how exactly the current channel position is signalled
roxfan has joined #asahi-dev
<marcan>
if code can be shared here, it might make sense to have some kind of libispfw or something, especially since both of these drivers are in media/ (ane OTOH, not sure, depends on how different it ends up being)
<marcan>
but that refactoring can come later too, up to you
<marcan>
probably easier to hack on the code separately at first
<ChaosPrincess>
im still in the "huge piles of hardcoded register pokes" phase
<eiln>
they handle memory really, really weird. the firmware iova starts at ipc_iova and is masked by 0x80000000.
<eiln>
also echo "1 2 hi" > /tmp/ave_log.cfg gets the userspace driver to talk a lot. I'm fairly certain this is a scanf CVE
hightower2 has joined #asahi-dev
<ChaosPrincess>
eiln: yes, the firmware sends the address it wants the kernel to add to iovas
<ChaosPrincess>
also, thats fw ver?
<eiln>
there's a firmware bug causing an irq hang (preventing more logs). It should be as chatty as ISP. I was trying to patch the instruction, but apparently since ventura we can't write to segment-ranges anymore, hence I was downgrading earlier
ourdumbfuture has joined #asahi-dev
jacksonchen666 has quit [Ping timeout: 480 seconds]
faustine has quit [Quit: Lost terminal]
ourdumbfuture has quit [Quit: My Mac has gone to sleep. ZZZzzz…]
ourdumbfuture has joined #asahi-dev
eiln has quit [Ping timeout: 480 seconds]
Retr0id has quit [Read error: Connection reset by peer]
Retr0id has joined #asahi-dev
roxfan has quit [Read error: Connection reset by peer]
midou has quit [Ping timeout: 480 seconds]
amarioguy has joined #asahi-dev
jeisom has joined #asahi-dev
<amarioguy>
quick question, how exactly do you probe for the PCIe XHCI host controller BAR (or in U-boot terms, the "HCCR"/"HCOR")
<amarioguy>
trying to follow the u-boot and linux code that probes for it has not exactly been the easiest lol, is it just a read from the ECAM region?
<marcan>
there's no firmware, you are the firmware so you get to *assign* a BAR.
<marcan>
(via ECAM writes)
<marcan>
I mean if you're not running on u-boot of course
<j`ey>
why do isp and ave share anything?
<amarioguy>
marcan: ah that makes sense, so i just assign a BAR to the device?
<marcan>
j`ey: same codebase
<marcan>
amarioguy: yes (after doing a whole bunch of other pcie init)
<amarioguy>
ah neat
<amarioguy>
(long story short, trying to upload XHCI controller fw in my edk2 fork lol, was trying to follow u-boot code beforehand)
<sven>
you’ll also have to configure all the bridges above the device
<amarioguy>
yea that tracks, bus0 seems to just be all the root bridges (w/devices behind those bridges)
<marcan>
I would hope edk2 already has some semblance of PCIe support?
crabbedhaloablut has quit []
<marcan>
(that can do this for you)
<marcan>
anyway, I really need to sleep :p
<amarioguy>
marcan: yeah i'm pretty sure it does, i'm probably just being dumb and not seeing smth obvious lol
<sven>
yeah, I’d try to avoid writing that code myself. That uboot and Linux code is partly hard to follow because the whole setup is a bit tricky
<sven>
(maybe not for the apple case where you know about all devices but to do it correctly for the general case of arbitrary pcie busses)
ellyq has quit [Read error: Connection reset by peer]
<jannau>
amarioguy: any progress with the cursed mac studio dcp swap_surfface in m1n1?
<jannau>
that said, I still have to test the PR on m1 devices
ellyq has joined #asahi-dev
<amarioguy>
jannau: i just ended up blowing away the install and reinstalling 12.3.1 lol, wasn't really using the ventura or sonoma partitions too much anyways
<jannau>
ok, no worries
midou has joined #asahi-dev
deflated8837_ has quit [Remote host closed the connection]
darkapex2 has quit [Remote host closed the connection]
darkapex2 has joined #asahi-dev
<jannau>
marcan: asahi alarm needs a mesa-asahi-edge rebuild for llvm-16