marcan changed the topic of #asahi-dev to: Asahi Linux: porting Linux to Apple Silicon macs | Non-development talk: #asahi | General development | GitHub: https://alx.sh/g | Wiki: https://alx.sh/w | Logs: https://alx.sh/l/asahi-dev
<jannau> we might not want all gpio irqs to act as wakeup source though
amarioguy has joined #asahi-dev
<jannau> marcan: https://github.com/AsahiLinux/linux/pull/78 with t8112 cache info and dart rename is open as well
<jannau> the color/timing mode tracing would require FTRACE in the edge kernel config
<j`ey> yeah you should be able to mark induvidual irqs
<marcan> in s2idle there aren't "wakeup sources" per se, what there are is IRQs marked IRQF_NO_SUSPEND
<marcan> which means those IRQs will not be masked in s2idle, then the drivers are free to do whatever they want with them
<marcan> the pin controller should have that set for its parent IRQ and then the downstream SPI driver should set it for its GPIO IRQ
<marcan> then other GPIOs will be masked in s2idle
<marcan> remember in s2idle the kernel is just running normally, it's just that by default all IRQs are masked
<marcan> we unconditionally unmask mailbox IRQs in suspend under the assumption that rtkit drivers will suspend themselves anyway (except SMC which needs to stay alive)
<marcan> (this is not done properly yet for the helper and for GPU and others)
<marcan> (nor DCP for that matter, not sure if DCP wants to actually shut down rtkit on suspend?)
<j`ey> ah, 2e34ca7f13a2d5f8fd665d568b6493a20d4b6efe
<jannau> ah, so the dead code in spihid makes not much sense for s2idle. I'd assumed s2idle could use the same suspend path
<jannau> macos powers dcp completely down for DPMS (t8103 with internal displays), I've not yet checked j314 or mac mini/studio
<marcan> I might do a suspend/resume/spihid hacking stream today, lots of stuff to fix there
<marcan> "Does anyone know how to fix it?" *posts a screenshot that tells them how to fix it*
<marcan> why are users...
<bluetail> ^ addiction to instant happy chemicals without ever trying to get their hands dirty
<TellowKrinkle> "The YouTube tutorial said to do this. When I did it, it didn't do this. It must be broken."
* bluetail chuckles
sam___ has quit [Remote host closed the connection]
zalyx has quit [Quit: later alligator]
zalyx has joined #asahi-dev
user982492 has joined #asahi-dev
user982492 has quit []
user982492 has joined #asahi-dev
<marcan> rebased on 6.1 final and merged jannau's stuff. only compiled tested, will play around with suspend later today
user982492 has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
jbowen_ has quit []
jbowen has joined #asahi-dev
user982492 has joined #asahi-dev
opticron_ is now known as opticron
renatorabelo has quit [Quit: Leaving]
renatorabelo has joined #asahi-dev
user982492 has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
<cyrozap> marcan, sven: Heads-up, there's an issue with ASMedia's USB host controllers where if you try reading from multiple USB devices simultaneously that are connected to the same ASMedia host controller (doesn't matter whether it's through a hub or they're plugged directly into the HC's USB ports), the host controller will attempt a DMA write to an address it doesn't have permission to access. I can easily
<cyrozap> reproduce the error on POWER9 systems, which have a very strict IOMMU, so I was curious if the same issue affects Apple Silicon systems, which I assume also have a strict IOMMU. Even if the platform doesn't support any kind of capability like EEH (Enhanced Error Handling), where the IOMMU reports misbehaving devices to the OS, it would be interesting to see if any data gets lost or memory corruption
<cyrozap> occurs.
<cyrozap> I first wrote about this issue a couple of years ago, but have yet to discover exactly what's going wrong, and where. You can find the original thread where I discussed it here: https://lore.kernel.org/all/CAO3ALPyB1JDvvC27JGgAoTuHh0w+897tPhmTKX9PQWBFCrrnbQ@mail.gmail.com/
mini0n has quit [Read error: Connection reset by peer]
<marcan> cyrozap: we have a strict IOMMU and haven't seen such issues yet (maybe it doesn't apply to our firmware? I can't imagine Apple shipping broken USB controllers... wait, I can >_>)
<marcan> we do have IOMMU fault interrupts hooked up as far as I know
<sven> yeah, dart will complain in dmesg when there’s a fault
<marcan> cyrozap: I saw the zero addr quirk thing. for us, we do get full proper fault addresses, so iff we can reproduce the same problem we might be able to shed light on it
<marcan> we could try running the same firmware as you?
<sven> oh, macOS would also panic if there’s a dart fault fwiw
<sven> so the usb controller breaking there would be very obvious
<marcan> yeah
cylm has joined #asahi-dev
amarioguy has quit [Ping timeout: 480 seconds]
amarioguy has joined #asahi-dev
renato__ has joined #asahi-dev
renatorabelo has quit [Ping timeout: 480 seconds]
renato__ has quit [Ping timeout: 480 seconds]
SSJ_GZ has joined #asahi-dev
<ChaosPrincess> marcan: mind cherry-picking c5895db and 3e14469 from https://github.com/WhatAmISupposedToPutHere/linux/commits/spi-nor ? Those add the binding for the nvram, so you can have a linux equivalent of `bless`
mkurz has quit [Ping timeout: 480 seconds]
bcrumb has joined #asahi-dev
mkurz has joined #asahi-dev
mkurz has quit [Remote host closed the connection]
mkurz has joined #asahi-dev
mkurz has quit []
mkurz has joined #asahi-dev
mkurz has quit []
mkurz has joined #asahi-dev
MajorBiscuit has joined #asahi-dev
mkurz has quit []
mkurz has joined #asahi-dev
mkurz has quit [Read error: Connection reset by peer]
mkurz has joined #asahi-dev
mkurz has quit [Remote host closed the connection]
mkurz has joined #asahi-dev
Mary has quit [Quit: The Lounge - https://thelounge.chat]
heyuan has joined #asahi-dev
Mary has joined #asahi-dev
heyuan has quit [Quit: ERC 5.4 (IRC client for GNU Emacs 28.2)]
tobhe_ is now known as tobhe
bcrumb has quit [Quit: WeeChat 3.7.1]
Mary has quit [Quit: The Lounge - https://thelounge.chat]
Mary has joined #asahi-dev
jluthra has quit [Remote host closed the connection]
jluthra has joined #asahi-dev
Dementor has quit [Ping timeout: 480 seconds]
mkurz has quit [Read error: Connection reset by peer]
mkurz has joined #asahi-dev
psykose has quit [Remote host closed the connection]
psykose has joined #asahi-dev
gladiac has joined #asahi-dev
user982492 has joined #asahi-dev
bcrumb has joined #asahi-dev
bcrumb has quit []
bcrumb has joined #asahi-dev
ncopa has joined #asahi-dev
<ncopa> Hey (with the intention to run alpine linux with asahi kernel)
<ncopa> actually.... Hey (with the intention to get in touch!)
<ncopa> > If you are a developer for a distribution and interested in officially supporting Apple Silicon machines, please get in touch!
<mps> ncopa: join #asahi-alt
<ncopa> ok. thanks!
<mps> this is mostly development channel
<ncopa> "If you are a developer for a distribution and interested in officially supporting Apple Silicon machines, please get in touch! We’d love to work with you to make it happen." https://asahilinux.org/2022/03/asahi-linux-alpha-release/#can-i-install-other-distros
<ncopa> is that #asahi-alt better for that? ^^^
<mps> #asahi-alt is for alternate distro discussions
bcrumb has quit [Quit: WeeChat 3.7.1]
bcrumb has joined #asahi-dev
bcrumb has quit []
<mps> ncopa: some people already use alpine on m1 for full year
bcrumb has joined #asahi-dev
mofux has joined #asahi-dev
gladiac has quit [Quit: k thx bye]
mofux has quit [Remote host closed the connection]
mofux has joined #asahi-dev
user982492 has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
MajorBiscuit has quit [Ping timeout: 480 seconds]
mkurz has quit [Read error: Connection reset by peer]
mofux has quit [Ping timeout: 480 seconds]
Dementor has joined #asahi-dev
mkurz has joined #asahi-dev
<cyrozap> marcan: Firmware version shouldn't matter--I've seen the issue with every ASMedia USB host controller I've tested, from the original ASM1042 through the ASM3142 (it seems I haven't tried the ASM3242 yet). But I'll load the Apple ASM3142 firmware on an ASM3142 card I have here and see if I can trigger the fault, just to be sure. The most reliable way I've found to trigger it is to plug in two USB NVMe
<cyrozap> drives into the same host controller, then start two reads from them (`sudo dd if=/dev/sdN of=/dev/null bs=1G iflag=direct status=progress`) as simultaneously as possible. Doing this triggers the issue almost immediately. It also works with USB hard drives, though it may trigger slightly less than immediately.
<cyrozap> btw, thanks for helping me look into this!
renato__ has joined #asahi-dev
<cyrozap> I'm not sure if there's a way to enable tracing the DMA buffers (or whatever they're called) allocated by the xHCI driver, but assuming you can successfully trigger the fault, it would be helpful to know what buffer is being overrun (assuming that's what the issue is and the xHC isn't just trying to write to some random address).
renato__ is now known as renatorabelo
<cyrozap> And if you _can't_ trigger the fault, I'll be really interested to know about that, since that would mean that the issue likely has something to do specifically with the POWER9 IOMMU or its kernel driver (though this issue doesn't occur with any of the TI or Fresco Logic USB host controllers I've tried).
<zzywysm_> cyrozap: there is also the possibility that Apple's hardware qual team found the same bug and made ASMedia fix it for them
beeblebrox has joined #asahi-dev
<zzywysm_> cyrozap: you may notice that SSDs are far more reliable in Apple hardware than anywhere else, and it's for the same reason
mkurz has quit [Quit: Leaving]
<cyrozap> zzywysm_: Yeah, that's why I'm testing on my ASM3142 card the firmware that Apple ships--I want to see if Apple had them fix the issue in firmware.
<sven> if this can be reproduced on macOS I’m pretty sure apple will make then fix it (assuming they haven’t already)
<sven> *them
bcrumb has quit [Quit: WeeChat 3.7.1]
cylm has quit [Ping timeout: 480 seconds]
<cyrozap> I've confirmed that the issue still exists even when running the firmware Apple ships. Tested with Linux 6.0.11 on POWER9.
<cyrozap> ASM3142 firmware version: 191118_70_11_11 (the one I extracted from a kernelcache)
<cyrozap> It'll be interesting to see if this issue happens on Apple hardware, and whether it happens in just Linux or in macOS, too.
<jannau> cyrozap: seems to work here without issues, both SSDs copied 70GB with 226MB/s (mac studio, linux 6.1-rc8 + asahi)
kettenis has quit [Remote host closed the connection]
kettenis has joined #asahi-dev
<cyrozap> jannau: Interesting... Would you mind running `dd if=/usr/lib/firmware/asmedia/asm2214a-apple.bin bs=1 skip=128 count=6 2>/dev/null | xxd -ps`? That will print the version of the firmware you have installed.
<jannau> cyrozap: fw version is 191118701108, both SSDs (500G / 256G) now completely read without issue
<cyrozap> jannau: Thanks for your help! It's especially interesting to see that you're running a slightly older version than what I have (191118_70_11_08 vs. 191118_70_11_11), and yet haven't encountered the issue. Meanwhile, I've uploaded a much more recent firmware (210330_70_02_40) to my ASM3142 card and haven't been able to trigger the fault yet.
<jannau> my firmware version is the one shipped with macos 12.3
renatorabelo has quit [Ping timeout: 480 seconds]
renatorabelo has joined #asahi-dev
<cyrozap> I was using the one from macOS 13.0, but I don't think the slight version difference should matter that much since I know the fault can be triggered on even older versions.
<cyrozap> I just managed to trigger the fault on firmware version 210330_70_02_40. When I tried earlier, I was using a cheap USB 3 thumb drive (64 GB, 100 MB/s max read speed) and a Crucial P5 250 GB SSD in a SuperSpeed Plus Gen 2x1 enclosure. This time, instead of the cheap USB drive, I used a Crucial P5 Plus 2 TB SSD in a SuperSpeed Plus Gen 2x1 enclosure, so now both SSDs plugged into the card support 10 Gbps
<cyrozap> USB. This shouldn't be necessary, since I've been able to trigger the fault with mechanical hard drives on 5 Gbps USB, but maybe it's the fact that, even with the newer firmware, the USB link is now getting saturated (I saw ~973 MB/s on the P5 Plus drive before the system detected the bad write).
<cyrozap> Also, I think it might be easier to trigger the issue with UAS drives.
<cyrozap> Though it may not be strictly necessary.
SSJ_GZ has quit [Ping timeout: 480 seconds]