#asahi-dev on 2022-12-12 — irc logs at oftc.irclog.whitequark.org

00:04 <jannau> we might not want all gpio irqs to act as wakeup source though

00:07 amarioguy has joined #asahi-dev

00:11 <jannau> marcan: https://github.com/AsahiLinux/linux/pull/78 with t8112 cache info and dart rename is open as well

00:12 <jannau> the color/timing mode tracing would require FTRACE in the edge kernel config

00:15 <j`ey> yeah you should be able to mark induvidual irqs

00:16 <marcan> in s2idle there aren't "wakeup sources" per se, what there are is IRQs marked IRQF_NO_SUSPEND

00:16 <marcan> which means those IRQs will not be masked in s2idle, then the drivers are free to do whatever they want with them

00:16 <marcan> the pin controller should have that set for its parent IRQ and then the downstream SPI driver should set it for its GPIO IRQ

00:16 <marcan> then other GPIOs will be masked in s2idle

00:17 <marcan> remember in s2idle the kernel is just running normally, it's just that by default all IRQs are masked

00:17 <marcan> we unconditionally unmask mailbox IRQs in suspend under the assumption that rtkit drivers will suspend themselves anyway (except SMC which needs to stay alive)

00:17 <marcan> (this is not done properly yet for the helper and for GPU and others)

00:18 <marcan> (nor DCP for that matter, not sure if DCP wants to actually shut down rtkit on suspend?)

00:18 <j`ey> ah, 2e34ca7f13a2d5f8fd665d568b6493a20d4b6efe

00:21 <jannau> ah, so the dead code in spihid makes not much sense for s2idle. I'd assumed s2idle could use the same suspend path

00:25 <jannau> macos powers dcp completely down for DPMS (t8103 with internal displays), I've not yet checked j314 or mac mini/studio

00:37 <marcan> I might do a suspend/resume/spihid hacking stream today, lots of stuff to fix there

00:48 <marcan> https://www.reddit.com/r/AsahiLinux/comments/zjapki/asahi_installation_disk_resize_error/

00:48 <marcan> "Does anyone know how to fix it?" *posts a screenshot that tells them how to fix it*

00:48 <marcan> why are users...

00:54 <bluetail> ^ addiction to instant happy chemicals without ever trying to get their hands dirty

01:03 <TellowKrinkle> "The YouTube tutorial said to do this. When I did it, it didn't do this. It must be broken."

01:06 * bluetail chuckles

01:06 sam___ has quit [Remote host closed the connection]

01:17 zalyx has quit [Quit: later alligator]

01:19 zalyx has joined #asahi-dev

01:42 user982492 has joined #asahi-dev

01:44 user982492 has quit []

01:45 user982492 has joined #asahi-dev

01:56 <marcan> rebased on 6.1 final and merged jannau's stuff. only compiled tested, will play around with suspend later today

02:00 user982492 has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]

02:02 jbowen_ has quit []

02:03 jbowen has joined #asahi-dev

02:04 user982492 has joined #asahi-dev

03:16 opticron_ is now known as opticron

04:04 renatorabelo has quit [Quit: Leaving]

04:08 renatorabelo has joined #asahi-dev

05:53 user982492 has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]

06:28 <cyrozap> marcan, sven: Heads-up, there's an issue with ASMedia's USB host controllers where if you try reading from multiple USB devices simultaneously that are connected to the same ASMedia host controller (doesn't matter whether it's through a hub or they're plugged directly into the HC's USB ports), the host controller will attempt a DMA write to an address it doesn't have permission to access. I can easily

06:28 <cyrozap> reproduce the error on POWER9 systems, which have a very strict IOMMU, so I was curious if the same issue affects Apple Silicon systems, which I assume also have a strict IOMMU. Even if the platform doesn't support any kind of capability like EEH (Enhanced Error Handling), where the IOMMU reports misbehaving devices to the OS, it would be interesting to see if any data gets lost or memory corruption

06:28 <cyrozap> occurs.

06:29 <cyrozap> I first wrote about this issue a couple of years ago, but have yet to discover exactly what's going wrong, and where. You can find the original thread where I discussed it here: https://lore.kernel.org/all/CAO3ALPyB1JDvvC27JGgAoTuHh0w+897tPhmTKX9PQWBFCrrnbQ@mail.gmail.com/

07:01 mini0n has quit [Read error: Connection reset by peer]

07:03 <marcan> cyrozap: we have a strict IOMMU and haven't seen such issues yet (maybe it doesn't apply to our firmware? I can't imagine Apple shipping broken USB controllers... wait, I can >_>)

07:03 <marcan> we do have IOMMU fault interrupts hooked up as far as I know

07:13 <sven> yeah, dart will complain in dmesg when there’s a fault

07:15 <marcan> cyrozap: I saw the zero addr quirk thing. for us, we do get full proper fault addresses, so iff we can reproduce the same problem we might be able to shed light on it

07:15 <marcan> we could try running the same firmware as you?

07:17 <sven> oh, macOS would also panic if there’s a dart fault fwiw

07:17 <sven> so the usb controller breaking there would be very obvious

07:20 <marcan> yeah

07:30 cylm has joined #asahi-dev

07:35 amarioguy has quit [Ping timeout: 480 seconds]

07:35 amarioguy has joined #asahi-dev

07:37 renato__ has joined #asahi-dev

07:44 renatorabelo has quit [Ping timeout: 480 seconds]

07:47 renato__ has quit [Ping timeout: 480 seconds]

08:10 SSJ_GZ has joined #asahi-dev

08:25 <ChaosPrincess> marcan: mind cherry-picking c5895db and 3e14469 from https://github.com/WhatAmISupposedToPutHere/linux/commits/spi-nor ? Those add the binding for the nvram, so you can have a linux equivalent of `bless`

09:38 mkurz has quit [Ping timeout: 480 seconds]

09:40 bcrumb has joined #asahi-dev

09:59 mkurz has joined #asahi-dev

10:05 mkurz has quit [Remote host closed the connection]

10:05 mkurz has joined #asahi-dev

10:08 mkurz has quit []

10:09 mkurz has joined #asahi-dev

10:10 mkurz has quit []

10:10 mkurz has joined #asahi-dev

10:10 MajorBiscuit has joined #asahi-dev

10:11 mkurz has quit []

10:11 mkurz has joined #asahi-dev

11:14 mkurz has quit [Read error: Connection reset by peer]

11:21 mkurz has joined #asahi-dev

11:29 mkurz has quit [Remote host closed the connection]

11:33 mkurz has joined #asahi-dev

11:34 Mary has quit [Quit: The Lounge - https://thelounge.chat]

11:34 heyuan has joined #asahi-dev

11:34 Mary has joined #asahi-dev

11:43 heyuan has quit [Quit: ERC 5.4 (IRC client for GNU Emacs 28.2)]

11:45 tobhe_ is now known as tobhe

12:20 bcrumb has quit [Quit: WeeChat 3.7.1]

13:01 Mary has quit [Quit: The Lounge - https://thelounge.chat]

13:06 Mary has joined #asahi-dev

13:08 jluthra has quit [Remote host closed the connection]

13:08 jluthra has joined #asahi-dev

13:10 Dementor has quit [Ping timeout: 480 seconds]

14:37 mkurz has quit [Read error: Connection reset by peer]

14:53 mkurz has joined #asahi-dev

16:07 psykose has quit [Remote host closed the connection]

16:08 psykose has joined #asahi-dev

16:23 gladiac has joined #asahi-dev

16:29 user982492 has joined #asahi-dev

16:42 bcrumb has joined #asahi-dev

16:43 bcrumb has quit []

16:43 bcrumb has joined #asahi-dev

16:44 ncopa has joined #asahi-dev

16:48 <ncopa> Hey (with the intention to run alpine linux with asahi kernel)

16:51 <ncopa> actually.... Hey (with the intention to get in touch!)

16:51 <ncopa> > If you are a developer for a distribution and interested in officially supporting Apple Silicon machines, please get in touch!

16:53 <mps> ncopa: join #asahi-alt

16:54 <ncopa> ok. thanks!

16:54 <mps> this is mostly development channel

16:57 <ncopa> "If you are a developer for a distribution and interested in officially supporting Apple Silicon machines, please get in touch! We’d love to work with you to make it happen." https://asahilinux.org/2022/03/asahi-linux-alpha-release/#can-i-install-other-distros

16:57 <ncopa> is that #asahi-alt better for that? ^^^

16:57 <mps> #asahi-alt is for alternate distro discussions

16:58 bcrumb has quit [Quit: WeeChat 3.7.1]

16:58 bcrumb has joined #asahi-dev

16:58 bcrumb has quit []

16:59 <mps> ncopa: some people already use alpine on m1 for full year

17:13 bcrumb has joined #asahi-dev

17:18 mofux has joined #asahi-dev

17:33 gladiac has quit [Quit: k thx bye]

17:36 mofux has quit [Remote host closed the connection]

17:39 mofux has joined #asahi-dev

17:43 user982492 has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]

17:45 MajorBiscuit has quit [Ping timeout: 480 seconds]

17:45 mkurz has quit [Read error: Connection reset by peer]

17:47 mofux has quit [Ping timeout: 480 seconds]

17:51 Dementor has joined #asahi-dev

17:51 mkurz has joined #asahi-dev

19:21 <cyrozap> marcan: Firmware version shouldn't matter--I've seen the issue with every ASMedia USB host controller I've tested, from the original ASM1042 through the ASM3142 (it seems I haven't tried the ASM3242 yet). But I'll load the Apple ASM3142 firmware on an ASM3142 card I have here and see if I can trigger the fault, just to be sure. The most reliable way I've found to trigger it is to plug in two USB NVMe

19:21 <cyrozap> drives into the same host controller, then start two reads from them (`sudo dd if=/dev/sdN of=/dev/null bs=1G iflag=direct status=progress`) as simultaneously as possible. Doing this triggers the issue almost immediately. It also works with USB hard drives, though it may trigger slightly less than immediately.

19:25 <cyrozap> btw, thanks for helping me look into this!

19:33 renato__ has joined #asahi-dev

19:35 <cyrozap> I'm not sure if there's a way to enable tracing the DMA buffers (or whatever they're called) allocated by the xHCI driver, but assuming you can successfully trigger the fault, it would be helpful to know what buffer is being overrun (assuming that's what the issue is and the xHC isn't just trying to write to some random address).

19:36 renato__ is now known as renatorabelo

19:41 <cyrozap> And if you _can't_ trigger the fault, I'll be really interested to know about that, since that would mean that the issue likely has something to do specifically with the POWER9 IOMMU or its kernel driver (though this issue doesn't occur with any of the TI or Fresco Logic USB host controllers I've tried).

19:52 <zzywysm_> cyrozap: there is also the possibility that Apple's hardware qual team found the same bug and made ASMedia fix it for them

19:53 beeblebrox has joined #asahi-dev

19:53 <zzywysm_> cyrozap: you may notice that SSDs are far more reliable in Apple hardware than anywhere else, and it's for the same reason

19:58 mkurz has quit [Quit: Leaving]

20:00 <cyrozap> zzywysm_: Yeah, that's why I'm testing on my ASM3142 card the firmware that Apple ships--I want to see if Apple had them fix the issue in firmware.

20:12 <sven> if this can be reproduced on macOS I’m pretty sure apple will make then fix it (assuming they haven’t already)

20:12 <sven> *them

20:29 bcrumb has quit [Quit: WeeChat 3.7.1]

20:32 cylm has quit [Ping timeout: 480 seconds]

20:49 <cyrozap> I've confirmed that the issue still exists even when running the firmware Apple ships. Tested with Linux 6.0.11 on POWER9.

20:51 <cyrozap> ASM3142 firmware version: 191118_70_11_11 (the one I extracted from a kernelcache)

21:01 <cyrozap> It'll be interesting to see if this issue happens on Apple hardware, and whether it happens in just Linux or in macOS, too.

21:02 <jannau> cyrozap: seems to work here without issues, both SSDs copied 70GB with 226MB/s (mac studio, linux 6.1-rc8 + asahi)

21:06 kettenis has quit [Remote host closed the connection]

21:06 kettenis has joined #asahi-dev

21:26 <cyrozap> jannau: Interesting... Would you mind running `dd if=/usr/lib/firmware/asmedia/asm2214a-apple.bin bs=1 skip=128 count=6 2>/dev/null | xxd -ps`? That will print the version of the firmware you have installed.

21:28 <jannau> cyrozap: fw version is 191118701108, both SSDs (500G / 256G) now completely read without issue

21:36 <cyrozap> jannau: Thanks for your help! It's especially interesting to see that you're running a slightly older version than what I have (191118_70_11_08 vs. 191118_70_11_11), and yet haven't encountered the issue. Meanwhile, I've uploaded a much more recent firmware (210330_70_02_40) to my ASM3142 card and haven't been able to trigger the fault yet.

21:38 <jannau> my firmware version is the one shipped with macos 12.3

21:49 renatorabelo has quit [Ping timeout: 480 seconds]

21:52 renatorabelo has joined #asahi-dev

21:53 <cyrozap> I was using the one from macOS 13.0, but I don't think the slight version difference should matter that much since I know the fault can be triggered on even older versions.

22:12 <cyrozap> I just managed to trigger the fault on firmware version 210330_70_02_40. When I tried earlier, I was using a cheap USB 3 thumb drive (64 GB, 100 MB/s max read speed) and a Crucial P5 250 GB SSD in a SuperSpeed Plus Gen 2x1 enclosure. This time, instead of the cheap USB drive, I used a Crucial P5 Plus 2 TB SSD in a SuperSpeed Plus Gen 2x1 enclosure, so now both SSDs plugged into the card support 10 Gbps

22:12 <cyrozap> USB. This shouldn't be necessary, since I've been able to trigger the fault with mechanical hard drives on 5 Gbps USB, but maybe it's the fact that, even with the newer firmware, the USB link is now getting saturated (I saw ~973 MB/s on the P5 Plus drive before the system detected the bad write).

22:13 <cyrozap> Also, I think it might be easier to trigger the issue with UAS drives.

22:13 <cyrozap> Though it may not be strictly necessary.

23:20 SSJ_GZ has quit [Ping timeout: 480 seconds]