#asahi-dev on 2023-04-28 — irc logs at oftc.irclog.whitequark.org

00:11 brolin has quit [Ping timeout: 480 seconds]

00:39 <maz> I'm running 6.3++ on my mini with mass storage on the USB-A ports, and it works just fine.

00:46 brolin has joined #asahi-dev

01:19 eiln has joined #asahi-dev

01:48 elvishjerricco has joined #asahi-dev

01:54 elvishjerricco has quit [Read error: Connection reset by peer]

01:55 brolin has quit [Ping timeout: 480 seconds]

01:56 brolin has joined #asahi-dev

02:00 gabuscus has quit []

02:06 benpoulson has joined #asahi-dev

02:08 elvishjerricco has joined #asahi-dev

02:12 elvishjerricco has quit [Read error: Connection reset by peer]

02:14 benpoulson has quit [Ping timeout: 480 seconds]

02:14 elvishjerricco has joined #asahi-dev

02:17 gabuscus has joined #asahi-dev

02:17 elvishjerricco has quit [Read error: Connection reset by peer]

02:17 elvishjerricco has joined #asahi-dev

02:23 elvishjerricco has quit [Read error: Connection reset by peer]

02:31 _rudi1 is now known as _rudi

02:32 elvishjerricco has joined #asahi-dev

02:38 benpoulson has joined #asahi-dev

02:47 benpoulson has quit [Ping timeout: 480 seconds]

02:52 brolin has quit [Ping timeout: 480 seconds]

03:00 elvishjerricco has quit [Read error: Connection reset by peer]

03:07 brolin has joined #asahi-dev

03:09 elvishjerricco has joined #asahi-dev

03:12 elvishjerricco has quit [Read error: Connection reset by peer]

03:15 elvishjerricco has joined #asahi-dev

03:57 brolin has quit [Ping timeout: 480 seconds]

04:19 <marcan> if I had to guess this is a bad interaction between a change in 6.3 and that very specific USB device

04:20 <marcan> could even be a firmware bug in the device itself that just coincidentally is getting hit now

04:20 <marcan> one of the things we changed in 6.3 is cpuidle, so it could even be just a random timing difference

04:25 <marcan> the logs just sound like the device itself is resetting, there's no actual errors

04:28 <marcan> I don't see any suspicious commits in usb core or xhci

04:29 <marcan> if I could repro it locally I could outright use a USB analyzer but...

05:04 greguu has joined #asahi-dev

05:45 sjg has quit [Read error: Connection reset by peer]

05:45 msteffen has quit [Read error: Connection reset by peer]

05:45 sjg has joined #asahi-dev

05:45 jbowen has quit [Write error: connection closed]

05:45 msteffen has joined #asahi-dev

05:45 joshtaylor has quit [Read error: Connection reset by peer]

05:45 joshtaylor has joined #asahi-dev

05:45 jbowen has joined #asahi-dev

05:59 cylm_ has joined #asahi-dev

06:48 steven has quit [Quit: ZNC 1.8.2 - https://znc.in]

06:49 steven has joined #asahi-dev

07:12 kettenis has joined #asahi-dev

07:15 rhysmdnz has quit [Quit: Bridge terminating on SIGTERM]

07:15 Guest12357 has quit [Quit: Bridge terminating on SIGTERM]

07:18 Jamie has joined #asahi-dev

07:18 rhysmdnz has joined #asahi-dev

07:18 Jamie is now known as Guest12394

07:24 cylm_ has quit [Ping timeout: 480 seconds]

07:38 kettenis has quit [Ping timeout: 480 seconds]

08:06 drubrkletern has quit [Remote host closed the connection]

08:09 bps has joined #asahi-dev

08:37 nsklaus has joined #asahi-dev

09:00 mattgirv has quit [Ping timeout: 480 seconds]

10:03 hightower3 has joined #asahi-dev

10:09 kettenis has joined #asahi-dev

10:09 hightower2 has quit [Ping timeout: 480 seconds]

10:57 maz has quit [Ping timeout: 480 seconds]

10:59 <lina> marcan: Umm, your cpuidle stuff is making lockdep very angry ^^;;

11:06 <povik> https://github.com/AsahiLinux/linux/commit/7edfa962bd38fc79c38b845c7cc87cc2f762cb42

11:07 <povik> seems like PM core makes DART go suspend even with active consumers, if those consumers don't support runtime PM

11:07 <povik> that goes away when a driver for the consumer runs pm_runtime_set_active explicitly

11:07 <povik> could this be the same issue as the one being worked around for PCIe?

11:08 <povik> this happens with SIO (yes, that was the SIO boot breakage...) for which pm_runtime_enabled is false though

11:08 <povik> that's at probe time, maybe pm_runtime_enabled is true, like for those PCIe devices, when the device link is being set up

11:09 <povik> jannau: ^

11:44 <jannau> povik: did you already rebase on asahi-wip?

11:45 <jannau> is the problem dart-sio gets probed, goes into runtime suspend, power-domain gets shutdown, sio probes?

11:46 <jannau> that should be fixed in asahi-wip

11:47 <povik> i don't think this is the same issue you have in mind

11:47 <povik> the dart gets suspended/resumed while sio is in probe

11:47 <povik> and of course it gets hit by the dart resume reset like the pcie devices get

11:47 <povik> so PM core doesn

11:47 <povik> * so PM core doesn't keep DART up while SIO is up

11:50 <jannau> are dart-sio and dart in different power-domains? I don't think runtime-pm state of dart-sio matters as long as the pd is on

11:51 <povik> it does because of this reset

11:51 <povik> https://github.com/AsahiLinux/linux/blob/2fb9510f953092b1cc94e0cb26713f2f152754f8/drivers/iommu/apple-dart.c#L1341

11:51 <jannau> that's not the problem for pcie though

11:51 <povik> and the same DART will get resumed because of ADMAC, which shares it

11:56 VinDuv has quit [Quit: ZNC 1.8.2+deb2+b1 - https://znc.in]

11:56 VinDuv has joined #asahi-dev

12:07 <jannau> we might want DL_FLAG_RPM_ACTIVE for every device. I suppose that fixes sio

12:07 cafebabe has quit [Quit: Connection closed for inactivity]

12:07 <jannau> after checking Documentation/driver-api/device_link.rst

12:08 <povik> that would fix it, but my understanding is DL_FLAG_RPM_ACTIVE would keep the DART up even for devices that do support runtime PM and are suspended

12:10 <povik> i am surprised it doesn't work as is, that the device link to a consumer not supporting runtime PM doesn't prevent the DART from suspending implicitly

12:10 <jannau> "DL_FLAG_RPM_ACTIVE can be specified to runtime resume the supplier and prevent it from suspending before the consumer is runtime suspended."

12:10 <povik> aha.

12:10 <povik> i only read the text above device_link_add

12:10 <povik> which wasn't as clear

12:13 <povik> the qualifier before that makes me a little uneasy though: 'Two other flags are specifically targeted at use cases where the device

12:14 <povik> link is added from the consumer's ``->probe`` callback:'

12:14 <povik> but maybe it does work as we need

12:15 <jannau> apple_dart_probe_device() is called from the driver core before the consumer's probe callback

12:15 <povik> yeah

12:42 roxfan2 has joined #asahi-dev

12:43 roxfan has quit [Ping timeout: 480 seconds]

12:43 maz has joined #asahi-dev

13:04 bps2 has joined #asahi-dev

13:10 bps has quit [Ping timeout: 480 seconds]

13:20 abd has joined #asahi-dev

14:08 gladiac has joined #asahi-dev

15:00 bps2 has quit [Ping timeout: 480 seconds]

15:12 <ChaosPrincess> is dcp the only thing rn that is firmware version dependent?

15:13 <marcan> and GPU

15:13 <marcan> and some stuff in the ADT bindings but only m1n1 cares about that

15:14 <ChaosPrincess> right, and we are about to support 13.3 firmwares soon(tm)?

15:14 <marcan> correct

15:14 <ChaosPrincess> cool

15:14 <marcan> lina: argh :(

15:14 <marcan> I'll take a look

15:17 <ChaosPrincess> will we have a path to update 12.3 machines to newer versions?

15:23 <marcan> once it's useful, yes

15:23 <marcan> initially only the new machines will get 13.3

15:23 <ChaosPrincess> im sorta maybe looking at another ip block that may end up being firmware dependent

15:24 <ChaosPrincess> idk for sure yet, but seems to be

15:25 <_jannau_> there goes my plan to support dcpext only on 13.3 or later (more out of annoyance than difficulty to support both)

15:25 <marcan> lina: see https://lore.kernel.org/linux-arm-kernel/20220726145430.bfwidmw6xmeppbfb@bogus/ , looks like it's known broken upstream (turn off CONFIG_PROVE_RAW_LOCK_NESTING to shut it up)

15:26 <marcan> _jannau_: dcpext would be a very good reason to call it useful

15:26 cylm_ has joined #asahi-dev

15:26 <marcan> what I mean is that we'll probably release in O(1-2 weeks) for M2 Pro/Max only

15:26 <marcan> I don't expect dcpext to be done by then :p

15:31 <_jannau_> not sure, it seems to work fine ignoring type-c/atc-phy. I'm currently down to tipd behaves differently on every other connect but at least consistently

15:31 <kettenis> I should have some time to get u-boot in releasable shape next week

15:32 <marcan> still need to fix oslog but if you're bored: for MTP, we need to allocate oslog out of the end of the sram region in the DT

15:32 <_jannau_> but yes, 1-2 weeks is not enough time to test it on other systems and fix issues

15:32 <marcan> instead of RAM

15:32 <marcan> since that sticks in linux, and forwarding RAM mappings is a major PITA

15:32 balrog has quit [Quit: Bye]

15:32 <marcan> also the size field extends down 12 more bits in the message, the DVA isn't as large as we thought

15:32 <marcan> size is in bytes

15:33 <marcan> (not 4k pages)

15:33 <kettenis> that makes more sense

15:33 <marcan> there's plenty of space at the end of SRAM so it's perfectly fine to steal some space for oslog there IMO

15:34 <kettenis> you're going to change the linux driver to do that?

15:34 jbowen has quit []

15:34 <marcan> the linux driver doesn't have to do anything :)

15:34 <marcan> since it never chainloads anything else and will inherit that from u-boot if chainloaded

15:34 jbowen has joined #asahi-dev

15:35 <kettenis> ah, ok

15:35 <marcan> (need to fix the size field though, just that)

15:36 <_jannau_> it would make sense allocate from sram on linux as well so that the behavior between us booting with chainload/HV and user install with u-boot is consistent

15:36 <ChaosPrincess> what does sram mean in this context? s=static or s=shared?

15:36 <marcan> both I guess?

15:36 <ChaosPrincess> s=static sounds expensive

15:37 <marcan> _jannau_: doesn't make much of a difference though

15:37 <marcan> ChaosPrincess: eh, it's 1MB

15:37 <marcan> I mean I guess it could be eDRAM but I don't know if they do that

15:38 balrog has joined #asahi-dev

15:39 gordonfreeman has joined #asahi-dev

15:42 <gordonfreeman> hello! I'm trying to move the single step handler of m1n1 from py to c. single stepping is ok, but I couldn't handle the SSTEP_LOWER exception which comes after stepping.

15:43 <gordonfreeman> I'm actually moving the py code to c verbatim. The issue in the python comment in handle_step happens in C, too: "not sure why MDSCR_EL1.SS needs to be disabled here but otherwise if also SPSR.SS=0 no instruction will be executed after eret and instead a debug exception is generated again"

15:45 <gordonfreeman> what I currently do in handle_step C version is: u64 mdscr_el1 = BIT(15); // MDE=1 msr(MDSCR_EL1, mdscr_el1); return true;

15:45 <gordonfreeman> this prevents the debug exception generation but also the kernel panics after the first handling.

15:45 <gordonfreeman> Am I missing something?

16:00 <povik> jannau: reading further in drivers/base/core.c and power/runtime.c that ACTIVE flag indeed looks like what we want

16:10 <mkurz> marcan: you say "...release in O(1-2 weeks) for M2 Pro/Max only...", are there also plans to release the gpu/mesa fixes for m1 pro/max as well soon? Thanks!

16:15 gladiac has quit [Quit: k thx bye]

16:18 <sven> marcan: well, I technically have about two weeks of overtime that I can take as vacation now :D

16:19 <sven> probably won’t do that because I’d go crazy fighting against the atcphy and typec mess for two weeks to get dcpext working ;)

16:26 <jannau> sven: do I remember correctly that you got at some point TimingElements with empty ColorModes for dcpext? do you remember what caused that? I see that currently on 13.3

16:38 <sven> yeah, that was when the dptx dance was incorrect

16:38 <sven> I didn’t know about the dcp->ap calls then iirc

16:50 <jannau> hmm, those are still coming through and the identifiers should be correct but I guess some message could have changed as well

16:51 <sven> is that on a m1?

16:52 <jannau> and there is still the sub header version increased to 4 and possibly 2 additional bytes of padding in the epic cmd header

16:52 <jannau> yes, m1 mac mini

16:59 crabbedhaloablut has quit []

17:01 crabbedhaloablut has joined #asahi-dev

17:13 <povik> wrote up what we know in a patch:

17:13 <povik> https://tpaste.us/PEQ1

17:28 pthariensflame has joined #asahi-dev

17:33 <jannau> fixed, got the reply to the max lane count wrong

17:33 pthariensflame has quit [Quit: Textual IRC Client: www.textualapp.com]

17:58 brolin has joined #asahi-dev

18:26 gordonfreeman has quit [Remote host closed the connection]

19:28 brolin has quit [Ping timeout: 480 seconds]

19:30 brolin has joined #asahi-dev

19:39 brolin has quit [Ping timeout: 480 seconds]

19:40 brolin has joined #asahi-dev

20:01 brolin has quit [Ping timeout: 480 seconds]

20:12 roxfan has joined #asahi-dev

20:15 roxfan2 has quit [Ping timeout: 480 seconds]

20:29 bcrumb has joined #asahi-dev

20:36 abd has quit [Ping timeout: 480 seconds]

20:37 bcrumb has quit [Quit: WeeChat 3.8]

21:09 abd has joined #asahi-dev

21:34 brolin has joined #asahi-dev

21:49 kettenis has quit [Ping timeout: 480 seconds]

21:55 cylm_ has quit [Ping timeout: 480 seconds]

22:12 kettenis has joined #asahi-dev

22:17 bcrumb has joined #asahi-dev

22:17 bcrumb has quit []

22:18 eiln has quit [Read error: Connection reset by peer]

23:36 kettenis has quit [Ping timeout: 480 seconds]