#asahi-dev on 2021-12-18 — irc logs at oftc.irclog.whitequark.org

2021-07-26 22:57 ChanServ changed the topic of #asahi-dev to: Asahi Linux: porting Linux to Apple Silicon macs | General development | GitHub: https://alx.sh/g | Wiki: https://alx.sh/w | Logs: https://alx.sh/l/asahi-dev

01:02 <tpw_rules> hm, that sure seems to be the case. bizarre

01:10 bpye has joined #asahi-dev

01:11 skipwich has quit [Quit: DISCONNECT]

01:12 skipwich has joined #asahi-dev

01:14 bpye4 has joined #asahi-dev

01:14 skipwich has quit []

01:14 <marcan> sven: that would be a cute hack, if we can just have the kmutil m1n1 chainload something from AuxKC :D

01:16 <tpw_rules> marcan: btw i fixed the issues with the device tree. u-boot has a hook to essentially do a merge of the device tree it loads with its own device tree

01:18 <marcan> tpw_rules: it works if you haven't fully initialized rtkit yet

01:18 <marcan> it's weird

01:18 <marcan> we don't know why it doesn't work if you do a full init

01:19 <tpw_rules> "it"?

01:19 rwhitby has joined #asahi-dev

01:19 <marcan> the reset thing

01:19 bpye has quit [Ping timeout: 480 seconds]

01:20 <tpw_rules> oh. well i'm not sure if what u-boot is doing a full init. do you remember exactly how you tested that function?

01:20 <tpw_rules> i mean my whole problem is it's not initialized enough i can get it to shut back down

01:20 <tpw_rules> so maybe i'm just not testing the reset properly

01:21 bpye has joined #asahi-dev

01:22 rwhitby has quit []

01:22 alyssa has quit [Quit: leaving]

01:26 bpye4 has quit [Ping timeout: 480 seconds]

01:45 Dcow_ has joined #asahi-dev

02:20 Emantor has quit [Quit: ZNC - http://znc.in]

02:20 Emantor has joined #asahi-dev

02:23 <marcan> tpw_rules: the reset seems to work if you have it crash init by not setting up SART properly. IIRC

02:23 <marcan> but if it progresses beyond that it stops working

02:23 <marcan> I've also seen it half-work; you get the first message from RTKit, then it hangs

02:23 yuyichao has joined #asahi-dev

02:49 <tpw_rules> hm, it seems not setting up SART at all is not enough to let it be resettable

02:53 riker77_ has joined #asahi-dev

02:58 riker77 has quit [Ping timeout: 480 seconds]

02:58 riker77_ is now known as riker77

03:44 kov has quit [Quit: Coyote finally caught me]

03:44 kov has joined #asahi-dev

03:52 skoobasteeve has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]

04:01 PhilippvK has joined #asahi-dev

04:01 skoobasteeve has joined #asahi-dev

04:04 phiologe has quit [Ping timeout: 480 seconds]

04:33 Dcow_ has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]

04:37 Dcow has quit [Ping timeout: 480 seconds]

04:39 Dcow has joined #asahi-dev

04:44 amarioguy has quit [Ping timeout: 480 seconds]

04:46 Dcow_ has joined #asahi-dev

04:51 Dcow has quit [Ping timeout: 480 seconds]

05:09 <Jamie[m]1> does anyone have a convenient method for syncing up events in MMIO traces to the thing that caused it (i.e. a way to get a marker to show up in the trace from macOS userland?)

05:21 <marcan> Jamie[m]1: you could use the m1racles register :D

05:22 <marcan> hv_exc.c has a handler for it, you could have that emit an event

05:22 <marcan> and then read/write that register to trigger events; you get a whole two bits for writes (though macOS writes zeroes to it all the time, so you need to filter that out either by value or EL1)

05:23 <marcan> though since this is trapped, you can use the full 64 bits of data

05:23 <marcan> or indeed as many registers as you want, treating it as a weirdo hypercal

05:23 <marcan> I'd actually been meaning to implement this for this use case, especially for the GPU stuff, just haven't gotten there yet

05:24 <marcan> another option would be a debug break; I think we trap brks unconditionally right now (and lower them automatically since a patch I sent some time ago), at least from EL1 but presumably EL0 too

05:25 Glanzmann has joined #asahi-dev

05:25 <Glanzmann> tpw_rules: I'm your beta tester.

05:30 <Jamie[m]1> haha sounds good

05:32 skipwich has joined #asahi-dev

05:37 <tpw_rules> kettenis: i fixed u-boot. my shower thought is that the addresses you give to rtkit should probably be 16k aligned

05:37 <tpw_rules> and doing that fixed it it seems. i'll actually like clean up and test it tomorrow

05:38 <tpw_rules> anyway night

07:20 skipwich has quit [Quit: DISCONNECT]

07:38 amarioguy has joined #asahi-dev

08:15 amarioguy has quit [Ping timeout: 480 seconds]

08:33 the_lanetly_052 has joined #asahi-dev

08:46 ChaosPrincess has quit [Quit: WeeChat 3.3]

08:46 thunfisch has quit [Remote host closed the connection]

08:47 thunfisch has joined #asahi-dev

09:00 l3k[m] has joined #asahi-dev

09:00 Sebhl[m] has joined #asahi-dev

09:01 thunfisch has quit [Ping timeout: 480 seconds]

09:24 <sven> Hmmm… iirc SART should allow anything aligned to 4K

09:25 <sven> im still baffled SART can be skipped and nvme then still works :/

09:27 aleasto has joined #asahi-dev

09:30 thunfisch has joined #asahi-dev

09:32 <Glanzmann> sven: Porting the mac mini to Linux was much easier: https://github.com/torvalds/linux/commit/b74ba22f030eb7ab88f7d8954ad18ecc0ac5ce3c

09:33 <Glanzmann> s/the/the last/

09:34 <j`ey> and the next mac's will also be easier than the current gen!

09:34 <j`ey> (hopefully :))

09:34 <sven> M1 pro/Max was already simple even though they changed a lot

09:35 <kettenis> sven: I can confirm that if I provide memory blocks that are 16K aligned, I do see the final APPLE_RTKIT_MGMT_SET_AP_PWR_STATE message

09:36 <sven> weird

09:36 <sven> but good to know

09:36 <kettenis> but then nvme in u-boot hangs, presumably because I'm not acking syslog messages

09:36 <sven> so I guess only the syslog thread hangs then if you don’t setup the SART buffers correctly

10:45 ChaosPrincess has joined #asahi-dev

11:09 amarioguy has joined #asahi-dev

11:45 amarioguy has quit [Ping timeout: 480 seconds]

12:02 PhilippvK has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]

12:06 phiologe has joined #asahi-dev

12:46 Dcow has joined #asahi-dev

13:20 VinDuv has quit [Quit: ZNC 1.8.2+deb2+b1 - https://znc.in]

13:57 Dcow has quit [Quit: Textual IRC Client: www.textualapp.com]

14:17 Dcow has joined #asahi-dev

14:30 yuyichao has quit [Quit: Konversation terminated!]

14:31 yuyichao has joined #asahi-dev

14:48 aeadio has joined #asahi-dev

14:48 aead has quit [Remote host closed the connection]

14:48 MajorBiscuit has joined #asahi-dev

15:15 MajorBiscuit has quit [Quit: WeeChat 3.3]

15:15 MajorBiscuit has joined #asahi-dev

15:21 Dcow has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]

15:25 VinDuv has joined #asahi-dev

15:26 m42uko has quit [Ping timeout: 480 seconds]

15:31 MajorBiscuit has quit [Ping timeout: 480 seconds]

15:35 Dcow has joined #asahi-dev

15:38 m42uko has joined #asahi-dev

15:44 Dcow has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]

15:55 Dcow has joined #asahi-dev

15:55 amarioguy has joined #asahi-dev

16:04 Dcow has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]

16:15 tdmm has joined #asahi-dev

16:24 <tpw_rules> sven: the problem is specifically giving rtkit non 16k aligned addresses. sart doesn't seem to care

16:25 <sven> yes, I just refer to them as SART buffers as opposed to NVMMU buffers used for most nvme data

16:25 <tpw_rules> ah ok

16:26 <sven> I guess rtkit runs with 16k pages as well and just doesn’t want to map non aligned buffers for those

16:26 <sven> SART should only require 4K alignment or so

16:43 Dcow has joined #asahi-dev

16:50 tdmm has left #asahi-dev [#asahi-dev]

16:53 aleasto has quit [Ping timeout: 480 seconds]

17:01 the_lanetly_052 has quit [Ping timeout: 480 seconds]

17:05 amarioguy has quit [Ping timeout: 480 seconds]

17:11 amarioguy has joined #asahi-dev

17:11 <kettenis> sven: so if I don't start the syslog endpoint nvme works in u-boot and in OpenBSD

17:12 <sven> hrm, so that one isn't mandatory after all then

17:12 <sven> let me try that with the firmware i have

17:12 <tpw_rules> are you able to shut the processor down?

17:13 <kettenis> no, because I have no way to verify that I can start it back up again

17:15 <kettenis> and if I start the syslog endpoint I run into isses

17:16 <kettenis> I can make u-boot work if I poll the mailbox

17:16 <kettenis> and afterwards I can use nvme on the mini which boots from usb

17:17 <kettenis> but on the macbook pro (which boots from nvme) OpenBSD hangs

17:17 <tpw_rules> well let me see if i can shut it down then. for whatever reason i don't actually get any syslog messages besides that 0xb type response

17:18 <sven> you'll only get syslog messages once you start bringing up nvme

17:21 <kettenis> I see the syslog messages if I start the endpoint now

17:21 <sven> feels like we have different firmware then :D

17:21 <tpw_rules> maybe i am not checking for them in the right place then

17:21 <sven> iirc the first message for me comes when the admin queue is set up

17:22 <kettenis> the messages start appearing when I start initializing the nvme stuff

17:22 <sven> ah ok, i just misunderstood then

17:22 <kettenis> not sure if that is when the admin queue is set up

17:26 <tpw_rules> no, it seems if i never initialize syslog, then the management interface doesn't respond to anything

17:26 <tpw_rules> endpoint*

17:26 <sven> that sounds like what happened when i tried to boot without syslog originally

17:28 <sven> maybe we should define what we mean by "enable syslog". could be "send start ep for the syslog endpoint" but could also be "send this 0xb command which we thought was syslog_init"

17:28 <tpw_rules> i just tried that first thing

17:32 <kettenis> https://gist.github.com/kettenis/b0a5f5e6a6d5957a2bc93ea5707e4603

17:33 <kettenis> that is what happens if I don't enable syslog

17:34 <tpw_rules> and trying that second thing, the management interface still does respond to the shutdown messages. and uboot is able to reinitialize rtkit, but hangs trying to reinitialize the nvme itself

17:35 <sven> hm... so i know that the syslog thread in the rtkit firmware registers a callback for "power state changes" (i.e. that 0xb) command. so maybe that just gets stuck if the endpoint isn't initialized?

17:35 <tpw_rules> although now that i think about it, my shutdown code does not try to shut down the nvme. so maybe i need to do that better

17:36 wCPO has joined #asahi-dev

17:42 <tpw_rules> well, maybe linux knows how to do that

17:45 <tpw_rules> hm, it sure seems to... let me garbage collect my setup and make sure this works for real

17:45 <tpw_rules> sven: good call on not sending that 0xb command

17:51 <sven> so the thing is if you send 0xb ("set AP power state") twice with the same power state RTKit panics

17:52 <sven> if you never send 0xb with 0x20 as the argument the power state remains at 0x10

17:52 <sven> and that's also the thing you have to send to shut it down

17:53 <tpw_rules> i'm sending 0x01 to shut it down

17:53 <tpw_rules> but 0x10 with message type 6 ("set IOP power state")

17:54 <sven> hrm, i might've confused those two

17:54 <sven> but they are modeled after what iBoot does anyway

17:55 <sven> so if RTKit thinks the AP power state is 0x01 and you send it 0xb with 0x01 again it'll just panic

17:55 <tpw_rules> that's rude of it

17:55 <sven> yup. and what's even worse is that flipping that reset bit can't revive the co-processor after a panic

17:56 <tpw_rules> but probably explains why earlier experiments were having linux tell me it crashed

17:56 <sven> the code in nvme/dev parses the crashlog it sends fwiw

17:56 <sven> it would've complained about something like "Invalid AP Power" there

17:58 <tpw_rules> i never quite figured out how to actually dump them for that to pars

17:58 <sven> hm?

17:58 <sven> the crashlog is just written to the buffer it requested originally

17:59 <tpw_rules> oh, you mean the linux branch

17:59 <tpw_rules> i thought you were talking about m1n1

17:59 <sven> :D

17:59 <sven> m1n1 also has a crashlog parser somewhere

17:59 <tpw_rules> yeah but it just accepts a file on the hard disk, which i'm not sure how to get

17:59 <sven> ah

17:59 <sven> just open("crashlog.b", "w").write(iface.readmem(crashlog_addr, 0x8000)) or something like that

18:08 MajorBiscuit has joined #asahi-dev

18:09 MajorBiscuit has quit []

18:14 <tpw_rules> hm, i think i made a mistake when testing because i can't get it to work now. it just crashes in linux again. but i will try the parser out

18:16 <kettenis> fwiw, i pushed some fixes to the u-boot branch

18:16 <kettenis> including the hack to not start the syslog endpoint

18:22 aleasto has joined #asahi-dev

18:28 <tpw_rules> ah, simply not sending the 0xb command to shut it down seems to solve the problem for good

18:28 <tpw_rules> but kettenis i assume you are not interested in having uboot shut nvme down yet because that will break openbsd

18:33 nabaiste^ has joined #asahi-dev

18:36 <kettenis> right

18:36 <kettenis> I need to add the code to start things back up to OpenBSD first

18:48 MajorBiscuit has joined #asahi-dev

19:01 Glanzmann has quit [Quit: EOF]

19:06 Major_Biscuit has joined #asahi-dev

19:07 MajorBiscuit has quit [Ping timeout: 480 seconds]

19:21 yamii has quit [Quit: WeeChat 3.3]

19:22 yamii has joined #asahi-dev

19:41 Dcow has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]

19:49 aeadio has quit [Quit: ZNC - https://znc.in]

20:06 aead has joined #asahi-dev

20:10 MajorBiscuit has joined #asahi-dev

20:11 <jannau> sigh, I still don't understand how sio is set up but at least I can now read the SPI transfers through RegMonitor

20:15 <jannau> the dma buffers are unfortunately not at constant offsets in SIO's memory

20:15 MajorBiscuit has quit []

20:16 Major_Biscuit has quit [Ping timeout: 480 seconds]

20:16 <jannau> but the devive provides the information available in ioreg

20:16 <jannau> i.e. product and vendor id's and strings

20:17 <kettenis> I suppose using gpiola to decode the transfers is a bit too involved...

20:17 <sven> hrm, doesn't it just map buffers in DART and then send the points as messages?

20:18 <sven> *pointers

20:20 <sven> you can also pass smartio_debug=0xff and make XNU more verbose

20:21 MajorBiscuit has joined #asahi-dev

20:29 <jannau> I can't spot pointer in the messages

20:29 aleasto has quit [Remote host closed the connection]

20:30 <sven> hm.. so what i remember from looking at it very briefly was that there were a lot of setup messages at the beginning

20:30 <sven> oh... i think the pointer was shifted

20:30 <sven> and it was at the beginning of the IOVA space iirc

20:31 <jannau> yes, there are a couple of setup messages

20:31 <sven> those already should have pointers to IOVA space in them

20:31 MajorBiscuit has quit [Quit: WeeChat 3.3]

20:32 MajorBiscuit has joined #asahi-dev

20:32 * sven should've taken notes :(

20:35 <sven> sio_dart_tracer.dart.iotranslate(0x388<<12) that's still in my m1n1_history

20:45 <jannau> https://gist.github.com/jannau/b111faa66f59cd7eae9dca56bddfec75 is what I see

20:47 <jannau> ignore the identifiers except for CHANNEL I have no evidence for anything

21:00 <jannau> it might write the address I'm interested in at a fixed offset from the adresse of one setup message

21:05 <sven> hm... i never got to the actual DMA packets fwiw

21:06 <sven> could also be that it just allocates a ringbuffer at the beginning and then just keeps reusing that somehow

21:10 <jannau> for spi3 it appears to be a fixed rx and a fixed tx buffer

21:24 ey3ball[m] has joined #asahi-dev

21:25 <povik> 21:20 < sven> you can also pass smartio_debug=0xff and make XNU more verbose

21:25 <povik> don't tell me sio stands for smartio

21:29 <sven> ioreg | grep -i smartio

21:29 <sven> | | | | +-o AppleSmartIO <class AppleSmartIO, id 0x100000499, registered, matched, active, busy 0 (17 ms), retain 6>

21:29 <sven> | | | | +-o sio-dma <class AppleSmartIODMANub, id 0x100000189, registered, matched, active, busy 0 (14 ms), retain 8>

21:29 <sven> | | | | +-o IODMAController0000008A <class AppleSmartIODMAController, id 0x1000004be, registered, matched, active, busy 0 (3 ms), retain 11>

21:29 <sven> povik: ^--

21:35 <povik> oh no

21:35 <povik> that strikes me as ironic given what we suspect the sio coprocesor does/is :-p

21:36 <sven> its firmware is ~1MB or so. assuming that most of that is rtkit it shouldn't be too hard to confirm how smart it actually is :)

22:12 <sven> huh. i just shutdown nvme after writing a lot of data and it seems that remove_sq/cq actually flushed some more data because it took a few seconds to complete. and that was after a flush already

22:28 MajorBiscuit has quit [Quit: WeeChat 3.3]

22:34 Glanzmann has joined #asahi-dev

22:35 MajorBiscuit has joined #asahi-dev

22:49 psykose has quit [Ping timeout: 480 seconds]

22:49 psykose has joined #asahi-dev

23:34 MajorBiscuit has quit [Quit: WeeChat 3.3]

23:36 Dcow has joined #asahi-dev

23:46 MajorBiscuit has joined #asahi-dev