marcan changed the topic of #asahi to: Asahi Linux: porting Linux to Apple Silicon macs | General project discussion | GitHub: https://alx.sh/g | Wiki: https://alx.sh/w | Topics: #asahi-dev #asahi-re #asahi-gpu #asahi-stream #asahi-offtopic | Keep things on topic | Logs: https://alx.sh/l/asahi
<JTL>
chadmed: with MRIs, it makes me almost fall asleep while in the machine :D
riker77 has quit [Quit: Quitting IRC - gone for good...]
riker77 has joined #asahi
tomtastic has quit [Ping timeout: 480 seconds]
tomtastic has joined #asahi
kettenis_ is now known as kettenis
jvoisin has joined #asahi
<alyssa>
marcan: with the help of cpu pstates.py and a bunch of ugly mesa rebases, I finally managed to get my m1 warm :-p
<rowang077[m]>
I wonder is the difference between the M1 macbook air and the pro due to cooling difference? As in it's thermal throttling or is it a different configurations done by Apple.
<kettenis>
the base model has a 7-core GPU instead of an 8-core GPU, so that is a configuration difference
<kettenis>
but otherwise it is just thermals
<rowang077[m]>
Isn't that configurable? I thought you can get an 8-core GPU with the air.
<kettenis>
yes, that's not the base model
<rowang077[m]>
so there is no soc/configured difference between the upgraded air and the Pro/mini as far as we know?
<rowang077[m]>
So let's say I were to liquid cool all M1 computers. The perf delta between them would be negligable.
<alyssa>
cherrypicked more commits from marcan 's tree, so now the hack in m1n1 goes away :)
<alyssa>
looking forward to seeing my tree lighten
<marcan>
rowang077[m]: correct
<marcan>
people have modified Airs to have better cooling already
<alyssa>
marcan: I butchered the git history when cherrypicking (π€·), but I now have your latest pwrstate patches in my tree
<marcan>
cool!
<alyssa>
and can confirm that everything works as before + NVMe works with stock m1n1 now
<marcan>
awesome :)
<alyssa>
so that's one hack gone, 17 to go :-p
<kettenis>
gotto rename your branch at some point
<alyssa>
kettenis: this is now in "fewer-hacks-2"
<alyssa>
you will notice there is no "no-hacks" branch π
<kettenis>
see ;)
<marcan>
alyssa: it's monocommit crap, but if you want to try cherry-picking cpufreq/wip (just pushed) on top, that should give you working cpufreq (including MCC control)
<marcan>
and scheduling
<marcan>
I *think* the devicetree merge shouldn't be too bad :)
<alyssa>
my whole tree is monocommit crap tbh
<marcan>
heh
<alyssa>
I figure it'll be sorted out at upstream time.
<alyssa>
and when people keep doing random fixup commits and force pushes and entire drivers get swapped in and out ... yeah, not problem to do the git surgery for them 12 times over π
<marcan>
:p
<marcan>
hopefully as things are less interdependent the merges will become a lot easier
<alyssa>
yeah, here's hoping
<alyssa>
I just totally gave up on device tree tbh.
<marcan>
I still want to at least have a *script*, something I keep around manually, of how to merge a known-good branch, as a base
<marcan>
I think that cherry-pick shouldn't be too bad
<marcan>
it changes cpus which haven't changed since forever
<alyssa>
I was trying to maintain history but then with the pstate->pwrstate I noped out and squashed into one big "all the new DT since upstream"
<marcan>
and adds some devices at the top of the devices part
<marcan>
lol
<alyssa>
and so much of the history was "add these 6 devices because Corellium" so it's not history worth preserving at this point
<marcan>
heh yeah
<alyssa>
anyway, let's see cpufreq/wip
<alyssa>
what's the core base/power change?
<marcan>
just an API variant I was missing, and actually some of it is obsolete and can be removed I think
<marcan>
(it was for a prior attempt)
<alyssa>
ack
<alyssa>
I suppose you'll be cleaning up the git history in the next few days anyway so no point in me doing surgery now.
<marcan>
yeah, exactly
<alyssa>
anyway, cherry-picked, let's give her a spin
<marcan>
just apply it if you want to try it functionally
<alyssa>
I do
<marcan>
just force pushed with the obsolete change removed, but you don't have to care
<marcan>
did you get any conflicts?
<alyssa>
a trivial one on the DT (versus DCP), otherwise no βΊοΈ
<alyssa>
TTY> FDT: Usable memory is 0x800e08000..0x9dad88000 (0x1d9f80000)
<alyssa>
TTY> FDT: failed to get reg property of CPU
<alyssa>
uh
<alyssa>
guess I botched the DT cherrypick
<marcan>
yeah that looks like a bad merge of some sort
<marcan>
oh wait hold on
<marcan>
no, that's me
<alyssa>
rip
<marcan>
alyssa: pushed m1n1 update
<marcan>
sorry, forgot about that one
<alyssa>
still not happy..
<alyssa>
(yes i chainloaded)
<marcan>
same error?
<alyssa>
oh err better now
<alyssa>
but have a kpanic
<alyssa>
with errr pcie wat?
<marcan>
heh
<marcan>
paste?
<alyssa>
sysfs_create_groups/kernfs_activate
<marcan>
I mean backtrace
<marcan>
btw, you know about the "lower" thing in the hv, right?
<alyssa>
hm?
<marcan>
ah, you're running this base metal for now?
<marcan>
*bare
<alyssa>
yeah
<marcan>
ok
<marcan>
that makes pasting the backtrace harder :p
<alyssa>
yes
<alyssa>
would screencap but dealing with an irl thing
<marcan>
ok
<marcan>
I'm going to sleep soon :p
<alyssa>
(and typing with 1 hand and uh)
<marcan>
alyssa: yeah you're going to have to run the HV, I think the important bits scrolled off the screen
<marcan>
I suspect there's probably a warning or something useful before the big explosion
<marcan>
this looks like I or someone else botched something with kobjects/sysfs, or possibly that something else exploded first and the failure path is broken
<marcan>
(and left behind bad state)
povik has quit [Quit: Page closed]
aleasto has quit [Ping timeout: 480 seconds]
<j_ey>
marcan: wat, people have modified the airs to have better cooling??
aleasto has joined #asahi
<marcan>
yeah, it's actually easy, you remove a thermal insulator and replace it with a thermal pad
<marcan>
only problem is it might cook your lap ;)
<marcan>
but I hear it works well on a desk
<j_ey>
oh, so that's what the lap in laptop stands for
<j_ey>
:p
<kettenis>
see the youtube link for the LTT video I posted
<alyssa>
marcan: uhm it just booted fine now. uh.
nskl has joined #asahi
<alyssa>
3 good boots in a row. Hm.
<alyssa>
Wonder how I got into such a bad state
<alyssa>
hm. maybe just got the system screwed up, i was having trouble with m1n1 at first too. meh.
nsklaus_ has quit [Ping timeout: 480 seconds]
<alyssa>
anyway have cpufreq+mcc cherrypicked
<alyssa>
it doesn't.. seem to be working... looking for obvious problems on my side
<alyssa>
currently missing: SPI, shutdown, wifi, thunderbolt.
VinDuv has quit [Quit: ZNC 1.8.2+deb2+b1 - https://znc.in]
VinDuv has joined #asahi
tylo has quit [Ping timeout: 480 seconds]
<alyssa>
cherry-picking the corellium wifi commits, I'm curious what the story is there
<alyssa>
of course building a kernel with wireless support means rebuilding the whole kernel. on my chromebook. grumble.
<alyssa>
there is a lot in this commit I don't understand yet
<alyssa>
there's also a lot of debugging stubs in here as I can see.
tylo has joined #asahi
tylo has quit [Ping timeout: 480 seconds]
<alyssa>
looking at the commit hunk by hunk, we have:
<alyssa>
1. error handling change, can probably be dropped
<alyssa>
2,3. adding BCM4378 to the allowlist
<alyssa>
4. Doing something funny with the firmware path. This smells of hack.
<alyssa>
ok we're ceasing to make sense to me
<alyssa>
ripping apart the mp_global_t struct because... why? this is hack.
<alyssa>
will need to get fixed i guess..
<alyssa>
5. module parameters for mac address, chip id, and nvram. these seem to be unused, maybe was leftover from debug?
<alyssa>
this is related to this "OTP" thing that's Apple specific. I
<alyssa>
'm trying to understand what that is.
<alyssa>
I guess they use it to read out the vendor/modrev/module/chiprev/serial. but they don't seem to *do* anything with that information?
<alyssa>
so can all that code maybe be dropped?
<alyssa>
if so, the commit becomes a /lot/ smaller :)
<alyssa>
okay, back to the top
<alyssa>
next they rip apart mp_global for no clear reason. probably was a debug thing..
<alyssa>
next functional thing is brcmf_fw_set_macaddr .. sketchy string parsing code in kernel space but the functional change is prepending the string "macaddr=aa:bb:cc:dd:ee:ff" into NVRAM with the appropriate MAC used
<alyssa>
I don't see where they set that MAC address. if it was OTP, fine. but actually the MAC address should be from the ADT anyway.
<alyssa>
some unexplained change to nvram length, probably MAC related
<kettenis>
m1n1 makes sure the mac address is available as a local-mac-address property
<alyssa>
allowing vif==NULL for no p2p whatever that is, need to read the context
<alyssa>
kettenis: yeah, saw that π
<kettenis>
there should be a standard interface to fetch it
<alyssa>
oh and the registers are moved around. How... lovely.
<alyssa>
feedc0de ... uh
<alyssa>
....RANDOMBYTES? how.. random....
<alyssa>
okay. er. right. uhm. okay
<kettenis>
feel free to give patrick@openbsd.org a shout
<kettenis>
he added support to our bwfm(4) driver
<kettenis>
and I don't think he was impressed by what corellium did...
<kettenis>
(unfortunately I can't get it to work anymore; some subtle thing in the PCIe driver maybe)
<alyssa>
nod
<alyssa>
oh and this needs SMC support to light up the wifi PCI port right..
<alyssa>
forgot that particular detail
<kettenis>
yup
<alyssa>
I had a cherrypick of corellium SMC but that's probably dead by now.
<kettenis>
yeah, that needs to be redone based on sven's mailbox/rtproc stuff
<alyssa>
....oh and that's another "rebuild the whole kernel" config change.
<alyssa>
j_ey: I hear you and Sven have an SMC?
<alyssa>
oh, found it
<kettenis>
the way I envisaged the SMC stuff working is by having a "pwren-gpios" property next to the "reset-gpios" property on the root port node
<alyssa>
nod
<kettenis>
pwren-gios is how the hifive unmatched handles powering up devices behind the bridge
<kettenis>
but maybe this needs to tie in with rfkill somehow
<alyssa>
nod
<alyssa>
ok this will involve too many hacks for me to deal with right now
<alyssa>
who needs wifi anyway.
<j_ey>
us with the laptops D:
<alyssa>
D:
<alyssa>
j_ey: thanks for volunteering to finish this mess
aleasto has quit [Quit: Konversation terminated!]
* alyssa
is tapping our.
<alyssa>
*out.
* alyssa
rolled back the smc+wifi stuff
<alyssa>
and now wow look how bug-free my kernel feels now! :-p
aleasto has joined #asahi
<j_ey>
D:
<alyssa>
so. better question. why is there a 1s stall during boot. and why does it seem to be the fault of the ... UART?
<alyssa>
does not work (1 sec delay): net.ifnames=0 rw root=/dev/nvme0n1p4 rootwait rootfstype=ext4
<alyssa>
marcan: FWIW, linux (kernel + userspace) boots 20% faster if I run cpu_pstates.py first
<alyssa>
(the kernel has the cpufreq patches)
<alyssa>
I assume all the cpufreq scheduling only kicks in later in boot, so cranking everything up early on is a win.
<alyssa>
I don't think that's a bug in your cpufreq/mcc, and I don't think linux is easy to fix here
<alyssa>
so I sort of wonder if it makes sense to set everything to max in m1n1, and then let the cpufreq/mcc drivers turn it down once the system is booted and that infrastructure is kicking
<alyssa>
What I don't know is if that "turning it down" happens correctly.
<kettenis>
downside of cranking stuff up in m1n1 is that it leaves the cores spinning in a high power state
<kettenis>
but maybe this could be done in the code that runs just after we set them loose
<j_ey>
or at least make the p cores higher than 600mhz
<alyssa>
kettenis: where are you worried about?
<alyssa>
idling in m1n1? (do it in the kboot path)
<alyssa>
or in the OS itself? (if the kernel has cpufreq drivers, it'll downclock a few seconds after boot. if it doesn't, it should expect bad pmgmt anyway and at least this way it goes brrr)
<alyssa>
I guess u-boot/grub
<maz>
alyssa: do we really care about *that*?
<maz>
alyssa: on my system, that's 0.2 of a second to probe all three ports...
<alyssa>
π€·
<alyssa>
I could make the same quip about _relaxed accessors π
<maz>
not quite. relaxed accessors have a system-wide effect, and are used all over the place. boot happens *once*.
<alyssa>
fair
* alyssa
boots a stupid number of times, probably because she's used to suspend/resume being broken on all of her devices
<maz>
I mean, I'm all for a speedy boot. but if we are going to start optimising for boot time, there are a lot of other things that would help. disable kernel logging, for a start.
povik has joined #asahi
<alyssa>
fair enough!
<maz>
anyway, I wasn't supposed to stop by IRC, just on my way to pick some booze in the basement! :D
<alyssa>
enjoy! :-p
* alyssa
is writing a python SMC driver because.. uh... that's a great question
<povik>
have fun
<alyssa>
thanks :-p
<alyssa>
I would like to understand it better
<alyssa>
there's my PC excuse
<povik>
i finally got to the point where i have something in linux to test
<alyssa>
:D
<povik>
(audio-driver-wise)
<povik>
few days of coding with no testing
<povik>
what can go wrong
<povik>
also, i havent yet tried linux on the m1 at all
<alyssa>
giggle
<povik>
today will be a first
<povik>
and already loaded with new code
aleasto has quit [Quit: Konversation terminated!]
<povik>
so the question is where do i get userspace with alsa tools
<povik>
or maybe just trying with some userspace will be enough at first