<chadmed>
is there something deeper going on in m1n1 that causes that little bench: piece of assembly to hang the system? i cant see why it would
<chadmed>
but when i try to run too many iterations of it (trying to make a power virus) the system irrecoverably hangs and i have to hard reboot it
ramitgoolry[m] has joined #asahi-dev
WindowPain_ has joined #asahi-dev
<jannau>
chadmed: proxy timeout? how long is the estimated run time?
<chadmed>
its not actually the run time in this case, i have it set to few enough loops that it comes back successfully after running
<jannau>
you could run it on a secondary core with smp_call()
<chadmed>
im doing that, im following along with what was done to measure the pstate latencies
WindowPain has quit [Read error: Connection reset by peer]
<chadmed>
what ive done differently is put the bench routine under a while True, which is what causes the uart timeout
<chadmed>
idk why it would do this since smp_call should always return in time and just run again
<jannau>
not sure what smp_call does if the previous call is still running on the same core
<jannau>
should just clobber the previous return value
<chadmed>
thats what i expected but the shell just respectfully waits for it to finish doing whatever its doing, which seems to be never because the routine has hung somewhere
<jannau>
but the second smp_call can cause a proxy timeout if the first is still running (on the same core)
<jannau>
smp_call waits just before the the secondary core enters the called function
roxfan has joined #asahi-dev
bisko has joined #asahi-dev
bisko has quit []
bisko has joined #asahi-dev
bisko has quit []
bisko has joined #asahi-dev
bisko has quit []
bisko has joined #asahi-dev
bisko has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
fetsorn has joined #asahi-dev
<chadmed>
seemed to have been a race between smp_call and the interpreter iterating over the loop. luckily i seem to be able to get reasonably sane data by just beating the race
<chadmed>
amazing what stepping away for a couple of hours and coming back well fed can do :P
fetsorn has quit [Remote host closed the connection]
<jannau>
do you have a smp_wait or any other kind of synchronization with the code you're running
<jannau>
if not why wouldn't it racy
<chadmed>
i did try using the smp_call_sync() instead but it gave me the same results *shrug*
<chadmed>
i might even just have a dodgy usb cable who knows
<jannau>
smp_call_sync waits for the result and will timeout if the bench code takes more than 3 seconds
<jannau>
the same might happen for the second smp_call on the same core. it will wait until the first call has finished
norb has joined #asahi-dev
<chadmed>
it was the thunderbolt port on my dock, i started getting uart checksum errors too :/
<chadmed>
plugged into a thunderbolt port on the actual host machine and all is well now
<norb>
Hi devs, first of all a BIG THANKS. That was an incredible smooth installation my M1Pro. I would like to know if I can help with something. I am a ~20 year Debian Developer now running Arch as main desktop (M1pro and desktop), normally build my own kernels and I'm not scared with playing around with sysinternals and testing wild stuff. Any pointer is welcome!
<marcan>
that should be fun for Linux kernel debugging
<marcan>
chadmed: smp_call without waiting for completion is not supposed to work, no
<marcan>
you need to use smp_wait or smp_call_sync
<marcan>
but yeah, sounds like your underlying issue was something else too
<chadmed>
yeah i figured, as i said though turned out the weird errors were due to a dodgy port on the host end, all cleared up now
<chadmed>
new fun challenge: a single core uses such little power in most of the pstates that its basically just noise to the SMC
<marcan>
norb: if you're still a debian project member, I'd love to see official work on that end for integrating with these machines :) (even if just discussion at this stage)
<marcan>
we have some folks doing unofficial debian images but nothing official that I'm aware
<marcan>
chadmed: unsurprising tbh :)
<norb>
I'm not doing Debian stuff anymore, have moved completely to arch. I can help with anything Debian related, but don't use it myself anymore
<marcan>
also reminds me someone needs to write hwmon, it's on my endless pile but that's the kind of thing anyone can pick up given the smc scaffolding already in place
<marcan>
maybe there should be an official-ish list of "things for folks to pick up"
<chadmed>
marcan: i cant say im shocked either, just trying to think of ways to get values for opp-microwatt without performing invasive surgery on a motherboard
<marcan>
AIUI the SoC has internal power meters, which are probably readable through PMP if not directly
<marcan>
those should give you better data than SMC
<marcan>
and measure in accumulated joules I believe?
<marcan>
actually hold on, CLPC keeps complaining and I don't *think* that hits PMP
<marcan>
so maybe those are raw
<marcan>
chadmed: want me to take a quick look at that see if I find it quickly?
<marcan>
clpc node is under pmgr so yeah, must be raw
<chadmed>
if you want or have time, if its going to be an effort though dont worry too much about it
<marcan>
not sure how much of that init is required, but some of it seems to be to get good numbers for the energy counter, I think?
<marcan>
this is for t8103/mac mini
<marcan>
not sure what the units are either
<marcan>
also the pcore/ecore units seem to be different for some reason
<marcan>
interestingly the energy counters seem to be the same for all the boost states of the pcores, in fact it peaks a bit higher at 2988 MHz
<marcan>
which makes sense, since those boost states all use the same voltage AIUI
<marcan>
so that means the dynamic power per clock is the same, and due to static power, higher clocks actually save (a tiny bit of) energy
<marcan>
not sure if all those magic constants are the same for everyone or calibration stuff calculated off of the values read or what
<marcan>
also some counters aren't working yet
<chadmed>
the one thing that strikes me as odd is the seemingly different units between the pcores and ecores
<marcan>
yeah
<marcan>
I might have the init wrong
<marcan>
let me look closer
<chadmed>
arent the clusters at 0x210e00000/0x211e00000?
<marcan>
cluster globals yes
___nick___ has joined #asahi-dev
<chadmed>
oh yeah didnt notice the change in set_pstate()
___nick___ has quit []
___nick___ has joined #asahi-dev
<marcan>
chadmed: some of the init constants do seem different, also those regs are arrays of pstates (8 bits each) and they are preceded in the full HV log (which I did not paste verbatim) by reads of 8 regs that seem to contain pstate info for 8 pstates
<marcan>
so they're reading those and computing transposed arrays of that info
<marcan>
and there are significant differences between pcores and ecores so it seems plausible that the units differ
<marcan>
will look in more detail later
<chadmed>
cool, ill spend some time on it tomorrow too. i was just playing around with some ideas out while i had some free time before exams so dont feel obligated to go too deep, im sure there are bigger fish to fry :)
<jannau>
was there a change in the hypervisor that could make linux boot or the vuart very slow?
<jannau>
almost slow enough to see single character updates in the earlycon up to the linux mem init at least
the_lanetly_052___ has joined #asahi-dev
the_lanetly_052__ has quit [Ping timeout: 480 seconds]
roxfan has quit [Ping timeout: 480 seconds]
dbancroft[m] has joined #asahi-dev
povik has quit [Ping timeout: 480 seconds]
<marcan>
jannau: if you have updated recently, the gdbstub was merged, which touched a bunch of stuff. there was also the spinlock thing and a fix for that.
<marcan>
(prior)
<marcan>
and related changes
<marcan>
nothing specific to the vuart but those things touched the general exception entry stuff
<jannau>
I think it was before I rebased onto the gdbserver merge
<jannau>
I was previously on "display: Report time spent modesetting". I'll check if it's a "regression"
povik has joined #asahi-dev
roxfan has joined #asahi-dev
<jannau>
any idea how the macOS selected timing mode could persist in dcp over reboot and poweroff (macbook pro 14")?
<sven>
nvram setting + DCP firmware patching from iboot maybe
<jannau>
I don't think patching is necessary, iboot probably just configures it and it persists after that
<jannau>
the display is initialized
* jannau
is just surprised that the timing mode is saved in nvram
<sven>
oh, true
<jannau>
and a little annoyed since modesets are crash dcp macbook but not on other devices
___nick___ has quit [Ping timeout: 480 seconds]
jeffmiw has joined #asahi-dev
fetsorn has joined #asahi-dev
<jeffmiw>
11:28 <marcan> maybe there should be an official-ish list of "things for folks to pick up" <= I very much like this idea, I think it can really help the project
fetsorn has quit [Remote host closed the connection]
kloenk has quit [Remote host closed the connection]
<jeffmiw>
marcan: I will be happy to look at hwmon and get someting going. I need to catch-up on SMC. I suppose the best is to state from bits/110-smc ? any suggestion/direction is more than welcome :)
<j`ey>
jeffmiw: just put commits on top of the asahi branch
<j`ey>
jeffmiw: also look at drivers/power/supply/macsmc_power.c, you can see how to write a smc driver from that
<jannau>
sigh, something is broken. I had now really glacial startup from disabling the MMU in m1n1. I recorded 2 min and 14 seconds
<jeffmiw>
thanks j`ey !
kloenk has joined #asahi-dev
<jannau>
and I only grabbed the phone and started recording after I saw it was slow
<j`ey>
is this on ultra?
<povik>
could you bisect it to some m1n1 change?
<jannau>
it was not that slow earlier. no on the macbook pro 14"
<povik>
ah
<marcan>
jeffmiw: I have a big TXT of SMC key stuff, let me throw it somewhere
<jeffmiw>
marcan: cool
<jeffmiw>
how did you extracted it ? from tracing macos under hv I suppose ?
<jeffmiw>
marcan, sven: I added contact names(yours(and mine) in this case respectively for hwmon & i2c irq) so people know who to reach in case, feel free to remove if you don't like it
<sven>
works for me
nicolas17 has joined #asahi-dev
<jeffmiw>
if and once everyone is ok with this proposal, I'll add it to the developer section of the table of content
<povik>
i think you should just go ahead, in worst case it will get corrected later
* povik
expanded on the i2c task
<jeffmiw>
added to the ToC/SideBar then
fetsorn has joined #asahi-dev
fetsorn has quit [Remote host closed the connection]
psykose has quit [Remote host closed the connection]
psykose has joined #asahi-dev
nicolas17 has quit [Quit: Konversation terminated!]
<amarioguy>
sven: i can probably pick up the i2c stuff
<amarioguy>
unless someone else is already working on it