ChanServ changed the topic of #asahi-dev to: Asahi Linux: porting Linux to Apple Silicon macs | General development | GitHub: https://alx.sh/g | Wiki: https://alx.sh/w | Logs: https://alx.sh/l/asahi-dev
amw has joined #asahi-dev
user982492 has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
psykose has quit [Remote host closed the connection]
psykose has joined #asahi-dev
phiologe has joined #asahi-dev
<marcan> jmr2: linux.py is unaffected by the DT selection thing, it assumes you know what you're doing
<marcan> what does the filtering is payload mode
user982492 has joined #asahi-dev
PhilippvK has quit [Ping timeout: 480 seconds]
rkt has joined #asahi-dev
jmr2 has joined #asahi-dev
<jmr2> Got it. Thanks.
jmr2 has quit []
marcan has left #asahi-dev [#asahi-dev]
marcan has joined #asahi-dev
rkt has quit [Quit: rkt]
rkt has joined #asahi-dev
kov has quit [Quit: Coyote finally caught me]
kov has joined #asahi-dev
rkt has quit [Quit: rkt]
rkt has joined #asahi-dev
rkt has left #asahi-dev [#asahi-dev]
rohin2 has joined #asahi-dev
rohin2 has quit [Ping timeout: 480 seconds]
kenzie has quit [Quit: Ping timeout (120 seconds)]
kenzie has joined #asahi-dev
rkt has joined #asahi-dev
rkt is now known as Guest4338
rkt_ has joined #asahi-dev
rkt_ is now known as rkt
Guest4338 has quit [Ping timeout: 480 seconds]
chadmed has quit [Quit: Konversation terminated!]
chadmed has joined #asahi-dev
aleasto has joined #asahi-dev
nsklaus has joined #asahi-dev
rkt has quit [Quit: rkt]
rkt has joined #asahi-dev
user982492 has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
rkt has quit [Quit: rkt]
rkt has joined #asahi-dev
chadmed has quit [Quit: Konversation terminated!]
chadmed has joined #asahi-dev
<povik> so, about that NCO block on M1
<povik> we already know it provides audio clocks
<povik> turns out macos writes to its ranges, it's just that they also resolve as error-handler[15] that i didn't notice the writes earlier
<povik> now i would like to know to relationship between the outputted frequency and the nco's registers
<povik> i tried measuring and plotting the dependence but it didnt enlighten me
<povik> on the contrary, i now am really curious what the dependence is, because that plot is nuts
<povik> so if someone wanted to look up this information in a kext, i wouldnt be angry at all
rkt has quit [Quit: rkt]
rkt has joined #asahi-dev
<marcan> povik: do you have some numbers?
<povik> sure, what you have in mind? addresses?
<marcan> normally an NCO would be using some kind of accumulator design, followed by some kind of cleanup pass or something like that
<marcan> I mean like register values vs frequency
<marcan> whatever you plotted
<povik> give me a sec
<marcan> I see the setFrequency proto in the kext, but this is the kind of code I don't want to look at before trying to blackbox it
<marcan> this kind of stuff is easy to accidentally veer into copyright issues with
<marcan> so it looks like we have 4 NCO clock inputs, 5 NCOs (selectable I guess?), and 6 MCA clock selects
<marcan> interesting fanout
skipwich has quit [Quit: DISCONNECT]
skipwich has joined #asahi-dev
rkt has quit [Quit: rkt]
<marcan> povik: https://mrcn.st/p/I7B97zDA some clocktree notes (mostly not useful)
<marcan> getting the feeling the iBoot clock numbers are kinda bullshit
<marcan> (some of them anyway)
rkt has joined #asahi-dev
<marcan> how did you measure the freq? dma frequency?
<povik> marcan: should i upload the plot somewhere also, or are you handy with gnuplot? :-)
<povik> yeah, watching the dma data rate
<marcan> tbh I'm more interested in throwing this into python ;)
<marcan> what's with the gaps?
<marcan> 859 2816.5058968377184
<marcan> 1118 47902.4243718684
<povik> see the comment at the top, it is over two arbitrary ranges
<povik> and within the ranges there are gaps due to the frequency being too low
<povik> see the iboot value measures as
<povik> 1118 47902.4243718684
<marcan> ack
<marcan> that one should be exactly 48000 if they didn't screw it up
<marcan> so I guess that's your error margin
<povik> sure, it gives you an idea what's the measurement error
<povik> ...
<povik> i sped it up and increased the error margin for later values, this is one of them
<marcan> so one thing to keep in mind is that if there is a PLL or similar cleanup pass after it, the other reg values might affect it and that might cause insanity for some ranges
<marcan> but let's see
<marcan> the value is definitely a bitfield
<povik> great. i was trying to isolate some fields yesterday late in night but with no result
<marcan> I'm getting the feeling this register is kind of interesting
rkt has quit [Quit: rkt]
<marcan> okay, this has to be a shift register of some sort
<marcan> if you sort by frequency you get patterns like these: https://mrcn.st/p/O30FM5sz
<marcan> povik: ^
<marcan> the bottom two bits are some kind of finetune
<marcan> and obviously there is measurement error
<marcan> let me see if I can look up some NCO designs
<povik> interesting
<povik> that NCO has some heritage, looking at compatible=
<marcan> yeah
<marcan> povik: do you have your test script somewhere?
rkt has joined #asahi-dev
rkt has quit [Quit: rkt]
rkt has joined #asahi-dev
<povik> had to clean up a bit, it depends on some m1n1.hw.admac changes
<povik> marcan: ^ experiments/mca_mark.py
<povik> usage in comment at top
<povik> i guess it is obvious to you how to reduce the error margin
<marcan> povik: I just figured out a way to get more accurate measurements
<povik> do tell
<marcan> povik: I just figured out a way to get more accurate measurements
<marcan> er
<marcan> 47999.384773510596
<marcan> I just use the countval from admac reports
<povik> of course
<povik> :)
<marcan> that's in clkref units
<sven> marcan: huh, weird w.r.t DAPF.
<sven> but at least the locked DART makes sense now. otherwise you could just switch the mapped firmware region to bypass-dapf i guess
<sven> i wonder what those two weird bypass streams are now though
vivg[m] has joined #asahi-dev
<marcan> povik: look at this: https://mrcn.st/p/iEE2U6QR
<marcan> ignore the third column, but the fourth column is the divider vs. the input clock
<marcan> it's always in units of 0.5 and has an offset of 0.25 (set elsewhere?)
<marcan> so the register values are kind of crazy but we are enumerating integer dividers of the refclk here after all
<marcan> I wonder if...
<marcan> maybe this is an LFSR counter?
<marcan> that would make sense, it gives better timing
<marcan> if so, then the question is what's the polynomial
<marcan> okay, it's the wikipedia one, of course
<marcan> 11 bits, x^9 + x^5 + 1
<marcan> plus a normal binary counter for the low bits
<marcan> cute
<marcan> povik: mystery solved; hardware engineers being clever
<marcan> I think the other registers are fine tuning, doing the fractional part
mikebeaton[m] has joined #asahi-dev
<marcan> I'm not sure if there is an easy way of generating the LFSR value other than a table, though
<marcan> I mean, the table is easy to generate
<marcan> but I don't know if there is a formula
<marcan> 0x1ffc 144176271.4194868
<marcan> 0x1bfc 219604.30389050167
<marcan> so that's the rollover, at all 1s, and 0 is special (not normally valid in an LFSR)
<povik> marcan: great!
<povik> i will look at this in detail later, but i am happy we have this figured out
<povik> i volunteer adding this to m1n1
<povik> (i mean, at least having dump_pmgr show it up)
<marcan> that calculates the lookup table and then computes the divider based on the reg value
<marcan> matches all of them fine
<marcan> ignore the last stuff, the div calculation is what matters
<marcan> I'm not sure where the 4.25 thing comes from, probably other registers
<jix> marcan: you can use modular exponentiation by squaring to directly compute the n-th step of a lfsr, but for 11 bits I wouldn't bother
mikebeaton[m] has quit [Quit: issued !quit command]
<jix> without using carryless (i.e. polynomial over F_2) multiplication instructions (which I assume arm also has these days) it might even be slower than the naive approach for 11 bits
<marcan> heh, yeah, table it is then
<jix> with those instruction it's probably faster than generating the table, but that's a lot of added complexity
<marcan> this is a kernel driver, not going to use NEON for this
<maz> marcan: kernel_neon_begin()/end(). the crypto code is full of it! :-)
<marcan> yes but it feels very silly for this ;)
<maz> I'm all for silly things, today...
* rkt is reviewing today’s chat log with a web brower
<marcan> povik: the next register is the fractional part, but it's also not just a straight thing
<marcan> I think it's probably a delta for the accumulator or so
<marcan> povik, jix: want to take a crack at this relation for the NCO fractional part register? https://mrcn.st/p/8L7N8zW5
<marcan> it's a weird S shape in a log plot
<jix> frac = 1 / (np.exp2(23.5 - np.log2(reg)) + 2) # is pretty close
<jix> but that's just eyeballed, so no idea if that's close enough for what it's used for
<jix> oh and it's not close for the small values, I assumed they are noise because they jump all over the place in a log plot, but that might not be the case?
<marcan> yes, the small ones are noisy
<marcan> there's error in the measurement
<chadmed> is this lfsr used to derive the dac's sample rate? if so values not between 8000 and 384000 should never ever matter
<marcan> maybe, but there's no good reason to hardcode it when we know the formula
<marcan> magic numbers suck
<chadmed> oh yeah for sure im just saying that noise causing inaccuracy in low values shouldnt be too much of an issue unless the hardware is doing double duty for something else
<marcan> jix: I wonder how that comes about from the implementation...
<marcan> chadmed: I mean the noise is just in my measurement
<jix> marcan: same, do you have a rough idea what it does?
<marcan> it's supposed to be an NCO
<marcan> we know the half-integer part is that counter
<marcan> I would expect further fractions to be the phase accumulator thing
<marcan> but this does not match what I expect from a phase accumulator
<marcan> jix: 0.5 - 0.5 / ((reg / (2**22.5)) + 1) makes more sense
<marcan> (equivalent)
<marcan> the sqrt(2) factor there bothers me
<marcan> actually 22.6 matches even better
<jix> maybe 2**22.??? is just 0x600000 ?
<marcan> it might just... yeah that
<marcan> why a factor of 3 then though :D
<marcan> ahh wait
<marcan> the other registers are related
<jix> also 0x60a000 is even closer ... but it also seems to be off by a small factor
<marcan> 0xff9f5200 is -0x60ae00
<marcan> which is another register
<marcan> and yes it does that
nsklaus has quit [Quit: Textual IRC Client: www.textualapp.com]
nsklaus has joined #asahi-dev
<jix> hmm not a small factor but a small offset... if you use 0x60ae00 and then add 2**(-13) at the end it's really close (but that offset might also be measurement error? no idea how you measure this)
<marcan> there may be some measurement error, or the other register is messing with things
<marcan> jix: so basically (1 - 1 / (1 - (r1 / r2))) / 2
<jix> or equivalently 1/2 * r1 / (r1 - r2)
<marcan> yeah, I was about to say...
<marcan> so it's just two periods
<marcan> how long to be +0.5 and how long not to
<jix> can you do the measurement over a longer interval or so to figure out if the 2**(-13) are just measurement error? I expect it to be... but a constant offset could also be plausible
<marcan> pretty sure it's error
rkt has quit [Quit: rkt]
<marcan> ok, so small values take a while to converge, which tells me these are accumulator increments indeed
<marcan> so basically the logic would be a state flip-flop driving a mux that selects the increment register
<marcan> hm, sec
<marcan> yeah, so any time the top bit of the accumulator toggles, the flip-flop toggles
<marcan> and the flip-frop drives an LSB's worth of divider
<marcan> and that's all of the data registers then, reg0 is presumably control, might want to poke around the bits there at some point
<marcan> so the first register is the increment for the ddiv = +0 state, and the second for the ddiv = +0.5 state
<marcan> which means there's an inverse relationship, if r2 is 2*r1, then it spends half as long at 0.5, so the overall div is 0.16666
<marcan> povik: and this is how you reverse an NCO without looking at the kext
<marcan> :-)
<marcan> ah wait, there was one more register
<marcan> not sure what that one's about
<marcan> macos sets these to 0x5ad200 0xff9f5200 0x7f9f5200
<marcan> maybe the third one is just some initializer, perhaps it needs a control reg kick to set?
<marcan> hmmm
<marcan> yes, that's what it is
<marcan> so you have to stop the NCO first, clearing bit 31 of the control register
<brentr123[m]> So hyped for the m1 pro stream
<marcan> then you write the config
<marcan> 00: ctl 04: div/lfsr 08: low-inc 0c: high-inc 10: init accumulator
<marcan> then you set bit 31 of the ctl reg and it goes off
<marcan> that way you can use small incs without having to wait for the accumulator to wrap back across the midpoint
<marcan> the midpoint is indeed 0x80000000
<marcan> and top bit 1 means high div and use high-inc
<marcan> so really that register can just always be set to 0x80000000 I guess, I have no idea why you'd bother with anything else?
<marcan> (or 0x7fffffff)
<marcan> >>> 5952000+6336000
<marcan> 12288000
<marcan> >>> 48000 * 256
<marcan> 12288000
<marcan> ha.
<jix> I guess it's always a good idea to avoid small incs anyway... i.e. multiply them by the largest constant that still fits
<marcan> nah, see what xnu did there
<marcan> (r1 - r2) == desired_freq
<marcan> that's gotta make sense mathematically
<jix> wait is it doing a zigzag around the midpoint or is it always wrapping around in one direction? (or I guess you could do it either if the increment regs are as wide as the accumulator reg)
<marcan> zigzag
<marcan> I actually suspect macos has some rounding error here...
<marcan> hm, wait, no, that's too much
<marcan> ah, had it backwards
<marcan> still has some error though
<povik> marcan: re: this is how you reverse NCO -- 10/10, would give marcan puzzling data again
<povik> funny thing is yesterday i noticed the similarities in frequency of shifted bit patterns, in couple instances
<povik> but ruled that out as probably a meaningless coincidence
<povik> and went on looking for fixed subfields in the registers
<marcan> jix: yeah I don't know why they use fout as the sum of the params
<marcan> so I measure 47999.3839 but I calculate that macos should be setting 48000.5586 and I don't know if that's measurement error or what
<marcan> I'd *think* my timer and this thing would be frequency-locked to some extent, which is a bit weird
<marcan> unless the input clock is a bit of a lie and not exact
<jix> pretty strange to not set it exactly anyway
<marcan> https://mrcn.st/p/w1wM9Yvh is my attempt at calculating the values
<marcan> (high_inc needs to be negated in the register)
<jix> what's fin?
<marcan> input freq
<jix> no, what's the frequency
<marcan> 900000000, supposedly
<jix> that's quite high 0o
<marcan> probably why they used an LFSR divider ;)
<jix> so this is used to adjust the fractional period of the lfsr divider? (just making sure I understand the whole chain here)
<marcan> yes
<marcan> the lfsr divider works in 0.5 units (half clocks I guess?)
<marcan> and then this accumulator toggles between two adjacent dividers
<jix> what d1/d2 values does mac os set this to?
<jix> (shouldn't be a choice here AFAIUI, just wanting to make sure that I have everything right)
<marcan> ok, something's off
<marcan> NCO1: #156: 900000000 0 0xa8/0x23b048000: nco: 0x80300000 0x45e 0x5ad200 0xff9f5200 0x7f9f5200 0x0
<marcan> I get a divider of 4.50005766 with the flop one way, and 4.05639163 the other
<marcan> that doesn't look right
<marcan> (for the highest freq)
<jix> uh I think the math for the values you said mac os sets checks out? (also not quite following the last thing you said)
<marcan> does it?
<jix> I get that it should set a fractional value of 31/64 and that's what 5952000 / (5952000 + 6336000) reduces to... except wait is it actually the right thing to do? it matches what mac os does but uh not sure
<marcan> I'm probably doing the math wrong
<marcan> also I get better numbers with a div6
<marcan> so I think that must just be a weird artifact of the lowest div
<marcan> 00000000 540858.4014394722 -0000001 6.50008392 0.000000
<marcan> 00000000 585929.9563842667 -0000001 6.00007725 0.000000
<marcan> jix: I'm probably getting some 1/x relationships wrong
<jix> I also totally don't trust that what I'm computing is the right thing, just that it ends up telling me mac os is setting 48k exactly
<jix> hmm but no it should be fine I think... now let me try to do this the other way around to get from freq to incs
<marcan> jix: ok, my interpretation of the dither was wrong
<marcan> it's per output cycle, not input cycle
<marcan> that's where my error comes from
<jix> ah yeah, what I did was just to use high_inc and low_inc to directly match the fractional part of the divider period ... and that should check out if it's per output cycle as the formula for the fractional part already takes into account that the longer output cycles are longer
<marcan> and yeah that works out to 48k from macos
<marcan> yeah okay and that makes the math make sense
<marcan> div = (2 * fin // fout)
<marcan> low_inc = (2 * fin - div * fout)
<marcan> high_inc = fout - low_inc
<marcan> simple.
<marcan> still not sure where that measurement error comes from, but shrug
<marcan> okay, I should *not* have let myself get nerdsniped with this today :S
<jix> ah and that's why the difference is fout... by scaling both values by fout you can use an integer division here, that also explains why they're not using a reduced fraction
<marcan> yeah
<jix> eh sum, not difference, well depends on how you view it ^^
bps2 has quit [Ping timeout: 480 seconds]
stzsch has quit [Remote host closed the connection]
<marcan> (also, I just realized I messed up the polynomial earlier; it's x^11 + x^9 + 1)
user982492 has joined #asahi-dev
<marcan> povik: so to do this "properly" in linux we need two clock drivers: one for the NCOs and one for the clocksels that go after them
user982492 has quit [Quit: Textual IRC Client: www.textualapp.com]
agraf has quit [Ping timeout: 480 seconds]
gig has joined #asahi-dev
<povik> marcan: AFAIK the muxes that go after NCOs can remain in the state left by iboot
<povik> so an option we can fabulate in our DT that the MCA blocks are connected to NCOs directly
<povik> *so as an option,
jeffmiw has joined #asahi-dev
aleasto has quit [Remote host closed the connection]
agraf has joined #asahi-dev
agraf has quit [Quit: ZNC 1.7.2 - https://znc.in]
X-Scale` has joined #asahi-dev
agraf has joined #asahi-dev
X-Scale has quit [Ping timeout: 480 seconds]
jacoxon has joined #asahi-dev
<marcan> we could do that, yeah
rohin2 has joined #asahi-dev
bps2 has joined #asahi-dev
jacoxon has quit []
X-Scale has joined #asahi-dev
X-Scale` has quit [Ping timeout: 480 seconds]
doggkruse has joined #asahi-dev
doggkruse has quit [Ping timeout: 480 seconds]
rohin2 has quit [Ping timeout: 480 seconds]