ChanServ changed the topic of #aarch64-laptops to: Linux support for AArch64 Laptops (Chrome OS Trogdor Devices - Asus NovaGo TP370QL - HP Envy x2 - Lenovo Mixx 630 - Lenovo Yoga C630 - Lenovo ThinkPad X13s - and various other snapdragon laptops) - https://oftc.irclog.whitequark.org/aarch64-laptops
weirdtreething has quit [Remote host closed the connection]
<icecream95>
Is it true that EL2 cannot force interrupts to be masked at EL1? If so, that's annoying, because if a nested hypervisor has HCR.IMO==0, then interrupts need to be masked regardless of PSTATE.
<icecream95>
Perhaps there is some GIC configuration I can set, I'll look at the specifications...
weirdtreething has joined #aarch64-laptops
chrisl has joined #aarch64-laptops
chrisl has quit [Ping timeout: 480 seconds]
<icecream95>
Hmm so it seems that reinjecting exceptions shouldn't be too hard. Won't help if there's an interrupt storm though, but *surely* no hypervisor would maliciously cause an interrupt storm to prevent being run nested?
hightower4 has joined #aarch64-laptops
hightower3 has quit [Ping timeout: 480 seconds]
chrisl has joined #aarch64-laptops
chrisl has quit [Ping timeout: 480 seconds]
minecrell has quit [Read error: Connection reset by peer]
minecrell has joined #aarch64-laptops
tobhe has joined #aarch64-laptops
<icecream95>
Hmm, (on vivobook) noticing blue screens often happen after plugging in/removing charger while suspended
tobhe_ has quit [Ping timeout: 480 seconds]
hexdump0815 has joined #aarch64-laptops
hexdump01 has quit [Ping timeout: 480 seconds]
chrisl has joined #aarch64-laptops
chrisl has quit [Ping timeout: 480 seconds]
alfredo has joined #aarch64-laptops
<icecream95>
Seems it also can happen even when charging for the whole duration of suspend
<chenxuecong[m]>
Is there a new debian iso for x13s?
alfredo has quit [Quit: alfredo]
chrisl has joined #aarch64-laptops
chrisl has quit [Ping timeout: 480 seconds]
<steev>
should be able to use any testing+ debian iso for x13s
<zdykstra>
Void Linux just released new ISOs, the aarch64 iso supports the x13s now
neggles has joined #aarch64-laptops
jhovold has joined #aarch64-laptops
chrisl has joined #aarch64-laptops
jglathe_angrybox has joined #aarch64-laptops
chrisl has quit [Ping timeout: 480 seconds]
<maz>
icecream95: this is indeed broken in the architecture. work is under way to retroactively fix it.
chrisl has joined #aarch64-laptops
chrisl has quit [Ping timeout: 480 seconds]
<icecream95>
maz: A retroactive fix? How does that work?
<icecream95>
But I guess good to see that problems are being noticed and fixed eventually.
<icecream95>
Next can you retroactively add a fine-grained trap bit for ISB? :)
<maz>
icecream95: I'm not at liberty to say yet, but it is a clarification that shouldn't have any material impact for existing SW, and make NV work correctly in the situations that matter.
<maz>
hopefully it will be made public soon-ish.
<icecream95>
okay, will wait for it then!
<maz>
trapping ISB looks like the worst idea in the history of the architecture, but I'm listening... ;-)
<icecream95>
maz: Since TLB invalidation doesn't have to take effect until context synchronisation, hypervisor code that notices that it's racing with TLBI could temporarily enable a trap
<icecream95>
Otherwise the hypervisor could end up trying to fix up Stage 2 page tables forever if another PE keeps doing TLBI
<icecream95>
(Though that could be worked around by waiting for a bit in the hypervisor TLBI trap handler.)
<icecream95>
But if you trap ISB, then you can delay writing the new Stage 2 page tables until then, so ensure forward progress. (This is assuming PE-local Stage 2 tables.)
<maz>
I don't understand your rationale. TLBI takes effect after a subsequent DSB, not ISB.
<icecream95>
But it requires an ISB on other cores that are affected by it, doesn't it?
<maz>
only if you are doing things like affecting pages containing instructions and need to make sure the other PEs are observing the new instruction stream, in which case you need to IPI the other PEs.
<maz>
given how rare TLBIs are in general compared to ISBs, your approach seems very heavy handed.
<maz>
note that there is definitely a forward progress issue with shadowing S2, but that's pretty similar on all architectures.
<maz>
your "PE-local S2" is interesting though. are you doing that to avoid CnP-like effects?
<icecream95>
Mostly because I saw the S2_MMU_PER_VCPU define in Linux and immediately thought that they were vcpu-local
<icecream95>
No particular reason that it would be better otherwise
<maz>
no, that's only a "rule of thumb" pre-allocation of S2 shadow contexts so that we don't have to recycle these too ofen.
<maz>
but the deeper you nest, the more this becomes a bottleneck on its own, so I'm looking at alternatives...
<maz>
up to L3, it's manageable. L4 falls off the cliff perf-wise.
<maz>
but these shadow S2 contexts are definitely VM-wide. which is a slight violation of the architecture, and we should do PE-local unless the guest asks for CnP.
<icecream95>
Aren't there also algorithmic complexity issues, where a deep context switch can require going through all of the shallower levels multiple times?
<icecream95>
I guess that could be solved through software, if you let the top-level hypervisor know about everything below it
<maz>
of course. trap amplification definitely is a thing, and bites you in the rear badly.
<maz>
but I hate paravirt with a passion, so not keen on this sort of half-arsed nesting.
<icecream95>
Even the other obvious fix of n-stage translation (for large n) and n exception levels would fall off a cliff when you run out of TLB entries...
<maz>
TLBs are not a problem. the PTW is pretty good at refilling things speculatively. what would really help is a less braindead ERET handling, direct injection of interrupts at arbitrary depths, and VNCR-specific TLB invalidation instructions.
<maz>
on the other hand, this is NV we're talking about. the moment you say that, you have already thrown the very notion of performance out of the window! :)
<icecream95>
If you are 100 levels deep, then you would need 100x the number of S2+ TLB entries, and each miss could take hundreds of microseconds... but maybe you aren't thinking that deep yet?
<icecream95>
(Well I guess you don't need to cache every level, but the lookups are still slow. One nice thing about collapsing everything into Stage 2 is that accesses shouldn't get slower with depth, as long as the hypervisors don't change the memory layout too much.)
<maz>
TLBs are not the problem, because you actually execute very little hypervisor code. it is trap amplification that kills you. just sending an IPI to another vcpu at any depth results in a totally crazy dance.
chrisl has joined #aarch64-laptops
<maz>
PTs are rather stable structures outside of startup and teardown phases. but traps are there at all times. it hurts even more when the HW is crap (see the CNTVOFF issue that plages x1e).
<maz>
plagues*
<icecream95>
Well certainly with the current design the trap amplification seems to be the bigger issue
<icecream95>
On the topic of counters, do you know whether setting CNTPOFF causes a reboot because the CPU is broken or if it's trapped to EL3 and firmware is broken? My guess is the latter
<icecream95>
(Or if that's all correct, and software is somehow magically supposed to know not to set it?)
<icecream95>
Is useful to have a 4-byte reboot instruction sequence, I guess.
srinik has joined #aarch64-laptops
chrisl has quit [Ping timeout: 480 seconds]
<maz>
my understanding is that it is the latter. QC has a history of producing firmware that is lacking the relevant enable bits at EL3.
<maz>
given that they advertise ECV+CNTPOFF, I can only assume that they implemented the feature, and that some SW person only copy-pasted the SCR_EL3 value from another firmware implementation.
<icecream95>
The hypervisor (not EL3) also seems to have a bug where ID_AA64DFR0_EL1.MTPMU is set to 0, which is not permitted from Armv8.6 if PMUv3 is supported. That one's not so serious I guess
<icecream95>
(It's -1 if accessed from EL2)
<Jasper[m]>
I'm not sure if x1e or 8cx3 are armv8.6
<icecream95>
x1e is Armv8.7
<maz>
I thiough that was only in SMT implementations.
<maz>
thought*
<Jasper[m]>
icecream95: I was technically correct :^)
<Jasper[m]>
Nah okay, mb
<Jasper[m]>
I knew it wasn't v9, guess I thought it was older in general
<icecream95>
maz: Well, it's only set to 1 in SMT implementations, but 0 has implementation defined behaviour. So this is a "negative feature" where newer implementations have to explicitly deny support?
<maz>
the whole version numbering is a joke anyway. it's all a cherry-picking exercise with a few constraints.
<icecream95>
*cough* WFxT on x1e *cough*
<maz>
that one doesn't even work, due to the CNTVOFF issue.
<HdkR>
icecream95: Well, X1E is technically ARMv8.5 because of missing features :)
<icecream95>
HdkR: Which missing features?
<HdkR>
1Ghz cycle counter specifically
<HdkR>
Mandatory in v8.6/v9.1
<icecream95>
Oh right, that thing. Maybe that's a bug too?
<icecream95>
You just need to overclock it a bit more, until it's the right speed
<HdkR>
Not sure, could be they support FEAT_CNTSC but the hypervisor deletes it or something
<HdkR>
Their stock cycle counter runs at 19.2Mhz
<icecream95>
So now I've seen it called v8.5, v8.6, and v8.7...
<maz>
CNTSC is not visible to EL2. that's an EL3 feature.
<maz>
(or rather secure feature).
<HdkR>
Only visible missing that the cycle counter doesn't run fast enough :D
<icecream95>
Probably the hardware supports it but EL3 disables it because it made Windows not boot :)
<icecream95>
(Or made Windows run too fast.)
<maz>
doubt it, MS were the ones asking for it.
<HdkR>
Me too
<icecream95>
Microsoft didn't ask nicely enough then?
<HdkR>
Apple hardware has roughly an equivalent feature, but it can be controlled per-process
<HdkR>
I assume once it gets wired up to Linux then it'll be system-wide since per-process isn't really necessary here
<maz>
CNTSC really is a system-wide feature, affecting all security states at the same time. there is no "per-context" scaling, unfortunately.
<maz>
which kills VM migration in a lot of cases.
<icecream95>
CNTSCR has a 24.8 fixed-point scale, so might still end up being off by 0.1% or so
<icecream95>
No wait other way around! 8.24 is much better
<HdkR>
maz: Yea, Apple only implemented it per-process for backwards compatibility. Apparently a ton of applications hardcoded 24Mhz for them. Linux doesn't have that since so many devices run at different cycle counters
<HdkR>
icecream95: ARM ARM doesn't claim any consistency claims for the cycle counter, although it recommends <10s drift in 24 hours
<HdkR>
So I think it's fine :D
<icecream95>
10 seconds? That's worse than Sinclair!
minecrell7 has joined #aarch64-laptops
minecrell has quit [Quit: Ping timeout (120 seconds)]
chrisl has joined #aarch64-laptops
chrisl has quit [Ping timeout: 480 seconds]
icecream95 has quit [Quit: rcirc on GNU Emacs 29.1]
chrisl has joined #aarch64-laptops
chrisl has quit [Ping timeout: 480 seconds]
alfredo has joined #aarch64-laptops
<konradybcio>
wonder if a human could achieve a total drift of <10s in 24h of counting.. not judging the intermediate values of course ;)
<konradybcio>
maz if you have any suggestions / complaints other than what you submitted workarounds for in linux already, i'll happily pass those on to the relevant folks..
alfredo has quit [Ping timeout: 480 seconds]
<maz>
konradybcio: I would love to have some feedback at least on the two issues I'm working around: CNTPOFF_EL2 resets the box, and CNTVOFF_EL2 works erratically when HCR_EL2.E2H==1.
<maz>
konradybcio: specially for the second one, whether there is a better workaround than blindly trapping/emulating everything, which really sucks.
<konradybcio>
i'm probably not allowed to talk about or promise bugfixes as i'm just a run off the mill engineer.. but i can certainly take what you say in public and forward it to the internal folks maz
<maz>
konradybcio: I'm not asking for bug fixes. I'm only asking for someone who has access to the implementation details to reproduce my findings and maybe come up with an actual characterisation of the problem. they can then decide whether what we have in Linux is the right thing, and if it isn't, whether they want to propose something better.
<maz>
because at the moment, this reflects pretty badly on QC, specially when you look at the cost of the workaround.
<maz>
(yes, you can read QC as Quality Control... ;-)
chrisl has joined #aarch64-laptops
srinik has quit [Ping timeout: 480 seconds]
chrisl has quit [Ping timeout: 480 seconds]
alfredo has joined #aarch64-laptops
srinik has joined #aarch64-laptops
chrisl has joined #aarch64-laptops
chrisl has quit [Ping timeout: 480 seconds]
alfredo has quit [Read error: Connection reset by peer]
jglathe_angrybox has quit [Remote host closed the connection]
chrisl has joined #aarch64-laptops
chrisl has quit [Ping timeout: 480 seconds]
chrisl has joined #aarch64-laptops
chrisl has quit [Ping timeout: 480 seconds]
chrisl has joined #aarch64-laptops
chrisl has quit [Ping timeout: 480 seconds]
chrisl has joined #aarch64-laptops
chrisl has quit [Ping timeout: 480 seconds]
SpieringsAE has joined #aarch64-laptops
<SpieringsAE>
Huh it seems like my camera isn't working under windows, so now both the sd-card reader and the camera seem to be borked
SpieringsAE has quit [Quit: SpieringsAE]
alpernebbi has joined #aarch64-laptops
<JensGlathe[m]>
That's sad
chrisl has joined #aarch64-laptops
chrisl has quit [Ping timeout: 480 seconds]
<JensGlathe[m]>
Do you do anything special or design issue?
todi1 has joined #aarch64-laptops
todi has quit [Ping timeout: 480 seconds]
jglathe_volterra has quit [Remote host closed the connection]
chrisl has joined #aarch64-laptops
chrisl has quit [Ping timeout: 480 seconds]
srinik has quit [Ping timeout: 480 seconds]
chrisl has joined #aarch64-laptops
Triebholz has joined #aarch64-laptops
Treibholz has quit [Ping timeout: 480 seconds]
chrisl has quit [Ping timeout: 480 seconds]
jhovold has quit [Ping timeout: 480 seconds]
icecream95 has joined #aarch64-laptops
<icecream95>
SpieringsAE: The camera not working could also be a symptom of some of the firmware not loading correctly. Have you tried turning it off and on again? Do Live Captions still work (to test if it's an NPU issue)?
chrisl has joined #aarch64-laptops
chrisl has quit [Ping timeout: 480 seconds]
mbuhl has quit [Remote host closed the connection]