ChanServ changed the topic of #aarch64-laptops to: Linux support for AArch64 Laptops (Asus NovaGo TP370QL - HP Envy x2 - Lenovo Mixx 630 - Lenovo Yoga C630)
laine has joined #aarch64-laptops
laine has quit [Remote host closed the connection]
laine has joined #aarch64-laptops
laine has quit [Remote host closed the connection]
laine has joined #aarch64-laptops
<HdkR> Hmmmm, that single threaded result on the Lenovo implies longer term tests are working with appropriate perf
<HdkR> But that might also imply that that massive 8MB L3 is working while the 6MB SLC isn't :P
laine has quit [Ping timeout: 480 seconds]
<HdkR> steev: Do you have any kernel debugging flags enabled in your config? I'm seeing high kernel load under perf in some instances
laine has joined #aarch64-laptops
<steev> i do believe there are, yeah
<steev> i was trying to get uh... what was it called
<steev> this working
<steev> and it required some kernel config options... but i never did get it
<HdkR> Toggled a bunch of random settings, hope it resolves this performance delta with short lived applications
<steev> let me know which ones, and i can look into disabling them too
<steev> actually... i could compare to mani's defconfig
<HdkR> Does seem like a ton of differences
<steev> nah, the only real one i can think of is the fault injection
<steev> function*
<steev> the rest are mostly just modules becoming built-ins
laine has quit [Ping timeout: 480 seconds]
<HdkR> Actually, I think this has to be cache related. Profiling only code that does linear writes to memory, Lenovo takes 52% more time
<HdkR> 51ms -> 78ms in absolute time
<steev> actually, that's the difference between what i run and mani's defconfig
<HdkR> Quite small
<steev> indeed, most of the changes are just the new options
<HdkR> Is there any way to determine if L3 and SLC is running at max clocks?
<HdkR> Also no idea how to even see SLC in the dts
<HdkR> llcc_bwmon_opp_table?
<steev> maybe (and check the dtsi)
<HdkR> The opp tables seem weird. 1.5GB/s on the top end?
<steev> i blame bamse
<HdkR> oop, that's in kilobytes, not kilobits. 15.2GB/s is still significantly lower than what sc8280xp offers
<HdkR> Unless this is operating modes that once you hit that threshold it hits whatever opp-6 level is
<HdkR> If I delete all but the largest, would I get the highest config?
<steev> maybe?
<HdkR> It's very tricky when I can't tell what frequency each of these levels of cache are running at. As far as I'm aware there is no debugfs available for it
<steev> maybe someone in -msm knows?
<HdkR> Deleting all the opp values except the top end one seemingly gave a few percentage in geekbench but in measurable absolute times it didn't seem to improve hmmm
<HdkR> I am seeing that epss_l3 is falling down the generic compatible path but sc7180/sc7280/sc8180x has a specific compatible
<bamse> HdkR: are you saying that i should turn those numbers to 11?
<HdkR> Not really, it doesn't solve my issue with short-term program execution and I'm just trying random ideas since I'm not a kernel dev
<bamse> HdkR: iirc i dumped the frequency table from epss, and then spread the values across the cpu frequencies haphazardly
<HdkR> Considering I have no idea what that means, sounds reasonable to me :P
<steev> i don't even know how to do things like dump the freqency tables
<HdkR> Same
<steev> still trying to figure out how to actually capture all of the bluetooth traffic so i can figure out the frame reassembly crap
<bamse> steev: for l3, you go into the osm-l3.c and you find a dev_dbg()...make that print and you'll get the table
<steev> oh
<HdkR> bamse: Is there any way to determine if SLC is running maxed out?
<bamse> HdkR: might be possible to measure the clock rates
<HdkR> Since moving from 51GB/s of sm8380 to 68GB/s of sc8280xp and faster CPU cores should be a clear win rather than a loss
<HdkR> So I can only assume cache
<bamse> HdkR: what are you measuring there?
<HdkR> Absolute time to write code in to a linear buffer
<bamse> sized for hitting each cache?
<HdkR> tasksetting to the X1 of the sm8350 and the X1C of the sc8280xp
<HdkR> not sized since this is just measuring total time while running an application
<HdkR> It's probably only a few megabytes
<HdkR> So that's testing linear reads I see
<bamse> yes, because when doing write i noticed in some cases the cache performance was limited by ddr performance
<HdkR> That would be a good test as well in my case, since L3 cache size has double between the two platforms and DDR perf has increased
<HdkR> has doubled*
<HdkR> And basically testing what I'm measuring right now
<bamse> when using writes, i saw something like l3 performance being limited by ddr performance
<HdkR> Oh, L3 and SLC size doubled going from sm8350 to sc8280xp
<HdkR> 4MB+3MB -> 8MB+6MB
<bamse> hmm, did i merge that slc-fix for 6.3?...
<HdkR> Everything just goes bigger. I can't see why perf would go down :(
<steev> which slc fix?
<bamse> the incorrect slice ids
<steev> i don't recall seeing anything about that... but seems like something johan would have added if it were in
<steev> oh
<steev> from abel?
<bamse> yes
<steev> i don't see it in actual -next yet
<steev> but it *should* be in what hdkr is using
<bamse> then i should pick that up for 6.3-rc
<bamse> doesn't seem to be able to boot my sm8350 device i guess we have some fixes needed for that as well
<steev> it boots the x13s just fine
<steev> i've rebooted a bunch while playing around with bluetooth today :)
<steev> and tim's v2 is arguably worse than v1 - at least in terms of figuring out which patch to use...
<bamse> well... HdkR didn't test with upstream sm8350 at least...
<bamse> it definitely needs some tlc
<HdkR> I'm still on steev/lenovo-x13s-next-20230210
<steev> you should switch to the 6.2 stuff (6.2-rc8)
<HdkR> Would this bring in the mentioned slc slice id fix?
<HdkR> oh no, I see a commit about it
<steev> no, it's already in that one too
<bamse> applied it and re-ran mybw...there it didn't make a difference at least
<steev> but tbh... next is kinda.... iffy tbh
<bamse> steev: hoping we can get 6.3-rc1 up and running, and polish that iffyness out of it...
<HdkR> So if SLC got a fix, maybe L3 needs a fix as well? :)
<bamse> no, not that problem
<bamse> HdkR: but if you run "mybw" on your x13s, what do you get?
<HdkR> Let's see
<HdkR> Looks like it is peaking out at around 24GB/s atm
<bamse> do you have a functional sm8350 that you can run it on as well?
<HdkR> aye
<bamse> ohh
<HdkR> tasksets just to ensure it stayed on X1/X1C cores
<bamse> that's what i get...
<HdkR> Well there's quite a difference there
<bamse> your number matches pretty much what i had before i fixed the l3 scaling
<bamse> s/fixed/enabled...
<HdkR> So L3 needs a fix you say? :P
<bamse> definitely!
<HdkR> Where do I get this fix?
<bamse> 9235253ec73d ("interconnect: qcom: osm-l3: Add per-core EPSS L3 support") and a few others...went into v6.2-rc1
<bamse> before that we got ~15% lower geekbench score in linux compared to windows...
<bamse> with that we're ~5% faster
<HdkR> Which is more in line with what I'd expect from just the clock speed boost even
<HdkR> Grabbing steev's latest branch which has that commit
<bamse> steev: ahh, i didn't pick the llcc bug fix, because johan's ask is reasonable
hexdump01 has joined #aarch64-laptops
hexdump0815 has quit [Ping timeout: 480 seconds]
<steev> Ah, that makes sense
<steev> HdkR: yours should have it too
<steev> Well, the next should
<steev> Either way
<HdkR> Building now at least
<steev> I just prefer what’s in 6.2 because the 6.3 iffyness
<HdkR> I'm on the 6.2.0-rc8 branch now at least
<HdkR> Which seemingly fixed the mybw program's results at least
<steev> interesting
<HdkR> Didn't solve the memory writing issue though sadly
<steev> i'm not sure why he dropped the GPU and other one as well
<steev> * dropped the LLCC_GPU and LLCC_WRCACHE max_cap changes
<steev> i'm assuming those numbers just aren't in the docs?
<bamse> HdkR: do you have a smilarly simple test that manifest the problem you're seeing?
<bamse> HdkR: i don't have any more cards up my sleeve...but i'd be happy to take a look (and/or spread the word...)
<HdkR> bamse: Sadly nothing simple. I'm grabbing some xray stats of FEX running `/usr/bin/true`
<HdkR> I think this 6.2.0-rc8 branch solved the issue I was having with NFS hardlocking though
<HdkR> Two full builds without hang, so that's nice
alfredo has joined #aarch64-laptops
<clover[m]> steev: im on 6.2.0-rc8-0-x13s+
<clover[m]> seems like back to dummy output for audio
<clover[m]> gpu still works
<clover[m]> bt works
alfredo has quit [Ping timeout: 480 seconds]
<steev> dmesg output?
<steev> there are still issues here and there, and we're waiting on another patchset from srini
<steev> it might just be a reboot is needed if you got something like an underflow
<steev> i wish there was a way to unload the audio modules, but it seems like they rely on something being built in, and it can't be removed
alfredo has joined #aarch64-laptops
<HdkR> Scheduler seems a bit spicy but this game might just be abusive
<steev> maybe you should just get thunks working
<steev> disclaimer: i have no idea what thunks are
<HdkR> lol
<HdkR> We're working on it but 32-bit thunking is complex and upstream doesn't want our patches to make it easier
<steev> i hate when upstream doesn't want things to make it easier
<steev> not even sarcasm
<HdkR> It's indeed a pain
<amstan> do the thunks!
<amstan> (i also don't know what those are, but the word sounds fun)
<HdkR> hah, it's in progress!
<HdkR> I /think/ we have a way to fix the 32-bit allocator problem but it's going to take some effort
<steev> i should also look into fex... i wanna play darkest dungeon
<HdkR> It's marked as working on my game list. Don't think I ever perf tested it, probably fine
<steev> if that game had perf issues... that would be very sad. it's not exactly an intense game
<HdkR> That's what I was thinking but you never know. Sometimes games do terrible things
<steev> heh
<HdkR> Some game's audio loops right now can get really intense due to sleeping for too long, then trying to quickly do work
alfredo has quit [Quit: alfredo]
<HdkR> I think this is mostly from the virtual cycle counter not being fast enough. Really looking forward to those ARMv9.1 CPUs enforcing 1Ghz
<steev> [ 24.290021] platform sound: deferred probe pending
<steev> OH
<steev> you need
<steev> make sure those options are set in your config
<steev> you're probably missing +CONFIG_SC_LPASSCSR_8280XP=y
<steev> hm, i thought i pulled in mani's patch that deals with the timeouts too, shouldn't be that many, but maybe that's just what we get now, unsuspend shows them moreso than at boot
<steev> either way, it's 3am here, and i should head to bed since the neighbor's kid isn't screaming anymore
Guest5298 has joined #aarch64-laptops
<Guest5298> any news about X13s firmware update to expose EL2?
<HdkR> That would be pretty sick
<HdkR> But is it even likely to occur>?
Guest5298 is now known as Caterpillar2
krzk has joined #aarch64-laptops
<Caterpillar2> HdkR: perhaps javierm broonie know if it will be released or not. Everybody here is under f** NDA
<Caterpillar2> buying X13s has been a huge mistake
<Caterpillar2> maybe one day I will destroy it and put the video on youtube
<HdkR> Ah. I see.
alfredo has joined #aarch64-laptops
alfredo has quit [Quit: alfredo]
jhovold has joined #aarch64-laptops
<qzed> Keep in mind I "know" this stuff mostly from random internet posts/comments and from things I've heard:
<qzed> Qualcomm runs some firmware stuff in EL2
<qzed> or that's what people say at least
<qzed> so my guess is no one and not even an OEM if they're asking nicely is going to get access to it anytime soon
<qzed> at least on their WoA stack, the android stack seems to be different
<qzed> and don't ask me why they can't do it the android way...
<qzed> so our best bet for KVM is to somehow make use of the bypervisor thing they have set up there
<qzed> *hypervisor
<qzed> also: some android devices suffer from the same problem:
<ardb> qzed: iirc the chromesos arm64 laptops boot at el2
<qzed> ah right, was mixing up android with chromeos I think
Caterpillar2 has quit [Quit: Konversation terminated!]
krzk has quit [Quit: Lost terminal]
<javierm> ardb: correct, the chromebooks support KVM
falk689 has quit [Quit: No Ping reply in 180 seconds.]
falk689 has joined #aarch64-laptops
<steev> yeah but chromebooks aren't your bog standards oem... google can say "make this happen"... we can't
<clover[m]> steev: yep, that config value was my missing puzzle piece
<steev> clover[m]: oops, sorry
<clover[m]> rc8 working fine now
<steev> good to hear :)
<steev> there is one issue i occasionally see here, wifi drops and starts spamming [52013.164196] ath11k_pci 0006:01:00.0: HTC Rx: insufficient length, got 1480, expected 1488 (the got/expected change); simply `sudo modprobe -r ath11k_pci; sudo modprobe ath11k_pci` works
<HdkR> That's suspiciously close to MTU. Wonder what happens if you reduce
<steev> i was just gonna mention it to mani and then forget about it til next time i hit it :P
<steev> i suppose i could try that too
<HdkR> Drop to 1400 and hope it goes away? :P
<HdkR> Does the kernel expose a way to query the CPU cores cache sizes these days? I need to double check the MSR list
<HdkR> `L2 L#0 (0KB) + L1d L#0 (0KB) + L1i L#0 (0KB) + Core L#0 + PU L#0 (P#0)` lstopo doesn't understand them at least :D
spikerguy has joined #aarch64-laptops
<clover[m]> Welcome spikerguy:
Caterpillar2 has joined #aarch64-laptops
<qzed> anyone here with decent PCI subsystem knowledge (especially pm / power-off related)? or alternatively does someone know dark magic and has a lamb to sacrifice? or can point me to someone who knows?
<qzed> turns out ResetSystem isn't just broken on some ARM/Qualcomm devices but also on the x86 Surface Pro 9 and Surface Laptop 5
<qzed> and it seems PCI shutdown stuff is to blame somehow
<qzed> which is a bit out of my depth... so I was wondering whether anyone here has some pointers
<qzed> (IRCs, people to ask... anything really)
<steev> manni might know, he's been touching the arm/qualcomm stuff
<steev> mani_s when he's on irc
<qzed> the quick summary of the problem is: EFI's ResetSystem returns (doesn't seem to crash / page-fault as far as I can tell) under certain conditions
<qzed> in particular, it returns after some PCI shutdown functions ran
<qzed> so it looks like firmware is expecting some devices to be left on I guess
<qzed> which means if we run those PCI shutdown callbacks for some devices, the system essentially stays on and goes to a busy halt loop
<qzed> now I can write a PCI quirk for that... but I'm not sure if that's the best approach, feels very janky
laine has joined #aarch64-laptops
alfredo has joined #aarch64-laptops
alfredo has quit [Ping timeout: 480 seconds]
<HdkR> oop, looks like the new kernel didn't solve the hanging problem after all. Just significantly less likely to occur
spikerguy has quit [Quit: Page closed]
<steev> that... sucks... but is also good to know?
<HdkR> Good to know so I can warn people yes
<steev> just buy them a 2TB ssd to put in them ;)
<steev> although looks like lenovo only even has 512GB on their site available :(
<steev> i was looking to get a 1T
<HdkR> I've actually got a 2TB on the way for replacing the 512GB one
<HdkR> That's exactly the one I bought
<HdkR> There aren't many 2230 or 2242 drives available. Shame it is DRAM-less but eh
<steev> yeah, i noticed that too
<HdkR> WD's drives in the smaller form factor have the same problem
<HdkR> I just think it physically doesn't have the space and they don't use PoP
<steev> i don't know enough about storage technologies :(
<steev> but once you get it... i definitely wanna know what you think of it
<HdkR> It will /probably/ be fine
<steev> i wanna know speed wise, because i do a lot of building as well
<steev> i build kali packages on it
<steev> in fact... right now i'm building rust 1.65... which required building 1.64 first *sigh*
<steev> because neither 1.64 nor 1.65 will be pushed to sid/testing before bookworm releases
<HdkR> I could probably run some fio benches before and after
falk689 has quit [Remote host closed the connection]
falk689 has joined #aarch64-laptops
Caterpillar2 has quit [Quit: Konversation terminated!]