ChanServ changed the topic of #aarch64-laptops to: Linux support for AArch64 Laptops (Asus NovaGo TP370QL - HP Envy x2 - Lenovo Mixx 630 - Lenovo Yoga C630)
<qzed> steev: thanks, I'll have to check that... the SPX doesn't have any hid-over-i2c devices but could be the light sensor thing or something else is acting up
<steev> i was going back through the irc log to find the wifi thing and saw that mentioned back then too
<steev> qzed: my knowledge level isn't as high as your but iirc, i was just looking via irqtop
<qzed> steev: I'm not really an expert on that stuff either, especially all that SoC/qualcomm stuff... still got a lot to learn on that
<qzed> right now I have pretty much no idea what's going wrong with suspend... so I decided to look into SAM/EC based thermal sensors instead
<steev> i'm more of an "educated guess based on what i'm reading, and grepping around the source code and hoping something looks readable"
<qzed> meaning the SPX has now support for a bunch of skin thermal sensors... but still no suspend xD
<qzed> that's usually a good start xD
<qzed> I love https://elixir.bootlin.com/linux for that stuff
<steev> thermal sensors are good - that reminds me that with 5.19.0 rc4 (came out a little bit ago), bottom is back to saying there are no sensors on the flex 5g
<qzed> that + some grepping
<qzed> hmm, SoC based sensors are working for me... only ath10k times out reading or something
<steev> yeah, the ath10k one has been a thing for a long time
<steev> and dmesg gets spammed saying some sort of timeout
<steev> [ 743.943167] ath10k_snoc 18800000.wifi: failed to synchronize thermal read
<qzed> right, yeah
<steev> i was gonna poke kvalo, but i don't think they're on irc, unfortunately (i didn't see them in #linux-wireless on libera
<qzed> so, finally looking at irqtop: there's a `GICv3 19 Level arch_timer` with a ton of interrupts, but I'm not sure if that's excessive... delta of ~3000 seems high though
<qzed> and I think that eplains the cpu0 thing because that has %irq of 60 and %delta of 80
<steev> for me that is only at around 59
<qzed> while on other cores thats at ~5 and ~2 max
<steev> the big one(s) for me are 89c000.i2c and hid-over-i2c (which i'm assuming are the same and not actually 2 different irqs?) at 397
<qzed> so it seems that all the busy stuff showing up is interrupt handling...
<qzed> should be the same, I think
<qzed> hmm, okay I'm not so sure any more if that's all there is to the cpu usage... would explain why it's bound to cpu0, but my desktop has a delta of 6k to 7k and the cpu usage doesn't seem to be that extreme...
<qzed> but could also be a result of the slower cores... hard to say
<qzed> (delta for system timer)
<qzed> also considering I have CONFIG_HZ=1000 the arch_timer delta with 3k over 3 seconds kinda seems reasonable
<qzed> okay, so setting CONFIG_HZ to 250 reduces the number of interrupts, but seems to have little impact on the CPU usage... so maybe not related...
<qzed> but cpu usage can also be misleading...
<steev> can also confirm that suspend doesn't work here, and i'm reminded that i need to disable gnome trying to suspend automatically
<qzed> funny enough autosuspend was the only time I was able to pull a log... other times it either didn't wake back up or if it does usb-serial didn't come back up
<qzed> *if it did
<steev> i ended up with a black screen with a cursor in the top left, i haven't actually checked the log yet
<steev> i don't see anything in my log, just a bunch of i2c_hid_get_input: incomplete report (21/1) repeated about 60 times or so
<steev> journalctl just shows reached target sleep, entering sleep state suspend, pm: suspend entry (s2idle)
<steev> bamse: occasionally seeing https://paste.debian.net/1245341/ with -rc4
<steev> that's on the flex5g, on the c630 see similar, but not quite the same - https://paste.debian.net/1245343/
<qzed> I think you can ignore those... or set CONFIG_RCU_EXP_CPU_STALL_TIMEOUT to a higher value
<qzed> if that value isn't already reasonably high
<qzed> I have again no idea what a reasonable stall time for a warning is here... but I've seen them too and set that to 250ms, which pretty much eliminated them
<steev> looks like it's currently set to 20, i'm assuming that's the default
<qzed> `default 20 if ANDROID` and `default 0 if !ANDROID`
<qzed> and `0` means it takes the value from `RCU_CPU_STALL_TIMEOUT`, which is in the seconds
<steev> bamse's config sets it to 21000
<qzed> that's RCU_CPU_STALL_TIMEOUT in ms
<qzed> and that defaults to 21s
<qzed> so defaults all around I guess
<qzed> as far as I can tell, the "expedited" timeouts are just an early warning thing for things taking too long
<steev> ah, yeah i do have android enabled
<steev> also, fwiw, i re-did the mp stuff, and... i don't see more usb devices here
<qzed> hmm okay, I have no idea what's connected to those on the flex, so can't say if that's the way it's supposed to be or not
hexdump0815 has joined #aarch64-laptops
hexdump01 has quit [Ping timeout: 480 seconds]
hexdump01 has joined #aarch64-laptops
hexdump0815 has quit [Ping timeout: 480 seconds]
<steev> okay so
<steev> i used fbgrab, and... it's definitely there, it's just... not actually showing on the display? https://usercontent.irccloud-cdn.com/file/VVutiNsr/fb0.png
<steev> hm, or not
<steev> bottom should be running, but it's still showing the login prompt
SallyAhaj has quit [Read error: Connection reset by peer]
SallyAhaj has joined #aarch64-laptops
jhovold has joined #aarch64-laptops
SallyAhaj has quit [Quit: SallyAhaj]
matthias_bgg has joined #aarch64-laptops
vsp has joined #aarch64-laptops
vsp_ has joined #aarch64-laptops
vsp has quit [Ping timeout: 480 seconds]
Guest3173 is now known as jelly
vsp_ has quit [Remote host closed the connection]
SallyAhaj has joined #aarch64-laptops
matthias_bgg has quit [Ping timeout: 480 seconds]
<qzed> steev: I've rebased my stuff on 5.19-rc4 with bamse's 5.19-rc1 patches and things work as well as before
<qzed> I've had to pull in https://github.com/linux-surface/kernel/commit/6d1214e0dacc0b228b43bc0e2f9d20e1d4f475f9 otherwise the display wouldn't turn on most of the times
<steev> yeah, i have that patch here
<qzed> I've dropped a couple of commits that bamse also seems to have dropped, including the UCSI stuff
<qzed> do you get any additional info from /sys/kernel/debug/devices_deferred?
<steev> not the why, just that they are deferred
<qzed> hmm... did you try to bind them manually?
<steev> see, this is where my lack of knowledge comes into play
<steev> i don't know how to bind them manually
<qzed> ah, so there's some interface to bind drivers manually to devices
<qzed> you do have command line access, right?
<steev> oh yeah - that's the thing
<qzed> it should be `echo c300000.power-controller | sudo tee /sys/bus/platform/drivers/qcom_aoss_qmp/bind`
<steev> i get
<steev> video, as long as i don't fw_devlink=permissive, just that the above power controller is deferred
<qzed> okay
<qzed> so manually binding the thing could maybe give some additional output
<steev> hm
<steev> so, all i have in qcom_aoss_qmp is uevent
<steev> and usb c works, so i've just plugged in a wifi device to that, so i can "use it"
<qzed> oh huh wait, I also only have the uevent node... and no bind... okay that's a bit weird
<steev> maybe the qcom_aoss_reset which does have a bind?
<qzed> ah, the driver has `.suppress_bind_attrs = true` set for some reason...
<qzed> do you have any warnings or error messages for that in the log?
<steev> https://paste.debian.net/1245407 doesn't seem like any warnings or errors
<qzed> hmm, those look okay
<qzed> that should expose the bind/unbind attributes
<steev> i looked at the previous boot where i'd done fw_devlink=permissive and everything there is fine
<steev> er the same
<steev> yeah, was thinking of doing that next
<qzed> I think I had a similar thing some time back on a linux-next branch where some edp-phy would be deferred indefinitely for no apparent reason, but the "disable deferred probe timeout" thing fixed that for me
<qzed> also in that case I could bind the phy later and everything would work... so if that works for you it seems to be similar
<qzed> this series broke that for me in linux-next: https://lore.kernel.org/netdev/20220601070707.3946847-1-saravanak@google.com/
<qzed> but I think that should not be included in rc4
<bamse> steev: you shouldn't get any rcu stalls there...you only see that on -rc4? i.e. a regression since earlier -rc? or did you change your patch stack as well?
<bamse> steev: or perhaps just did more testing?
<steev> bamse: only seen it since rc4
<steev> qzed: so, interestingly (and correct, it's not in rc4) - since looking at that patch - i rebooted and set deferred_probe_timeout=250 and... apparently at this point, whatever they were waiting on... are there now, because the display just "went away" and if i ssh in, that power-controller is no longer in devices_deferred
<steev> still about to test with that supress patched out
<qzed> the suppress thing should just give you the option to manually bind it, that's mainly useful if things are in place but the driver core decides to not try again for some reason (in that case you can then manually "try again")
<steev> it just tells me... no such device
<steev> bind/unbind exist now though
<qzed> huh okay... ENODEV normally indicates that the driver isn't intended for the device...
<qzed> maybe some path in the driver tries to access something that isn't there and that returns ENODEV?
<qzed> then the driver might fail with ENODEV rather than EPROBE_DEFER, and probing doesn't get actually deferred
<steev> bamse: it also doesn't occur every time
<steev> qzed: i don't honestly know - i'm gonna try redoing my patches - because i probably have some i shouldn't as well
<qzed> you could try enabling debug output and then try binding again
<bamse> steev: do you have the sync_state callback in the interconnect driver commented out?
<steev> yes
<steev> interconnect: sc8180x: Disable sync_state for now
<bamse> i had some strange problems with that in the past, when the cpu goes to idle and the busses are turned off at "random" times
<bamse> we've made some progress since then, but i think we're still missing a few driver votes
<bamse> iirc ufs doesn't vote for its bandwidth...so if everything else goes idle and transactino there would crash the system
<steev> gonna do what i shoulda done a while ago
<steev> test with your rc1 branch itself, rather than my frankenkernel
<steev> oh
<steev> oh god
<steev> i think i might know the issue
<steev> i haven't actually been using the 5.19 devicetree because i haven't been copying it in, i've been booting 5.19 with the 5.11 devicetree (dtbloader)
<steev> will be re-testing with the 5.19 rc1 from bamse and then redoing my patchset
<bamse> might be a few things not working so awesomely with that 5.11 dtb...
<steev> indeed
jhovold has quit [Ping timeout: 480 seconds]
djakov_ has joined #aarch64-laptops
djakov has quit [Ping timeout: 480 seconds]
<steev> hm
<steev> well the bad news is... your rc1 isn't workin for me either
<steev> need to do more digging
<steev> blergh
<steev> i thinks i know why
<steev> bamse: your rc1 does the same thing - once i put the firmware, where it's expected to be (the installer puts it in qcom/LENOVO/82AK/ your dts looks in qcom/sc8180x ) - no video output once it's "supposed" to probe drm, even with video=efifb
<steev> well, the firmware installer script*
<steev> gonna re-try my rc4 with stuff in the correct spot
<steev> oh interesting, i see the same issue with your rc1 that qzed had with bluetooth where it's looking for crbtfw01.tlv instead of crbtfw21.tlv
klardotsh has joined #aarch64-laptops
<steev> [ 2.107592] gpio gpiochip0: (c440000.spmi:pmic@0:gpio@c000): not an immutable chip, please consider fixing it!
<steev> [ 2.111940] gpio gpiochip1: (c440000.spmi:pmic@4:gpio@c000): not an immutable chip, please consider fixing it!
<qzed> I see those on the spx too
<steev> rc4 does the same, no video output once it's supposed to switch over to msm
<steev> even if video=efifb is set
<qzed> do you have any deferred devices? for me something like this happens occasionally and it's somewhat reduced with the "disable defered probe timeout" thing
<steev> oh yes
<steev> ae9a000.displayport-controller platform: supplier aec2a00.phy not ready
<steev> panel platform: supplier backlight not ready
<steev> backlight platform: supplier c440000.spmi:pmic@5:lpg not ready
<steev> interestingly... the backlight is on?
<qzed> right, the first one is what I get when that happens
<steev> so just reboot a few times?
<qzed> without disabling the defered probe timeout that works for me like 1 in 5 times or so... with the defered probe timeout its the other way round (1 in 5 boots display stays black)
<qzed> I think you can also manually bind the edp-phy driver
<qzed> but I'm not sure whether the other things may cause problems
<steev> hm, i definitely have the disable deferred probe timeout
<qzed> as I don't have those
<qzed> I think if the backlight causes the panel to refuse from probing, the edp stuff won't probe either because that depends on the panel
<qzed> can you post a dmesg lgo?
<qzed> *log
<qzed> okay, I can't see the errors that I get when the panel stuff goes wrong... so I guess it's not even trying to probe the drm stuff because of the backlight somehow?
<steev> probably
<qzed> > rtc-pm8xxx c440000.spmi:pmic@0:rtc@6000: setting system clock to 1970-02-15T00:31:22 UTC (3889882)
<qzed> interesting, had the same thing on the spx and thought that MS decided maybe not to use the qcom rtc
<qzed> but I guess that doesn't work properly for the flex either?
<steev> the qcom rtc is garbage (no offense to whomever)
<steev> the c630 uses it, and it... can't actually be used as an rtc
<qzed> ooof
<qzed> luckily for me, the EC on the SPX provides an interface for one...
<steev> there's a thread... somewhere on arm.com, or maybe the code aurora forums, it's explained
<qzed> so your problem somehow is "supplier c440000.spmi:pmic@5:lpg not ready", which IIRC is the PWM driver for the backlight
<steev> Unfortunately, the aarch64 cores can NOT change the RTC time (only consume it)
<steev> right
<qzed> and I can't really see what would stop it from probing
<qzed> I mean it has practically no dependencies... except the parent pmic mabye?
<steev> but we enable that
<qzed> yeah
<steev> only thing i can think is... it's already on when it tries to be enabled and thus fail?
<qzed> i'd think that should then complain in the log
<qzed> I assume you do have CONFIG_LEDS_QCOM_LPG set?
<steev> i'm gonna pretend the answer was yes
<steev> gdi
<qzed> heh xD
<steev> i also may or may not be missing PHY_QCOM_EDP
<bamse> it's possible that i have the edp phy builtin...or at least in the ramdisk
<steev> i... don't have it enabled at all
<bamse> to avoid some probe deferral in the dp driver
<steev> it's module in your dump, but iirc i put it into the initramfs before
<bamse> yeah, but now that they moved the panel to live on the aux bus things changed
<bamse> so they register the device, which triggers module loading, and then they assume the device has probed
<steev> ah, let me build it in then
<bamse> but if you don't have the driver available (at least in the ramdisk) that will just be a pretty tight loop retrying that
<steev> well, when i rebuilt this one to test... i forgot to re-enable them
<steev> for some reason, make savedefconfig throws those away
<bamse> odd
<steev> okay lpg module, edp phy built-in, firmware in qcom/sc8180x instead of the other place (okay, i'm cheating and symlinked), booting and... same thing
<steev> and nothing in devices_deferred this time
<steev> oh
<steev> we got something
<steev> [ 8.279041] [drm:msm_dp_modeset_init [msm]] *ERROR* eDP aux_bus not found
<steev> [ 8.279397] [drm:_dpu_kms_initialize_displayport:633] [dpu error]modeset_init failed for DP, rc = -19
<bamse> do you have the new fancy aux-bus based layout in your dts?
<steev> uhh
<qzed> doesn't look like it
<steev> i thought i had whatever is in your rc1
<qzed> I think the rc1 stuff has still the old layout
<steev> hm
<steev> or it has both?
<qzed> there should be an aux-bus node
<qzed> let me try and whip something up
<bamse> i thought i pushed out the updated dts?
<bamse> i do know that i did quite a bit of refactoring of the dts to make it ready to send to the list...but the machine is powered down at the momement, so i can't check
<steev> i don't see it
<steev> unless it was in the sc8280xp branch
<steev> nope :)
<bamse> no, i rebased the sc8180x branch...ripped out the external display and started preparing to push the rest upstream
<bamse> but in the last branch i synced to my laptop i hadn't moved the panel...
<bamse> i'll have to power up the machine again, so i can update the branch...or maybe finish the work and post the patches
<bamse> just need the thunderstorm to pass...
<qzed> didn't compile-test it though
<steev> compiles fine
<steev> ohhh, i bet that storm is headed this way
<steev> can tell because it's not hotter than satan's asshole outside
<steev> that didn't quite work, and ssh doesn't appear to be working either. it's nbd, i'll just wait for the storm to pass :)
<bamse> steev: we had 98F here when i was outside ~2pm...now it's 70F...and the thunder was shaking the house
<bamse> steev: haven't yet figured out if i need to disconnect my things here or not...but i rather not figure out the hard way
<steev> yeah, it's only 83F here currently
<steev> it's "dark" out but not storming
<bamse> according to the radar the worst of it is inbetween us...
<bamse> and to the east
<steev> no rush, i'm trying to figure out why my rpi4 is not acting the same as my coworker's in australia
<steev> it actually feels really good out