<fda> hello, i installed RC3 to avm 1200. i've attached serial console. the flash from openwrt 19 to rc3 was ok. the device has eth0 + br-lan with the configured br-lan
<fda> BUT it is not reachable by lan and it cant ping other devices
<fda> so i deleted manualla br-lan and attached eth0 an ip. but still no ping to/from lan
<fda> what can i do?
Rentong has quit [Ping timeout: 480 seconds]
<fda> some terminal output https://pastebin.com/Jbgm2gGu
<russell--> fda: there is nothing in the bridge
<russell--> what's your /etc/config/network look like?
<fda> @russell-- on sysupgrad i set "revert default"
<fda> i think there is something broken since openwrt19, maybe kernel-things
<fda> im not experienced with kernel/dts internals, but maybe https://github.com/openwrt/openwrt/commit/f3e45c45cb ?
<russell--> is there a switch?
<russell--> swconfig dev switch0 show | grep port
<fda> the device has 2xwlan + 1xlan
<fda> the a "repeater" / accesspoint
<fda> "Failed to connect to the switch. Use the "list" command to see which switches are available."
<russell--> okay, probably no switch
<russell--> you want the bridge
<russell--> because the wireless interfaces will be added to it
<fda> later, first i need to wired connection to work at all
<russell--> so try rebooting and do the brctl again and see if eth0 in added
<fda> it was added by default! to test i removed it
<fda> because there is no connection in and out
<fda> thats because i attached an serial console
<russell--> is it running dnsmasq? are you sure there is a device at 192.168.1.2?
<fda> i tried different. eth0 is not working
<russell--> RX bytes:25443 (24.8 KiB) TX bytes:5823 (5.6 KiB)
<russell--> says different
<fda> i know
<fda> im pinging from another host
<fda> but that gets no answers
<fda> und if flushed by serial console al iptables rules to test
<fda> im currently building an image wiht tcpdump...
<russell--> if you plug in a device that expects a dhcp server, does it get an address?
<fda> depends on the vlan :)
<fda> my network is not the problem
<russell--> okay, good luck
<fda> :((((((
<fda> there is someting with kernel config
Rentong has joined #openwrt-devel
danitool has quit [Ping timeout: 480 seconds]
romany has quit [Remote host closed the connection]
Rentong has quit [Ping timeout: 480 seconds]
hexa- has quit [Quit: WeeChat 3.2]
hexa- has joined #openwrt-devel
<digitalcircuit> Now testing OpenWRT at https://github.com/openwrt/openwrt/compare/openwrt-21.02...digitalcircuit:openwrt-21.02-cpufreq-dtsivolt-cache , and that plus reverting the 1.4 GHz L2 cache DTSI change. I'll have to wait 8 hours to be sure, but it seems like I might've narrowed stress-ng down to an error-causing command (even if it hasn't YET hard-rebooted like the SFTP test):
<digitalcircuit> nice -n 5 stress-ng --oomable -t 8h --times --cache 1 --cache-level 2 # On loop, tracking success and failure count
Acinonyx has joined #openwrt-devel
Acinonyx_ has quit [Ping timeout: 480 seconds]
<digitalcircuit> ...and nevermind that, stress-ng runs into errors with 1.4 GHz L2 disabled too. Might still work to trigger the reboot, will continue tinkering and testing.
<mangix> digitalcircuit: i am amazed someone runs stress-ng
<mangix> that package was added to openwrt by accident
<digitalcircuit> mangix: Noted! It was suggested for me to try stress-ng back when I was trying to simplify my test case. The way I discovered the issue was by Deja Dup (duplicity) uploading 203 GB in 25 MB chunks over OpenSSH to a USB drive connected to my NBG6817.
<digitalcircuit> Understandably, that's an involved setup to ask someone else to verify an issue with, hence trying to figure out a simpler way to recreate this. See also: https://bugs.openwrt.org/index.php?do=details&task_id=3099#comment9712 and https://lists.openwrt.org/pipermail/openwrt-devel/2021-July/035729.html
<digitalcircuit> (I hope I'm not annoying anyone here either; I've been trying to figure this issue out for months. I know the 1.4 GHz L2 cache frequency is what allows the hard reboot to happen, but I haven't figured out any fix.)
<mangix> digitalcircuit: so before and after the patchset it fails?
lmore377_ has joined #openwrt-devel
lmore377 has quit [Ping timeout: 480 seconds]
<mangix> backported latest stress-ng to 21.02
<digitalcircuit> mangix: Err, referring to the SFTP test failing (causing a hard reboot), or stress-ng? If the latter, I haven't figured out the right stress-ng parameters yet to recreate the SFTP test (without needing SSH/etc).
<digitalcircuit> And stress-ng on OpenWRT master has failed to build when I last tried in the past week or so (it was still building on 21.02).
<mangix> :)
<mangix> wonder why
lmore377_ has quit [Ping timeout: 480 seconds]
lmore377 has joined #openwrt-devel
<digitalcircuit> I'm not sure - I tracked it down to something with the Makefile referencing the Linux headers for IO parameters resulting in an invalid definition due to double [#define]s ( https://github.com/ColinIanKing/stress-ng/blob/master/Makefile#L431-L438 ) but wasn't certain on how to fix that.
<mangix> hmmm maybe a missing liburing dependency
<digitalcircuit> Not sure... The resulting "io-uring.h" file got created, just with invalid syntax (I'd need to retry to check to give a firm answer).
<digitalcircuit> Given my difficulty in recreating the issue with stress-ng, I had postponed that for a while, but even scripting an SFTP upload using GNOME's GIO stack in Python 3 hasn't yet recreated the issue either - I think I need to upload multiple random files, not a single one.
<digitalcircuit> Alternatively, I may need to run stress-ng in 2-4 second bursts, to mimic the SFTP upload being in chunks, exercising the CPU governor. I know that the patchset to enable 1.4 GHz L2 cache mentions potential issues with clocking and modified the cpufreq driver to try to prevent those situations, and I'm wondering if I'm somehow bypassing that protection.
<digitalcircuit> OpenWRT 19.07 didn't have the issue. OpenWRT 21.02 is currently broken for me, https://github.com/openwrt/openwrt/compare/openwrt-21.02...digitalcircuit:openwrt-21.02-cpufreq-dtsivolt-cache doesn't change that, but if I then revert https://github.com/openwrt/openwrt/commit/911822cb6ed1447a0d0becfcce51543ff4a6e377 (disabling 1.4 GHz L2 frequency), it's fine. The "master" branch was fine until that patch to fix 1.4 GHz.
<digitalcircuit> (I feel I'm way over my head, trying to learn as I go, pardon all the uncertainty!)
<mangix> well, it's a sort of abandoned platform by Qualcomm
<mangix> Ansuel's doing most of the work
<mangix> that ethernet latency commit told me I wasn't crazy :)
<digitalcircuit> Ah, that's unfortunate of Qualcomm. And I do appreciate Ansuel's work - I want to be clear that I'm not trying to put them or anyone else's efforts down! Ultimately if I can't figure out what's going wrong, if I just need to add runtime /sys/ parameter or otherwise disable 1.4 GHz L2 cache frequency just for me, I'm fine with that.
<digitalcircuit> Ethernet latency commit?
<digitalcircuit> mangix: Noted, thanks! I've actually had that in mind because in theory I shouldn't be hitting 384 MHz anyways, yet the cpufreq stats/trans_table (?) shows enter/exit for 384 MHz. That plus the warnings in https://github.com/openwrt/openwrt/commit/3efbfe5465e0d3cbc52c37a2b80e8f4f2d4b35da makes me wonder if 384 MHz CPU + 1.4 GHz CPU isn't prevented in all cases.
<digitalcircuit> (Err, 1.4 GHz L2 cache)
<digitalcircuit> Nothing shows in the router's hardwired serial console about timings, so I must've not enabled the right kernel debugging settings or that logging isn't being hit.
* digitalcircuit realizes the problem he's chasing (trying to use a USB drive + router instead of having a separate NAS) and the resulting nondeterministic crash that only he seems to be getting consistently (majority of SFTP backup runs) might sound crazy :)
<Monkeh> Have you considered just hitting it with a hammer and moving on? :P
<digitalcircuit> Monkeh: Heh, the thought certainly has crossed my mind :D The less-destructive variant of "just live on custom builds forevermore" might be what I end up doing. I just stubbornly WANT to fix this. It'd bother me to give up (especially since I'm trying to apply for some Quality Engineering positions).
<digitalcircuit> (Having a NBG6817 in two different households adds to the motivation - I thought it'd make it easier for me to support both, which it does, but it means issues affect both too.)
Tapper has joined #openwrt-devel
<slh> just to re-iterate, I've been seeing unexplained crashes on lantiq (and freezes/ hangs on ath79) as well, so it might not all be ipq806x specific
<Monkeh> I have in the past seen some odd crashes on lantiq which I never did get to the bottom of
<slh> in my case without modem involvement (using an external vigor 130; bthub5 just used as router terminating the PPPoE session)
<Monkeh> These were all in modem use
<Monkeh> Although I have an HH5A which is more than a little crash happy
<Monkeh> But that one's definitely faulty
<slh> the bthub5 was the next best thing (after tl-wdr4300 and tl-wdr3600 failed, by freezing up hard under high throughput conditions) - and it worked 'better' (sudden/ unexplained reboots, but no eternal hangs)
lmore377_ has joined #openwrt-devel
lmore377 has quit [Ping timeout: 480 seconds]
lmore377 has joined #openwrt-devel
lmore377_ has quit [Ping timeout: 480 seconds]
Rentong has joined #openwrt-devel
valku has quit [Quit: valku]
<digitalcircuit> slh: Noted, thanks for chiming in! Hmm. Maybe there's larger changes that happened between 19.07 and 21.02 that are made worse (for me at least) by the 1.4 GHz L2 cache change that's specific to ipq8065.
<slh> digitalcircuit: not sure, is /etc/init.d/cpufreq present in 21.02.0? (only if it was backported, no idea if it was)
<digitalcircuit> slh: Yep, it's in 21.02.0rc3, possibly rc2, but not rc1 ( https://github.com/openwrt/openwrt/commit/0b0bec56ea1ac97098a7a546df2ebd62f7011129 ). I also manually backported that to an earlier version when trying to figure out my SFTP issue.
Rentong has quit [Ping timeout: 480 seconds]
<slh> damned, that (avoid 384 MHz) would have been such an easy thing to backport ;)
<digitalcircuit> slh: I know... I was so excited when I thought it worked, only for the frustratingly nondeterministic nature of this sudden/unexplained reboot to strike :)
<digitalcircuit> I'm starting to wonder if hooking up GDB over serial console might not actually be that crazy of an idea.
<digitalcircuit> (I don't recall specifics, just that there's a guide on OpenWRT kernel debugging via serial console.)
<digitalcircuit> I've been splitting my focus between recreating this more reliably (without needing to run 0.5-20 hours of full system backups), and trying to determine exactly what's going wrong, too.
Borromini has joined #openwrt-devel
Rentong has joined #openwrt-devel
Rentong has quit [Ping timeout: 480 seconds]
rejoicetreat has joined #openwrt-devel
f5 has joined #openwrt-devel
Tapper has quit [Ping timeout: 480 seconds]
Tapper has joined #openwrt-devel
danitool has joined #openwrt-devel
Rentong has joined #openwrt-devel
<blocktrron_> fda: there's the possibility the PHY delays are configured incorrectly now, as the behavior changed with kernel 5.4
<blocktrron_> try to change phy-mode from rgmii-rxid to rgmii-id
Rentong has quit [Ping timeout: 480 seconds]
Borromini has quit [Quit: Lost terminal]
rejoicetreat has quit [Ping timeout: 481 seconds]
goliath has joined #openwrt-devel
danitool has quit [Quit: Cubum autem in duos cubos, aut quadratoquadratum in duos quadratoquadratos]
rmilecki has joined #openwrt-devel
dedeckeh has joined #openwrt-devel
_lore_ has quit [Ping timeout: 480 seconds]
_lore_ has joined #openwrt-devel
Rentong has joined #openwrt-devel
Tapper has quit [Ping timeout: 480 seconds]
Rentong has quit [Ping timeout: 480 seconds]
Tapper has joined #openwrt-devel
f5 has quit [Ping timeout: 480 seconds]
<fda> blocktrron_: i tested more. ips are set correct, ping other devices and other ping it. 0 ping are okay, but in the ARP table are the devices!
<fda> i tried to change dts like https://github.com/openwrt/openwrt/commit/f3e45c45cb , but i dont know dts files
<fda> can needs the phy-mode be changed in the dts? or is there a way in the running system
<fda> also tcpdump shows broadcasts of the network. very odd
f5 has joined #openwrt-devel
<fda> @blocktrron_ works!!!! i just changed the 1 string in the dts
<fda> many thanks!
Tapper has quit [Ping timeout: 482 seconds]
<fda> can someone delete openwrt 21-RC files of 1200 as with them network does not work?
Tapper has joined #openwrt-devel
svanheule has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
Rentong has joined #openwrt-devel
por has joined #openwrt-devel
por has quit [Remote host closed the connection]
svanheule has joined #openwrt-devel
por has joined #openwrt-devel
Tapper has quit [Remote host closed the connection]
Tapper has joined #openwrt-devel
Tapper has quit [Ping timeout: 480 seconds]
valku has joined #openwrt-devel
arifre has quit [Quit: Page closed]
arifre has joined #openwrt-devel
svanheule has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
Tapper has joined #openwrt-devel
svanheule has joined #openwrt-devel
svanheule has quit []
svanheule has joined #openwrt-devel
Tapper has quit [Ping timeout: 480 seconds]
Tapper has joined #openwrt-devel
svanheule_ has joined #openwrt-devel
svanheule has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
svanheule_ is now known as svanheule
svanheule has quit []
svanheule has joined #openwrt-devel
rmilecki has quit [Ping timeout: 480 seconds]
rmilecki has joined #openwrt-devel
Tapper has quit [Ping timeout: 480 seconds]
Tapper has joined #openwrt-devel
rsalvaterra_ has joined #openwrt-devel
rsalvaterra has quit [Ping timeout: 480 seconds]
por has quit [Remote host closed the connection]
goliath has quit [Quit: SIGSEGV]
aleasto has joined #openwrt-devel
ecloud has quit [Ping timeout: 480 seconds]
aleasto has quit [Ping timeout: 480 seconds]
aleasto has joined #openwrt-devel
ecloud has joined #openwrt-devel
Tapper has quit [Remote host closed the connection]
Tapper has joined #openwrt-devel
Tapper has quit [Remote host closed the connection]
Tapper has joined #openwrt-devel
danitool has joined #openwrt-devel
Rentong has quit [Remote host closed the connection]
Rentong has joined #openwrt-devel
goliath has joined #openwrt-devel
Rentong has quit [Ping timeout: 480 seconds]
Rentong has joined #openwrt-devel
shibboleth has joined #openwrt-devel
Rentong has quit [Ping timeout: 480 seconds]
Rentong has joined #openwrt-devel
aleasto has quit [Remote host closed the connection]
Rentong has quit [Ping timeout: 480 seconds]
philipp64 has quit [Quit: philipp64]
Rentong has joined #openwrt-devel
philipp64 has joined #openwrt-devel
philipp64 has quit []
philipp64|work has quit [Quit: philipp64|work]
Rentong has quit [Ping timeout: 480 seconds]
rsalvaterra_ has quit []
rsalvaterra has joined #openwrt-devel
<rsalvaterra> Heh… elfutils hate being compiled with gcc 11… :)
<stintel> I think they hate being compiled in general :P
<rsalvaterra> stintel: Well, GCC is rightly complaining about mismatched pointer types in arrays (-Warray-parameter), and rightly so, from the ones I fixed…
<rsalvaterra> But since I'm on a "oh, gcc 11, let's break routers!" mood… :P
<stintel> yeah I never even finished switching to gcc10 as default
<rsalvaterra> Yeah! Elfutils are building now. I think the next client is BusyBox…
<rsalvaterra> … but actually I'm wrong, yay!
<owrt-snap-builds> Build [#184](https://buildbot.openwrt.org/master/images/#builders/29/builds/184) of `pistachio/generic` failed.
Rentong has joined #openwrt-devel
Rentong has quit [Ping timeout: 480 seconds]
<hauke> stintel: there werer still 2 bugs with gcc 10
<hauke> I think you also looked into them
<hauke> did you fix one of them?
<stintel> I fixed one, another is in my staging tree
<stintel> for busybox I think\
<stintel> feel free to check
jlsalvador2 has joined #openwrt-devel
<stintel> I'm driving back to Bulgaria this weekend but hardware problems are delaying my departure
<stintel> won
<stintel> won't be able to look into gcc anytime sono
<hauke> stintel: ok
<hauke> stintel: ok you fixed the problem in mdnsd
<hauke> then there is still umbim
<hauke> I think this needs some bigegr chanegs or we ignore the warning
jlsalvador has quit [Ping timeout: 480 seconds]
jlsalvador2 is now known as jlsalvador
<rsalvaterra> stintel: My Omnia is working fine with a gcc 11-compiled image (for my config, of course). And it's smaller too. I'm happy. :P
<shibboleth> are any of the current COTS tplink/qca/intel 11ax devices supported?
<shibboleth> i don't really care about ax, waiting for 6e, but the cpus/socs are nice
<hauke> shibboleth: mt7915 is supported
<shibboleth> mediatek
<shibboleth> no offense to nbd, but no way
<hauke> Intel client wifi with AX should also work
<shibboleth> a while back there was some back and forth between devs testing ath11 devices?
<shibboleth> hauke, yeah, but routers/devices.
<slh> xiaomi ax3600 and ax9000 are being worked on, the former is further ahead (pretty much fully working, apart from the big caveat at the end), the later is significantly better hardware - but the elephant in the room is ath11k leaking memory like a sieve
<slh> (ax9000 includes a third qcn9074 radio, which is hard to get working so far, PCIe not behaving, very, very fresh ath11k support for this revision, etc.)
<slh> there are also plenty of other ipq807x devices on the market, but the xiaomi ones are by far the cheapest (and therefore the first victims)
Rentong has joined #openwrt-devel
dedeckeh has quit [Remote host closed the connection]
<shibboleth> https://wikidevi.wi-cat.ru/TP-LINK_Archer_AX50, surely cheaper, but ofc not qca
<hexa-> you'd want mt76 for ax anyway
Rentong has quit [Ping timeout: 480 seconds]
<PaulFertser> mt7915 for 802.11ax
Acinonyx_ has joined #openwrt-devel
Acinonyx has quit [Ping timeout: 480 seconds]
silver has quit [Ping timeout: 480 seconds]
<fda> jow: you know how to get the latest kernel version durring build by git? uname shows "5.10.46". but "opkg list-upgradable" reports errors like
<fda> * pkg_hash_check_unresolved: cannot find dependency kernel (= 5.10.46-1-78562504a701e22ae17585f15e8cf6aa) for kmod-nfnetlink
<fda> * pkg_hash_fetch_best_installation_candidate: Packages for kmod-nfnetlink found, but incompatible with the architectures configured
<fda> seems like the pakcage feeds something with "shallow clone"
<fda> i coloned the main repo just with "git clone https://github.com/openwrt/openwrt.git DIR"
<blocktrron_> fda: i know, I've pushed this fix ;)
<fda> -.-
<fda> i dont know the nick names...
Rentong has joined #openwrt-devel
Rentong has quit [Ping timeout: 480 seconds]
rmilecki has quit [Ping timeout: 480 seconds]
isak has joined #openwrt-devel
silver has joined #openwrt-devel
Tapper has quit [Remote host closed the connection]
Tapper has joined #openwrt-devel
philipp64|work has joined #openwrt-devel
rsalvaterra_ has joined #openwrt-devel
Tapper has quit [Remote host closed the connection]
Tapper has joined #openwrt-devel
rsalvaterra has quit [Ping timeout: 480 seconds]