<ynezz>
hm, in QSDK on ipq8074 I've proper board_id=0xa4 "ath11k_pci 0001:01:00.0: chip_id 0x0 chip_family 0x0 board_id 0xa4 soc_id 0xffffffff" likely as they pass cnss2.bdf_pci1=0xa4 from the bootloader
<ynezz>
in OpenWrt I get 0xff, is there some way to override it to the correct board_id ?
dangole has joined #openwrt-devel
robimarko has joined #openwrt-devel
<ynezz>
qcom,board_id=<0xa4>; is in DTS, probably some QSDK feature, can't find the similar feature in the tree
<robimarko>
ynezz: You dont need to override it
<robimarko>
We used to support that, but upstream shot it down as you can just use the variant string to match properly
<ynezz>
ah, ok
<robimarko>
You are getting 0xff as none of the vendors except for QCA itself bothered to fuse the board ID
<robimarko>
Is that pluggable card?
<ynezz>
no
<robimarko>
Then just using the variant string to match up the BDF will work just fine
<Ansuel>
creating the 2 container for the action is just extra 20 second
<ynezz>
I was talking about buildworker image, IMO its fine to included it in your tools container for CI
<Ansuel>
yep if we don't plan on adding in buildworker then yes i will add to tools prefer to have more control since the actions lacks some feature
<ynezz>
BTW I'm seeing issues with qcn9074 firmware loading, it seems to be looking for board.bin file
<ynezz>
but that ipq-wifi is installing it as board-2.bin, what should be changed?
<ynezz>
"user.notice 11-ath11k-caldata: Unable to serve ath11k/QCN9074/hw1.0/board.bin request"
<ynezz>
"ath11k_pci 0001:01:00.0: failed to fetch board data for bus=pci,qmi-chip-id=0,qmi-board-id=255 from ath11k/QCN9074/hw1.0/board-2.bin" is really misleading
<Ansuel>
ynezz today wants to suffer
<robimarko>
ynezz: board.bin is the fallback
<robimarko>
It will try that if it fails finding the correct BDF in board-2.bin or board-2.bin is missing
rua has joined #openwrt-devel
<ynezz>
yep, but that should be correct BDF
<robimarko>
Can you run ath11k-bdencoder -i on the board-2.bin?
<robimarko>
And did you set the variant string in the DTS?
<ynezz>
yes, and it loads that file via board.bin so it should be ok
<ynezz>
robimarko: "ath11k_pci 0001:01:00.0: DT bdf variant name not set." seems like a clue, that DT node was on the other PCI bus, works now :o
rua has quit [Quit: Leaving.]
<djfe>
robimarko: about bootipq, yes it's just a command to boot from spi
<djfe>
the board I have access to, is a Zyxel WRE6606
<djfe>
if Kernel is larger than 4MiB, then bootipq won't load enough of the Kernel and booting fails
<djfe>
In another ipq40xx device they replaced bootcmd https://git.openwrt.org/?p=openwrt/openwrt.git;a=commit;h=45eb57f12f3a128a1822a20b7e536527ab92ca67
<djfe>
lzma-loader is probably the better alternative, since they only seem to support gzip natively
<djfe>
I know, it takes effort but it's probably the better solution, right?
<robimarko>
djfe: I dont know, I am that much of lzma-loader fan
<djfe>
looking at devices with KERNEL_SIZE = 4096k there are quite a few devices affected
<djfe>
why not?
<robimarko>
To put it mildly, its a mess currently
<robimarko>
But, if you want to port it, please go ahead cause there isnt much of an alternative that works without tweaking the bootloader
<djfe>
ok but increasing compat and asking users to modify their uboot environment could lead to issues as well.
<djfe>
Well, yes ^^
<djfe>
The biggest problem for me is finding out how this is done.
<djfe>
Also if I do it, it will probably take quite a while since I haven't implemented anything similar before
<robimarko>
Well, then you gotta find a volunteer to do so
<robimarko>
Cause it is low-level work
<djfe>
yes, definitely not my strong suit, but I would like to get to know stuff like that eventually
<djfe>
Finding volunteers: I wanted to bring the issue to attention here by asking, if anyone else was already affected by this and had to recover their devices via serial like me :)
<djfe>
I kind of want to prevent people having to recover their devices after installing release candidates ^^
<djfe>
I'm not sure KERNEL_SIZE is set for every affected device, it wasn't set for the wre6606.
<robimarko>
Its probably not set on all device
<djfe>
First I'm going to do damage prevention, trying to figure out which devices are missing KERNEL_SIZE, so it will rather fail building then booting later on
<ynezz>
robimarko: is there any preference for handling of ipq807x devices using eMMC storage ?
<robimarko>
ynezz: In what sense?
<djfe>
It might be all spi/nor flash devices on ipq40xx, but I'm not entirely certain, yet.
<ynezz>
robimarko: there is no such device in the tree yet, but there might be some in the works
<robimarko>
ynezz: Qnap 301W and NBG use eMMC
<ynezz>
ah, I've missed that, thanks
<robimarko>
So, everything should be there
<robimarko>
djfe: Its not all devices, it depends on what vendor did in bootloader
<robimarko>
But it would be safe to assume a lot of them have the 4MB limit
* f00b4r0
discovers that his mobile provider is currently in recovery position, nation-wide. Impressive
<Ansuel>
recovery?
<f00b4r0>
SNAFU
<f00b4r0>
nation-wide outage
<Ansuel>
OH LOL
<f00b4r0>
which is only like the second time in under 6 months, which is bound to have some consequences, given this is the state-owned "historical" provider, namely Orange.
<aparcar[m]>
stintel: do you mind I enable blank templates again or do you think people will go nuts?
<stintel>
blank templates ?
<robimarko>
ynezz: Yeah, it seems to hang for a bit more on preinit than on 5.15
<Ansuel>
aparcar maybe there is a way to allow blank templates only for members ?
<Mangix>
I installed dd-wrt on a wr940nv4 just because the stock firmware doesn't support disabling WAN. Doesn't support WPA3. Wonder if I can flash OpenWrt with support for that.
<aparcar[m]>
Ansuel: strangly not, I looked for something like "announcements" but it doesn't seem to be there neither
rua has quit [Ping timeout: 480 seconds]
<aparcar[m]>
Ansuel: what's the status of the CI ;)? Still waiting for CI build of another repo
<robimarko>
While TRIM has caused headaches in the past for SSD users, hopefully the eMMC reporting of TRIM capabilities is reliable and this doesn't end up causing issues / non-zeroing-out behavior for quirky devices...
<Ansuel>
imho the feature to zero-out sector is broken
<robimarko>
Why?
<Ansuel>
for that single emmc
<robimarko>
Yeah
<robimarko>
Ansuel: They say in the commit description: If an eMMC card supports TRIM and indicates that it erases to zeros
<robimarko>
But, I dont see how they check whether that TRIM erases to zero?
<Ansuel>
ehhh i wonder if they assume support
<robimarko>
mmc_can_trim does however also check for EXT_CSD_SEC_GB_CL_EN
<robimarko>
Whatever that means
<Ansuel>
ynezz aparcar can you check and review this? github.com/openwrt/openwrt/pull/12774
<mrkiko>
robimarko: hi!! No, but ... I am not writing actively to the flash as the device is running normally
<mrkiko>
robimarko: reading the memory with grep didn't cause any error... I would need some writing to try
<mrkiko>
robimarko: I am able to cause I/O errors with fstrim - [93010.835112] I/O error, dev loop0, sector 16902 op 0x3:(DISCARD) flags 0x800 phys_seg 1 prio class 2
<Ansuel>
we should test
<Ansuel>
if fstrim cause the same error on 5.15
<Ansuel>
but my idea is that it wasn't supported
<Ansuel>
for emmc
<mrkiko>
Ansuel: I can confirm that
<robimarko>
I suspect TRIM was broken all along but never used
<Ansuel>
yes and the datasheet is just for the meme
<Ansuel>
or a copy past error
<Ansuel>
copy paste*
<mrkiko>
for now, I can not observe fs corruptions with e2fsck with the -n flag; and after having mounted ro both / and /overlay
<mrkiko>
To write-stress a little, I installed curl and python3 packages...
bluew has joined #openwrt-devel
<mrkiko>
but not seeing any of the initial WRITE_ZERO ones
<robimarko>
mrkiko: Does fstrim cause issues in 5.15 is my question?
<robimarko>
Cause, it looks damn likely that TRIM was broken on these eMMC from the start but I dont OpenWrt triggers TRIM ever
<Ansuel>
robimarko trim doesn't make that sense on 4gb emmc since doing it by sw should not cause that big of perf regression
<robimarko>
Anusel: There should be no regression as it wasnt being used, cause AFAIK you need to periodically call fstrim or like to actually use it
floof58 has quit [Remote host closed the connection]
<robimarko>
Ideally we would leave it but it seems broken
<mrkiko>
robimarko: on 5.15 fstrim said operation not supported I think
<robimarko>
That would explain this never showing up before
<mrkiko>
but there are two kind of i/o errors involved here and I don't have the knowledge to correlate them - one isdiscard-related, the other (the ones I reported yesterday) might not be
<Ansuel>
well one might be
<Ansuel>
validating the trim and noticing that it failed
<Ansuel>
the other is 0 in the wrong place
<robimarko>
Write zeroes one is 100% TRIM being broken
<robimarko>
Cause its trying to offload that by using TRIM commands
<Ansuel>
if the feature is broken the kernel expect a block while in reality it's something else and that is reported
lucenera has joined #openwrt-devel
<robimarko>
It expects that block to be zeroed out
<robimarko>
mrkiko: What is the eMMC model you have?
floof58 has joined #openwrt-devel
<mrkiko>
so, it expects the block to be zroed out but it's not? Or writes zeroes somewhere it shouldn't? furthermore, the write zero ones are triggered at boot as far as i can tell, at least after sysupgrade and with no intervention on my side
<Ansuel>
it's probably called on when the emmc is mounted
<robimarko>
Its called once EXT4 fs is mounted
<mrkiko>
robimarko: how do I retrieve these informations?
<robimarko>
Its all in /sys/class/mmc_host/mmc0/mmc0:0001
<robimarko>
Some Kingston ones are already in the quirks table
<Ansuel>
lovely they report support but it's broken
<Ansuel>
i guess they just enable it for the marketing
<robimarko>
Well, classic thing
<mrkiko>
As for thedevice name, gato in italian = cat in english, altough a "t" letter is missing. Forno is oven - because the device might get little bit hot, even tough sustainably in my experience
<robimarko>
I added mine to the quirks list and no more errors
<robimarko>
Funnily enough, it seems that TRIM kind of works
<robimarko>
Cause fstrim has no errors, but it doesnt seem to like writing zeros
<Ansuel>
well problem is exactly that
<Ansuel>
the feature half working and having fw bug
<mrkiko>
I hope it doesn't break hw, or that it doesn't break the deviceenough to need to boot from initramfs and restore partitions tables and so on :D
<robimarko>
Nah
<robimarko>
Its just failing to do whats being asked for
<mrkiko>
:D
<robimarko>
Worst it can do is mess up the overlay
<mrkiko>
no problem then
<Ansuel>
mrkiko worst that can happen is corrupted fs
<mrkiko>
but so far the impression is - the code is detecting errors and behaving well. As said, no detectable corruptions for now
<mrkiko>
with e2fsck
<robimarko>
AFAIK, it will see that I/O call failed and just revert to the old SW path
<Ansuel>
yep was similar to the problem of ipq8064 using different EC configuration on some partition
<robimarko>
Il add the patch for Micron eMMC I have and send it upstream
<robimarko>
Maybe they have some ideas
<Ansuel>
reporting error but the thing wasn't touched
<mrkiko>
infact... probably the main reason to try to avoid these errors is to avoid confusion in a sense
<robimarko>
Dont get me wrong, I/O errors are something to always be looked into
<robimarko>
It makes even more sense since ynezz reported even in HS400 his eMMC works without any errors
<robimarko>
It would also explain why benchmarking does not trigger anything
<mrkiko>
if I can test, I'll do so, so feel free to ping me
<robimarko>
mrkiko: You dont happen to know somebody with a high res pictures of the NBG or maybe who knows the eMMC model?
<Ansuel>
no image on FCC?
<robimarko>
I cant make out the eMMC IC, let alone the model
<robimarko>
Anyway, off to bed
robimarko has quit [Quit: Leaving]
goliath has quit [Quit: SIGSEGV]
tlj has quit [Ping timeout: 480 seconds]
csharper2005 has quit [Ping timeout: 480 seconds]
cmonroe has quit [Ping timeout: 480 seconds]
cmonroe has joined #openwrt-devel
cmonroe has quit [Ping timeout: 480 seconds]
schwicht has quit [Read error: Connection reset by peer]
schwicht has joined #openwrt-devel
Tapper has quit [Ping timeout: 480 seconds]
Danct12 has quit [Ping timeout: 480 seconds]
<SlimeyX>
usually are the black antenna leads 2.4 and white or grey 5G?
<SlimeyX>
AP with dual radios
<\x>
cant tell
<\x>
depends
<\x>
just try and see signal readings on which is better if in doubt ;) got a phone right?
<SlimeyX>
heh yeah
<SlimeyX>
ap has two of the exact same cards trying to figure out which should do which band
ptudor has joined #openwrt-devel
<\x>
which ap is this
<SlimeyX>
adtran bsap-1930
<SlimeyX>
nm i found it in my notes
<SlimeyX>
grey 5 black 2.4
rsalvaterra has quit [Read error: Connection reset by peer]