#linux-sunxi on 2022-11-03 — irc logs at oftc.irclog.whitequark.org

2022-08-14 19:44 ChanServ changed the topic of #linux-sunxi to: Allwinner/sunxi development - Did you try looking at our wiki? https://linux-sunxi.org - Don't ask to ask. Just ask and wait for an answer! - This channel is logged at https://oftc.irclog.whitequark.org/linux-sunxi

00:01 <apritzel> dikiy: I just see that smaeul already decreased the mask to 9 bits: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8b33dfe0b

00:43 apritzel has quit [Ping timeout: 480 seconds]

01:41 cnxsoft has joined #linux-sunxi

01:41 swiftgeek has quit [Ping timeout: 480 seconds]

01:42 pabs has quit [Quit: Don't rest until all the world is paved in moss and greenery.]

01:44 pabs has joined #linux-sunxi

01:47 swiftgeek has joined #linux-sunxi

02:00 swiftgeek_ has joined #linux-sunxi

02:03 jernej has quit [Quit: Free ZNC ~ Powered by LunarBNC: https://LunarBNC.net]

02:04 jernej has joined #linux-sunxi

02:04 swiftgeek_ has quit []

02:04 swiftgeek_ has joined #linux-sunxi

02:05 swiftgeek_ has quit []

02:05 swiftgeek_ has joined #linux-sunxi

02:06 swiftgeek is now known as Guest300

02:06 swiftgeek_ is now known as swiftgeek

02:07 Guest300 has quit [Ping timeout: 480 seconds]

02:25 warpme____ has quit []

02:39 vagrantc has quit [Quit: leaving]

03:34 moteen has joined #linux-sunxi

03:40 moteen has quit [Remote host closed the connection]

04:20 JohnDoe_71Rus has joined #linux-sunxi

05:09 cnxsoft1 has joined #linux-sunxi

05:09 cnxsoft has quit [Read error: Connection reset by peer]

05:27 moteen has joined #linux-sunxi

05:33 moteen has quit [Read error: Connection reset by peer]

06:17 evgeny_boger has joined #linux-sunxi

06:25 evgeny_boger has quit [Ping timeout: 480 seconds]

07:24 apritzel has joined #linux-sunxi

07:40 szemzoa has quit [Remote host closed the connection]

07:43 mps has quit [Ping timeout: 480 seconds]

08:03 apritzel has quit [Ping timeout: 480 seconds]

08:51 mps has joined #linux-sunxi

09:38 apritzel has joined #linux-sunxi

10:47 dikiy_ has joined #linux-sunxi

10:49 dikiy has quit [Ping timeout: 480 seconds]

10:53 dikiy has joined #linux-sunxi

10:55 dikiy_ has quit [Ping timeout: 480 seconds]

11:05 dikiy has quit []

11:07 dikiy has joined #linux-sunxi

11:10 <dikiy> apritzel: I already did the test (https://github.com/smaeul/timer-tools) that shows (https://pastebin.com/xbTeksJN) that genmask(8,0) is not enough.

11:11 <dikiy> as smaeul suggested changing 8 to 7 helped and test passed (no failures through some hours of testing)

11:11 <dikiy> I would test the performance impact, but I dont know how

11:13 <dikiy> And as of kernel parameter: it could be, for example 8 as default value. But if somebody finds out (rare cases, as my for example, I suppose) that 8 is not enough, then he could set the kernel parameter to 7 (or 6, whatever)

11:15 <apritzel> dikiy: I understand the idea, but a command line parameter would just be a hack

11:15 <dikiy> The case is, that nobody with good boards (tha majority) should suffer because of 5% of bad boards. And, unfortunately pine64 doesn't consider this as buggy HW, and doesn't send a replacement

11:16 <apritzel> keep in mind that the arch timer in inside the SoC, so it's not some board production issue

11:16 <gamiee> dikiy: it's happening it on PINE A64 LTS? Or other PINE64 device?

11:17 <dikiy> pinephone OG

11:17 <apritzel> it's an Allwinner A64 problem

11:17 <apritzel> there might be good and bad batches of SoCs, nobody knows

11:17 <gamiee> Yes, I know it's SoC problem, just thinking if it's issue of A64 batch, or all revisions of A64

11:18 <gamiee> (I wonder if it's fixed in A64-H)

11:18 <apritzel> and this "nobody with good boards (tha majority) should suffer" is the reason I asked for a performance assessment

11:18 <dikiy> I'm only bored of need to crossbuild the kernel everytime I want to get an update...

11:18 <apritzel> because if it turns out to be totally negligible, we don't need to boil the ocean, and just decrease the mask (again)

11:19 <dikiy> apritzel: how could I measure the performance caveat?

11:20 <apritzel> good question, for a start it should be relatively easy to check how often we actually call this function

11:20 <apritzel> which I guess depends on the workload

11:22 <dikiy> apritzel: is this a function?

11:23 <apritzel> IIRC I once measured that the actual sysreg access takes 3 cycles, so we may overthink the performance impact here

11:23 <dikiy> as I see it is some inline definition

11:23 <apritzel> sun50i_a64_read_cnt[pv]ct_el0 are functions

11:23 <dikiy> ah, I see

11:25 <dikiy> can I set some counter on the syscall without recompile the kernel?

11:27 <apritzel> I think we already suffer because we need a workaround at all, so just slightly increasing the chance of a false positive might not be noticeable at all

11:27 <dikiy> 9->8 and now 8->7 gives like 4 times caveat, isnt it?

11:28 <dikiy> *would give

11:30 <apritzel> dikiy: I'd say we go from 1/1024 to 1/256, which is technically 4 times, but still with a very low probability

11:30 <dikiy> it is like we need a wait in a loop till sufficient time is over

11:30 <dikiy> ah, so I didn't understand the caveat

11:31 <dikiy> it is like we had a chance of 1/1024 to stuck for some time in a loop. and now it would be 1/256

11:31 <apritzel> dikiy: check the code, it just immediately reads again, and we just impose an upper limit

11:32 <apritzel> it's not stuck, normally you would just do another read, and that's it

11:32 <dikiy> I lack some understanding of multiprocessing. For example, can kernel be interrupted meanwhile in this loop?

11:33 <dikiy> ah.. the bug have a fingerprint, yes?

11:33 <dikiy> like 7fffffff at the and

11:34 <apritzel> yes, that's the idea: the erratum produces that pattern

11:35 <dikiy> in this case it shouldn't be a big deal I think

11:35 <apritzel> re: preemption: why would that matter? if the scheduler interrupts, we don't care about 10 cycles or so anymore

11:36 <dikiy> should I report it somewhere?

11:36 <apritzel> dikiy: in the normal case we just read once, and the value is fine, and there is just the general overhead of an arch timer workaround, plus the comparison (which costs nothing in the grand scheme of things, really)

11:37 <apritzel> dikiy: if we hit the erratum case, then it's great, because we detect this, and just pay 10 or so more cycles to prevent a big problem

11:38 <apritzel> dikiy: the only downside is that this bit pattern can of course appear just normally, without anything being wrong, that's this false positive case I mentioned above

11:39 <dikiy> now I see that increasing the mask doesnt matter a lot, comparing with the whole workaround scheme

11:39 <apritzel> exactly

11:39 <apritzel> dikiy: yes, please report this to the mailing, including the maintainers (both sunxi and arch timer)

11:39 <apritzel> scripts/get_maintainers.pl will tell you who they are

11:40 <dikiy> tbh I have no idea, how the lists work.

11:40 <dikiy> just google linux kernel maillist?

11:40 <apritzel> just send an email to the addresses that this Perl script outputs

11:44 <dikiy> could somebody run the script for me? I dont have a kernel tree right now

11:46 <apritzel> for just a report you can send to the sunxi maintainers, as listed here: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/MAINTAINERS#n1816

11:46 <apritzel> eventually should make a patch, and then having a kernel tree is your smallest problem ;-)

11:47 <dikiy> I could simply modify the already existent patch :)

11:48 <apritzel> but why? You just change the code, commit that, and let "git format-patch" do the dirty work for you

11:48 <dikiy> is it ok to send an email to all of three of them?

11:48 <dikiy> because the only command I know is git clone xD

11:48 <dikiy> I'm not a programmer

11:50 <apritzel> dikiy: just send an email to Chen-Yu, Jernej and Samuel, and CC: the Linux-ARM and sunxi list mentioned in that paragraph

11:51 <dikiy> thank you!

11:52 <apritzel> regarding performance measurements: normally you should be able to just use ftrace, but those two functions are of course marked as "notrace" ;-)

11:54 <dikiy> haha D

12:24 dikiy has quit [Ping timeout: 480 seconds]

12:44 grming has joined #linux-sunxi

12:53 JohnDoe_71Rus has quit []

12:57 dikiy_ has joined #linux-sunxi

14:19 cnxsoft1 has quit []

14:31 grming has quit [Ping timeout: 480 seconds]

14:42 grming has joined #linux-sunxi

14:43 cnxsoft has joined #linux-sunxi

14:46 warpme____ has joined #linux-sunxi

14:54 <maz> apritzel: is this still about the A64 arch timer jumping around?

14:54 <apritzel> maz: yeah, I am afraid so :-(

14:55 <maz> apritzel: I though this one was done and dusted? is this appearing on new HW for which the workaround is not sufficient? or was the workaround broken the first place?

14:56 <apritzel> the latter, apparently, it's still the same A64. For some odd reason there are chips out there that seem to be worse than we thought

14:57 <maz> great :-(

14:57 <apritzel> seems like not everyone reported issues, and if, then not to the right channels

15:01 cnxsoft has quit []

15:02 <maz> meh. seems like 6.1 is going to interesting on the "timer workaround" front, since I broke XGene...

15:21 JohnDoe_71Rus has joined #linux-sunxi

17:07 dikiy___ has joined #linux-sunxi

17:09 dikiy_ has quit [Ping timeout: 480 seconds]

17:17 grming has quit [Quit: Konversation terminated!]

17:30 dikiy___ has quit [Read error: Connection reset by peer]

17:32 dikiy_ has joined #linux-sunxi

17:37 <palmer> maz: looks like the RISC-V timers are broken too!

17:49 dikiy_ has quit [Ping timeout: 480 seconds]

17:51 <gamiee> waiw what really?

17:53 apritzel_ has joined #linux-sunxi

18:02 vagrantc has joined #linux-sunxi

18:03 apritzel has quit [Ping timeout: 480 seconds]

18:16 dikiy has joined #linux-sunxi

18:20 apritzel_ has quit [Ping timeout: 480 seconds]

18:23 dikiy has quit [Quit: leaving]

18:34 dikiy has joined #linux-sunxi

18:35 <dikiy> aperezdc: sorry, Ive lost the connection and dont know hoe to recover pm chat

18:52 <aperezdc> dikiy: sorry, but I think we never talked in private... maybe you meant to ping someone else?

19:38 apritzel_ has joined #linux-sunxi

19:56 JohnDoe_71Rus has quit []

20:00 apritzel_ has quit [Ping timeout: 480 seconds]

20:04 <karlp> pretty sure they meant ap<tab> to get to apritzel :)

20:09 <dikiy> aahaha, yeah

20:09 <dikiy> I only remember the first letters ))

20:23 <aperezdc> Happens "=)

20:36 ftg has joined #linux-sunxi

20:37 dok has quit [Ping timeout: 480 seconds]

20:45 bauen1_ has joined #linux-sunxi

20:47 bauen1 has quit [Ping timeout: 480 seconds]

21:32 apritzel_ has joined #linux-sunxi

21:41 grming has joined #linux-sunxi

21:45 <dikiy> apritzel_:

21:58 indy has quit [Ping timeout: 480 seconds]

22:01 apritzel_ has left #linux-sunxi [#linux-sunxi]

22:02 apritzel has joined #linux-sunxi

23:02 macromorgan has quit [Remote host closed the connection]

23:02 macromorgan has joined #linux-sunxi