#linux-sunxi on 2021-10-18 — irc logs at oftc.irclog.whitequark.org

2021-07-26 22:56 ChanServ changed the topic of #linux-sunxi to: Allwinner/sunxi development - Did you try looking at our wiki? https://linux-sunxi.org - Don't ask to ask. Just ask and wait for an answer! - This channel is logged at https://oftc.irclog.whitequark.org/linux-sunxi

01:01 apritzel has quit [Ping timeout: 480 seconds]

02:10 cnxsoft has joined #linux-sunxi

02:17 macromorgan is now known as Guest3245

02:17 Guest3245 has quit [Read error: Connection reset by peer]

02:17 macromorgan has joined #linux-sunxi

02:56 swiftgeek has joined #linux-sunxi

03:22 chewitt has joined #linux-sunxi

03:32 Danct12 has quit [Quit: Quitting]

04:25 sh1 has joined #linux-sunxi

06:22 chewitt has quit [Quit: Zzz..]

06:28 apritzel has joined #linux-sunxi

06:50 lunixoid has joined #linux-sunxi

07:18 jernej has quit [Quit: Free ZNC ~ Powered by LunarBNC: https://LunarBNC.net]

07:19 jernej has joined #linux-sunxi

07:31 tnovotny has joined #linux-sunxi

07:44 apritzel has quit [Ping timeout: 480 seconds]

08:02 cnxsoft has quit [Read error: Connection reset by peer]

08:08 apritzel has joined #linux-sunxi

08:23 apritzel has quit [Ping timeout: 480 seconds]

08:57 ynezz has quit [Ping timeout: 480 seconds]

09:00 apritzel has joined #linux-sunxi

09:25 cnxsoft has joined #linux-sunxi

09:31 ynezz has joined #linux-sunxi

09:31 ynezz is now known as Guest3267

09:32 Guest3267 has quit []

09:32 ynezz_ has joined #linux-sunxi

09:36 ynezz_ is now known as ynezz

09:36 ynezz has quit []

09:38 ynezz has joined #linux-sunxi

09:39 ynezz is now known as Guest3268

09:40 Guest3268 is now known as ynezz

10:45 warpme_ has joined #linux-sunxi

13:06 Asara has quit [Quit: leaving]

13:06 Asara has joined #linux-sunxi

14:44 JohnDoe_71Rus has joined #linux-sunxi

15:00 cnxsoft has quit []

15:19 fcas has joined #linux-sunxi

15:25 fcas has quit []

15:26 fcas has joined #linux-sunxi

15:30 <fcas> I'm having some trouble with modules using H3. Linux 5.4.111, based on orange pi pc plus. Some of them are freezing (stop heartbeat, ssh/uart unresponsive) after random times (days in most cases)

15:31 <jakllsch> emmc problems?

15:31 <fcas> tried with both, mmc and SD Card

15:33 <fcas> tried to run some heat cycles too (heat to 60°C, freeze with non-condutive spray) but couldn't make them freeze

15:33 <apritzel> fcas: what modules? Linux kernel modules? If yes, which ones, and are you sure it's related to them, and not some general system stability problem?

15:35 <fcas> the entire system gets unresponsive, I need to reset the module (I mean system on module, not an specific linux module )

15:35 <fcas> reset of power cycle

15:37 <apritzel> fcas: power supply issues are a common cause for random crashes/freezes

15:38 <fcas> is there a recommend way to post images here?

15:38 <jernej> fcas: IMO best way to get anything useful, catch kernel messages (enable kernel messages on serial console and monitor it) and also try without any out-of-tree patches or drivers

15:38 tnovotny has quit [Quit: Leaving]

15:39 <apritzel> fcas: what does "Linux 5.4.111, based on orange pi pc plus" mean? Some OrangePi provided *kernel*, or mainline, and just the DT based on that board?

15:40 <fcas> mainline, DT and sch of the module based on orange pi pc plus

15:41 <jernej> in other words, custom board?

15:41 <fcas> yes

15:42 <jernej> any kernel patches or non-mainline drivers?

15:43 <fcas> no, none. Just changed boot cmd script a little, to make it able to boot from mmc after writing to it

15:49 <fcas> I tried with 4 power supplies (2 usb chargers, one of them able to provide 2A per output), two based on TI TPS54260 (different designs, both used in another products, one them is used today to supply a Beaglebone board)

15:51 <fcas> I noticed that the RTC crystal isn't oscillating, but orange pi doesn't even connect those pins to anything so I didnt pursue this subject

15:55 <jernej> if you want to enable RTC external signal, you have to adjust DT accordingly

15:55 <jernej> *external crystal

15:56 <fcas> ok, ty. But its safe to say that it shouldn't freeze the OS?

15:56 <jernej> correct

15:58 <jernej> just try to catch kernel oops or whatever it is on serial

15:58 <jernej> if it is stability issue, it will most probably be something else every time

15:59 <fcas> is there any way to make it likely that the issue will happen if its related with stability?

16:00 <fcas> I tried to use stress-ng package, sometimes it runs to completion (sucessfull test) and sometimes it freezes the board

16:01 lunixoid has quit []

16:02 <DuClare> Have you considered playing with the clock frequency fcas?

16:03 <fcas> I tried to use the default DT clock config only

16:03 <fcas> but I can see this on DMESG:

16:03 <fcas> Sep 20 10:43:59 flexb-v1-0-0-sds16 kernel: /cpus/cpu@0 missing clock-frequency property Sep 20 10:43:59 flexb-v1-0-0-sds16 kernel: /cpus/cpu@1 missing clock-frequency property Sep 20 10:43:59 flexb-v1-0-0-sds16 kernel: /cpus/cpu@2 missing clock-frequency property Sep 20 10:43:59 flexb-v1-0-0-sds16 kernel: /cpus/cpu@3 missing clock-frequency property

16:03 <fcas> Sep 20 10:43:59 flexb-v1-0-0-sds16 kernel: /cpus/cpu@0 missing clock-frequency property

16:04 <fcas> but its the default DT on that config

16:05 <apritzel> fcas: you can ignore those messages, that's some leftover from old times

16:05 <jernej> smaeul: I think we're missing some pieces for CEC wake up on H6. Message CEC_MSG_GIVE_DEVICE_POWER_STATUS (0x8f) is not a part of wakeupctrl flags. I guess there are additional flags in 0x7d32.

16:06 <apritzel> fcas: if you feel brave, try to send a patch to RMK ;-)

16:07 <apritzel> fcas: DRAM setup could be another issue, many times it can be worked around by lowering the DRAM frequency in U-Boot

16:09 <fcas> that seems pretty viable to change, I will check how to do it. But I used memtester a lot before trying with stress-ng

16:12 <apritzel> fcas: many DRAM issues in the past were found with lima-memtester, which was stress-testing the DRAM particularly (multiple DMA masters + CPU), but that requires a BSP kernel, IIRC

16:13 <fcas> I used the one provided here: https://layers.openembedded.org/layerindex/recipe/123434/

16:14 <fcas> and this version of stress-ng: https://layers.openembedded.org/layerindex/recipe/121674/

16:19 <apritzel> fcas: the problem is that most of those memtesters are designed to find faulty DRAM cells, but the most common DRAM problems on Allwinner boards are timing issues

16:20 <fcas> is there some recommended config to make it as reliable as possible?

16:21 <DuClare> If the configs in mainline dtses don't work for you, I'd recommend you just try until you find something that works

16:21 <apritzel> fcas: you wish ;-)

16:22 <apritzel> fcas: one common way to work around the problem is to just lower the DRAM frequency, which relaxes some of the other timing parameters

16:25 fcas has quit [Quit: Page closed]

16:50 fcas has joined #linux-sunxi

17:08 igraltist has quit [Remote host closed the connection]

17:09 igraltist has joined #linux-sunxi

17:14 Danct12 has joined #linux-sunxi

17:17 apritzel has quit [Ping timeout: 480 seconds]

17:39 apritzel has joined #linux-sunxi

17:44 <fcas> I'm using a LPDDR3 RAM, are those settings needed: SUNXI_DRAM_LPDDR3 SUNXI_DRAM_LPDDR3_STOCK in uboot? they are set as N

17:45 <fcas> and DRAM_CLK is set to 624, my DDR lists JEDEC 1600 (at the same time, it clearly states that it should be backwards compatible with slower speeds)

17:47 <jernej> fcas: it's soc that's limited, not DRAM chip

17:48 warpme_ has quit [Quit: Connection closed for inactivity]

17:48 apritzel has quit [Ping timeout: 480 seconds]

18:28 apritzel has joined #linux-sunxi

18:40 qCactus has quit [Ping timeout: 480 seconds]

18:41 igraltist has quit [Remote host closed the connection]

18:42 igraltist has joined #linux-sunxi

18:55 igraltist has quit []

18:56 apritzel has quit [Ping timeout: 480 seconds]

19:01 apritzel has joined #linux-sunxi

19:02 igraltist has joined #linux-sunxi

19:06 JohnDoe_71Rus has quit []

19:11 jelly-hme is now known as jelly

19:11 <apritzel> fcas: those settings (624 MHz, JEDEC 1600) assume that everything is configured correctly, which is probably not the case

19:12 <apritzel> and that's not a problem of your board, it applies to basically every board, with our current DRAM setup in U-Aboot

19:13 <apritzel> we simply don't know enough about the DRAM controller to tune it up to the spec limits

19:14 <apritzel> in the past I have heard reports of people running at 800 MHz (on A64), but this was with specific (magic) settings and proper board design

19:16 <apritzel> fcas: so you should ignore those theoretical values, and start to tune down CONFIG_DRAM_CLK a notch, to see if stability improves

19:16 <apritzel> which is not easy if it's hard to reproduce, but nobody said that board design would be a walk in the park ;-)

19:20 <fcas> I just used 480 instead of 624

19:21 <fcas> lets see how it works, after that I can ask here if we should try to increase the speed and redo all testing and settings

19:22 <apritzel> yeah, 480 sounds good as a start. If you can find a quicker reproducer, that would of course be helpful

19:24 <apritzel> IIUC the trick with lima-memtester was that it used 3D graphics (DMA from texture memory to the GPU, then into the framebuffer, plus the DE DMAing from the framebuffer to the video transmitters), plus:

19:25 <apritzel> CPU -> DRAM traffic, so the DRAM controller is put under heavy stress

19:25 <apritzel> you could try to replicate this, when you can run some 3D benchmark alongside some other memory stress test

19:26 <anarsoul> yeah, GPU is pretty sensitive to DRAM stability

19:27 <anarsoul> for utgard case it writes tile heap and varyings into memory and then reads them back

19:27 <anarsoul> "glmark2-drm -b refract" should give you decent load for DRAM controller

19:28 <apritzel> anarsoul: nice, thanks for the hint

19:36 <anarsoul> looks like original memtester just used textured cube

19:37 <anarsoul> so I guess kmscube would also do? :) but "glmark2-drm -b refract" should definitely generate more memory traffic

19:42 <apritzel> anarsoul: do you have such "worst-case" scenarios for Midgard and Bifrost as well?

19:52 <jernej> speaking of stability issues - I observed some on OrangePi 3 with lowest frequency when switching to higher one

19:53 <jernej> slightly increasing lowest setting helped another user but not me

19:53 <anarsoul> apritzel: sorry, no

19:53 <jernej> *lowest voltage

19:53 <jernej> another user reported random crash on PineH64, which also looks like stability issue

19:54 <jernej> I suspect voltage is set too low for some frequencies and bins

20:12 <apritzel> jernej: this is CPU OPPs you are talking about?

20:15 <jernej> apritzel: yes

20:16 <apritzel> random and occasional issues, or reproducible?

20:18 <jernej> well, I can relatively easily reproduce issue on my OPi3s (I have two), except sometimes, when it appears stable

20:20 <jernej> by easy I mean no hard work, just starting and stoping video playback

20:20 <jernej> sometimes this is after two or tree videos, sometimes 20

20:21 <jernej> and it doesn't crash in same place

20:21 <jernej> but often when doing some memory operation

20:24 fcas has quit [Quit: Page closed]

20:24 Daanct12 has joined #linux-sunxi

20:25 <jernej> one user with similar issue claimed that using OPi Lite2 image on OPi 3 give him stable system (minus some HW)

20:26 <jernej> after comparing and testing, it turns out that higher minimum CPU voltage was the reason for better stability on his board

20:28 <anarsoul> jernej: video playback with hw decoding?

20:29 <jernej> yes, CPU at that time usually goes to lowest frequency (or close)

20:30 <jernej> if there is no overlay

20:30 Daanct12 has quit [Quit: Quitting]

20:30 Danct12 has quit [Ping timeout: 480 seconds]

21:16 Danct12 has joined #linux-sunxi

21:18 Danct12 has quit [Remote host closed the connection]

21:18 Danct12 has joined #linux-sunxi

22:34 diego71 has quit [Ping timeout: 480 seconds]