cnxsoft has quit [Read error: Connection reset by peer]
apritzel has joined #linux-sunxi
apritzel has quit [Ping timeout: 480 seconds]
ynezz has quit [Ping timeout: 480 seconds]
apritzel has joined #linux-sunxi
cnxsoft has joined #linux-sunxi
ynezz has joined #linux-sunxi
ynezz is now known as Guest3267
Guest3267 has quit []
ynezz_ has joined #linux-sunxi
ynezz_ is now known as ynezz
ynezz has quit []
ynezz has joined #linux-sunxi
ynezz is now known as Guest3268
Guest3268 is now known as ynezz
warpme_ has joined #linux-sunxi
Asara has quit [Quit: leaving]
Asara has joined #linux-sunxi
JohnDoe_71Rus has joined #linux-sunxi
cnxsoft has quit []
fcas has joined #linux-sunxi
fcas has quit []
fcas has joined #linux-sunxi
<fcas>
I'm having some trouble with modules using H3. Linux 5.4.111, based on orange pi pc plus. Some of them are freezing (stop heartbeat, ssh/uart unresponsive) after random times (days in most cases)
<jakllsch>
emmc problems?
<fcas>
tried with both, mmc and SD Card
<fcas>
tried to run some heat cycles too (heat to 60°C, freeze with non-condutive spray) but couldn't make them freeze
<apritzel>
fcas: what modules? Linux kernel modules? If yes, which ones, and are you sure it's related to them, and not some general system stability problem?
<fcas>
the entire system gets unresponsive, I need to reset the module (I mean system on module, not an specific linux module )
<fcas>
reset of power cycle
<apritzel>
fcas: power supply issues are a common cause for random crashes/freezes
<fcas>
is there a recommend way to post images here?
<jernej>
fcas: IMO best way to get anything useful, catch kernel messages (enable kernel messages on serial console and monitor it) and also try without any out-of-tree patches or drivers
tnovotny has quit [Quit: Leaving]
<apritzel>
fcas: what does "Linux 5.4.111, based on orange pi pc plus" mean? Some OrangePi provided *kernel*, or mainline, and just the DT based on that board?
<fcas>
mainline, DT and sch of the module based on orange pi pc plus
<jernej>
in other words, custom board?
<fcas>
yes
<jernej>
any kernel patches or non-mainline drivers?
<fcas>
no, none. Just changed boot cmd script a little, to make it able to boot from mmc after writing to it
<fcas>
I tried with 4 power supplies (2 usb chargers, one of them able to provide 2A per output), two based on TI TPS54260 (different designs, both used in another products, one them is used today to supply a Beaglebone board)
<fcas>
I noticed that the RTC crystal isn't oscillating, but orange pi doesn't even connect those pins to anything so I didnt pursue this subject
<jernej>
if you want to enable RTC external signal, you have to adjust DT accordingly
<jernej>
*external crystal
<fcas>
ok, ty. But its safe to say that it shouldn't freeze the OS?
<jernej>
correct
<jernej>
just try to catch kernel oops or whatever it is on serial
<jernej>
if it is stability issue, it will most probably be something else every time
<fcas>
is there any way to make it likely that the issue will happen if its related with stability?
<fcas>
I tried to use stress-ng package, sometimes it runs to completion (sucessfull test) and sometimes it freezes the board
lunixoid has quit []
<DuClare>
Have you considered playing with the clock frequency fcas?
<fcas>
I tried to use the default DT clock config only
<apritzel>
fcas: you can ignore those messages, that's some leftover from old times
<jernej>
smaeul: I think we're missing some pieces for CEC wake up on H6. Message CEC_MSG_GIVE_DEVICE_POWER_STATUS (0x8f) is not a part of wakeupctrl flags. I guess there are additional flags in 0x7d32.
<apritzel>
fcas: if you feel brave, try to send a patch to RMK ;-)
<apritzel>
fcas: DRAM setup could be another issue, many times it can be worked around by lowering the DRAM frequency in U-Boot
<fcas>
that seems pretty viable to change, I will check how to do it. But I used memtester a lot before trying with stress-ng
<apritzel>
fcas: many DRAM issues in the past were found with lima-memtester, which was stress-testing the DRAM particularly (multiple DMA masters + CPU), but that requires a BSP kernel, IIRC
<apritzel>
fcas: the problem is that most of those memtesters are designed to find faulty DRAM cells, but the most common DRAM problems on Allwinner boards are timing issues
<fcas>
is there some recommended config to make it as reliable as possible?
<DuClare>
If the configs in mainline dtses don't work for you, I'd recommend you just try until you find something that works
<apritzel>
fcas: you wish ;-)
<apritzel>
fcas: one common way to work around the problem is to just lower the DRAM frequency, which relaxes some of the other timing parameters
fcas has quit [Quit: Page closed]
fcas has joined #linux-sunxi
igraltist has quit [Remote host closed the connection]
igraltist has joined #linux-sunxi
Danct12 has joined #linux-sunxi
apritzel has quit [Ping timeout: 480 seconds]
apritzel has joined #linux-sunxi
<fcas>
I'm using a LPDDR3 RAM, are those settings needed: SUNXI_DRAM_LPDDR3 SUNXI_DRAM_LPDDR3_STOCK in uboot? they are set as N
<fcas>
and DRAM_CLK is set to 624, my DDR lists JEDEC 1600 (at the same time, it clearly states that it should be backwards compatible with slower speeds)
<jernej>
fcas: it's soc that's limited, not DRAM chip
warpme_ has quit [Quit: Connection closed for inactivity]
apritzel has quit [Ping timeout: 480 seconds]
apritzel has joined #linux-sunxi
qCactus has quit [Ping timeout: 480 seconds]
igraltist has quit [Remote host closed the connection]
igraltist has joined #linux-sunxi
igraltist has quit []
apritzel has quit [Ping timeout: 480 seconds]
apritzel has joined #linux-sunxi
igraltist has joined #linux-sunxi
JohnDoe_71Rus has quit []
jelly-hme is now known as jelly
<apritzel>
fcas: those settings (624 MHz, JEDEC 1600) assume that everything is configured correctly, which is probably not the case
<apritzel>
and that's not a problem of your board, it applies to basically every board, with our current DRAM setup in U-Aboot
<apritzel>
we simply don't know enough about the DRAM controller to tune it up to the spec limits
<apritzel>
in the past I have heard reports of people running at 800 MHz (on A64), but this was with specific (magic) settings and proper board design
<apritzel>
fcas: so you should ignore those theoretical values, and start to tune down CONFIG_DRAM_CLK a notch, to see if stability improves
<apritzel>
which is not easy if it's hard to reproduce, but nobody said that board design would be a walk in the park ;-)
<fcas>
I just used 480 instead of 624
<fcas>
lets see how it works, after that I can ask here if we should try to increase the speed and redo all testing and settings
<apritzel>
yeah, 480 sounds good as a start. If you can find a quicker reproducer, that would of course be helpful
<apritzel>
IIUC the trick with lima-memtester was that it used 3D graphics (DMA from texture memory to the GPU, then into the framebuffer, plus the DE DMAing from the framebuffer to the video transmitters), plus:
<apritzel>
CPU -> DRAM traffic, so the DRAM controller is put under heavy stress
<apritzel>
you could try to replicate this, when you can run some 3D benchmark alongside some other memory stress test
<anarsoul>
yeah, GPU is pretty sensitive to DRAM stability
<anarsoul>
for utgard case it writes tile heap and varyings into memory and then reads them back
<anarsoul>
"glmark2-drm -b refract" should give you decent load for DRAM controller
<apritzel>
anarsoul: nice, thanks for the hint
<anarsoul>
looks like original memtester just used textured cube
<anarsoul>
so I guess kmscube would also do? :) but "glmark2-drm -b refract" should definitely generate more memory traffic
<apritzel>
anarsoul: do you have such "worst-case" scenarios for Midgard and Bifrost as well?
<jernej>
speaking of stability issues - I observed some on OrangePi 3 with lowest frequency when switching to higher one
<jernej>
slightly increasing lowest setting helped another user but not me
<anarsoul>
apritzel: sorry, no
<jernej>
*lowest voltage
<jernej>
another user reported random crash on PineH64, which also looks like stability issue
<jernej>
I suspect voltage is set too low for some frequencies and bins
<apritzel>
jernej: this is CPU OPPs you are talking about?
<jernej>
apritzel: yes
<apritzel>
random and occasional issues, or reproducible?
<jernej>
well, I can relatively easily reproduce issue on my OPi3s (I have two), except sometimes, when it appears stable
<jernej>
by easy I mean no hard work, just starting and stoping video playback
<jernej>
sometimes this is after two or tree videos, sometimes 20
<jernej>
and it doesn't crash in same place
<jernej>
but often when doing some memory operation
fcas has quit [Quit: Page closed]
Daanct12 has joined #linux-sunxi
<jernej>
one user with similar issue claimed that using OPi Lite2 image on OPi 3 give him stable system (minus some HW)
<jernej>
after comparing and testing, it turns out that higher minimum CPU voltage was the reason for better stability on his board
<anarsoul>
jernej: video playback with hw decoding?
<jernej>
yes, CPU at that time usually goes to lowest frequency (or close)
<jernej>
if there is no overlay
Daanct12 has quit [Quit: Quitting]
Danct12 has quit [Ping timeout: 480 seconds]
Danct12 has joined #linux-sunxi
Danct12 has quit [Remote host closed the connection]