<cyrozap>
apritzel: I PM'd you a link to the file. I'll try with the one you posted now.
aggi has quit [Ping timeout: 480 seconds]
<cyrozap>
The SPL you posted didn't work for me--same error.
aggi has joined #linux-sunxi
<apritzel>
cyrozap: interesting, dug out my OPi Zero2, and both your and my version indeed failed ...
<machinehum>
apritzel: Amazing
<machinehum>
Boss level
<cyrozap>
apritzel: I just sent you the SPL I built a year ago that _does_ work with FEL. Maybe you could try that on your Zero2, too, and see if that works?
<cyrozap>
Just to confirm that your Zero2 isn't broken.
<apritzel>
cyrozap: but the respective version (GCC 12.1.0) worked on my TV box - the only difference should be some slightly different DRAM setup parameters
<apritzel>
cyrozap: also: GCC 11.2.0 failed as well, but GCC 10.3.0 worked
<apritzel>
cyrozap: and your version as well
<cyrozap>
Well, I'm glad you can reproduce the issue now, and it's not just me, haha :)
<apritzel>
cyrozap: thanks for helping in narrowing this down, I will have deeper look tomorrow, and probably pick some brains of our compiler guys
<apritzel>
cyrozap: might be some undefined behaviour in U-Boot, newer compilers tend to exploit them more heavily
<cyrozap>
And thank you for looking into this so diligently! I'm going to re-build the old version of the SPL with GCC 12.1.0 again, and then pop both the working SPL and the broken SPL into Ghidra and see if there's any major differences around the "Return to FEL" code.
<apritzel>
cyrozap: my hunch is that the problem is in the DRAM code, because the slightly different code between the Mate X96 and the OPiZero2 seems to make a difference
juri_ has quit [Remote host closed the connection]
evgeny_boger has joined #linux-sunxi
juri_ has joined #linux-sunxi
apritzel has joined #linux-sunxi
evgeny_boger has quit [Ping timeout: 480 seconds]
pg12_ has joined #linux-sunxi
pg12 has quit [Ping timeout: 480 seconds]
evgeny_boger has joined #linux-sunxi
ftg has joined #linux-sunxi
<cyrozap>
apritzel: I think I might have found a possible source of the problem, but I may also have simply misunderstood the Allwinner boot process.
JohnDoe_71Rus has quit []
<cyrozap>
I used the command `sunxi-fel spl u-boot/spl/sunxi-spl.bin` to load the working U-Boot SPL (built last year) and return to FEL mode. Then, I used `sunxi-fel dump 0x20000 0xa000` to dump what looks mostly like the SPL, but with several differences: 1) The eGON header is slightly modified (I assume to tell the boot ROM to enter FEL mode). 2) Bytes 0x21000 to 0x22000 are filled with (apparently random)
<cyrozap>
data that differs significantly from the SPL that was originally uploaded.
<apritzel>
cyrozap: yes, the BROM modifies the SPL header, to put the boot source into, plus the SPL does that to communicate something to U-Boot proper, like the detected DRAM size
<apritzel>
and then, depending on the SoC type, we use part of the SRAM for stack and other data
<apritzel>
IIRC for the H616 we use the end of SRAM C for that, so below 0x58000
<cyrozap>
Oh, I see now--on H616, 0x21000 is used as a swap buffer: `{ .buf1 = 0x21000, .buf2 = 0x52a00, .size = 0x1000 }`
<apritzel>
plus the BROM uses parts of SRAM A1 for its stack
<apritzel>
yeah, this is the BROM stack, which we have to preserve when we later plan to re-enter FEL
<apritzel>
so sunxi-fel backs up this part, into SRAM C, and when we jump back into FEL at the end of the SPL, we restore this, then return to the BROM
<cyrozap>
Wow, what a nightmare. Makes me glad that MediaTek provides plenty of SRAM on their chips (between 256kB and 512kB on the devices I own) dedicated to the SPL ("preloader", in their terms), with a separate 64kB SRAM for the mask ROM's data and stack.
disctanger has joined #linux-sunxi
<cyrozap>
Well, anyways, I guess that pokes a hole in my "return-to-FEL code is getting corrupted" hypothesis.
<cyrozap>
Oh, and for the sake of completeness, comparing the old U-Boot SPL to the recently-built one didn't turn up any significant results.
<cyrozap>
As far as I could tell, none of the code that might affect the "return to FEL" code path was changed in a way that changed any of the logic.
<cyrozap>
That's why I thought maybe the offsets were what were causing the problems--while the code was mostly the same, a bunch of it got moved around inside the binary in the newly-built SPL.
disctanger has quit [Remote host closed the connection]
JohnDoe_71Rus has joined #linux-sunxi
<apritzel>
cyrozap: thanks for checking that, and bummer that it didn't show something easily
<apritzel>
cyrozap: can you check whether the address of fel_stash_addr moved?
cnxsoft has quit []
evgeny_boger has quit [Ping timeout: 480 seconds]
vpeter has quit [Remote host closed the connection]
bauen1_ has quit [Ping timeout: 480 seconds]
vpeter has joined #linux-sunxi
vpeter has quit [Read error: Connection reset by peer]
vagrantc has joined #linux-sunxi
vpeter has joined #linux-sunxi
hentai has quit [Read error: Connection reset by peer]