JohnnyonFlame has quit [Ping timeout: 480 seconds]
lynxeye has joined #etnaviv
mwalle has joined #etnaviv
<mwalle>
Hi, I try to enable Etnaviv support on the NXP LS1028A SoC (which contains an GC7000 core similar to the iMX8 as far as I know). It contains a mali-dp display controller. Thanks to the nice guys at #linux-dri I was already able to already add kmsro support for mali-dp
pcercuei has joined #etnaviv
<mwalle>
If i start glmark, I'm getting a black screen and gpu hang, trying to recover messages
<mwalle>
I don't see any interrupts either, I presume there should be some, no?
<lynxeye>
mwalle: First thing: get the exact GPU model/rev from etnaviv, then dig into the Vivante kernel driver hwdb to get the feature bits of that core and add it to the etnaviv hwdb.
<mwalle>
oh, I'm running the latest linux-next and mesa-21.0.3 (the latter just because that was on my current buildroot setup)
<mwalle>
lynxeye: ok let me have a look
* austriancoder
hopes that the kernel based hwdb does not get extended and we switch over to user space based one in mesa
<austriancoder>
lynxeye: btw. what happended to the kernel patches - nothing landed yet in 5.13
<lynxeye>
austriancoder: We've had like 3 patches staged the last time around, so it dropped off my prio list and I didn't send a pull-req in time for 5.13. :/ Should all land in 5.14.
<lynxeye>
mwalle: Not the etnaviv driver, the Vivante kernel driver.
<lynxeye>
Basically you need to build a etnaviv hwdb entry from the values in the Vivante driver.
<austriancoder>
lynxeye: really? I am not happy with this - might it be possible to move etnaviv to drm-misc and/or add me as maintainer?
<lynxeye>
mwalle: If we already had the entry in etnaviv you wouldn't run into this again.
<mwalle>
lynxeye: ah :)
<mwalle>
but there are also "some" values in debugfs
<lynxeye>
austriancoder: While it's unfortunate, the changes are all in linux-next and I don't see any userspace changes depending on this, yet.
<lynxeye>
mwalle: The values in debugfs are what is read from the hardware, which is basically all lies since Vivante started to solely rely on the hwdb.
<austriancoder>
lynxeye: I tried to send in the patches in time to get into the next release (5.13 at the time) also even there is no public userspace bits yet the changes did not introduce any new ioctl
<austriancoder>
lynxeye: okay I have to live with it ..
<lynxeye>
austriancoder: I can only apologize for missing the merge and will try to be more consistent in the future there. Still everything you do in userspace needs to cope with those new params not being there, as we need to work with older kernels.
<austriancoder>
lynxeye: I am ware of that fact
<austriancoder>
.. s/ware/aware
<mwalle>
is there any script how to convert that long per bit feature list from the vivante driver to the etnaviv 32bit features values?
<mwalle>
*script/documentation
<lynxeye>
mwalle: One of my coworkers started something, currently very specific to one GPU. I would be happy to see this cleaned up in the github etna_viv repo.
<mwalle>
do I need the phys_baseaddr and the contiguous_mem properties from the vivante kernel driver?
<lynxeye>
mwalle: Nope, etnaviv uses the system CMA region
<lynxeye>
mwalle: Please take a look at the Mesa MR. If one of the PE pipe addresses isn't programmed correctly, you'll get GPU MMU faults with address 0
JohnnyonFlame has quit [Ping timeout: 480 seconds]
<mwalle>
lynxeye: I've updated to the latest mesa main branch tip with your 5 patches on top. There are some changes. I still get the mmu fault at addr 0, but I also see something happening on the screen (well mostly black but there is definetly some pattern recognizable, which also seems to change with different glmark tests)
<mwalle>
I've also set ETNA_MESA_DEBUG=no_supertile
<lynxeye>
mwalle: Does it change more if you also set no_ts?
<marex>
lynxeye: so why dont you and austriancoder co-maintainer the kernel driver anyway ?
<mwalle>
and in fact i also see something on the screen :)
<lynxeye>
mwalle: \o/
<mwalle>
lynxeye: is there a way to debug why there is that first fault? that looks like a valid address though
<lynxeye>
marex: Because one person doing it was working okay, at least as long as there was a big enough number of patches that I didn't forget about the staged stuff...
<lynxeye>
marex: Shared maintenance of a tree requires more coordination and in fact I almost never commit to drm-misc even though I have commit rights, due to the tooling overhead.
<marex>
lynxeye: seems austriancoder would like to help, so maybe give the comaintainership model a try ?
<marex>
lynxeye: I saw it working pretty well in various projects
<lynxeye>
Wouldn't a "hey I haven't seen a kernel PR yet" ping at -rc6 time be easier for starters? The etnaviv list is always copied on all that kernel stuff after all...
<marex>
lynxeye: if you were down due to some infectious disease, maybe not ?
dos has joined #etnaviv
<lynxeye>
*shrug* TBH I think there are a lot of bigger fires right now.
<lynxeye>
mwalle: Does this hang always happen after starting glmark? May be due to some missing TLB flush or something like that if it just happens a single time at app start.
<mwalle>
lynxeye: yes, on every start, and just once
<marex>
mwalle: I think I observe the MMU faults on STM32MP1 as well during glmark
<marex>
mwalle: do you observe / trigger those consistently ?
<marex>
mwalle: for me it takes days to trigger one
<marex>
lynxeye: like being a single point of failure during pandemic ? ;-)
<mwalle>
marex: on every start right before (?) the first test
<marex>
lynxeye: I think I'll just drop this now
<marex>
mwalle: oh
<marex>
mwalle: and if you use this --run-forever arg, does it always happen on the first test ?
<mwalle>
marex: let me try
<marex>
mwalle: or in fact, if you just select the one test and use --run-forever
<marex>
mwalle: for me it happened on one of the later tests
<marex>
mwalle: in fact, did you scrape devcoredump out of the MMU fault yet ? :)
<mwalle>
marex: sorry I'm really a noob regarding this topic ;) I'm happy that I got that mali-dp and this running
<mwalle>
what is devcoredump?
<marex>
mwalle: there is some functionality where if MMU fault happens, the kernel triggers some userspace udev helper and it writes a file with information
<marex>
there's udev rule and script to store that info
<lynxeye>
marex: I think those are very different issues. Faults at startup are usually a sign of either a missing TLB flush somewhere or content from the last execution still being stuck in some caches that is attempted to drain out when starting the GPU again.
<mwalle>
marex: no, it doesn't happen with --run-forever (well one time after start)
<mwalle>
btw nxp documented an erratum regarding the clock gating, which isn't working. I'm unsure if this depends on the core or the fact that this is a layerscale where most clocks are static and can't be gated/changed anyway. But it seems that the vivante gpu gates the input clock itself. Thus it might be core erratum
<lynxeye>
mwalle: Huh, am I blind or is there no public errata sheet available for this SoC?
<lynxeye>
The Vivante GPU normally have different ways to gate internal clocks depending on the GPU core generation.
<mwalle>
"GPU hangs if clock gating for rasterizer, setup engine and texture engine are enabled" workaround is enable (?) module level clock gating and disable clock gating fot these three blocks
<mwalle>
lynxeye: yeah no public errata sheets..
<lynxeye>
mwalle: That's interesting. This is the mlcg in the etnaviv kernel driver and we don't have any code in place to avoid SE and TX clock gating.