ChanServ changed the topic of #etnaviv to: #etnaviv - the home of the reverse-engineered Vivante GPU driver - Logs https://oftc.irclog.whitequark.org/etnaviv
lynxeye has joined #etnaviv
pcercuei has joined #etnaviv
<tomeu> lynxeye: pH5: I'm debugging an issue with a larger model in which, after submitting above a number of operations, the GPU hangs
<tomeu> the operations in which I observe the hangs don't happen if I omit a few operations from the beginning of the model graph
<tomeu> I'm wondering if that could be because I'm asking the NPU to access memory it cannot really access
<tomeu> I came upon this but who points in that direction, but inconclusively in the NPU case: https://community.nxp.com/t5/i-MX-Processors/IMX8mp-NXP-s-6-1-22-2-0-0-Mickledore-galcore-ko-kernel-panic-gpu/m-p/1739413
<lynxeye> tomeu: Is it really hanging or is it just so slow that it might trigger the timeout handler?
<tomeu> do you know anything about how the GPU/NPU might be limited in what physical memory it is able to access?
<tomeu> I think it hangs, because further operations also time out, until the GPU is reset
<tomeu> but well, it could be that it takes a really really amount of time, but I don't see why that could be
<lynxeye> tomeu: I'm not aware of any restrictions on physical memory access on the NPU side.
<lynxeye> With the i.MX8MP they fixed the fabric, so peripherals have full access to all of the DRAM.
<tomeu> hmm, I read " 3GB + 256MB are the entire memory region which is accessible by the GPU." in https://community.toradex.com/t/verdin-imx8mp-opencl-gpu-size/23547/2
pcercuei has quit [Quit: leaving]
<lynxeye> tomeu: Not sure what downstream is doing there, but the one big improvement between other i.MX8M* and the i.MX8MP is that the GPUs actually can access all of DRAM.
<lynxeye> On the other i.MX8M* they were limited by the fabric only handling 32bit DMA addresses.
<lynxeye> But if you want to check, you can just boot your system with mem=3G on the kernel commandline, which will make the kernel ignore all memory above the 4GB DMA address boundary.
<tomeu> hmm, that's easy enough
<tomeu> but yeah, it's probably not that because further operations do run correctly (after the reset)
<tomeu> wonder what other resources are shared among jobs beside addresses
<tomeu> maybe event ids?
<tomeu> ahem, memset the command buffer fixed it...
<tomeu> sorry for wasting people's time :)
<cphealy> tomeu: in future, one trick that might be useful for determining if the GPU is hung or just busy for long time is to check with "perf" to see if there is data moving between the GPU and DRAM. There is a perf PMU driver for the i.MX8MP that allows seeing throughput of individual IP cores.
<tomeu> That's a good tip, thanks!
lynxeye has quit [Quit: Leaving.]