ChanServ changed the topic of #linux-msm to:
marvin24_ has joined #linux-msm
marvin24 has quit [Ping timeout: 480 seconds]
Daanct12 has joined #linux-msm
Guest2769 has quit [Ping timeout: 480 seconds]
Danct12 has joined #linux-msm
Daanct12 has quit [Ping timeout: 480 seconds]
junari_ has joined #linux-msm
Danct12 is now known as Guest3147
Danct12 has joined #linux-msm
pespin has joined #linux-msm
junari__ has joined #linux-msm
<minecrell> krzk: There are more dtbs_check warnings for the various remoteproc schema than I expected ("qcom,halt-regs:0: [134] is too short" fails on almost every DTB but also others). Anyway, the glink-edge change itself seems to work
junari_ has quit [Ping timeout: 480 seconds]
<krzk> minecrell: qcom,halt-regs is expected
<krzk> cool, then go with glink-edge
sricharan has joined #linux-msm
sricharan has quit [Remote host closed the connection]
junari_ has joined #linux-msm
junari__ has quit [Ping timeout: 480 seconds]
Danct12 has quit [Quit: WeeChat 3.8]
mort_5 is now known as mort_
anholt_ has joined #linux-msm
anholt has quit [Ping timeout: 480 seconds]
junari_ has quit [Remote host closed the connection]
junari_ has joined #linux-msm
junari_ has quit [Ping timeout: 480 seconds]
Caterpillar has joined #linux-msm
pespin has quit [Remote host closed the connection]
<minecrell> dianders: fwiw: I think the problem with that dcache_clean/inval patch is that it exposes devices that are not correctly marked as "dma-coherent" even though they are (or in the case of scm: behave) cache-coherent. When Linux writes DMA memory for such a device it will bypass the cache. But it may also keep some cache lines with stale data around
<minecrell> unchanged, with the assumption that they won't be seen by the device anyway because it's not cache-coherent.
<dianders> minecrell: But in this case the memory being affected is just RAM, not any device memory, right?
<minecrell> dianders: it's RAM there too I would say, RAM accessed by the nvme device
<minecrell> dianders: I would say chances are good that you can avoid your issue by marking the scm device in the DT as "dma-coherent;"
<minecrell> whether that is generally correct is a different question though since it could vary from scm call to scm call or from firmware version to firmware version
<dianders> minecrell: Indeed that does fix the issue for me. I don't think I understand the issue deeply enough to understand if that's a correct solution or just a hack, though. If the firmware changed to start mapping this memory as uncached then it would work with the old way and break with the new way, right? IIUC there's not a separate device on the memory bus here and the problem is that the firmware is just another client on the CPU?
<minecrell> dianders: for the first question: I would say yes, although I'm not a Linux dma-api expert either so maybe take my words with some caution :)
<minecrell> dianders: for the second question: I would say it doesn't really matter if you have a device or the firmware accessing the memory, the only difference is that a device is typically cache-coherent or not in hardware, while the firmware may decide map the memory cacheable or uncacheable in software (this can change more easily)
<minecrell> argh, this would mean we would need to go through *all* SoCs and see which hardware components are maybe cache-coherent, and possibly scm as well...
pespin has joined #linux-msm
pespin has quit [Remote host closed the connection]
<minecrell> well I think Robin's reply effectively just confirmed everything from above ^ :/
<dianders> minecrell: Yup, now I'm trying to figure out what to do... :-P
<dianders> I guess moving all of SCM to the streaming API isn't a bad thing as it will mitigate the problem and should be 100% better. Then I guess it's a question of whether to mark SCM devices as dma-coherent or not.
<konradybcio> well, scm goes back to at least 2012
<konradybcio> that question immediately becomes problematic
<minecrell> dianders: there is also the comment in qcom_scm_ice_set_key() that explicitly mentions avoiding the streaming API afaict, so simply replacing it there isn't straightforward either
<minecrell> konradybcio: I guess it's possible that only tf-a maps things as cacheable, dunno
<minecrell> I mean, tf-a definitely does but who knows what the qcom firmware does...
<dianders> It would be easy to apply this change to just sc7180 chromebooks...
<minecrell> I think the more pressing issue is that now potentially every SoC has silent DMA corruption if DT nodes are missing "dma-coherent". I've never seen a comprehensive list which hw components are cache-coherent on which SoC, seems like downstream often enough also just ignores that
<minecrell> It would be annoying to get disk corruption or something like that, not sure how likely that is to happen with this change
<konradybcio> annoying for some, dealbreaking for others.. bye bootloader if you're unlucky
<minecrell> dianders: For your case you can easily verify if tf-a maps things as cacheable or not. So if all SCM calls implemented in tf-a use cacheable, it should be fine and more correct to mark scm as "dma-coherent"
<dianders> minecrell: I'll see if I can find that bit of code.
<minecrell> I believe it's cacheable because MT_RO_DATA -> MT_MEMORY, rather MT_DEVICE/MT_NON_CACHEABLE but not 100% sure https://github.com/ARM-software/arm-trusted-firmware/blob/master/include/lib/xlat_tables/xlat_tables_v2.h#L85-L93
<dianders> minecrell: Makes sense to me, thanks. I'll do a tiny bit more digging / confirming and then try posting up a patch that just applies to sc7180 Chromebooks.
<konradybcio> dianders: is there some call that identifies whether we're using qc/tf-a tz?
<dianders> konradybcio: not that I'm aware of, but the DT already needs to know if you're using QC's normal firmware or the Chromebook-style firmware for a number of reasons, so it's easy to just add this in sc7180-trogdor.dtsi
<dianders> There are whole piles of differences.
svarbanov_ has joined #linux-msm
svarbanov has quit [Read error: Connection reset by peer]