<bamse>
marc|gonzalez: you will hit the error on line 2205 in arm-smmu.c if you just reduce the number of interrupts...
<bamse>
marc|gonzalez: but qcom_smmu_cfg_probe() is called inbetween smmu->num_context_banks is read from hardware and that check...so perhaps you could try to just adjust that
<bamse>
marc|gonzalez: i.e. remove the last interrupts, and in qcom_smmu_cfg_probe() do if (smmu->num_context_banks == 12) smmu->num_context_banks = 11;
<bamse>
(just for testing purposes)
atipls has quit [Quit: ZNC 1.8.2+deb3build2 - https://znc.in]
<marc|gonzalez>
bamse: is reducing the number of interrupts supposed to change smmu->num_context_banks ?
<marc|gonzalez>
It doesn't in my test, but maybe I didn't understand what you wanted me to do...
<marc|gonzalez>
id = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_ID1);
<marc|gonzalez>
The number of context bank is determined at run-time by querying the HW, as far as I can tell
<marc|gonzalez>
bamse: downtream seems to check if the FW has marked some context banks as reserved
<marc|gonzalez>
Can't we do something similar in mainline?
<bamse>
marc|gonzalez: how do they check that?
<bamse>
marc|gonzalez: no, i thought just reducing the number of interrupts would affect num_context_banks, but what i'm suggesting is that you do both - just for testing
<marc|gonzalez>
bamse: apparently, vendor uses a function called arm_smmu_init_static_cbndx_list()
<marc|gonzalez>
Reducing the number of interrupts does NOT affect smmu->num_context_banks
<bamse>
marc|gonzalez: i know, that's what i'm telling you...
<marc|gonzalez>
I'm sorry, then I didn't understand what you said
<marc|gonzalez>
of course forcing the number down by hand will work
<bamse>
so, setting num_context_banks = 11 removes all your problems?
<marc|gonzalez>
Right now, I have just one (major) problem that accessing context bank 12 hoses the system. Any hack that results in not touching context bank 12 will make the system not be hosed...
<marc|gonzalez>
But my intuition is that there's nothing special about CB12 other than some dude at qcom thought "hey, we should reserve the last context bank, it has less chance of clashing with the OS"
<z3ntu>
bamse: quick question regarding qrtr nameserver, there's service number which identifies a let's say type of service (e.g. 4097 = DIAG service), there's version which I guess could be used for having multiple versions of the same service running with different function definitions. And there's 'instance' which is some semi-random number, what could be a decent description for that value? Also from what I can tell, the combination of
<z3ntu>
service+version+instance is unique, and from that a client gets the port number for the service they want to talk to?
<marc|gonzalez>
Except that the Linux dudes thought the same thing "hey, let's use the last CB to keep the FW happy"
<marc|gonzalez>
and thus FW & OS are trying to use the last CB at the same time
<marc|gonzalez>
and bad things happen(TM)(C)
<bamse>
marc|gonzalez: it seems just "correcting" num_context_banks would be the least intrusive change...
<bamse>
marc|gonzalez: said "linux dudes" is me...so, you're welcome :)
<bamse>
marc|gonzalez: i needed to pick a context bank that wasn't in use by the system, i could have searched for the first free one, but this was cleaner...and it doesn't seem to matter for your case anyways
<marc|gonzalez>
Proposed fix would be to use qsmmu->bypass_cbndx = smmu->num_context_banks - 2; instead of qsmmu->bypass_cbndx = smmu->num_context_banks - 1; ?
<bamse>
z3ntu: the "version" part doesn't seem to be in use universally, so i'd say service+instance is what uniquely identifies a service...but it should be mentioned that instance seems to be able to differ between platforms
<bamse>
marc|gonzalez: does that fix your problem?
<marc|gonzalez>
With a comment saying that some FW reserve the last CB?
<marc|gonzalez>
Well Angelo's work around was: "Use CB 11 instead of CB 12" so I would assume it fixes the issue, yes.
<marc|gonzalez>
That will work as long as the FW doesn't decide to reserve CB 11 though
<marc|gonzalez>
And it gets dirty when smmu->num_context_banks is 1 or 2
<bamse>
but can you confirm if that it actually works?
<aka_[m]>
qcom hypervisor probably reserves it for usage dedicated to trustzone trustlets
<aka_[m]>
bamse: isn't goal of qcom-iommu pretty much not touching stuff we don't need?
<z3ntu>
bamse: Ok, thanks! Hopefully I've got the description somewhat correct now :) will use this at a talk this weekend at FrOSCon about eSIM on Qualcomm hw
<marc|gonzalez>
bamse: in his patches, angelo said he managed to get audio on the audio jack. We don't have an audio jack, but my plan was to get audio working on the HDMI link. But I wanted to submit a fix for the SMMU crash before jumping in the rabbit hole.
<marc|gonzalez>
Basically, right now, if we enable lpass_q6_smmu, the driver code hoses the system
<marc|gonzalez>
well, not really the driver's fault, but an assumption does not hold on all systems
<marc|gonzalez>
bamse: there is another related issue, of course, in arm_smmu_device_reset()
<marc|gonzalez>
System blows up if we try to reset CB12
<bamse>
aka_[m]: that's the approach we take downstream, upstream we still let the arm-smmu driver initialize the hardware to some known state - to the best of our ability
<bamse>
z3ntu: cool, i'm interested in hearing that talk
<bamse>
marc|gonzalez: so, just setting last_s2cr = num_context_banks - 2; isn't fixing anything? but smmu->num_context_banks = 11; does?
deathmist1 has joined #linux-msm
<marc|gonzalez>
Maybe it's better if I send a patch to show exactly what I change?
deathmist has quit [Ping timeout: 480 seconds]
<marc|gonzalez>
bamse: if you say it's OK to "hide" the last context bank in the driver code, then I think it will work to just do --smmu->num_context_banks;
<marc|gonzalez>
Problem is: we don't want to do that for all SMMUs, just for the audio DSP SMMU.
<bamse>
marc|gonzalez: we probably want to make that conditional on the specific smmu, but yes
<marc|gonzalez>
if (last_context_bank_reserved_by_firmware) --smmu->num_context_banks;
<marc|gonzalez>
bamse: I will test unconditionally and we can discuss how to express the condition in parallel
<bamse>
marc|gonzalez: sounds good (the testing and concluding that it's all we need)
<marc|gonzalez>
bamse: OK but just to be explicit about the assumptions... The proposed change only helps when the last context bank is reserved by the FW. If the FW reserves any other CB, system blows up. We agree?
<marc|gonzalez>
You think it is unnecessary to grok arm_smmu_init_static_cbndx_list() in the vendor code?
<bamse>
marc|gonzalez: agreed
<bamse>
marc|gonzalez: perhaps not unnecessary, but you have to decide if it's worth your time
<bamse>
marc|gonzalez: that said, we do have things like protected-clocks and reserved-gpio-regions in dt already, so if there is an actual need we could argue in favor of something similar here
<marc|gonzalez>
I think we do need something like that in DT. How else do we say "ADSP SMMU has last CB reserved for FW" ?
<bamse>
marc|gonzalez: the code has been there for 4 years now, so how widespread is this problem?
<marc|gonzalez>
I love it when you guys ask me that
<marc|gonzalez>
It is widespread to my system :)
<marc|gonzalez>
I dunno why MSM8996 seems unaffected, I thought 8996 & 8998 were "very close"
<marc|gonzalez>
But since it's actually a FW diff, maybe "8996 vs 8998" is irrelevant
<marc|gonzalez>
bamse: can I make a formal submission? Are you the qcom smmu maintainer?
<marc|gonzalez>
Or do I need to run this by robclark perhaps?
<marc|gonzalez>
Bah, I'll just send a formal submission to kick start the discussion...
<bamse>
marc|gonzalez: you don't need to worry about who's the maintainer, just make a proper commit and ask us to review it...
<marc|gonzalez>
Will do, tomorrow. Cheers!
<bamse>
marc|gonzalez: but i don't think you should say "sometimes the last context bank is broken, remove it"...i think you should be more specific and say "this hardware reports 12 context banks, but only 11 are usable"
<marc|gonzalez>
I'll make a lengthy commit message explaining WTF is going on ;)
<bamse>
you mean "hardware reports 12 context banks, but only 11 are usable"?
<aka_[m]>
*by EL1/EL0 user
<aka_[m]>
Question to developers so to almost all ppl here.
<aka_[m]>
I want to add support from some awinic crap amplifier and i see i could hack it inside other driver, from what i see register definitions are plain inside .h
<aka_[m]>
Is in linux any specific struct for handling it?
<aka_[m]>
i mean for making array
<aka_[m]>
or just define struct with int index,int reg and build array per chip
<konradybcio>
does the rest of the logic match for these two chis?
<konradybcio>
chips
<konradybcio>
you can check e.g. the qmp phy drivers for an example of handling regs
f_ has quit [Remote host closed the connection]
f_ has joined #linux-msm
f_ has quit [Remote host closed the connection]
f_ has joined #linux-msm
danylo has quit [Quit: Ping timeout (120 seconds)]
<aka_[m]>
konradybcio: very similar stuff, going to verify also bitfields.
<aka_[m]>
Tho i would need to probably add second struct with features so if some feature don't exist i can guard it to not get some random big value from un-initialized index
<aka_[m]>
seems registers exist but are stuff is RD only
sergi has quit [Quit: Bridge terminating on SIGTERM]
z3ntu has quit []
Newbyte has quit []
KieranBingham[m] has quit [Quit: Bridge terminating on SIGTERM]
go4godvin has quit [Quit: Bridge terminating on SIGTERM]
flamingradian[m] has quit [Quit: Bridge terminating on SIGTERM]
kxxt[m] has quit [Quit: Bridge terminating on SIGTERM]
Adrian[m] has quit [Quit: Bridge terminating on SIGTERM]
naysco12138[m] has quit [Quit: Bridge terminating on SIGTERM]
aka_[m] has quit [Quit: Bridge terminating on SIGTERM]
konradybcio has quit [Quit: Bridge terminating on SIGTERM]
jane400[m] has quit [Quit: Bridge terminating on SIGTERM]
Niatemporarilynia-ematrixorgdo has quit [Remote host closed the connection]
Marijn[m] has quit [Remote host closed the connection]
JIaxyga[m] has quit [Remote host closed the connection]
mainliner[m] has quit [Remote host closed the connection]
Dinolek[m] has quit [Remote host closed the connection]
obbardc has quit [Read error: Connection reset by peer]
logicalerzor[m] has quit [Remote host closed the connection]
stefan-schmidt has quit [Remote host closed the connection]
travmurav[m] has quit [Remote host closed the connection]
danylo has joined #linux-msm
aka_[m] has joined #linux-msm
<aka_[m]>
seems its painful and fields are shifted what a shame
aka_[m] has quit [Remote host closed the connection]