ChanServ changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - Logs https://oftc.irclog.whitequark.org/panfrost - <macc24> i have been here before it was popular
chewitt has joined #panfrost
Bennett has quit [Remote host closed the connection]
camus has joined #panfrost
camus1 has quit [Read error: Connection reset by peer]
nlhowell has joined #panfrost
nlhowell has quit [Ping timeout: 480 seconds]
jambalaya has quit [Remote host closed the connection]
nlhowell has joined #panfrost
jambalaya has joined #panfrost
nlhowell has quit [Ping timeout: 480 seconds]
camus1 has joined #panfrost
camus has quit [Ping timeout: 480 seconds]
JulianGro has joined #panfrost
camus has joined #panfrost
camus1 has quit [Ping timeout: 480 seconds]
soreau has quit [Ping timeout: 480 seconds]
rasterman has joined #panfrost
macc24 has joined #panfrost
camus1 has joined #panfrost
camus has quit [Remote host closed the connection]
camus has joined #panfrost
camus1 has quit [Remote host closed the connection]
soreau has joined #panfrost
camus1 has joined #panfrost
camus has quit [Read error: Connection reset by peer]
camus1 has quit [Remote host closed the connection]
camus has joined #panfrost
camus1 has joined #panfrost
camus has quit [Ping timeout: 480 seconds]
hyrc has quit []
camus1 has quit [Remote host closed the connection]
camus has joined #panfrost
nlhowell has joined #panfrost
<alyssa>
Oh ho ho wait
<alyssa>
the automatic varying allocation on Valhall ... the varyings get allocated into the tiler heap?
<rasterman>
well you have pretty pink and white triangles... :)
<rasterman>
i hate it when bisects screw up. i alreayd had this screw up twice
<rasterman>
like once it told me its been fixed betwen a range of like 8 commet revs
<rasterman>
this time i got a "this rev is broken"
<rasterman>
... finally
<rasterman>
now i have to manually check because i don't trust this...
<alyssa>
mmh
<alyssa>
maybe I should try to bisect from the other side, then
<alyssa>
start scribbling over DDK memory and see when it breaks o:)
<rasterman>
hahahahaha
<alyssa>
dougallj did this on apple to great success
<rasterman>
actually i'm really curious if over time morello can make this less likely. e.g. memory is "owned" by a shared lib
<rasterman>
and ONLY code executing from within that shared lib can write to it...
<alyssa>
whose side are you on :p
camus1 has joined #panfrost
<rasterman>
and to allow writing to memory the lib may alloc -= it has to explicitly export a poointer to do that then "unexport" it to revoke such access
<rasterman>
(and only code inside that shlib mappings can do the export/unexport)
<rasterman>
that'd be nice...
<alyssa>
whose side are you on :p
<rasterman>
and .. i'm on my side...
<rasterman>
as someone who writes lots of shlibs... i'm tired of apps scribbling over data structs the shlib manages then deciding to blame the shlib :)
<rasterman>
it'd be also nice to lock mem down to a specific thread too :)
<rasterman>
the same way
camus has quit [Ping timeout: 480 seconds]
<rasterman>
i've had to debug problems before where someone rtan code from a thread... and it worked 99.9999% of the time
<rasterman>
then every now and again the app under some heavy testing would lcok up
<rasterman>
a linked list data struct would become a looped list (infinite now with no beginning.end)
<alyssa>
rust
<rasterman>
it eventually turned out to be that writing fromt he thread (didnt know it was doing that) because those apis to handle that were not threadsafe anbd intended to be called from mainloop only
<rasterman>
rewriting everything in rust is not a viable solution :)
<rasterman>
sit down and spend 5 years going nowhere and just re-writing code in rust (assuming you already gained expert status in rust too)
<rasterman>
so no ability to move on...
<rasterman>
probabkly will take 10y actually not 5y
<rasterman>
AND in the process you are likely to add new bugs as previously well debugged code/algorithms get rewritten and have new bugs - sure... no "memory stomping" bugs... but other new logic ones :)
<HdkR>
rasterman: Tagged memory can do what you're wanting. Allocate a tag per library and enforce ownership semantics that way
<HdkR>
Of course it won't work in all cases since you need to be careful about giving away ownership
<HdkR>
Most APIs just pass around memory without a care
<rasterman>
this problem above got solved by sliding a new object handle system beneath the existing api... object * ptrs became references in a table and needed a lookup. those tables are TLS and have other sanity checks (like checksums/hashes) and this then stopps the ability to even access an object from a thread where it did not exist UNLESS you explicitly expose another thread's context
<rasterman>
HdkR: you mean MTE? or you mean eg aligning your allocs to e.g. 16 bytes and using lower 4 bits as the tag?
<HdkR>
Yes, MTE
<rasterman>
yeah. mte is nice. :)
<alyssa>
Ok, DDK does /not/ like me deleting its position shader resource table
<rasterman>
but the above would not have been solved by mte as it'd have been a valid ptr just acc essed from the wrong thread
<rasterman>
but definitely having some kind of export/import ptr in an api gives you a point of control
<rasterman>
forcing alignment e.g. to 64bytes would allow 6 bits which would be nicer :)
<rasterman>
but the above object api solved it by making it an indirection and it made it essentially impossible to access an obj when not intended ... so that got done.
<rasterman>
but it was just an example of the kinds of thnigs a proper capability system can do
<rasterman>
MTE is like a capability system for poor people. 4 bits... :)
<rasterman>
it's nice to have and slide in to today's arch. but a full 128bits is even nicer :)
<rasterman>
but there's a lot of research/work to do to bring in the idea of exporting/importing ptrs between "domains of ownership". it's not too common.
<alyssa>
hmm what is 112 bytes
<alyssa>
"2^4 * 7" "real helpful"
<alyssa>
ok yep words 30,31 of the idvs helper payload are the near/far planes
<HdkR>
Woo RE
<alyssa>
so it looks like I should target my investigation at this resouce thing in position shaders, and the LEA_ATTR instruction (or whatever it actually is)
<alyssa>
but.. this isn't even accessing the resources. clearly I've broken multiple things
<daniels>
rasterman: if your oops points to dma_scheduler/dma_resv/dma_fence UAF, yeah
<robclark>
I assume daniels meant the first two
<daniels>
yeah, first two, soz
<rasterman>
i've tried bisecting like 4 tiems now - each time it ends up at a different hash ... and well... manually pre/post that its also broken... i'ts been pissing me off :|
<rasterman>
it's been cvausing all sorts of fun side effects like network port not working right and other weird side effects too.
<rasterman>
i'll give that a shot to stuff those in and see. i closed up my bisecting terms for the day :)
alpernebbi has joined #panfrost
<rasterman>
actually let me try now
<rasterman>
i still have it up - just had to reconnect
<robmur01>
protip: always assume any kernel before about -rc3 to be catastrophically broken. These days I typically don't even bother bisecting things unless I'm still hitting them mid-cycle, except when there's some likelihood of it being related to something I've done :)
<rasterman>
i was hoping to figure it out before rel... :)
<rasterman>
but i've been a bit baffled that a bisect hasnt reliably pointed to something to scratch my head over :|
<rasterman>
but i didnt like the idea of a release having this broken
<robmur01>
bisecting across merges is an arse at the best of times, but particularly when those merges are branches with different bases all over the place
<rasterman>
yeah... so i find. it makes me yearn for linear history :)
<robmur01>
what's worst though is when the bisect result is utterly nonsensical but actually true
<rasterman>
you mean it found the commit but th8e commit itself makes no sense as to why that would cause it?
* robmur01
remembers figuring out when the merge of the input tree broke USB on Juno...