Daaanct12 has quit [Read error: Connection reset by peer]
Duke`` has quit [Ping timeout: 480 seconds]
saurabhg has quit [Remote host closed the connection]
saurabhg has joined #dri-devel
ybogdano has quit [Read error: Connection reset by peer]
chslt^ has quit [Ping timeout: 480 seconds]
saurabhg has quit [Remote host closed the connection]
saurabhg has joined #dri-devel
sergi has joined #dri-devel
toolchains has quit [Remote host closed the connection]
toolchains has joined #dri-devel
toolchains has quit [Ping timeout: 480 seconds]
<tango_>
are there bibliographical references about the mesa backends that use LLVM?
gouchi has joined #dri-devel
gouchi has quit []
frieder has joined #dri-devel
alanc has quit [Remote host closed the connection]
alanc has joined #dri-devel
<tango_>
(I'm guessing these would be radeonsi, amdgpu and llvmpipe)
chslt^ has joined #dri-devel
toolchains has joined #dri-devel
Major_Biscuit has joined #dri-devel
Daanct12 has quit [Remote host closed the connection]
toolchains has quit [Ping timeout: 480 seconds]
icecream95 has joined #dri-devel
Major_Biscuit has quit []
chslt^ has quit [Ping timeout: 480 seconds]
tzimmermann has joined #dri-devel
saurabhg has quit [Remote host closed the connection]
nchery has quit [Read error: Connection reset by peer]
MajorBiscuit has joined #dri-devel
MajorBiscuit has quit [Quit: WeeChat 3.5]
chslt^ has joined #dri-devel
MajorBiscuit has joined #dri-devel
Company has joined #dri-devel
<mupuf>
anholt_: shit, so sorry, I was sure I did :s
<MrCooper>
FireBurn: maybe the git bisect run script could determine if the current commit contains the other commits using git rev-list
<MrCooper>
tango_: amdgpu is one of two radeonsi winsys, radeonsi's use of LLVM is unrelated to the winsys though; RADV, clover and the Gallium draw module also use LLVM
<tango_>
MrCooper: thanks. but I don't suppose there's something specific I can cite about that, is there?
<MrCooper>
not that I know of
<MrCooper>
which doesn't mean much when it comes to docs :)
<tango_>
oh well, I'll link to the mesa project website
<tango_>
damn, citing projects in scientific articles is a major pain
<MrCooper>
are source code references usable?
lynxeye has joined #dri-devel
rasterman has joined #dri-devel
vliaskov has joined #dri-devel
<tango_>
MrCooper: what do you mean?
<MrCooper>
references to specific files in the Mesa tree which use LLVM
ahajda has joined #dri-devel
<pq>
..at specific git sha1, because things change
<tango_>
MrCooper: I mean, a misc. entry could handle that, but I think that for my purposes just linking to the mesa project is enough
<MrCooper>
pq: yes, I was saving up details like that :)
chslt^ has quit [Ping timeout: 480 seconds]
<pq>
referred date is quite a coarse approximation of "when", because git does not record when commits became part of the upstream project unless the project uses merge commits always.
<pq>
relese tags should work well
<pq>
I'd use both referred date and release tag.
lemonzest has joined #dri-devel
sergi has quit [Quit: Konversation terminated!]
pcercuei has joined #dri-devel
maxzor_ has joined #dri-devel
sergi has joined #dri-devel
sergi has quit []
sergi has joined #dri-devel
pcercuei has quit [Quit: brb]
pcercuei has joined #dri-devel
sergi has quit [Quit: Konversation terminated!]
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
sergi has joined #dri-devel
kts has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
mvlad has joined #dri-devel
lemonzest has quit [Ping timeout: 480 seconds]
chslt^ has joined #dri-devel
heat_ has joined #dri-devel
mclasen has joined #dri-devel
rkanwal has joined #dri-devel
thellstrom has joined #dri-devel
thellstrom has quit [Remote host closed the connection]
shankaru has joined #dri-devel
<daniels>
jekstrand: please poke in ~LAVA internally when that happens
<daniels>
anyway, both T760 devices seem fine now
digetx has joined #dri-devel
camus1 has quit [Read error: Connection reset by peer]
camus has joined #dri-devel
kts has quit [Quit: Konversation terminated!]
icecream95 has quit [Ping timeout: 480 seconds]
devilhorns has joined #dri-devel
kts has joined #dri-devel
Lightsword_ has left #dri-devel [#dri-devel]
kts has quit [Quit: Konversation terminated!]
Lightsword has joined #dri-devel
oneforall2 has joined #dri-devel
kts has joined #dri-devel
chslt^ has quit [Remote host closed the connection]
<jenatali>
I tried to restart it when it didn't kick off the first time so that MR didn't have to be re-assigned to Marge again
<daniels>
jenatali: don't trigger the performance jobs pre-merge
<daniels>
they're only supposed to be run post-merge
<jenatali>
I didn't, Marge did!
<jenatali>
Seems maybe there's a YML issue then not setting up the right rules for that job?
<daniels>
possibly, because it should only be triggered post-merge for analysis, not pre-merge
<jenatali>
Oh, and my canceling it started the merge (since it's optional) but then my restarting it shifted the MR into "merge when pipeline succeeds," I see
<jenatali>
Weird
<javierm>
jfalempe: yes, we forgot to include commit ("fb84efa28a48 drm/aperture: Run fbdev removal before internal helpers") in the set :(
<javierm>
jfalempe: but tzimmermann already queued that patch for drm-fixes so it should make it to -rc soon
<javierm>
jfalempe: err, I meant commit fb84efa28a48 ("drm/aperture: Run fbdev removal before internal helpers")
<jfalempe>
javierm, ok, so at least a known issue ;)
<jfalempe>
javierm, ok, so they already know the solution too.
<jfalempe>
I will add this info in the bz
<javierm>
jfalempe: yep, it would be good if Bruno can confirm it though since Justin added that patch because I pointed it out
<jfalempe>
javierm, yes, I will ask him to confirm the fix is working.
<daniels>
jenatali: yeah, so just leave it cancelled and it should progress - that particular machine is now back up and running as well
<daniels>
(perf tests lock to particular machines since the characteristics are surprisingly variable between 'identical' instances of the same model)
<jenatali>
daniels: Yeah Jason's MR merged with it cancelled, but not before Marge started running the next one on the now-stale tip of tree. My bad :(
<jenatali>
GitLab seems to have gotten real slow though all of a sudden
<MrCooper>
jenatali: the performance jobs are triggered by tomeu (or presumably rather a script which impersonates him), not by Marge
Peuc has joined #dri-devel
<jenatali>
MrCooper: Before I cancelled it, I looked at the top and it definitely said it was triggered by Marge
zehortigoza has joined #dri-devel
<MrCooper>
hmm, indeed; anholt_ , maybe a regression from your recent rules rework?
<daniels>
MrCooper: a timed pipeline, yeah
<MrCooper>
daniels: looks like they currently run automatically in Marge's pipelines though
<MrCooper>
then the timed pipeline triggers them again post-merge
KunalAgarwal[m] has quit []
KunalAgarwal[m] has joined #dri-devel
alyssa has joined #dri-devel
<alyssa>
icecream95 just pointed out that list_assert isn't defined for debugoptimized builds, only debug
<alyssa>
which hid a bug (failing to use a _safe iterator) because I only do debugoptimizer (and release) builds
<alyssa>
Is this intended?
<alyssa>
Actually it looks like a pile of places in Mesa guard asserts by `#ifdef DEBUG` instead of `#ifndef NDEBUG`
<alyssa>
simple_mtx assertions, for example
<alyssa>
or is that fine?
<alyssa>
the meson docs suggest that debugoptimized is the same as debug -O2
<jenatali>
MrCooper: While we're talking Marge pipelines, I'm still getting one rogue a630 job in my Marge pipelines for unrelated components (Dozen most recently)
<jenatali>
Not sure who's the right person to dig into that though
<MrCooper>
if it's not anholt_, she might know who
<alyssa>
dcbaker: ^^ reading the docs, the debug vs debugoptimized difference doesn't make any sense to me, and the fact that b_ndebug is ostensibly independent of buildtype ....
<jenatali>
Cool. I don't really mind since they're usually not last to finish, and as long as they don't flake. But it's probably not intentional
nchery has joined #dri-devel
<alyssa>
dcbaker: Oh, this is all kinds of cursed.
<alyssa>
Meson defines NDEBUG for release builds to disable assertions
<alyssa>
Mesa defines DEBUG for plain debug builds only
<alyssa>
so debugoptimized has neither DEBUG nor NDEBUG set
<alyssa>
If the intent of NDEBUG is to guard assertions, then every(?) use of '#ifdef DEBUG' to guard an assertion is wrong.
<alyssa>
and should be `#ifndef NDEBUG`
<pq>
alyssa, wasnät there some difference between "heavy" debugs and "light" debugs? Maybe that's it? :-)
<alyssa>
pq: maybe? I don't remember learning about this one ;)
itoral has quit [Remote host closed the connection]
srslypascal is now known as Guest4347
srslypascal has joined #dri-devel
<alyssa>
pq: I guess I can see hiding "full fledged validation" behind true debug builds, but regular old asserts should be in debugoptimized too, no?
<alyssa>
are we (mesa devs) expected to run debug builds in addition to debugoptimized builds in addition to release builds?
<alyssa>
that's a lot of combinatoric hell.
<pq>
I'm no Mesa dev *shrug* :-)
<alyssa>
I guess the kernel is an even worse combinatoric hell of every debug feature separately Kconfig gated...
maxzor_ has quit [Remote host closed the connection]
maxzor_ has joined #dri-devel
mighty17 has left #dri-devel [#dri-devel]
jewins has joined #dri-devel
<mareko>
is it possible to overwrite x86 instructions of a loaded library? or are those mapped read only?
<pixelcluster>
can't you change the protect flags of the pages where the lib is loaded?
<mareko>
good question, can a user do that?
<pixelcluster>
I think yea
shankaru has joined #dri-devel
<pixelcluster>
mprotect() exists
<mareko>
can the kernel disallow mprotect on those pages?
<HdkR>
mprotect is a thing yes
KunalAgarwal[m] has quit []
KunalAgarwal[m] has joined #dri-devel
<HdkR>
and icache is coherent so as long as you know nothing is executing the code, it is "safe" to patch the code live.
<HdkR>
icache on x86*
ManMower has joined #dri-devel
kts has quit [Ping timeout: 480 seconds]
<MrCooper>
that it's possible doesn't necessarily mean it's a good idea though :)
camus has quit []
<pepp>
mareko: depends on what you try to achieve, but using ptrace for this is a possibility (PTRACE_POKEDATA, PTRACE_PEEKDATA)
KunalAgarwal[m] has quit []
KunalAgarwal[m] has joined #dri-devel
<jenatali>
On Windows at least, it's relatively common to patch loaded binaries
<HdkR>
If someone is backpatching code and isn't an emulator, llvm-xray, hacking tool, or DRM then I'll glare at it for a long while
<HdkR>
and I already glare at DRM
<jenatali>
Microsoft publishes a library called "detoured" which makes it pretty easy to do. Since Windows doesn't have the equivalent of LD_PRELOAD, that's how things like game overlays work
<pixelcluster>
On windows I've seen viruses do it
<jenatali>
We also do it a lot for tests, to mock code that wasn't explicitly written for mocks and/or isn't amenable to mocks
<pq>
then someone rebuilds what you were patching and...?
<mareko>
can we forbid userspace from messing with Mesa libs this way? and then disallow submitting command buffers from any library that isn't Mesa in a specific path? (GPU hang protection)
<pq>
mareko, nope.
<mareko>
nope never or nope without changing the kernel?
<pq>
The first problem is how do you recognize "Mesa"?
<jenatali>
pq: The patching is typically just to redirect whole functions. You replace the first few instructions with a jump to your own code, and relocate the original elsewhere
<pq>
If I build my own Mesa, is it still "Mesa"? If I add a patch, is it still "Mesa"?
<pixelcluster>
What would be the benefit of disallowing to patch a Mesa lib?
<pq>
(I mean a source patch, and rebuild)
<mareko>
it would protect us from GPU hangs
<pq>
What's the goal? I don't understand how that can relate to GPU hangs in any way.
<mareko>
it's a DoS attack
<pixelcluster>
well in this case the attacker can already execute arbitrary code
<mareko>
not arbitrary yet
<pq>
Do you mean that Mesa would be a trusted library known to not craft malicious GPU command sequences?
<mareko>
yes, a kernel would have a whilelist of trusted libraries
<pq>
Sounds like a nightmware.
<jenatali>
You need binary signing. How else can you know that "Mesa" is "Mesa" and not someone's custom built version with malicious content?
<MrCooper>
that's the Windows driver team kind of thinking
<mareko>
sysprof can tell that a kernel function is being called from a specific library, so I guess the kernel can do it too
<pq>
-w
<HdkR>
Seems more like if you get a report of a GPU hang and they are injecting code, you just close that bug report instead
<pq>
I routinely use my own built Mesa in parallel to the distro Mesa.
<pq>
would it kill my use?
<mareko>
yes if you enable it
<pq>
so, you would want all distros to enable this in the kernel?
<pq>
I run distro kernels
<mareko>
not distros
<pq>
well, that's just practical annoyment, you still need binary signing to make it work at all like jenatali said
<jenatali>
Windows does stuff like this. You can create protected processes which will only load signed binaries, and we've got a feature called ACG (arbitrary code guard IIRC) which prevents any page that was writeable being made executable
<pq>
and gets pretty close to system lockdown and other stuff that many people hate with passion
<jenatali>
But it's all process level rather than systemwide
<jenatali>
Used for browsers or DRM
fxkamd has joined #dri-devel
<pq>
digital rights management, not our DRM, I presume :-)
<jenatali>
Right, yeah
<mareko>
we just need to make sure it's not in /home and the kernel needs to know the path, but signing doesn't seem necessary
<HdkR>
Sounds like it will break my emulation, since my x86 mesa binaries are in /home
<jenatali>
It depends who you're trying to protect this from. If you're trying to prevent a user, then filesystem protections locked to admins are sufficient. If you're trying to protect from an admin, then... yeah good luck, that's getting into media content protection territory and it sucks
<pixelcluster>
if the attacker can patch code in the mesa binary from /usr, that would pass that check though, wouldn't it?
<pq>
mareko, I'm still confused about what you think that could solve. What's the benefit and to whom?
<ManMower>
insert usb stick, run binaries from /media/something? bind mounts may also be an issue. I'm not sure anything that filters by a directory is robust.
<alyssa>
HdkR: what about a JIT? do you glare at JITs?
<mareko>
pq: datacenter customers
<HdkR>
alyssa: I glare at my JITs daily.
<pq>
Oooh!
<jenatali>
Sounds like this is about one workload causing GPU hangs which can end up affecting another?
<jenatali>
In which case, yeah as long as the workloads aren't admin, you can validate they're not running custom Mesa, and you can prevent them from patching the system version... that seems sufficient
<pq>
mareko, I was only thinking about a user like us on their own machines. Datacenter is totally different.
<mareko>
preventing any app from submitting arbitrary command buffers will make it possibly impossible to cause a GPU hang affecting other apps
<pq>
mareko, "virtualization" comes to mind, like... what's that thing that passes rendering API from guest to host? Then the host is guaranteed known system using the right Mesa to implement the API.
KunalAgarwal[m] has quit []
KunalAgarwal[m] has joined #dri-devel
<mareko>
signed drivers and signed kernels that users can't change would make it secure in virtualized environments
<pq>
I guess that depends on how you virtualize it. If you virtualize the API and not hardware interface or pci pass-through, then it doesn't matter what runs in the guest?
<mareko>
it's like PCI pass through
<pq>
if you so
<pq>
+say
<pq>
then you need a secure boot chain through the VM, guest bootloader and kernel, plus add userspace signing validation
<jenatali>
Yeah that's an interesting scenario, but it's tricky if you let the guest submit the work straight to the hardware. You need to have all the trust stuff in the guest instead of relying on the host to be trusted
<daniels>
you'd also need to forbid ptrace, which is not going to be popular
<daniels>
pq: virgl
<pq>
virgl!
<pq>
thanks :-D
<clever>
just thinking of the rpi's v3d (since i'm familiar with it), all messages submitted to hardware must contain physical addresses (pi4 has an extra mmu, and a shared 4g space for all 3d clients)
<clever>
there is no iommu in that case, so accepting messages from a guest directly is unwise
<clever>
but the linux driver has the same problem accepting messages from mesa, and prevents it by replacing all addresses kernel-side
<clever>
the client just has handles to objects, and then the driver replaces handles with addrs
kts has joined #dri-devel
<pq>
since we're talking about preventing GPU hangs, I suppose the space of command sequences that can hang is practically infinite and unrecognizable, so that kind of validation probably won't work?
<clever>
yeah, you have a halting problem
<clever>
at least with v3d, i believe there is a cheap way to halt the render, and i think even a way to save its state to ram
<clever>
so you can context-switch the whole damn 3d core
<clever>
but a more complex gpu with its own firmware, thats harder...
<daniels>
pq: if you have a second GPU, you can execute the commands on that and see if it hangs or not, so you know whether or not it's safe to submit to the first GPU
<clever>
daniels: but at that point, just dedicated the 2nd gpu to the guest
<milek7>
what would enforcing signed mesa code even accomplish? if it runs in the same address space with other code, there is probably dozens of ways to hijack it anyway
<HdkR>
hardware cpu breakpoint, tinker with the command buffers before submission...
<HdkR>
Tinker with them directly in the app themselves, before submission...
<HdkR>
SECCOMP to capture ioctl syscall to tinker before submission...
<HdkR>
submitting bad command buffers through DRM directly
<HdkR>
userspace syscall dispatch for the same thing as secommp but fancier
<HdkR>
mprotect the buffers or code to no permissions, capture the sigsegv, modify the buffers and submit yourself
<HdkR>
So many fun attack vectors
<daniels>
HdkR: yeah that was my point about ptrace - I thought seccomp only got RO syscall args
<daniels>
but yeah, there are about a billion ways for untrusted userspace code to submit ioctls you don't like
<HdkR>
daniels: Could be, I've tinkered with seccomp less than userspace dispatch which lets you do whatever you want
<clever>
daniels: yeah, so the kernel side needs to enforce that the syscall gave valid args
<clever>
dont trust userland
<HdkR>
Really the ideal thing would be your GPU that is punched through with SR-IOV has channels and if a channel is hung, hardware watchdog kill that context without anyone else noticing.
<HdkR>
But if the hardware isn't capable of that then you're SOL on that front :)
<clever>
in the case of v3d, it can only be rendering 1 frame at a time, and its simple to just whack a global reset button
<HdkR>
...and if this is implying that AMD hardware isn't capable of that then I need to start planning on a kill switch for a server
<clever>
so you can implement such a watchdog in software
<MrCooper>
HdkR: better start planning then :) though while Nvidia HW may be capable of this in theory, I'd be surprised if it was bullet proof
<HdkR>
MrCooper: Don't worry, planning on Nvidia GPUs in the same box. They can compete for most hang worthy
frieder has quit [Remote host closed the connection]
kunal_1072002[m] has quit []
kunal_1072002[m] has joined #dri-devel
tobiasjakobi has quit []
Duke`` has joined #dri-devel
kunal_10185[m] has joined #dri-devel
mclasen_ has joined #dri-devel
shankaru1 has joined #dri-devel
nchery has joined #dri-devel
shankaru has quit [Ping timeout: 480 seconds]
<dcbaker>
alyssa: DEBUG is supposed to be for expensive debug code like NIR validation
<alyssa>
dcbaker: NIR validation defines runs in debugoptimized and huge #s of bugs would slip thru if it didn't.
<dcbaker>
Interesting
mclasen has quit [Ping timeout: 480 seconds]
pcercuei has quit [Read error: Connection reset by peer]
pcercuei has joined #dri-devel
<dcbaker>
DEBUG actually predates Meson, I think someone tried to get rid of it around the time we dropped autotools, but I don’t remember who
<alyssa>
I guess problem #1 is nobody knows when to use DEBUG vs !NDEBUG and so it's pretty random what's binned as each
<alyssa>
Yeah.. I would be interested to know the perf impact on debugoptimized builds of setting DEBUG = !NDEBUG
<alyssa>
If it's a few % slower in exchange for 2x faster debugging... well, that's a win :p
<dcbaker>
Originally DEBUG was set in debugoptimized builds and was turned off because it was too slow
<dcbaker>
No idea what that would look like today though
pcercuei has quit []
<alyssa>
hmm
pcercuei has joined #dri-devel
<anholt_>
zmike: not sure why you showed me that mr?
<zmike>
anholt_: you're usually interested in ubsan stuff
<alyssa>
zmike: can those functions ever actually be called with null args? if so why?
<zmike>
it's just ubsan
<alyssa>
if not, assert(foo != NULL) should shut up ubsan while making the intent clear
<alyssa>
(at the top)
<alyssa>
design by contract *clap emoji*
<zmike>
yeah that doesn't work
<alyssa>
it should.
<zmike>
shrug
<zmike>
it doesn't
<anholt_>
zmike: I'm interested in asan and tsan. ubsan, in my experience, has been negative value (as in, doesn't fix any bugs that actually show up, but is a big hassle). not stopping others from working on it, though.
<alyssa>
does `if (foo == NULL) unreachable("booooo")` work?
<zmike>
anholt_: ah I misremembered
<alyssa>
if so, that's a release assert and if that's not good enough, ubsan needs to be fixed
<daniels>
__attribute__((nonnull))
<alyssa>
(or ignored, but preferably fixed)
<alyssa>
daniels: or that. or both.
<alyssa>
actually, does that exist? we don't use it in mesa