ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar
<anholt> I'm not sure that the time savings we're talking about here is worth the complexity. I'd be more interested in what compiler tunables there might be to get us "symbol backtraces, maybe function input values if they're cheap" rather than a full most-debuggable binary.
smilessh has joined #dri-devel
<anholt> given how allergic most mesa devs are to thinking how to interact with the CI already, making any additional complexity for debugging makes it less and less likely that anyone ever uses it.
<DavidHeidelberg[m]> anholt: when transferring outside of us and network is under load, sometimes the transfers are really slow, so every MB seems to matter (in terms of performance).
<DavidHeidelberg[m]> *US
<DavidHeidelberg[m]> btw. with the docs ( https://mesa.pages.freedesktop.org/-/mesa/-/jobs/34548032/artifacts/public/debugging.html#working-with-core-dumps-generated-by-ci ) it doesn't seems to be so hard to setup with few commands and debug. It's in general normal coredump
exit70_ has quit []
<anholt> why is the debug.dwp not included in the unstripped mesa tarball?
<DavidHeidelberg[m]> anholt: because of otherwise it have to be linked with linker at compilation time, which would slowdown the build
<DavidHeidelberg[m]> so in unstripped build are only references to the debug.dwp (exactly to the .dwo files inside the .dwp)
exit70 has joined #dri-devel
<anholt> we're linking debug info at compile time today, right?
<Lynne> have to say, descriptor buffers are so much nicer to work with
danvet has quit [Ping timeout: 480 seconds]
rmckeever has joined #dri-devel
<DavidHeidelberg[m]> anholt: with my MR we can use split-debug (so not put debug into .o, but .dwo), otherise sure
<anholt> I'm trying to understand why you need the split-debug complexity for the unstripped tarball.
<DavidHeidelberg[m]> anholt: in general, this page sums it up https://gcc.gnu.org/wiki/DebugFission ; for smaller project it doesn't matter, but Mesa is already large enough to difference to be seen
<anholt> (I haven't done the work myself, but I really suspect there's something better we could choose in our debugoptimized build's -g options that could make the cost of debug symbols low enough that it would give us debuggability without extra tarballs even)
<DavidHeidelberg[m]> anholt: well, even with full debug with debugoptimized it's not perfect since we lose all details, I assume doing something between would produce very limited results (not saying it couldn't produce something useful ofc)
<DavidHeidelberg[m]> I used mine MR few times and I had to say I would prefer to have meson 'debug' build, but ofc that's useless for flakes.
Zopolis4 has quit []
djbw has quit [Read error: Connection reset by peer]
djbw has joined #dri-devel
pcercuei has quit [Quit: dodo]
Haaninjo has quit [Quit: Ex-Chat]
anholt has quit [Ping timeout: 480 seconds]
liyi__ has joined #dri-devel
stuart has quit []
kts has joined #dri-devel
anholt has joined #dri-devel
kts has quit [Quit: Leaving]
kzd has quit [Quit: kzd]
columbarius has joined #dri-devel
co1umbarius has quit [Ping timeout: 480 seconds]
alyssa has left #dri-devel [#dri-devel]
fxkamd has quit [Remote host closed the connection]
ngcortes has quit [Remote host closed the connection]
fxkamd has joined #dri-devel
ngcortes has joined #dri-devel
Peuc has quit [Remote host closed the connection]
Peuc has joined #dri-devel
sauce has quit [Remote host closed the connection]
sauce has joined #dri-devel
unerlige1 has quit [Remote host closed the connection]
unerlige1 has joined #dri-devel
Sachiel has quit [Remote host closed the connection]
Sachiel has joined #dri-devel
kzd has joined #dri-devel
naseer__ has quit [Read error: Network is unreachable]
naseer__ has joined #dri-devel
warpme_____ has quit [Read error: Network is unreachable]
warpme_____ has joined #dri-devel
orbea has quit [Remote host closed the connection]
orbea has joined #dri-devel
ngcortes has quit [Read error: Connection reset by peer]
kzd has quit [Ping timeout: 480 seconds]
konstantin has joined #dri-devel
kzd has joined #dri-devel
nchery has quit [Ping timeout: 480 seconds]
konstantin_ has quit [Ping timeout: 480 seconds]
<karolherbst> uhh... why can't I trigger the fails CI runs into locally :(
aravind has joined #dri-devel
TMM has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
TMM has joined #dri-devel
pallavim has quit [Ping timeout: 480 seconds]
oneforall2 has quit [Remote host closed the connection]
oneforall2 has joined #dri-devel
kzd has quit [Quit: kzd]
kzd has joined #dri-devel
Zopolis4 has joined #dri-devel
macromorgan has joined #dri-devel
bmodem has joined #dri-devel
heat has quit [Ping timeout: 480 seconds]
JohnnyonFlame has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
kzd_ has joined #dri-devel
kzd has quit [Ping timeout: 480 seconds]
liyi__ has quit [Ping timeout: 480 seconds]
kzd has joined #dri-devel
liyi__ has joined #dri-devel
kzd_ has quit [Ping timeout: 480 seconds]
junaid has joined #dri-devel
junaid_ has joined #dri-devel
kzd_ has joined #dri-devel
kzd has quit [Ping timeout: 480 seconds]
kzd_ has quit []
kzd has joined #dri-devel
kzd has quit [Quit: kzd]
bgs has joined #dri-devel
kzd has joined #dri-devel
junaid_ has quit [Remote host closed the connection]
junaid has quit [Remote host closed the connection]
kzd has quit [Quit: kzd]
aravind has quit [Ping timeout: 480 seconds]
tzimmermann has joined #dri-devel
pjakobsson has quit []
alanc has quit [Remote host closed the connection]
danvet has joined #dri-devel
bluetail98 has quit []
alanc has joined #dri-devel
danvet has quit [Ping timeout: 480 seconds]
jkrzyszt has joined #dri-devel
danvet has joined #dri-devel
nchery has joined #dri-devel
<daniels> karolherbst: which driver?
sghuge has quit [Remote host closed the connection]
sghuge has joined #dri-devel
fab has joined #dri-devel
rasterman has joined #dri-devel
mvlad has joined #dri-devel
junaid has joined #dri-devel
junaid has quit [Read error: No route to host]
junaid has joined #dri-devel
vliaskov has joined #dri-devel
fab has quit [Ping timeout: 480 seconds]
danvet has quit [Ping timeout: 480 seconds]
ice9 has joined #dri-devel
danvet has joined #dri-devel
<linkmauve> “23:28:30 gfxstrand> IDK, Intel has managed to evolve their hardware for 15 years without deleting interesting formats.”, /me cries in ASTC.
danvet has quit [Ping timeout: 480 seconds]
danvet has joined #dri-devel
phasta has joined #dri-devel
<MrCooper> DavidHeidelberg[m]: FWIW, -ggdb/-ggdb3 might improve debugability of debugoptimized builds, probably at the cost of bigger debuginfo though
junaid has quit [Remote host closed the connection]
Company has quit [Quit: Leaving]
pochu has joined #dri-devel
tursulin has joined #dri-devel
ahajda has joined #dri-devel
MajorBiscuit has joined #dri-devel
danvet has quit [Ping timeout: 480 seconds]
danvet has joined #dri-devel
Haaninjo has joined #dri-devel
darkbasic4 has joined #dri-devel
rmckeever has quit [Quit: Leaving]
pcercuei has joined #dri-devel
<HdkR> linkmauve: Everyone cries in ASTC, just like the hardware designers
apinheiro has joined #dri-devel
<karolherbst> daniels: llvmpipe mainly
danvet has quit [Ping timeout: 480 seconds]
danvet has joined #dri-devel
Haaninjo has quit [Quit: Ex-Chat]
darkbasic4 has quit [Remote host closed the connection]
sgruszka has joined #dri-devel
sgruszka has quit [Remote host closed the connection]
sgruszka has joined #dri-devel
jdavies has joined #dri-devel
jdavies is now known as Guest5126
jdavies_ has joined #dri-devel
jdavies_ has quit [Remote host closed the connection]
junaid has joined #dri-devel
Guest5126 has quit [Ping timeout: 480 seconds]
junaid has quit [Remote host closed the connection]
liyi__ has quit [Ping timeout: 480 seconds]
co1umbarius has joined #dri-devel
columbarius has quit [Ping timeout: 480 seconds]
anholt_ has joined #dri-devel
devilhorns has joined #dri-devel
anholt has quit [Ping timeout: 480 seconds]
bmodem has quit [Ping timeout: 480 seconds]
ppascher has quit [Ping timeout: 480 seconds]
camus has quit []
kts has joined #dri-devel
kts has quit []
kts has joined #dri-devel
kts has quit [Remote host closed the connection]
kts has joined #dri-devel
YuGiOhJCJ has quit [Ping timeout: 480 seconds]
<DavidHeidelberg[m]> MrCooper: sounds good, anyway I don't see -ggdb vs -ggdbX documented in GCC docs
YuGiOhJCJ has joined #dri-devel
<psykose> it's in the manpage
<psykose> the documentation doesn't say anything useful however
<psykose> you can grep ggdb here https://man7.org/linux/man-pages/man1/gcc.1.html
<psykose> it just repeats the same shit as -g
<psykose> unsure if it does anything at all
jkrzyszt has quit [Remote host closed the connection]
<psykose> output size is the same though the sha changes
<psykose> maybe what i built is just not reproducible
<psykose> as for ggdbX it's in the ggdblevel part
<psykose> -g3 -> ggdb3 -g2 -> ggdb2
phasta has quit [Quit: Leaving]
<karolherbst> where can I check what CTS version/tag a test is using?
<karolherbst> or is it all the same?
srslypascal is now known as Guest5140
srslypascal has joined #dri-devel
Guest5140 has quit [Read error: Connection reset by peer]
kts has quit [Quit: Leaving]
agd5f_ has joined #dri-devel
agd5f has quit [Ping timeout: 480 seconds]
Daaanct12 has quit [Quit: Quitting]
<karolherbst> I'm now even on the same CTS version and can't trigger the fails from CI :(
grillo has joined #dri-devel
grillo has left #dri-devel [#dri-devel]
grillo_0 has joined #dri-devel
YuGiOhJCJ has quit [Quit: YuGiOhJCJ]
ahajda_ has joined #dri-devel
heat has joined #dri-devel
ahajda has quit [Ping timeout: 480 seconds]
ahajda has joined #dri-devel
ice9 has quit [Ping timeout: 480 seconds]
ahajda_ has quit [Ping timeout: 480 seconds]
ahajda_ has joined #dri-devel
agd5f has joined #dri-devel
ahajda has quit [Ping timeout: 480 seconds]
agd5f_ has quit [Ping timeout: 480 seconds]
agd5f_ has joined #dri-devel
agd5f has quit [Ping timeout: 480 seconds]
agd5f_ has quit [Ping timeout: 480 seconds]
agd5f has joined #dri-devel
Zopolis4 has quit []
pochu has quit []
kzd has joined #dri-devel
vliaskov has quit [Remote host closed the connection]
devilhorns has quit []
kts has joined #dri-devel
ahajda__ has joined #dri-devel
TMM has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
TMM has joined #dri-devel
ahajda___ has joined #dri-devel
jrayhawk has quit [Quit: leaving]
ahajda_ has quit [Ping timeout: 480 seconds]
ahajda__ has quit [Ping timeout: 480 seconds]
ahajda___ has quit [Ping timeout: 480 seconds]
fxkamd has quit []
agd5f_ has joined #dri-devel
Duke`` has joined #dri-devel
<DavidHeidelberg[m]> hakzsam: I suppose you don't have Helen Android patches included in the VKCTS uprev, right?
<hakzsam> nope?
<DavidHeidelberg[m]> There is few patches included in Helen tree which cannot be upstreamed yet, if you apply them it would be best
agd5f has quit [Ping timeout: 480 seconds]
<hakzsam> I will try to remember, thanks
gouchi has joined #dri-devel
sgruszka has quit [Remote host closed the connection]
<emersion> can someone review this? https://patchwork.freedesktop.org/series/109887/
<emersion> just simple logging stuff
<karolherbst> okay.. so my MR regresses stuff, just not on my machine :'(
<karolherbst> this makes no sense...
sewn has joined #dri-devel
<karolherbst> gfxstrand: any idea how https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20161/diffs?commit_id=e908e08deb198153d92075889815192edb12eb30 could break OpenGL? I honestly don't see a path to this code from a GL perspective
<sewn> is this the right channel to ask for mesa compiling help? im experiencing a weird mesa build failure.
jrayhawk has joined #dri-devel
<karolherbst> oh uhhh....
djbw has quit [Read error: Connection reset by peer]
<karolherbst> oops
<jenatali> wael: Yeah, probably
<soreau> sewn: yes
<sewn> despite zstd being disabled via -Dzstd=disabled, libvulkan for amd drivers attempt to link to it, which causes a build failure
tzimmermann has quit [Quit: Leaving]
<soreau> is this with mesa git or a release?
<karolherbst> I think I found it...
<sewn> it also attempts to link to udev, which in the manually specified (variable, -Dpkg_config_path meson option) pkg config paths, it tries to link to it as well.
<sewn> release 22.3.5
djbw has joined #dri-devel
Company has joined #dri-devel
heat has quit [Read error: No route to host]
tursulin has quit [Ping timeout: 480 seconds]
heat has joined #dri-devel
vyivel has quit [Remote host closed the connection]
vyivel has joined #dri-devel
agd5f has joined #dri-devel
agd5f_ has quit [Ping timeout: 480 seconds]
pallavim has joined #dri-devel
ZenWalker has quit [Ping timeout: 480 seconds]
kts has quit [Quit: Leaving]
unerlige1 has left #dri-devel [#dri-devel]
kzd has quit [Quit: kzd]
unerlige has joined #dri-devel
kzd has joined #dri-devel
simon-perretta-img has quit [Ping timeout: 480 seconds]
bluetail98 has joined #dri-devel
tobiasjakobi has joined #dri-devel
tobiasjakobi has quit []
jkrzyszt has joined #dri-devel
MajorBiscuit has quit [Quit: WeeChat 3.6]
ZenWalker has joined #dri-devel
<gfxstrand> karolherbst: Uh... what?
<gfxstrand> Yeah, that makes no sense.
smilessh has quit [Ping timeout: 480 seconds]
heat_ has joined #dri-devel
heat has quit [Read error: No route to host]
<DemiMarie> jenatali: what does “crash” mean in this context? If it means that the kernel driver or GPU firmware crashed, that is a bug in the kernel or GPU firmware.
bestest has joined #dri-devel
<jenatali> It means that the GPU hung, which can also indicate a bug in the usermode driver generating commands that would hang the GPU, or a bug in an app
<bestest> I'm suffering startup crashes on mesa for the game Minit, which uses 32-bit YoYo Games Linux Runner 1.3 and appears to suffer from the issues described here https://gitlab.freedesktop.org/mesa/mesa/-/issues/1310 https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4181 . As far as I can tell, these issues were fixed 2 years ago, but I'm on the latest stable for mesa and the game still crashes. Was there a regression, or was the
<bestest> issue never properly fixed, or am I simply doing something wrong?
<eric_engestrom> sewn: that's really weird; the code that handles this is simple enough that I'm sure there's no bug in it:
<eric_engestrom> zstd is replaced with an empty dependency if you pass `-D zstd=disabled`, so anything after that will never even know zstd is a thing
<sewn> just to be sure, im building this with lib32 in mind, and pkg config path is set to /usr/lib32/pkgconfig
<bestest> Well, it does crash; is there any info I can provide that would help?
<bestest> I tried to get a backtrace, but it just returns no stack
<bestest> Oh, I'm sorry, I misread, my apologies
heat_ has quit [Read error: No route to host]
heat has joined #dri-devel
<soreau> sewn: have you tried removing the build directory and trying again?
<sewn> im building it with a package manager, so technically yes
<eric_engestrom> sewn: that shouldn't make any difference with that; it might not find things or find things it can't use if the lib32 config is missing/wrong, but that's it
<sewn> eric_engestrom: it says udev is found but is not actually in pkg config path
bestest has quit [Quit: Leaving]
kzd_ has joined #dri-devel
kzd has quit [Ping timeout: 480 seconds]
<karolherbst> gfxstrand: yeah... I still have no idea, but apparently my new version changes things and I replaced `deref->var->type` with `deref->type`...
<karolherbst> and now the llvmpipe CI tests aren't failing anymore
<karolherbst> or rather.. randomly crashing
<karolherbst> I still have no idea how that path is even hit..
<jenatali> Yeah deref->var would only be valid for direct variable derefs, but if you've got arrays, it wouldn't be set
bluetail986 has joined #dri-devel
<DemiMarie> jenatali: are GPUs generally unable to prevent malicious userspace from freezing them?
<jenatali> In my experience, yes
<DemiMarie> Is this because of the lack of instruction-level preemption?
<jenatali> Effectively, running a shader on the GPU is kind of like running a userspace process. Someone authored the shader code. If it does something like dereference a null pointer... it's got to crash somehow
<jenatali> Some GPUs can report those types of errors, others just hang
konstantin_ has joined #dri-devel
<jenatali> That's about the extent of my knowledge though
<DemiMarie> I expect the shader to crash, but it should not take down other stuff on the GPU.
<jenatali> Right. Newer GPUs don't have to take down the whole GPU, that particular program would just hang, and then the host OS can reset the engine and set the GPU to work on a different task
iive has joined #dri-devel
<DemiMarie> This is especially important for VR/AR where a malicious shader must not be able to prevent the VR/AR from updating, as otherwise the human user might get sick.
agd5f_ has joined #dri-devel
<gfxstrand> karolherbst: Oh.... Yeah, deref->var->type wasn't going to work
<karolherbst> yeah.. but I'm more confused on how llvmpipe even hit that path...
<karolherbst> I think I still have a regression with AMD, but I can figure out what's wrong there...
bluetail98 has quit [Ping timeout: 480 seconds]
nchery is now known as Guest5166
nchery has joined #dri-devel
<karolherbst> all arb_bindless_texture related, which kind of makes sense *sigh*
neko2 has joined #dri-devel
konstantin has quit [Ping timeout: 480 seconds]
neko2 has left #dri-devel [#dri-devel]
agd5f has quit [Ping timeout: 480 seconds]
Guest5166 has quit [Ping timeout: 480 seconds]
danvet has quit [Read error: Connection reset by peer]
ngcortes has joined #dri-devel
neko2 has joined #dri-devel
<neko2> whoops of course I forgot nick reg
<neko2> hey all. I've encountered a rather nasty amdgpu reset/crash loop today that I can trigger pretty reliably. wondering if this the right channel or if I need to direct it somewhere else... memory is fuzzy but I'm sure it was dri something or another.
<karolherbst> neko2: probably just want to file a bug on gitlab or something, but there is a #radeon channel for more AMD specific things
rmckeever has joined #dri-devel
<neko2> right, I feel like it is more of a kernel thing anyway. yes there is a crash triggered by a program which I've yet to get to the bottom of, but there is this loop of reset fails on top.
<karolherbst> ohh it's more of a hardware thing
<neko2> I suspect sp
<neko2> *so
<neko2> I already think this hw is cursed tbh
<karolherbst> the hardware doesn't really support fault recovery, so all the kernel can do is a full GPU reset
<karolherbst> and often that doesn't go as well
<neko2> shall I just upload the dmesg somewhere so you can see if it's one of those such cases
<karolherbst> _but_ ultimately it's also Userspace sending faulty commands to the hardware
<karolherbst> mhh
danvet has joined #dri-devel
<karolherbst> yeah, I guess it makes sense to file a kernel bug, but also a mesa bug (if it's triggered throguh GL/VK(
<karolherbst> Userspace obviously shouldn't send garbage, but the kernel should also handle recovery better
<neko2> in this case the offender was libreoffice of all things
<neko2> I can only survive if I disable it's acceleration
<karolherbst> mhh
<neko2> here is the log https://paste.rs/peC
<karolherbst> acceleration through OpenCL by any chance or general acceleration?
<neko2> I think general GL
<neko2> it didn't get past the splash screen
<neko2> it'd hang, a ring timeout would fire, the reset loop would happen for a while
<neko2> eventually it gives up and hangs on something else, and after that I lose system responsiveness
<karolherbst> yeah.. if the GPU keeps getting broken command it will just crash again
<neko2> huh. but the kernel can identity it's soffice.bin. surely if a crash occurs the best thing is to punt the offender
<neko2> obv I'm not a kernel dev
<karolherbst> mhh, not really, or at least that's not what people would like to do
<neko2> well I mean it's standard for userspace
<neko2> you segfault, you have a bug, your results are meaningless
<neko2> you get murdered
<karolherbst> sure.. but it could also just reap the userspace process if it keeps crashing multiple times
<karolherbst> doesn't have to check the process name
<karolherbst> anyway.. there are multiple bugs to be fixed :)
<neko2> in any case, I already expect my hardware doesn't help with the recovery process, given that it's raven ridge on a quirky motherboard
<karolherbst> newer GPUs aren't better
<neko2> still, I have encountered enough bugs at this point to know this combo is a source of curses
<pixelcluster> neko2: it indeed starts out looking like a relatively "normal" gpu hang ("ring comp_1.1.0 timeout")
<karolherbst> well.. newer AMD gpu's
<pixelcluster> the recovery seems *really* cursed though, I never saw it fail like that
<karolherbst> some vendors care more about GPU resets, some.. don't
<neko2> karolherbst: yeahhhhh ngl as much as I hate it, think my next system will be team blue gpu
<pixelcluster> karolherbst: well it can work
<karolherbst> well, yes... but you have to reset the entire GPU
<neko2> pixelcluster: like I said. I think asrock have fucked several things in this firmware
<karolherbst> others aren't that cursed
<pixelcluster> it does work quite well on the steam deck
<neko2> ... wait heck, language check, rip
<karolherbst> yeah.. not saying that it doesn't work most of the time
<neko2> (sorry. I'll keep the f-strikes tactical but I think that one was deserved, this bios is truly awful in many ways)
<karolherbst> but I've seen it fail miserably and also seen it recovering properly
<pixelcluster> I think that is because recovery is better tested in the kernels they ship
<karolherbst> nah.. not all people are from the US here :P
<neko2> like to get this far with the crash loop, I had to disable iommu beecause this firmware's handling of this is likely broken based on looking up those crash logs (I don't have those sadly, something broke in saving the logs)
<neko2> it would fail on even the first reset otherwise
<pixelcluster> rip
<neko2> so that doesn't bode well
<karolherbst> mhh I can also imagine that enabling iommu makes things worse, because I don't think any developer tests with iommu enabled tbh
<neko2> side note: iommu.strict can be ignored in the cmdline, I had left that there but at this point I disabled it in firmware due to being broken on this system
<karolherbst> well.. can always file bugs
<neko2> not actually sure where to start with the crash loop filing that
<neko2> where do I even file a bug for amdgpu's kernel side
<karolherbst> good question
<neko2> even if they tell me "your hardware is screeeeewed"
<neko2> (I do suspect a 50% chance of this at this point)
<karolherbst> I mean.. it's their hardware, they probably know it's screwed
<karolherbst> :P
<karolherbst> but yeah.. in 95% of the cases where hardware is blamed, it's actually a software bug
<neko2> well when I say hardware, I mean the firmware intimately tied to it
<neko2> there were a few flags reading it that suggested something terribly wrong was occuring at a low level
<karolherbst> maybe
<karolherbst> but that doesn't really matter for GPU resets
<neko2> oh? I was under the impression the firmware had to be in cooperation to reset properly...
<neko2> at least as far as the primary GPU is concerned
<karolherbst> the GPU's firmware yes, but not really the motherboard one
<karolherbst> once you are in your OS the firmware doesn't really do much with the GPU anymore
<neko2> right. well, it's an integrated one, on a 2200G, so I doubt there would be weird firmware for it...?
<karolherbst> yeah and even if.. the driver has to deal with nasty GPU firmware
<karolherbst> OEMs usually get tools from nvidia/AMD to customize the GPU firmware, but that still follows rules
<neko2> just trying some keywords in the search atm to look for dupes
<jenatali> Why are apps awful...
<karolherbst> because they are written by humans :P
<jenatali> GFXBench apparently hardcodes VK_FORMAT_A2R10G10B10, which is optional, when they could just as easily use VK_FORMAT_A2B10G10R10, which is required
<karolherbst> maybe VK_FORMAT_A2R10G10B10 was faster on nvidia?
gouchi has quit [Remote host closed the connection]
mvlad has quit [Remote host closed the connection]
<jenatali> Fine, then check format support and use it if it's there, don't just assume that it is
gouchi has joined #dri-devel
<karolherbst> heh.. using different formats per GPU in a benchmark? that ain't fair :P
<neko2> when did nvidia ever play fair...
<neko2> *ducks*
<jenatali> Then... use the one that's guaranteed to be there
<neko2> ok, I'm not seeing any results thus far that have my particular looping crash issue. so I think it's safe to say I can file a new bug report there.
<karolherbst> you could submit patches and see what they say :D
<jenatali> Is that an OSS benchmark? I didn't think it was
<karolherbst> I mean.. you probably have access to the source, no?
<jenatali> Eh some groups at MSFT do, I don't
<karolherbst> heh
<neko2> karolherbst: btw, asssuming the specific trigger (whatever the heck libreoffice is doing) is never asked for in fixing the crash loop, what would then be the steps to go through to fix the bug with libreoffice's whatever-it's-doing causing a crash to begin with? I presume I'd need API traces or such
<karolherbst> but I suspect they'll say no "because that would make older results invalid"
nchery has quit [Read error: Connection reset by peer]
<neko2> (and I imagine that'd be more the mesa side in terms of not sending something that'd crash to the gpu when given dubious input)
<karolherbst> neko2: probably?
<karolherbst> yeah soo there are always different pov here
<karolherbst> you could also argue that libreoffice might use the API incorrectly (if that's the case)
<neko2> oh, so it's not clear who's fault it is yet
<karolherbst> but regardless of that, mesa shouldn't really end up crashing the system
<karolherbst> and the kernel should be able to recover...
<karolherbst> but it also always depends on the things libreoffice is doing
<neko2> I personally think both sides should be corrected if possible tbh, especially if it runs on an older (stable?) kernel that has that bug
<karolherbst> some APIs specify that they can bring down the system if used incorrectly
<neko2> wait, I'm getting my layers confused now, nvm
<neko2> karolherbst: texture handles? x)
<neko2> (which I am told are sometimes just essentially GPU pointers)
<karolherbst> well.. anything where you hand in actual pointers can cause funky problems
<karolherbst> bindless_textures can be such thing in OpenGL e.g.
<neko2> right, bindless textures, that was what I was thinking of
<neko2> I am reminded of the confusingly named attrib pointer functions in openGL which were offsets, not actual pointers, despite the prototype... if I'm remembering that right anyway
<karolherbst> in any case, unprivileged Userspace shouldn't be able to bring down the system, because that's a CVE level bug
nchery has joined #dri-devel
<neko2> yeah, there was already a joke in archlinux-offtopic on libera earlier, what a way to DoS an amdgpu system, send them a spreadsheet to open (or in fact anything that LO is set to open)
<neko2> all I was doing was trying to just view a spreadsheet I'd been sent. T_T alas s!@# happens
<neko2> anyway, thanks all for the input, it seems there's a clear order to at least get that reset bug fixed, then at least once I can trigger the crash without bringing the house down I can then do more useful debugging of whatever the heck libreoffice is doing.
Haaninjo has joined #dri-devel
<jenatali> And of course using ABGR instead of ARGB works just fine
agd5f has joined #dri-devel
agd5f_ has quit [Ping timeout: 480 seconds]
jfalempe has quit [Quit: Leaving]
<cmarcelo> jenatali: what's the minium required version of MSVC for Mesa?
<jenatali> cmarcelo: Either VS2019 or VS2022, not sure if we'd dropped 2019 yet
<jenatali> Any particular reason?
<jenatali> Ah CI still builds with 2019
<cmarcelo> jenatali: designated initializers... oldest clang / gcc we require support them even without C++20, I know "MSVC 2019 16.1" (is this what CI have?) does support it under /C++20. wondering if there's another flag for that in MSVC.
<jenatali> cmarcelo: No, it's only supported in C++20 mode
<cmarcelo> jenatali: context: https://gitlab.freedesktop.org/mesa/mesa/-/jobs/36599451 line 726
<jenatali> Build that test as C++20?
ngcortes has quit [Ping timeout: 480 seconds]
<cmarcelo> jenatali: would you be ok with "if msvc: set C++20"?
<jenatali> Yeah, fine by me
<jenatali> For that test at least, not sure if we want to upgrade the whole tree
<cmarcelo> sure
<jenatali> cmarcelo: Ping me in the MR if you want an ack from me on the patch :)
<cmarcelo> Cool tks.
ngcortes has joined #dri-devel
nchery has quit [Remote host closed the connection]
nchery has joined #dri-devel
ngcortes has quit [Ping timeout: 480 seconds]
ngcortes has joined #dri-devel
junaid has joined #dri-devel
idr has joined #dri-devel
agd5f_ has joined #dri-devel
<jenatali> idr: Seems plausible. Where'd you find the 1930 number?
<idr> Other sources say 2019 needs /std:c++lastest while 2022 can use /stc:c++20.
<idr> Surely 38 random forum posts can't steer me wrong.
<jenatali> Yeah I'm just not sure if that's the version number that meson detects
<jenatali> Lemme see what it says for my compiler
<cmarcelo> (shared just to note that: eventually even 2019 got the c++20 too, but not sure if is old enough)
<idr> 2019 16.11... but maybe not 16.9 or 16.10?
agd5f has quit [Ping timeout: 480 seconds]
<jenatali> Yeah I think the first version number with support is 19.29
<jenatali> That lines up with the output from the build job (https://gitlab.freedesktop.org/idr/mesa/-/jobs/36605312) saying the version number is 19.29.30146
<jenatali> _MSC_VER is a macro available in the source, which apparently has no relation to the version reported through stdout? I dunno it's all a mess
<jenatali> Oh I see, 1930 == 19.30, that makes sense. So yeah, < 19.29 instead of < 1930 is what you want
<idr> Yeah... I was just typing something like that. :)
kzd_ has quit []
<idr> Maybe "if 'c++20' in cc.get_options()" would be better?
<idr> dcbaker: ^^^
agd5f has joined #dri-devel
neko2 has quit [Quit: leaving]
<idr> Or cpp_stds?
kzd has joined #dri-devel
<dcbaker> idr: yeah, if you can`override_options : ['cpp_std=c++20']` or `=c++lastest` (meson should understand both), assuming a new enough version
<dcbaker> In famous words, I have some patches that I should finish up that would make that all more robust, but...
apinheiro has quit [Ping timeout: 480 seconds]
agd5f_ has quit [Ping timeout: 480 seconds]
* idr tries that...
<jenatali> Thanks. Sorry MSVC is a bit of a headache here :(
<demarchi> rodrigovivi: as we were talking earlier today, the subdir-ccflags in drivers/gpu/drm/xe/Makefile is applying the cflags to the whole dir instead of just to the display-related compilation units... do you know if there is a way to do one of the options below? 1) add a separate Makefile in the xe/display dir, so subdir-ccflags applies only to that, but still link everything in the xe.ko; or 2) replace the subdir-ccflags with something
<demarchi> else so it only applies to the display/%.o objects?
<jenatali> I wish it just supported designated initializers without having to be in C++20 mode
<dcbaker> I blame the C++ committee
agd5f_ has joined #dri-devel
<idr> dcbaker: Can you elaborate on "override_options"?
<dcbaker> @idr, ah, `override_options` is a keyword to pass to a build target like `cpp_args`, but it tells meson "Hey, you know that default option I told you about? Yeah, ignore that, do this instead" so you'd write something like `cpp_std_override = ['cpp_std=c++latest']\n executable(..., override_options : cpp_std_override)`
<dcbaker> sorry, I should have been more clear about that
<dcbaker> which will stop meson from putting two c++ standard arguments into the command line
<dcbaker> src/intel/compiler does that with c++17
agd5f has quit [Ping timeout: 480 seconds]
<idr> So... how do I do that to select between option A, option B, or nothing?
<idr> Because the "obvious" things don't work.
<idr> I might have a thing that's good enough...
agd5f_ has quit [Ping timeout: 480 seconds]
<idr> That builds locally. :shrugh:
<jenatali> That looks reasonable to me
<idr> jenatali: We'll see if it also looks reasonable to the CI. :)
Duke`` has quit [Ping timeout: 480 seconds]
<rodrigovivi> demarchi: I really don't know... I believe that that separated file under the display dir could do the trick... the worst part that is the one I marked with XXX in the xe/Makefile I believe can be now removed after your patch to include the files directly or to remove the need for the i915 files...
<demarchi> # XXX: Needed for i915 register definitions. Will be removed after xe-regs
<demarchi> this?
<demarchi> this is a nop
<demarchi> the line above it will add the include to all .o
Zopolis4 has joined #dri-devel
<demarchi> oh... you mean, if disabling display in the kconfig
<demarchi> rodrigovivi: no, we can't remove, unless we add more ifdefs around the code. I fixed several of those by removing display completely and checking the errors, but there are some hard ones:
<demarchi> drivers/gpu/drm/xe/xe_device_types.h -> display/intel_display_core.h -> the-world.h
<demarchi> and some files rely on this indirect include, like xe_pci.c
<rodrigovivi> ouch :(
apinheiro has joined #dri-devel
<rodrigovivi> it would be good to have something cleaner for this display reuse...
krushia has joined #dri-devel
danvet has quit [Ping timeout: 480 seconds]
danvet has joined #dri-devel
danvet has quit [Ping timeout: 480 seconds]
danvet has joined #dri-devel
<gfxstrand> Does anyone else remember this crazy loader bug where it falls over on vkGetPhysicalDeviceProperties2KHR() if you support gpdp2 but not 1.1?
<gfxstrand> Or maybe it's a crazy CTS bugb?
<jenatali> It's a CTS bug
<gfxstrand> Oh, so someone does remember it. :)
<jenatali> The CTS doesn't enable the extension for gpdp2
<gfxstrand> That'll do it
<jenatali> Yeah... I was tripping over it constantly until I flipped on 1.1
<gfxstrand> Ugh
<jenatali> I assumed it was a regression but according to the history for those tests, nope
<jenatali> And I didn't see any issues filed about it in a quick skim. I probably should've filed one
<gfxstrand> I'm guessing it wasn't a problem until Mesa started doing the right thing and returning NULL if you don't enable an extension
<jenatali> Yeah I'd believe that
<gfxstrand> alright, I'll see if I can fix the CTS quick.
<gfxstrand> I'd make my intern do that but I want her to still like me. (-:
<demarchi> rodrigovivi: for now I'm keeping a hack commit on top "Undo display", that at least lets me test if the rest is moving to the right direction
<demarchi> we may need to rethink the display integration
<demarchi> i.e. I know the way it is right now is temporary, but it shouldn't be causing issues to the rest of the driver
junaid has quit [Remote host closed the connection]
danvet has quit [Ping timeout: 480 seconds]
gouchi has quit [Remote host closed the connection]
jkrzyszt has quit [Remote host closed the connection]
darkapex has quit [Remote host closed the connection]
macromorgan is now known as Guest5176
macromorgan has joined #dri-devel
darkapex has joined #dri-devel
danvet has joined #dri-devel
Guest5176 has quit [Ping timeout: 480 seconds]
jkrzyszt has joined #dri-devel
nchery has quit [Ping timeout: 480 seconds]
nchery has joined #dri-devel
apinheiro has quit [Quit: Leaving]
bgs has quit [Remote host closed the connection]
Haaninjo has quit [Quit: Ex-Chat]
<gfxstrand> Ok, now that I figured out how to actually get SSH to work...
jkrzyszt has quit [Remote host closed the connection]
macromorgan is now known as Guest5180
macromorgan has joined #dri-devel
<Ristovski> I love it when I reboot and get a random beep code but I can't reproduce it anymore
Guest5180 has quit [Read error: Connection reset by peer]
macromorgan is now known as Guest5181
macromorgan has joined #dri-devel
<psykose> they should have replays but for Life
rasterman has quit [Quit: Gettin' stinky!]
Guest5181 has quit [Ping timeout: 480 seconds]
<Ristovski> hmm maybe there already is `irltrace`, and its being used to edit-and-replay Mike until he makes zink get 9000FPS in every benchmark possible
<zmike> sweatytowelguy.jpg
agd5f has joined #dri-devel
jkrzyszt has joined #dri-devel
warpme_____ has quit []
danvet has quit [Ping timeout: 480 seconds]
<gfxstrand> Why are these tests creating custom instances?!?