heat_ has quit [Remote host closed the connection]
heat_ has joined #dri-devel
<mareko>
I'm thinking of pinning Mesa threads to 1 core each including the app thread, so that they can't be moved between CPUs
<mareko>
benchmarks results are too random without it to the point that CPU performance is not testable
Leopold_ has joined #dri-devel
<jenatali>
That sounds like something reasonable behind an environment variable. I don't know that I agree with that idea for the general case
<airlied>
yeah sounds like something for testing, but would probably screw up in the real world
<mareko>
drivers could opt out
<mareko>
but it should work fine if the thread->core assignment is randomized
<jenatali>
Until you get an app that wants to pin its own threads
<jenatali>
Or the random assigns multiple app threads to the same core
<mareko>
it can do that before creating a GL context
<jenatali>
Yeah but that's not supposed to be part of the GL API. What if a context is bound to multiple threads sequentially?
<airlied>
mareko: still seems like a bad idea outside of benchmark comparisons
<airlied>
screwing with app thread pinning is definitely hostile behaviour
<mareko>
apps pinning their threads won't be affected
<airlied>
how do you know other apps won't be affected though?
<airlied>
like how do you even know what the app thread is
<jenatali>
Right. When do you test that condition? Fighting with an app over that seems bad
<airlied>
apps can have multiple threads on multiple contexts
<mareko>
the app thread calls MakeCurrent
<jenatali>
Definitely seems like driconf at best
<airlied>
like just the interactions with firefox or chromium make me shudder lots of processes getting pinned
<mareko>
I've been thinking about it for a while, and the problems you are bringing up are solved problems, e.g. multi context and multi process thread distribution, using adjacent cores for each context, randomizing thread distribution within a core complex, not changing thread affinity of app threads are already assigned to a single core, etc.
<jenatali>
Random pinning also could be bad for benchmark determinism. E.g. if (on Windows at least) you pin to a thread that's responsible for DPCs on one run and not on another
<mareko>
for starters let's just consider radeonsi and Linux
jewins has joined #dri-devel
<airlied>
mareko: but mesa isn't the OS scheduler
<airlied>
the gpu driver isn't in charge of making those decisions for the whole desktop env
<mareko>
apps do
<mareko>
libraries do, engines do, runtimes do
<airlied>
and we should add to the mess?
<airlied>
also I'm sure this isn't common across x86 vendor cpus, and even less so when it comes to non-gaming apps or video transcoding servers etc
<airlied>
this isn't the drivers job, it's the schedulers job, if you want to become a scheduler developer, go do that :-)
<mareko>
there is always a cost-benefit ratio to everything
<jenatali>
Obviously if you scope it like that, it doesn't impact me so I don't really have any stake here. Still sounds like a bad idea to do generally but I'm open to changing my mind with data
<airlied>
like the driver doesn't have enough info to make decisions any better than anyone else, so it's likely outside of some benchmarks it'll make bad decisions just as often as good ones
<mareko>
hypothetically yes
<jenatali>
Yeah that's my take too, as an OS developer
<airlied>
you don't understand anything about the apps thread layout or usage patterns, you also will screw with the OS ability to power down cores
heat has joined #dri-devel
<airlied>
power management is screwey enough
<airlied>
doing this in the driver is the wrong hammer, now you could maybe do it in something like gamemode
heat_ has quit [Read error: No route to host]
<airlied>
where you say I'm launching a game, and gamemode goes and screws all your thread affinities
<mareko>
it can do that
<airlied>
but I'd think amd vs intel cpus will want different thigns
<airlied>
or at least amd cpus seem to care a lot more about thread locality
<mareko>
the pinning would only happen at initialization based on what the affinity mask is, and then it's up to apps
<mareko>
generally you want threads to stay on the same core, so that you don't lose cached data
<mareko>
other than that, AMD only needs to pin when you have multiple L3 caches, so e.g. 8-core Zen3 doesn't need pinning at all
<airlied>
you just reminded me of 8945, probably need to opt llvmpipe out of any pinning to cpus we aren't allowed to use
<airlied>
that we do that is also actively user hostile
<mareko>
an app pinning its thread calling GL to a single core is driver hostile
<mareko>
the current L3 cache pinning mechanism in Mesa has the best cost-benefit (i.e. pros/cons) ratio of any solution
<airlied>
an app can't be hostile to the driver, it can just have bad performance due to decisions it makes
<airlied>
if anyone cares they should contact the app developers and fix it
<airlied>
just because you have a driver, it doesn't mean you get to fix the world's problems in it
<airlied>
like yes it's easier to just hack it and move on, but it will screw up others trying to do the right thing
<mareko>
contacting app developers doesn't work :) ok, let's message 1 million app developers about how they should pin threads
<airlied>
they don't all do it wrong though do they
<mareko>
I think about it as a cost-benefit and ROI situation
<mareko>
the cost of doing in correctly in the kernel is too high
<mareko>
*it
kts has joined #dri-devel
<mareko>
for a Mesa person that is
kts has quit [Remote host closed the connection]
<airlied>
yeah that's why I'd suggest something like gamemode
<airlied>
since it already screws with all those things
<mareko>
it's not possible to implement a thread scheduler in a different thread or a process that does exactly what Mesa does right now because pthreads are not visible in /proc, only the PID is, and thus you can't query which CPU a pthread is running on from a different pthread
<mareko>
the only pthread that can open its /proc/.../stat is self
<airlied>
huh threads are all under /proc/<pid>/task/<tid> ?
<mareko>
nope
Leopold_ has quit [Remote host closed the connection]
<mareko>
even if you open the stat file of task/tid and verify that read() works, and you pass the fd to another thread, the read() on the other thread fails
<mareko>
the last thing worth trying is sending the opened fd of the stat file over a socket to another thread, so that it's properly duped, but I don't know if that would work
<mareko>
other than that, you can only open the main thread (pid) from any thread, not tid
<mareko>
the stat file I mean
<airlied>
seems like cpuset could be used for something like that
<mareko>
either the kernel or Mesa must do it
<mareko>
realistically it all sucks, practically it's the best we have
<airlied>
I would think having gamemode configure cpusets or cgroups would possibly be a better idea, when you say stat do you mean status?
<airlied>
stat doesn't seem to contain anything we'd want
glennk has quit [Ping timeout: 480 seconds]
<mareko>
/proc/pid/task/tid/stat contains the CPU number, which is useful when you want to move a set of threads to the same complex
<mareko>
but only the current thread can read it if the current thread is not a process, so it's as good as sched_getcpu
<airlied>
it's wierd I can read task from any process/thread my user is running
<airlied>
cat /proc/*/task/*/stat
<airlied>
or do you get 0 for the cpu in that case?
<mareko>
tids shouldn't be visible to getdents according to the documentation, so cat shouldn't print them
<airlied>
prints them here
<mareko>
but I can see them, hm
<airlied>
at least for firefox
<mareko>
pthread_create -> SYS_gettid in that thread -> trying to open /proc/self/task/tid/stat fails in another thread for me
<mareko>
I'm going to pursue the single core pinning idea as driconf at least
<mareko>
and possibly make it a default for radeonsi except pinning the app thread
<mareko>
zink will likely follow because it likes numbers
<airlied>
just get some numbers that aren't just specviewperf
<airlied>
real games showing actual fps changes is more likely to persuade people, or persuade phoronix to try it out :-P
<jenatali>
+1 multiple benchmarks, ideally from real apps would be great to see
simondnnsn has joined #dri-devel
The_Company has quit []
eukara has quit [Ping timeout: 480 seconds]
heat_ has joined #dri-devel
heat has quit [Read error: Connection reset by peer]
sima has joined #dri-devel
lemonzest has quit [Quit: WeeChat 4.1.2]
lemonzest has joined #dri-devel
kts has joined #dri-devel
ondracka_ has joined #dri-devel
heat_ has quit [Remote host closed the connection]
heat_ has joined #dri-devel
Duke`` has joined #dri-devel
YuGiOhJCJ has joined #dri-devel
heat_ has quit [Read error: No route to host]
heat_ has joined #dri-devel
Leopold_ has joined #dri-devel
itoral has joined #dri-devel
flynnjiang1 has joined #dri-devel
flynnjiang has quit [Read error: Connection reset by peer]
YuGiOhJCJ has quit [Ping timeout: 480 seconds]
YuGiOhJCJ has joined #dri-devel
Duke`` has quit [Ping timeout: 480 seconds]
ayaka has quit [Quit: byte]
ayaka has joined #dri-devel
kts has quit [Ping timeout: 480 seconds]
<HdkR>
As an additional topic around CPU pinning being a bad idea that I didn't see. In a Big.Little world that we are in, pinning randomly can have catastrophic performance implications
<HdkR>
Obviously not a concern when the system has homogeneous CPU cores or cache layout, but Intel, AMD, and ARM devices are all changing this and it should be left to the kernel to schedule properly
<mareko>
airlied: there are not many GL games that still work anymore, most Feral's games don't run on recent distros
<HdkR>
As a "fun" aside, on an AMD Phoenix system with Zen4 + Zen4c cores, userspace can't determine which cores are which. CPUID returns identical data. So if you pin on to a Zen4c core, you immediately lose performance.
<HdkR>
At least on Intel systems you can query which cores are "Atom" versus "Core"
<HdkR>
Getting randomly pinned on Meteor Lake's low power E-Core cluster is going to have a really bad time. Similar to getting randomly pinned to a Cortex-A53
<mareko>
that's all nice, but it doesn't help the fact that there is no better solution
<mareko>
what Mesa does right now is the best thing we have
<mareko>
it's so good that you don't want to run without it in most cases
<HdkR>
Indeed, the best solution ends up leaving the scheduling to the kernel knowing the downsides that get incurred there
<mareko>
it doesn't leave it to the kernel
<HdkR>
You've got the special case for AMD cpu systems that pin to a single L3 arrangement, but everything else should fall back to kernel affinity scheduling?
<mareko>
the GL calling thread is free
<mareko>
3 driver threads follow it using sched_getcpu and dynamic thread affinity adjustments
<mareko>
enabled by radeonsi and zink
YuGiOhJCJ has quit [Ping timeout: 480 seconds]
<ishitatsuyuki>
there is cache affinity, which mareko just described and is already implemented. then there is also pinning the bottlenecked / single-threaded thread to the faster core, which doesn't seem to be widely done or getting plumbed yet
<mareko>
the affinity changes based which core the GL calling thread is occupying
<mareko>
it actually looks pretty awesome in the gnome system monitor if you set the same color to CPUs of the same L3, only the cores of the same color are utilized, and when the app thread jumps to a different L3, the Mesa threads jump to that L3 too
YuGiOhJCJ has joined #dri-devel
<mareko>
the next step is to prevent the app thread from jumping to a different L3, and an even stronger solution would be to disallow threads from moving between cores completely
<mareko>
without that, performance is still too random sometimes
flynnjiang1 has quit [Read error: Connection reset by peer]
<mareko>
up to -15% random
<mareko>
we'll hardcode this and big.LITTLE stuff in Mesa if we have to
mszyprow has joined #dri-devel
<HdkR>
I guess I'll revisit this topic once I get caught by a game getting locked to a Cortex-A520 cluster
<mareko>
it's only applied to Zen CPUs for now
<HdkR>
That's good
<mareko>
usually you get 1 person trying to improve the situation by actually writing the code and 10 naysayers whose only job is to complain
<HdkR>
I can confirm that I'm a naysayer
<karolherbst>
airlied: fyi, llvmpipe breaks on llvm-18
<karolherbst>
"Callsite was not defined with variable arguments! ptr @llvm.coro.end"
<mareko>
Valve and low-mid APUs aren't even affected because you need a Zen CPU AND at least 2 L3 caches (12+ cores) to get the thread pinning code
<mareko>
enabled
<mareko>
on Zen 1-2 you need 6+ cores
<HdkR>
and Phoenix isn't affected because it says L3 due to monolithic die
<HdkR>
it shares L3*
<mareko>
it has 8 cores max, so 1 L3
shoffmeister[m] has left #dri-devel [#dri-devel]
<airlied>
mareko: also we have a lot more apps using GL now than previously with gtk4 all apps will mostly use GL, not sure we want to pin every single desktop app to misc cores
<airlied>
again you might get better perf in one game, but are screwing the whole power management
<airlied>
like you would likely want to migrate some background tasks to perf cores, having mesa pin them is hostile
<airlied>
saying nobody else wants to write code to screw with cpu affinity in a gpu driver project is a bit disingenious
<airlied>
nobody should be writing cpu affinity screwing code in a gpu driver project, we don't have the expertise to make those sort of decisions
<mareko>
yeah gtk4 changes the game
<airlied>
we might move gtk4 to vulkan sooner though, but initially will be gl4 based I think
<mareko>
the kernel folks apparently don't have the expertise either, or maybe it's too difficult to implement it there
<airlied>
I'm sure there are lots of scheduler people who would love to talk about a game optimised scheduler in the kernel :-P
<airlied>
kernel graphics folks probably don't, you'd want scheduler people
mszyprow has quit [Ping timeout: 480 seconds]
<mareko>
that's who I mean
<airlied>
I'd have though amd would have some cpu sched people who were big on getting ccix locality right
<airlied>
like cluster aware scheduler was one thing in that area
<mareko>
it's been 7 years since Zen launched, where have the sched people been all this time?
<airlied>
I'm also aware that even if the kernel improves nobody will notice with apps if userspace overrides things anyways
<mareko>
trust, the kernel does nothing today, I can test it immediately
<airlied>
I'm going to guess nobody has done much work outside of EPYC cpus
<airlied>
at AMD at least
<mareko>
yep, nothing
<airlied>
seems like the cluster scheduler guy at amd would be worth talking to to see if he has any interest in ryzen
<mareko>
Linux 6.5.0, the Mesa thread pinning improves performance by 11% in a simple open source game I just ran
<airlied>
esp since it seems to need L3 clusters
frieder has joined #dri-devel
fab has joined #dri-devel
<mareko>
airlied: if Mesa keeps the app thread alone and only prevents its own threads from moving, the kernel scheduler will be free to schedule the rest of the system around those
<mareko>
also those threads would be mostly idle with gtk apps
<mareko>
it's worth exploring as an improvement
vliaskov_ has joined #dri-devel
tursulin has joined #dri-devel
vliaskov__ has joined #dri-devel
jfalempe has quit [Remote host closed the connection]
sghuge has quit [Remote host closed the connection]
sghuge has joined #dri-devel
jsa has joined #dri-devel
vliaskov_ has quit [Ping timeout: 480 seconds]
mripard has quit [Quit: mripard]
mripard has joined #dri-devel
rasterman has joined #dri-devel
<MrCooper>
mareko: instead of going behind the backs of the kernel scheduler and user, you should work toward adding UAPI which gives the former the information to do a better job
<mareko>
not my expertise
<MrCooper>
too bad then, this definitely isn't Mesa's business though
<mareko>
we've already talked about that it is userspace's business though
<MrCooper>
I disagree
<mareko>
you can
jewins has quit [Ping timeout: 480 seconds]
<mareko>
the affinity API exists precisely for this
<mareko>
that gitlab issue is for swrast_dri.so, which doesn't use any of my Mesa thread pinning, so I don't know why people are even talking to me about it
<mareko>
other than cpu_detect
<ccr>
perhaps that fact should be mentioned in the issue(?)
<mareko>
yes
<MrCooper>
mareko: can you not try to make me responsible for solving your issues? I'm not the one who's proposing to make Mesa do something it shouldn't
<mareko>
MrCooper: There are no issues for me to solve. What we have in Mesa gives me 11% more CPU performance. That's what I care about. Mesa should absolutely continue doing that if there is no other alternative. If you disagree, that's on you.
vliaskov__ has quit [Read error: Connection reset by peer]
<MrCooper>
guess I'll have to live with that
lynxeye has joined #dri-devel
<mareko>
if there are problems with apps, we'll deal with them individually
<MrCooper>
if there's no issue to solve, you don't need to do the thing which started this discussion
pcercuei has joined #dri-devel
<mareko>
I think you are still bummed out by the fact that my initial implementation broke apps and we had an argument about it, but the current implementation is solid and has been enabled since 2020, adding up to 33% CPU performance, which is like a multi-generational uplift
<MrCooper>
no, it's always been clearly the wrong way to achieve this
<mareko>
actually since 2018 even
<MrCooper>
so you've had 6 years to get in touch with kernel scheduler developers
<mareko>
I have, trust me
<mareko>
well, AMD ones, that is
colemickens_ has joined #dri-devel
trofi has left #dri-devel [.]
<MrCooper>
mareko: did you propose to them adding UAPI to tell the kernel scheduler which threads should be kept closely together?
<MrCooper>
that might result in even better performance than what Mesa does now, which fights the scheduler and lags behind it
ADS_Sr_ has quit [Ping timeout: 480 seconds]
hansg has joined #dri-devel
yyds_ has joined #dri-devel
yyds has quit [Ping timeout: 480 seconds]
<mareko>
they wouldn't even talk about doing anything about the situation
<mareko>
it doesn't fight the scheduler, in fact, it complements the scheduler very well and gives the scheduler a lot of freedom because the affinity masks have up to 8 bits set
<mareko>
16 bits, actually
glennk has joined #dri-devel
<mareko>
apps are completely unaffected by the current behavior because Mesa doesn't pin threads, it constantly moves them where app threads are scheduled
<mareko>
and it only moves its own threads, not app threads
apinheiro has joined #dri-devel
<mareko>
a kernel scheduler wouldn't be more app-friendly than that
dtmrzgl has quit []
<mareko>
airlied: ^^
Leopold_ has quit [Remote host closed the connection]
<mareko>
if an app pins its GL context to an L3 cache, Mesa is nice and moves its threads under that L3 cache too, it's the nicest thing Mesa could do for apps :)
dtmrzgl has joined #dri-devel
<lynxeye>
mareko: The kernel scheduler could proactively move the driver thread to the same L3 cache before the next wakeup when it moves the app thread if it has the information that app and driver thread are closely cooperating.
<lynxeye>
it could even move the driver thread closer to the app thread (like same l2) if cpu capacity allows. Also L3 cache isn't the only thing to consider, on a multi-cluster setup like many modern ARM systems you want the threads on the same cluster. The kernel already has all the required topology information, Mesa would need to learn this information for each system.
Haaninjo has joined #dri-devel
<MrCooper>
lynxeye++
<MrCooper>
also, Mesa moving its threads might make the scheduler want to move the application thread somewhere else again in some cases
<MrCooper>
that's fighting, not cooperation
vliaskov has joined #dri-devel
sravn has quit []
sravn has joined #dri-devel
anarsoul|2 has joined #dri-devel
anarsoul has quit [Read error: Connection reset by peer]
azsxdcz_ has joined #dri-devel
mclasen has joined #dri-devel
glennk has quit [Ping timeout: 480 seconds]
azsxdcz_ has quit [Remote host closed the connection]
<mareko>
MrCooper: some of your statements are correct, but you need to choose better words and remove personal biases
<MrCooper>
take a look in the mirror
<mareko>
now you're just being an asshole
<MrCooper>
I could say the same about you
<mareko>
I try to use logic, not personal opinions
<MrCooper>
not a good word though
<MrCooper>
that makes two of us then
<mareko>
airlied: you owe me an apology for calling it "app hostile" while it's arguably friendly to apps and that you linked an unrelated ticket
<mareko>
a kernel solution would be better, but not by much
<mareko>
it wouldn't be very different from what's being done now
<MrCooper>
he doesn't owe you anything, certainly not for any of that
<MrCooper>
lynxeye explained the difference well
crabbedhaloablut has quit [Read error: Connection reset by peer]
crabbedhaloablut has joined #dri-devel
<pq>
mareko, please, be nicer than this.
glennk has joined #dri-devel
<mareko>
MrCooper: there is nothing "app hostile" in Mesa right now, that was undeserved
<mareko>
MrCooper: what lynxeye said is correct, but it wouldn't conflict with the current Mesa code, in fact, it would actually work very well with it because it would cause the Mesa code to have no effect
<mareko>
I'm not seeing here what you don't like about it
<mareko>
yes, we agree that the kernel should do it instead, but it would be more of the same thing, that's cooperation
hansg has quit [Remote host closed the connection]
Leopold_ has joined #dri-devel
<pq>
mareko, the discussion seemed be about the pinning you proposed to *add* into Mesa, and not about what Mesa already does.
<pq>
mostly
yyds_ has quit []
<pq>
I seems to me these two different topics are being confused.
Leopold__ has joined #dri-devel
<mareko>
pq: that seems true for most people, but MrCooper really doesn't like what Mesa currently does and I have a reason to believe that because it's a continuation of our heated debate from 2018
<pq>
alright
itoral has quit [Quit: Leaving]
Leopold_ has quit [Ping timeout: 480 seconds]
<pq>
most of the discussion here did look like it was about adding even more CPU pinning, though
<mareko>
yes, that is being considered
Leopold__ has quit [Remote host closed the connection]
<mareko>
pinning of the app thread wouldn't be friendly, but pinning and moving Mesa threads is only Mesa's business, nobody else's in userspace
jfalempe has joined #dri-devel
<pq>
I'm not sure... some cgroup CPU load balancer might very well want to pin threads differently
jfalempe has quit [Remote host closed the connection]
yuq825 has quit [Read error: Connection reset by peer]
jfalempe has joined #dri-devel
yuq825 has joined #dri-devel
<mareko>
hiding cores from the process will work, but as long as they are visible and the affinity API allows anything and Mesa created those threads, it can do whatever
<mareko>
so if you want something different, lower the privileges of the process
<pq>
that seems very inconvenient, and I wouldn't know how to do that
Daanct12 has quit [Quit: WeeChat 4.1.2]
macslayer has quit [Ping timeout: 480 seconds]
yyds has joined #dri-devel
<mupuf>
robclark: your subscription to igt-dev was removed
mszyprow has joined #dri-devel
Ermine has quit [Remote host closed the connection]
Ermine has joined #dri-devel
cef has quit [Ping timeout: 480 seconds]
glennk has quit [Ping timeout: 480 seconds]
cef has joined #dri-devel
apinheiro has quit [Quit: Leaving]
crabbedhaloablut has quit [Read error: Connection reset by peer]
crabbedhaloablut has joined #dri-devel
tzimmermann has quit [Ping timeout: 480 seconds]
kts has joined #dri-devel
kts has quit [Ping timeout: 480 seconds]
<robclark>
mupuf: gah... why do we even still use email list for igt?
kts has joined #dri-devel
<pinchartl>
robclark: I didn't know you preferred usenet over e-mail
jewins has joined #dri-devel
YuGiOhJCJ has quit [Quit: YuGiOhJCJ]
<robclark>
heh, well email is getting less and less useful.. not sure about switching to usenet but gitlab would seem like a no-brainer for igt
<mupuf>
robclark: yeah, Intel dropped the ball on this..
<mupuf>
robclark: yeah, Intel dropped the ball on this...
glennk has joined #dri-devel
kts has quit [Ping timeout: 480 seconds]
mvchtz has quit [Ping timeout: 480 seconds]
yuq825 has left #dri-devel [#dri-devel]
<any1>
Maybe we can vote on what to call the color format property here? Some candidates are "color format" and "force color format".
<any1>
sima, pq, emersion, swick[m]: ^
cef has quit [Ping timeout: 480 seconds]
Company has joined #dri-devel
<mripard>
color format seems like the most consistent choice
cef has joined #dri-devel
hansg has joined #dri-devel
<pq>
looking at CTA-861-H, I think color format is the right wording
<emersion>
if there's a conflict, it's on downstream
<any1>
pq: hehe, good point.
<pq>
any1, I wouldn't include "force" in it, because "force auto" would be odd, right? And props are always force anyway.
<any1>
emersion: Yeah, but there's no point in making their lives difficult. ;)
<emersion>
yeah, if it doesn't make the name ugly
<emersion>
i suppose it's a good thing the naming scheme is the wild west in the end
<pq>
I think all the existing downstream ones are ugly enough, "color" is not popular
<bl4ckb0ne>
what about camelsnake_Case
<emersion>
if there is a conflict just pick "сolor format" (yes c ≠ с)
<pq>
that's mean
<emersion>
i love UTF-8
<bl4ckb0ne>
thats just vile
<emersion>
bl4ckb0ne: so far we've managed to not be inconsistent inside a prop for the case style
<bl4ckb0ne>
ship it
<emersion>
that's the new milestone i suppose
kts has quit [Quit: Leaving]
kts has joined #dri-devel
greenjustin has joined #dri-devel
<any1>
Thanks people, looks like we'll be calling it "цолор формат"
<ccr>
:P
<ccr>
best to use unicode smileys
<ccr>
emojis
Dr_Who has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
<emersion>
"🎨⚙️"
<bl4ckb0ne>
just call it colour fourmat
macslayer has joined #dri-devel
<any1>
We also haven't excausted names of famous painters like Michalangelo, Rembrandt and Bob Ross.
<pinchartl>
robclark: there's an ongoing trend to care less and less about @gmail.com for mailing lists. not sure if that applies to @google.com too :-)
<pinchartl>
Konstantin has been telling people that they should switch to a different e-mail provider to subscribe to mailing lists
<mripard>
lei is also a great solution
<robclark>
switching to a new email provider is work and probably expense for me.. I'm more likely to not bother since more of the stuff I care about is gitlab these days
<mripard>
(and it's free!)
<pinchartl>
lei is really good, yes
<pinchartl>
robclark: if it works for you. you will be missed :-)
kts has quit [Ping timeout: 480 seconds]
<ccr>
gmail is apparently getting rather annoying due to their spam filtering/refusal to talk with mail servers on some seemingly arbitrary set of rules
<robclark>
well, I mean mesa shifted off of email, so that mainly leaves dri-devel... and well, igt.. but igt is already hosted at gitlab
<robclark>
yeah, I mean gmail can be annoying.. but email is pita for workflow.. we might as well be faxing patches
tzimmermann has joined #dri-devel
tlwoerner has quit [Quit: Leaving]
kts has joined #dri-devel
tlwoerner has joined #dri-devel
<mripard>
deal
dviola has quit [Ping timeout: 480 seconds]
mvlad has joined #dri-devel
Duke`` has joined #dri-devel
<Company>
if glTexImage2D(..., GL_RGBA8, GL_BGRA, ...) is accepted by Mesa's GLES - is that a bug that I should file?
simondnnsn has quit [Read error: Connection reset by peer]
mripard has quit [Remote host closed the connection]
glennk has quit [Ping timeout: 480 seconds]
<Company>
daniels: is that only BGRA or is that something more common?
raket has left #dri-devel [#dri-devel]
mvchtz has joined #dri-devel
columbarius has joined #dri-devel
<daniels>
Company: only BGRA8
<Company>
I'm wondering if it's good enough to add an if (BGRA && is_gles()) internal_format = GL_BGRA;
<daniels>
yeah ... I decided earlier today that I was just going to do that
co1umbarius has quit [Ping timeout: 480 seconds]
<daniels>
because even when we do get the new extension, we still need to run on systems without it
<Company>
yeah, a new extension is not gonna help
<gfxstrand>
Ah, Arm fixed-rate compression. That isn't scary at all. :sweat:
<daniels>
gfxstrand: it's not scary in and of itself; what's scary is that it's actually required
<gfxstrand>
Wait, what?!?
simondnnsn has joined #dri-devel
<daniels>
gfxstrand: RAM has somehow got rather more expensive over time
<daniels>
so we have at least one client who needs AFRC because their price point dictates very little RAM, but the feature demands dictate large enough buffers that you need lossy compression
<gfxstrand>
*sigh*
kts has quit [Quit: Leaving]
<gfxstrand>
I mean, I'm all for reducing waste but, uh...
jfalempe has quit [Read error: Connection reset by peer]
<gfxstrand>
How good are the compression rates? I guess it's probably better to do lossy compression than just throw 565 at it and call it a day.
<gfxstrand>
For as much as I kinda hate it, lossy render compression is a really neet tiler trick.
<RSpliet>
"visually lossless", according to their marketeers
<gfxstrand>
Well, yes. Everything's always "visually lossless". That's what they say about all those H.265 artifacts on cable TV, too. :-P
<Company>
I think that's my new favorite Vulkan definition
fab has quit [Quit: fab]
rasterman has joined #dri-devel
<daniels>
it's not just that it's configurable, it's that you're _required_ to configure it to use it, rather than it just occurring without you wanting it, which is definitely a good thing
<daniels>
(AFBC, which is actually lossless, given that it does the CCS trick of only saving memory bandwidth rather than space, just happens automatically)
<daniels>
I haven't done much intensive staring at it, but I'm told that you do really notice AFRC visually when you push it down to 2bpc :P
<Company>
AFRC is just cutting off bits, no?
<MrCooper>
man, CMake sure is sloooow compared to meson
<daniels>
Company: it's far smarter than just decimating precision
rasterman has quit []
<Company>
is there a description of how it works?
<daniels>
the only public sources I know of are patents, and I don't read patents as a rule
<RSpliet>
I imagine they pulled some tricks from texture compression algorithms. Well, not ASTC of course, the amount of computation you need for compressing that is bonkers, but simpler algorithms
<Company>
then I'll doubt the "far smarter" part for now
<Company>
the biggest innovation in there seems to be the guaranteed size reduction
<Company>
that'd be the typical kind of lossy compression you want for video/photographs
<Company>
that kills your desktop UIs and text
jfalempe has joined #dri-devel
djbw has joined #dri-devel
hansg has quit [Quit: Leaving]
lynxeye has quit [Quit: Leaving.]
callen92 has joined #dri-devel
hansg has joined #dri-devel
Leopold has joined #dri-devel
<CounterPillow>
the video/photo compression mostly kills your UIs and text due to chroma subsampling, which this likely doesn't do
simondnnsn has quit [Read error: Connection reset by peer]
rasterman has joined #dri-devel
simondnnsn has joined #dri-devel
jewins has quit [Ping timeout: 480 seconds]
glennk has joined #dri-devel
fab has joined #dri-devel
jewins has joined #dri-devel
colemickens_ has quit []
eukara has joined #dri-devel
dsrt^ has quit [Ping timeout: 480 seconds]
<Company>
CounterPillow: all kinds of subsampling kill it, because you lose straight lines
eukara has quit []
eukara has joined #dri-devel
<Company>
desktop UIs and fonts are absolute fans of using 1px wide lines - horizontal and vertical, too
dsrt^ has joined #dri-devel
Leopold has quit [Remote host closed the connection]
<JoshuaAshton>
Has there ever been discussion about code that allowed drm_sched jobs to be submitted with timeouts lower than `sched->timeout`? Anyone aware if this has come up before?
<JoshuaAshton>
Right now on AMDGPU, the timeouts are like 10s for gfx and 60s for compute
<JoshuaAshton>
Which is pretty extreme for Desktop
<JoshuaAshton>
Windows has 2s for the default TDR for applications
<JoshuaAshton>
It would be nice to be able to control this on a per-job granularity I think, so maybe long running applications can request up to those amounts, but regular desktop applications would get eg. 2s
<sima>
JoshuaAshton, yeah those are way too long, but also they're kinda way too long because people want to run compute jobs on there and then they randomly fail
<sima>
the issue is also that without preemption if you have anything that actually takes too long your desktop freezes anyway, even if all the desktop stuff finishes rendering quickly
<sima>
it's all a bit frustrating big time suck :-/
<sima>
also compute jobs = some cts stuff iirc
<JoshuaAshton>
Yeah, amdgpu has 60s for compute.
<JoshuaAshton>
Hmmm
<sima>
iirc they bumped it from the 10s for other engines
<JoshuaAshton>
yea
<sima>
anholt iirc set something a _lot_ more reasonable for vc4/v3d
<JoshuaAshton>
pixelcluster and I have been doing a bunch of stuff to make GPU recovery actually work and useful on Deck and AMD in general
<JoshuaAshton>
one of the things I am seeing now that it actually *works* is that the timeout is a bit long for good-ish UX here
<sima>
yeah I think for desktop you probably want below 1s or so and actually recover
<sima>
ofc assumes your recovery is reliable enough
<pixelcluster>
I'd disagree
<JoshuaAshton>
below 1s is probably a bit much
<pixelcluster>
often games submit a bunch of stuff in bulk and it happens that frametimes spike over 1s
<sima>
hm yeah I guess for games where it's the only thing more is ok
<sima>
but apps randomly stalling your desktop for multiple seconds is rather rough
<JoshuaAshton>
yea
<JoshuaAshton>
60s for compute is way too much on Deck. I think I will at least put the launch param to be 10s to be equal with gfx
<pixelcluster>
tried to work once with a raytracing app running in the background, 600ms per frame consistently
<pixelcluster>
can't recommend
<sima>
I guess for games the timeout + recovery needs to be quicker than what it takes the user to reach the reset button
<JoshuaAshton>
Maybe 5s would be a good compromise for Deck pixelcluster?
<JoshuaAshton>
Then we leave compute at 10
<sima>
JoshuaAshton, there's also the issue that the timeout is per job and you probably just want an overall frame timemout
<pixelcluster>
sounds good to me
<pixelcluster>
(5s timeout that is)
<JoshuaAshton>
sima: Yeah, it could be lots of really long small submissions
<JoshuaAshton>
Old Chrome used to do lots of small submissions e-e
<JoshuaAshton>
I guess my problem is less with games taking a long time, but actual hang detection
simon-perretta-img has quit [Ping timeout: 480 seconds]
<sima>
JoshuaAshton, yeah a priviledge "kill this" ioctl on a fence might be useful for that
<JoshuaAshton>
I didn't consider also handling this from userspace hmmm
hansg has quit [Quit: Leaving]
<sima>
then compositor or whatever can decide when it's getting bad and force everything in that app to reset/recover
<JoshuaAshton>
That's interesting
simon-perretta-img has joined #dri-devel
<sima>
JoshuaAshton, aside from the issue of overall frame time (which the kernel doesn't really know) it also solves the issue that you want much shorter timeout for desktop than single full screen app
<sima>
like the steam ui should probably not be stuck for 5s, but the game might have stalling frames and you don't want to kill it
<sima>
hm ... getting the priviledge model for that ioctl right might be really tricky
<sima>
or at least I don't have any good ideas
jewins has quit [Remote host closed the connection]
jewins has joined #dri-devel
<JoshuaAshton>
Yeah
<Lynne>
JoshuaAshton: with your pr + kernel changes, what kind of hang can be recovered from?
<JoshuaAshton>
Another option is letting the submit ioctl pick the timeout (obv that will be min'ed with that of the drm_sched job)
<JoshuaAshton>
Lynne: Both soft recovery + MODE2 has been tested to be successful and not stomp others (although MODE2 may take one or two innocent contexts if they had stuff in CUs at the time)
<Lynne>
is an out of bounds read (e.g. from BDA) recoverable?
ngcortes has joined #dri-devel
<JoshuaAshton>
Yes, page faults should be fully recoverable, even with soft recovery, but you'll need some extra kernel patches to workaround what appears to be a bug with interrupt handling buffer overflows on RDNA2 (and maybe other gens)
<JoshuaAshton>
There is also another kernel patch to fix soft recovery hang reporting
<JoshuaAshton>
You'll need both my MRs for it to work for both vk + gl also
sima has quit [Ping timeout: 480 seconds]
simon-perretta-img has quit [Ping timeout: 480 seconds]
jewins has quit [Ping timeout: 480 seconds]
simon-perretta-img has joined #dri-devel
Ahuj has quit [Quit: Leaving]
gouchi has quit [Remote host closed the connection]
gouchi has joined #dri-devel
jfalempe has quit [Remote host closed the connection]
ondracka_ has quit [Remote host closed the connection]
ondracka_ has joined #dri-devel
alanc has quit [Remote host closed the connection]
alanc has joined #dri-devel
<Plagman>
JoshuaAshton: handling it from userspace is more or less what the point of the interface we added was
<Plagman>
the two udev rules are like a super minimal/naive implementation of it, but the idea was always that it could be fleshed out further as a bunch of logic in the compositor or session manager to decide what to kill and within what timeframes
<Plagman>
and how to message the user
<Plagman>
i'd like to see that convo revisited upstream, last time i think vetter took objection to the existence of such an interface, and it got amalgamated with kdevcoredump, which is not at all the same thing
jewins has joined #dri-devel
frieder has quit [Remote host closed the connection]
fab has quit [Quit: fab]
ngcortes has quit [Ping timeout: 480 seconds]
ondracka_ has quit [Ping timeout: 480 seconds]
Leopold_ has joined #dri-devel
Leopold_ has quit [Remote host closed the connection]
mvlad has quit [Remote host closed the connection]
gouchi has quit [Quit: Quitte]
ondracka_ has joined #dri-devel
alarumbe has quit [Quit: ZNC 1.8.2+deb2 - https://znc.in]
alarumbe has joined #dri-devel
ngcortes has joined #dri-devel
tursulin has quit [Ping timeout: 480 seconds]
ngcortes has quit [Read error: Connection reset by peer]
Duke`` has quit [Ping timeout: 480 seconds]
ngcortes has joined #dri-devel
glennk has quit [Ping timeout: 480 seconds]
ondracka_ has quit [Ping timeout: 480 seconds]
Aura has quit [Ping timeout: 480 seconds]
Aura has joined #dri-devel
kzd has joined #dri-devel
sarnex has quit [Read error: Connection reset by peer]
<JoshuaAshton>
Plagman: I think this (picking the TDR timeout time) and that are separate, but yes I think we should re-visit that
<Plagman>
yes
<Plagman>
agreed
<JoshuaAshton>
It would be nice for us to get feedback on an app hang so we can display a nice modal with "Oopsie woopsie" or whatever, and also for getting good feedback in Steam as to what apps are hanging for users.
sarnex has joined #dri-devel
Marcand has quit [Remote host closed the connection]
Aura has quit [Ping timeout: 480 seconds]
<zmike>
can I just say that "oopsie woopsie" gets my vote
mclasen has quit []
mclasen has joined #dri-devel
dsrt^ has quit [Ping timeout: 480 seconds]
anholt has joined #dri-devel
<karolherbst>
same
callen92 has quit [Ping timeout: 480 seconds]
dsrt^ has joined #dri-devel
jsa has quit [Read error: Connection reset by peer]