ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar
feaneron has quit [Ping timeout: 480 seconds]
Company has quit [Quit: Leaving]
Nasina has quit [Read error: Connection reset by peer]
Nasina has joined #dri-devel
djbw has quit [Ping timeout: 480 seconds]
anholt has quit [Ping timeout: 480 seconds]
u-amarsh04 has quit [Remote host closed the connection]
alane has quit []
alane has joined #dri-devel
u-amarsh04 has joined #dri-devel
AldairsilvaSilva[m] has joined #dri-devel
epoch101 has quit []
u-amarsh04 has quit [Read error: Connection reset by peer]
luc has joined #dri-devel
anholt has joined #dri-devel
amarsh04 has joined #dri-devel
amarsh04 has quit []
u-amarsh04 has joined #dri-devel
yrlf has quit [Ping timeout: 480 seconds]
yrlf has joined #dri-devel
Daanct12 has joined #dri-devel
nerdopolis has quit [Ping timeout: 480 seconds]
kzd has quit [Ping timeout: 480 seconds]
u-amarsh04 has quit [Remote host closed the connection]
amarsh04 has joined #dri-devel
aravind has joined #dri-devel
Nasina has quit [Read error: Connection reset by peer]
Nasina has joined #dri-devel
Nasina has quit [Read error: Connection reset by peer]
dsimic is now known as Guest13291
dsimic has joined #dri-devel
Guest13291 has quit [Ping timeout: 480 seconds]
luc has quit [Remote host closed the connection]
Nasina has joined #dri-devel
Daanct12 has quit [Quit: WeeChat 4.6.1]
anholt has quit [Ping timeout: 480 seconds]
glennk has joined #dri-devel
dolphin has joined #dri-devel
paulk-bis has joined #dri-devel
aravind has quit [Ping timeout: 480 seconds]
paulk has quit [Ping timeout: 480 seconds]
Daanct12 has joined #dri-devel
Duke`` has joined #dri-devel
sima has joined #dri-devel
dviola has left #dri-devel [WeeChat 4.6.0]
dviola has joined #dri-devel
dviola has quit [Quit: WeeChat 4.6.0]
dviola has joined #dri-devel
djbw has joined #dri-devel
itoral has joined #dri-devel
phasta has joined #dri-devel
eric_engestrom has quit [Read error: Connection reset by peer]
eric_engestrom has joined #dri-devel
fab has joined #dri-devel
rasterman has joined #dri-devel
tzimmermann has joined #dri-devel
sghuge has quit [Remote host closed the connection]
mehdi-djait3397165695212282475 has joined #dri-devel
sghuge has joined #dri-devel
dt9 has joined #dri-devel
vliaskov has joined #dri-devel
amarsh04 has quit []
u-amarsh04 has joined #dri-devel
jkrzyszt has joined #dri-devel
jkrzyszt has quit [Quit: Konversation terminated!]
u-amarsh04 has quit [Ping timeout: 480 seconds]
zzoon[m] is now known as zzoon_back_15th[m]
jkrzyszt has joined #dri-devel
fab has quit [Quit: fab]
lynxeye has joined #dri-devel
u-amarsh04 has joined #dri-devel
sguddati has joined #dri-devel
<jfalempe>
What is the recommanded way to fix dim with the new gitlab url? I can fetch from git@ssh.gitlab.freedesktop.org:drm/kernel.git but not from "git@gitlab.freedesktop.org:drm/kernel.git" or "ssh://git@gitlab.freedesktop.org/drm/kernel.git"
<jfalempe>
and dim setup always add the later that doesn't work.
u-amarsh04 has quit []
<mlankhorst>
jfalempe: ssh.gitlab.freedesktop.rog should work
<mlankhorst>
Host gitlab.freedesktop.org
<mlankhorst>
Hostname=ssh.gitlab.freedesktop.org
<mlankhorst>
in ~/.ssh/config
u-amarsh04 has joined #dri-devel
<jfalempe>
mlankhorst: thanks, I will try that.
<jani>
yes, that for the first step, and after that drm-rerere update will switch to ssh.gitlab...
<jani>
there was really no way to automate this after the fact
sguddati has quit [Ping timeout: 480 seconds]
paulk-bis has quit []
paulk has joined #dri-devel
<jfalempe>
Thanks, that worked :)
pcercuei has joined #dri-devel
Kayden has quit [Quit: Leaving]
Kayden has joined #dri-devel
fab has joined #dri-devel
sguddati has joined #dri-devel
apinheiro has joined #dri-devel
u-amarsh04 has quit []
u-amarsh04 has joined #dri-devel
u-amarsh04 has quit [Ping timeout: 480 seconds]
fab has quit [Ping timeout: 480 seconds]
u-amarsh04 has joined #dri-devel
<sima>
mlankhorst, tzimmermann the backmerge in drm-misc-next lost the depends BROKEN from linus for DRM_HEADER_TEST due to 8e623137f112eb86ad949e3bcb6c0e5ae11a092a moving it
<sima>
I guess this needs a fixup or even more headlines about turds
u-amarsh04 has quit [Read error: Connection reset by peer]
<tzimmermann>
sima ok
<sima>
tzimmermann, thanks for taking care, and feel free to add my upfront a-b tag so that patch isn't stuck
u-amarsh04 has joined #dri-devel
<tzimmermann>
sima, patch is on dri-devel. if i hear nothing, i'll merge it this afternoon
<sima>
tzimmermann, lgtm
Company has joined #dri-devel
u-amarsh04 has quit [Ping timeout: 480 seconds]
kts has joined #dri-devel
<mlankhorst>
Not the turds!
Nasina has quit [Read error: Connection reset by peer]
Nasina has quit [Read error: Connection reset by peer]
yshui has joined #dri-devel
hellfire7734club[m] has joined #dri-devel
<sima>
bbrezillon, I guess my question is what userspace does if a job fails to allocate
<sima>
it avoids issues in the kernel, but if that means userspace can't really use it, we haven't gained much
mal_ is now known as mal
<bbrezillon>
sima: it does what it does today when a job faults
guludo has quit [Ping timeout: 480 seconds]
<bbrezillon>
on panfrost (old GPUs), it just ignores the fault and continues as if the GPU had done what was expected
<bbrezillon>
on panthor, the execution context is flagged as faulty and must be destroyed/recreated
<bbrezillon>
we have code handling that in mesa/gallium
<bbrezillon>
and panvk just reports it as a DEVICE_LOST
<sima>
bbrezillon, yeah if userspace is ok with that then I think it's all ok
guludo has joined #dri-devel
tonyk2 is now known as tonyk
hfg477[m] has joined #dri-devel
<sima>
it's probably not great though, since it just randomly makes the gpu unreliable under memory pressure
<sima>
but if you say it's the same as know then eh ...
dolphin has quit [Quit: Leaving]
<sima>
standardizing this more definitely sounds like a good idea
Nasina has joined #dri-devel
<sima>
did christian könig comment on any earlier versions?
<bbrezillon>
it's the same as now, without the potential deadlock, I guess
<sima>
ah then I guess it's better for sure
<bbrezillon>
I didn't Cc Christian on that patchset :-/
<sima>
yeah it's maybe also more for gfxstrand
<sima>
was also more thinking about maybe adding some howto overview sections
<sima>
but that's all orthogonal I think
<bbrezillon>
yeah, I can definitely add more docs
<sima>
oh alyssa also knows how to suffer through this with less DEVICE_LOST
<alyssa>
[citation needed]
Nasina has quit [Read error: Connection reset by peer]
devarsht[m] has joined #dri-devel
vsro has joined #dri-devel
<sima>
alyssa, apple streamout memory allocation fun
<sima>
or well, just preallocating it all
<bbrezillon>
sima: it's mostly infra patches in this patchset. Right now panfrost/panthor don't try hard when a allocation failure happens (NO_RETRY|NO_RECLAIM), but I guess this can be extended
<alyssa>
yeah asahi just preallocates a giant buffer and hopes you never run out
<alyssa>
i do not recommend it but i don't have a better option
<sima>
strictly speaking you can't do more than NO_RECLAIM in fence critical sections
<sima>
tzimmermann, reminds me, I owe you a long explainer on the atomic commit fence critical section pain
<bbrezillon>
yeah, that's what I thought
<bbrezillon>
fortunately, on CSF HW we have a way out
<sima>
the slightly better one is to preallocate in the kernel when we run such a context, so that that emergency reserve could be shared between context
<sima>
but it means you need to in-order schedule these
<bbrezillon>
(exception handler called on OOM, to flush the primitives)
<sima>
yeah, but that's only for tiler cache, not for emulating gs stuff
<alyssa>
bbrezillon: partial renders don't solve the general problem you get with geom/tess/xfb
<bbrezillon>
nah, emulating GS is another problem indeed
<sima>
or whatever the exact painful combo was, I'm really good at forgetting that nightmare fuel again :-D
<alyssa>
sima: any of geom/tess/xfb/mesh emulation hits this on asahi
<sima>
bbrezillon, but yeah tiler flush gets you out of this
<alyssa>
probably we could do better in a few simple cases but yeah
vsro has quit []
<sima>
tzimmermann, I guess ping me when I should start typing, or whether you prefer some mail somewhere
<sima>
bbrezillon, on the docs, that's really more a "would be nice, in some separate patch set" thing, just trying to distill all the various discussions into some docs and linking to infrastructure like your sparse shmem bo
<sima>
and defo cc: alyssa, gfxstrand and christian könig on that one as the people who have real understanding
<alyssa>
me? understanding? kernel code?
<alyssa>
i understand just enough to know i'm screwed and no more (:
<sima>
alyssa, oh also userspace uapi aspects
<sima>
plus someone gets to upstream the apple gem driver :-P
<alyssa>
kernels are magical things that come from dnf install
<alyssa>
what is upstream? what is driver?
Nasina has joined #dri-devel
kzd has joined #dri-devel
<sima>
well, it's all just there to run userspace, so who cares what is kernel and what is hw :-P
Nasina has quit [Read error: Connection reset by peer]
Daanct12 has quit [Quit: WeeChat 4.6.1]
kts has quit [Ping timeout: 480 seconds]
fab has quit [Ping timeout: 480 seconds]
<bbrezillon>
sima: I mean, if we can have the docs with the changes, that's probably better :-)
epoch101 has joined #dri-devel
<bbrezillon>
and I don't mind writing extensive docs for something I worked on, because at least I understand it (or I'm supposed to understand it)
<bbrezillon>
and just wanted to know if the overall approach is somewhat sound, or if we're going to hit a wall at some point
<sima>
well it's the same design wall as always, but just from scrolling around in the series it looks reasonable
jackson[m] has joined #dri-devel
haaninjo has joined #dri-devel
Nasina has joined #dri-devel
tomba_ is now known as tomba
heftig has joined #dri-devel
colinmarc has joined #dri-devel
<alyssa>
i think i might be able to do better in a few special cases with geom/tess/xfb
<alyssa>
but for the general thing.. yeah, we kinda screwed here..
<alyssa>
why is "GPU gets unreliable under memory pressure" a.. bad thing, exactly?
<alyssa>
like. under memory pressure, you're going to get device loss in userspace anyway due to allocating command streams and stuff
<bbrezillon>
that ^
rasterman has quit [Quit: Gettin' stinky!]
<bbrezillon>
I don't get it either tbh :-)
<alyssa>
so whether we get device loss right now or loss in 10ms from now, seems, sort of irrelevant at that point
<alyssa>
I guess potentially the differnce is that the CPU side alloc is allowed to go trigger the shrinker and succeed
hakzsam_ has left #dri-devel [#dri-devel]
<sima>
alyssa, we get unreliable much earlier than userspace would notice otherwise
<sima>
because we can only do GFP_NORECLAIM
hakzsam has joined #dri-devel
<sima>
whereas actual allocation failures are GFP_KERNEL or GFP_USER
<sima>
and the kernel loves to fill all the memory with caches, only leaving watermarks free, so if you exhaust those quickly enough you're out of luck
<alyssa>
right.. so it's about whether we can e.g. evict page caches and such?
<sima>
despite that there's potentially enormous amounts of memory around that's trivially reclaimable
<sima>
alyssa, yeah
<alyssa>
right ok
<sima>
well even entirely clean cache dropping isn't possible
<sima>
because locking
<bbrezillon>
but there are flags for that, no?
<bbrezillon>
like, retry-but-not-too-hard
<sima>
yeah it's the GFP hierarchy, but since dma_fence are maximally nasty you can't do any of them
<bbrezillon>
I remember i915 progressively increasing the reclaimness
<sima>
and rule of thumb is that already in the io-path you better have mempools because GFP_NOIO can just fail with sufficient amounts of bad luck
<sima>
yeah but that's either not in dma-fence paths, or when it was that code, very much not a good idea
<alyssa>
bbrezillon: is the drm r4l call right now?
<sima>
once you don't hold nasty amounts of locks, you can set a lot more GFP flags to get at memory that's reclaimable
<alyssa>
or did i just miss it an hour ago because of timezones
<bbrezillon>
alyssa: was an hour ago :-/
<alyssa>
d'oh.
enick_500 has joined #dri-devel
<sima>
bbrezillon, like i915 had a GFP_NOFS path (avoids shrinkers) when holding locks, and then a lock-drop+retry fallback
<alyssa>
screwed by daylight savings once again. oh well, next week hopefully then
x512[m] has joined #dri-devel
robertmader[m] has joined #dri-devel
<bbrezillon>
sima: I see. Is they not was we can say, I want weak reclaim, and don't use that shrinker...
<sima>
oh there is
<sima>
you don't get dma_fence at the end of the batch
<sima>
but you need some kind of preempt
<sima>
so for compute workloads this works really well, if you want interop, not so much
epoch101 has quit []
<bbrezillon>
*Is there no way
djbw has quit [Ping timeout: 480 seconds]
<bbrezillon>
hm, not even trying to flush the FS cache is likely going to introduce a perf regression on panthor, and an higher failing rate on panfrost...
<bbrezillon>
because as you said, Linux tends to fill those cache as much as possible
<bbrezillon>
and relies on reclaim to flush those on memory pressure
lucianodev[m] has joined #dri-devel
Nasina has joined #dri-devel
fab has joined #dri-devel
Nasina has quit [Ping timeout: 480 seconds]
kts has joined #dri-devel
djbw has joined #dri-devel
un1c0rn has joined #dri-devel
phasta has quit [Quit: Leaving]
feaneron has quit [Quit: feaneron]
<cwabbott>
alyssa: gfxstrand: why do we have a separate load_global_constant when afaict it's identical to load_global with CAN_REORDER access bit set?
djbw has quit [Ping timeout: 480 seconds]
lanodan has quit [Remote host closed the connection]
<alyssa>
cwabbott: i think "historical accident & we should merge"
lanodan has joined #dri-devel
<alyssa>
i would offer to do the refactor but i'm spread really thin right now
fab has quit [Quit: fab]
bolson has joined #dri-devel
vyivel_ has quit [Remote host closed the connection]
vyivel has joined #dri-devel
<cwabbott>
alyssa: apparently right now we lower load_global(_constant) to load_global_ir3 which has 2x32 address and offset
<cwabbott>
I guess I should translate load_global_constant to that with WRITE_ONLY+CAN_REORDER rather than creating a load_global_constant_ir3?
<alyssa>
should be able to yea
<cwabbott>
*READ_ONLY
bolson_ has joined #dri-devel
<cwabbott>
the reason I'm looking into it is that apparently PoE 2 runs terribly in part because it uses raw pointers instead of UBOs and we're not moving them to the preamble
<alyssa>
oof
bolson has quit [Ping timeout: 480 seconds]
<dj-death>
cwabbott: on intel we can use a different cache on some HW if I remember correctly
<dj-death>
constant cache vs data cache
<cwabbott>
right, but that's what ACCESS_NON_WRITEABLE is for
<cwabbott>
you don't need a separate intrinsic for that
<dj-death>
yeah, might need to be careful with the invalidation flags of the vulkan API though
<cwabbott>
seems like it was initially added by gfxstrand for OpenCL const memory (where it was already redundant) then Kayden made NonWriteable loads use it so now it's really the same thing
<dj-death>
it might be fine then
<Kayden>
Yeah, alyssa and I had talked about that a while ago. load_global with an access flag makes more sense
<Kayden>
just hadn't gotten around to refactoring
feaneron has joined #dri-devel
benjaminl has quit [Read error: Connection reset by peer]
benjaminl has joined #dri-devel
glennk has quit [Read error: Connection reset by peer]
djbw has joined #dri-devel
glennk has joined #dri-devel
jkrzyszt has quit [Quit: Konversation terminated!]
glennk has quit [Read error: Connection reset by peer]
glennk has joined #dri-devel
AldairsilvaSilva[m] has left #dri-devel [#dri-devel]