<karolherbst>
the tegra gallium driver has to detect this and do the same
<karolherbst>
and increase the refcount by 100000000 for resources managed this way
<karolherbst>
inside nvc0
<karolherbst>
maybe when unwrapping the resource?
MrCooper has quit [Remote host closed the connection]
gpuman_ has quit [Ping timeout: 480 seconds]
MrCooper has joined #dri-devel
gpuman_ has joined #dri-devel
gpuman___ has joined #dri-devel
gpuman__ has quit [Ping timeout: 480 seconds]
itoral has joined #dri-devel
gawin has joined #dri-devel
kevintang has joined #dri-devel
<jekstrand>
alyssa: yes
<alyssa>
jekstrand: Awww
* jekstrand
is a bit grumpy about it
<karolherbst>
alyssa: btw, you won
<alyssa>
what did I win
<alyssa>
a hug?
<alyssa>
awww thanks
<alyssa>
🤗
<karolherbst>
alyssa: first XDC related phoronix article award 2021 :D
<alyssa>
karolherbst: uh, no, v3dv talk before me
<karolherbst>
ohh crap
<karolherbst>
didn't saw that one
<karolherbst>
nvm
<karolherbst>
you won regardless
<karolherbst>
I blame software
<tagr>
karolherbst: okay, I'll have to look into it
<karolherbst>
tagr: I already have a patch though, but maybe you find a better solution
<karolherbst>
will open an MR and cc you on that
<tagr>
karolherbst: yeah, that'd be great, I doubt that I'd find a better solution =)
<karolherbst>
I am sure you will as mine is very crappy
<karolherbst>
but it shows what needs to change at least
gpuman has joined #dri-devel
gpuman__ has joined #dri-devel
gpuman_ has quit [Ping timeout: 480 seconds]
gpuman___ has quit [Ping timeout: 480 seconds]
<karolherbst>
tagr: there are other bugs which might be easier for you to investigate. Something broke with modifiers and we end up with an invalid modifier requested, not sure what this is about, but reverting your modifier rework "fixes" that. But with all those fixes combined, gdm still stays black :/
<karolherbst>
but not black as in "it displays black" but black as in the tty switch didn't complete and I see the non blinking cursor
<karolherbst>
this is on the jetson nano with the UEFI firmware and fedora 35 thoguh
<tagr>
karolherbst: from a quick look the only "better" way to fix this that I can think of would be fairly involved
<karolherbst>
tagr: yeah.... but I know not enough of the driver to do this :)
<tagr>
i.e. we'd need to basically introduce new (optional) callbacks for ref/unref
<karolherbst>
and those two comparisons also don't hurt as much..
<tagr>
so for drivers that wrap resources, they can proxy the ref/unref to the underlying operations
<tagr>
or underlying resources, rather
<karolherbst>
uhh.. yeah but that would add CPU overhead again
X-Scale` has quit [Ping timeout: 480 seconds]
<tagr>
yeah, that, too
<tagr>
but it'd be the only way I can think of that would keep the reference counts properly sync'ed
<karolherbst>
tagr: I was more thinking along the lines of just doing it once
<karolherbst>
like figure out where st/mesa is doing this and just catch those places
<karolherbst>
but that might be annoying to catch up with future changes
<tagr>
well, all of that happens within the state tracking code and none of that has any knowledge of driver-specifics
<karolherbst>
yeah.. probably
<tagr>
the underlying problem is that that code operates directly on the references, so there's no way to catch it from the driver
<tagr>
I suppose perhaps a middle-ground would be to add some sort of callback that drivers can use for that specific case
<tomeu>
krh: robclark: anholt_: can you think of something that chrome in chromeos does that requires a batch flush that wouldn't be caught by deqp-*?
<tagr>
so we wouldn't have to call back into the driver for every reference, but only for the case when we use that fast path
kevintang has quit [Remote host closed the connection]
<tomeu>
trying to figure out which flush panfrost is missing
kevintang has joined #dri-devel
<tomeu>
with a flush after every draw, the output is correct (just a bit slow :))
<tagr>
karolherbst: so instead of doing that p_atomic_add() in st_get_buffer_reference(), we'd basically be doing something like context->make_private_reference() which by default would just do the increment by 100000000, but on drivers like Tegra could be overridden to pass that onto the underlying reference
<tagr>
obviously that assumes that we can somehow get from gl_context to pipe_context, which I think we can via gl_context::st_context::pipe
<karolherbst>
tagr: yeah... just a question if that's all worth the effort
<tagr>
shouldn't be that much effort, but I think it'll be the only way to cleanly do it
<tagr>
if you do it at unwrap time it's not guaranteed to remain in sync, right?
<karolherbst>
tagr: it should be, as this all happens at reasource creation time somehow and stays like that forever
<karolherbst>
_but_ I think st/mesa could end up freeing resources and we would leak the internal nvc0 one regardless?
<karolherbst>
not sure
dllud has joined #dri-devel
dllud_ has quit [Read error: Connection reset by peer]
<karolherbst>
tagr: I think that we actually leak all that stuff...
<karolherbst>
as tegra would probably never know the resource gets fred, right?
<karolherbst>
st/mesa will unbind it in some way the driver doesn't know about it and then it's gone?
<tagr>
that should never happen, because the tegra resources always hold on to one reference
<karolherbst>
tagr: okay.. so worst case we have to change the resource_destroy cb
<karolherbst>
yeah.. we have to
<karolherbst>
tagr: soo.. tegra decreases the ref by one and frees its own
<karolherbst>
which in this case leaks those st/mesa managed resources
<karolherbst>
soo.. we need to save which resources we hacked the refcount up, check that in tegra_screen_resource_destroy and call into nvc0
<tagr>
that part would be a lot easier if the driver got a notification whenever this special trick came into play
calebccff has joined #dri-devel
<gawin>
hmm, for running steam with custom build of mesa `LD_LIBRARY_PATH="somepath/lib64" MESA_LOADER_DRIVER_OVERRIDE=crocus steam` should be enough? or should I do it per game
gpuman_ has joined #dri-devel
gpuman__ has quit [Ping timeout: 480 seconds]
nirmoy has joined #dri-devel
<pq>
gawin, wasn't steam a 32-bit app itself? What about the game?
<gawin>
ah, yes, you're right, also most of steam games are 32bits
lemonzest has quit [Quit: WeeChat 3.2]
<HdkR>
That "most" qualifier for the native games is changing :P
alyssa has left #dri-devel [#dri-devel]
gpuman__ has joined #dri-devel
gpuman___ has joined #dri-devel
lemonzest has joined #dri-devel
gpuman_ has quit [Ping timeout: 480 seconds]
gpuman has quit [Ping timeout: 480 seconds]
<gawin>
is there easy way to check if I'm using devel?
<MrCooper>
Help → System Information → scroll down to the Mesa version
jewins has joined #dri-devel
<gawin>
with just `lib` got it working (errors from ubsan in console), but steam still thinks it's 21.2.1, also looks like games are still using older driver
<karolherbst>
tagr: maybe mareko has any inputs on that matter as mareko was the one working on those changes. But I think as long as we only add a once time called callback there wouldn't be too much concerns about adding a little bit of overhead there.
dllud has quit [Remote host closed the connection]
dllud has joined #dri-devel
sdutt has joined #dri-devel
sdutt has quit []
sdutt has joined #dri-devel
<gawin>
karolherbst: been wondering what hardware do you use for nouvem, swapping pcie cards? or igpu from nForce family? I wanted to run some tests on r300 drivers, but my gpu from that era (x800) has terrible "culture of work"
gpuman has joined #dri-devel
alanc has quit [Remote host closed the connection]
alanc has joined #dri-devel
<gawin>
the turbine design is super loud (even when I tried to disconnect gpu from OS)
gpuman___ has quit [Remote host closed the connection]
gpuman__ has quit [Ping timeout: 480 seconds]
fxkamd has joined #dri-devel
<karolherbst>
gawin: just a desktop with pcie or laptops depending on what I've been working on
gpuman_ has joined #dri-devel
i-garrison has quit []
i-garrison has joined #dri-devel
kevintang has quit [Ping timeout: 480 seconds]
<robclark>
tomeu: not sure.. but just a guess, in chrome://flags disable partial swap?
gpuman__ has joined #dri-devel
gpuman_ has quit [Ping timeout: 480 seconds]
mattrope has joined #dri-devel
tobiasjakobi has joined #dri-devel
Company has joined #dri-devel
mattrope has quit [Remote host closed the connection]
tobiasjakobi has quit [Remote host closed the connection]
<ajax>
mikezackles: mmm. me, eric_engestrom, kusma... probably we should call out wsi explicitly in REVIEWERS
<ajax>
my apologies for not getting to that one sooner, i'm semi-away-from-coding this month
tobiasjakobi has joined #dri-devel
tobiasjakobi has quit [Remote host closed the connection]
<bnieuwenhuizen>
huh TIL about REVIEWERS
<mikezackles>
no problem, take your time! Understand the need for time off. There's just so much traffic I was afraid it'd get buried
gpuman_ has joined #dri-devel
gpuman__ has quit [Ping timeout: 480 seconds]
<eric_engestrom>
the REVIEWERS file is probably very outdated by now, and we don't address people by they email address but by their gitlab username now, so it's not usable unless you already know someone's username, in which case you probably know whether they might be interested in your patch
<eric_engestrom>
mikezackles: I saw your MR but didn't have time to really look at the code, but my first instinct was that suboptimal will mask errors with your change I think?
<eric_engestrom>
I mean, if a swapchain gets resized, you set it to suboptimal, but if an error happens to it later that error will be discarded because suboptimal now takes precedence
<mikezackles>
Ah I think I see what you mean
<ajax>
eric_engestrom: sounds like we need to update that file then
<eric_engestrom>
I don't have more time right now, but I'll try to look into the code this weekend (my day job isn't working on mesa anymore unfortunately)
<eric_engestrom>
ajax: yeah, I had an MR to convert it to gitlab's CODEOWNER but there were objections, but it's been too long I don't remember what
<mikezackles>
No prob, as long as it's on the radar. If I get some time I'll see if I can think of a fix
<eric_engestrom>
I think it's useful even if it's not automatic, but yeah it could be better
<eric_engestrom>
(and that message was only 2 months ago actually?! what is time?)
<eric_engestrom>
I'll try to rebase this MR soon, kusma left some comments
<eric_engestrom>
mikezackles: thanks for fixing this!
<mikezackles>
almost fixing lol. But yw :)
mbrost has joined #dri-devel
gpuman__ has joined #dri-devel
<ajax>
hey, you totally did.
gpuman_ has quit [Ping timeout: 480 seconds]
Duke`` has joined #dri-devel
<eric_engestrom>
mikezackles: seriously, the most important step is noticing that something's wrong, and the next most important is figuring out *what* is wrong; you did both of those ;)
vbelgaum_ has joined #dri-devel
<mikezackles>
Haha, ok ok thanks guys :) Well while we're at it thank you for all the work on mesa in general
hch12907 has joined #dri-devel
fxkamd has quit []
anusha has joined #dri-devel
hch12907_ has joined #dri-devel
hch12907 has quit [Ping timeout: 480 seconds]
dviola has quit [Quit: WeeChat 3.2]
nirmoy has quit []
slattann has joined #dri-devel
sneil has quit [Quit: Leaving]
sneil has joined #dri-devel
Surkow|laptop has quit [Remote host closed the connection]
Surkow|laptop has joined #dri-devel
dviola has joined #dri-devel
gawin has quit [Ping timeout: 480 seconds]
<jenatali>
A better solution than CODEOWNERS would be allowing non-developers to tag MRs, since people can subscribe to whatever tags they'd like to be notified about
itoral has quit []
<jenatali>
That way there doesn't need to be a centralized list that can get out-of-date for who owns what, just tag it with what it changes and the right people will know
idr has joined #dri-devel
<jenatali>
Still bugs me that permission to label MRs is so limited, it doesn't make sense to me
anujp has joined #dri-devel
<bnieuwenhuizen>
jenatali: permission to label MRs is the same permission to be able to assign MRs to marge ...
<jenatali>
bnieuwenhuizen: Right, not saying it's our fault, I'm blaming GitLab for making it that way
<jenatali>
I just wish the permissions were more fine-grained
<ajax>
... i'm not totally sure that's true...
<ajax>
pfft. Reporter can label Issues but not Merge Requests.
<HdkR>
wha? I just opened an issue and couldn't stick labels on it
<bnieuwenhuizen>
HdkR: I think Reporter is more like "can edit issues" while creating them is mostly more open
<HdkR>
oh
<HdkR>
Definitely assumed here that "Reporter" was the person opening the issue, not a permission role
anusha has quit []
<mdnavare>
does anyone know what env variables does Gnome shell Launch usin dedicated Graphics card set? does it set "DRI PRIME" ?
<MrCooper>
most likely
<mdnavare>
So when I rigjt click and launch with Dedicated card, that internally should use the switcherooctl launch that should set the PRIME env variable right?
<agd5f>
mdnavare, I suspect it just sets the env var and then runs the app
<mdnavare>
agd5f: Yes but underlying it still relies on teh PRIME variable right
<agd5f>
mdnavare, yeah
<mdnavare>
Okay great thanks a lot agd5f
<mdnavare>
Also do you know any GL app that we can try launching iwth GUI Launch option
<mdnavare>
Or any game like xonotic ?
<agd5f>
should work for any OpenGL app
<HdkR>
Try launching steam ;)
<mdnavare>
HdkR: After i install Steam, can I launch it through GUI, I have only tried launching it command line
gawin has joined #dri-devel
<LaserEyess>
steam comes with a desktop file, so yeah you could just use that or whatever launcher your DE/WM comes with
<LaserEyess>
but also you could launch a game from steam as well, not just steam, and that would be "launched from a GUI", no?
oneforall2 has quit [Remote host closed the connection]
slattann has quit []
gouchi has joined #dri-devel
<MrCooper>
current Steam has PrefersNonDefaultGPU=true & X-KDE-RunOnDiscreteGpu=true in the desktop file, so it's supposed to run on any non-default GPU by default on GNOME
oneforall2 has joined #dri-devel
<emersion>
non-deafult, by default… :P
<MrCooper>
different default :)
adavy has joined #dri-devel
mlankhorst has quit [Ping timeout: 480 seconds]
<mareko>
tagr: your driver seems to count references incorrectly if it can't handle that the state tracker likes to keep more references for itself
<karolherbst>
tagr: I just found more corruptions :(
<karolherbst>
mareko: the problem is, that the st kind of lies and tegra is wrapping around nouveau
<karolherbst>
but yeah..
<karolherbst>
the entire construct is quite unstable
<karolherbst>
and personally I am not even sure it's worth the effort
<Kayden>
MrCooper: that seems kinda bad, Steam is just a web browser GUI app, it shouldn't be running on the discrete GPU. things *launched* by Steam on the other hand...those likely ought to...
anusha has joined #dri-devel
<karolherbst>
Kayden: we had this discussion again when I was complaining :D
<karolherbst>
*already
<Kayden>
otherwise just having steam open with e.g. your friends list and chat would keep the discrete GPU powered on utterly soaking your laptop battery even if it's just sitting there idling
<Kayden>
ah, okay
<karolherbst>
Kayden: not implemented or so I got told
<karolherbst>
ENOTIME
<karolherbst>
it can crash nouveau badly due to multithreading
<Kayden>
:/
<karolherbst>
Kayden: at first I got yelled at because why would valve change anything like that, turned out, their default desktop file shipped by valve was changed :P
<Kayden>
I don't really understand how nouveau has so many threading issues
<karolherbst>
Kayden: it's just one
<karolherbst>
Kayden: quick answer: shared command buffer across all contexts
<karolherbst>
tell that to others who think it's a good idea
<Kayden>
every context is supposed to have its own command buffer...
<karolherbst>
my MR is controversial 🙃
<karolherbst>
Kayden: no really :D
<karolherbst>
tell that to other big nouveau contributors
<Kayden>
I mean, that's not really a multi-threading issue, that's just multi-context is broken
<karolherbst>
yes
<karolherbst>
but it's not easy on NV hw
<karolherbst>
as...
<karolherbst>
you have a hw context
<karolherbst>
with state
<karolherbst>
and you don't want to context switch
<karolherbst>
and you have to change state and save/restore and....
<karolherbst>
you are lucky, you have your "draw command buffers" being their own thing and the hw doesn't care beyond those
<karolherbst>
NV hw does
<karolherbst>
so there is a lot of state in the context on the GPU
<Kayden>
GL_KHR_context_flush_control is going to have something to say about that too, because you can disable flushing on MakeCurrent, so even if you had a single thread, you are supposed to be able to build up multiple command buffers without flushing them on switch
<karolherbst>
it's,... annoying
<anholt_>
I mean, everyone has that. and you just reset your state at the start of your command buffer.
<karolherbst>
anholt_: we can't
<karolherbst>
at least.. our driver can't
<karolherbst>
we have to set every single bit again basically
<Kayden>
this really doesn't sound different, intel and others have contexts as well with a bunch of state in them
<Kayden>
and switching has overhead
<karolherbst>
yeah.. probably
<Kayden>
I think at one point the HW was copying multiple MB on context switch
<karolherbst>
but our driver doesn't track it
<karolherbst>
we don't have any idea what state is actually set on the GPU
<anholt_>
i915, v3d, vc4, freedreno, etc., everyone sets every bit of state at the start of their batch.
<karolherbst>
some bits.. yes
<karolherbst>
but not all of it
<karolherbst>
anholt_: we don't have batches
<Kayden>
iris aggressively inherits state from previous batches, actually
<anholt_>
what is it that you submit to the kernel?
<imirkin>
potentially one bit of missing info: nouveau doesn't make one hw context per gallium pipe_context.
<anholt_>
Kayden: yeah, iris is an exception because you can load/store context state.
<Kayden>
but the HW context is saved/restored
<karolherbst>
anholt_: okay.. let me rephrase.. there is no "end" or "bounds" to a batch, just a continous stream of commands
<anholt_>
karolherbst: there's no "ok, now submit" ioctl?
<karolherbst>
anholt_: sure, that we have, but not because the hw works like that
<anholt_>
but your commands are all stored in a buffer, and then at some point you submit it, right?
<karolherbst>
sure
<anholt_>
that buffer is what I mean by a batch.
<karolherbst>
okay. I was always under an assumption that you always have some meta data attached to a batch
<karolherbst>
as in "this is a collection of context" and the hw being aware of it
<karolherbst>
so it can say "this batch finished"
<karolherbst>
or whatever
<karolherbst>
or the batch itself has state
<karolherbst>
maybe I also missremember things here
<karolherbst>
but we have none of this
<karolherbst>
just a ringbuffer
<karolherbst>
and the GPU keeps executing stuff
<anholt_>
what kind of metadata are you thinking? v3d's submit is "start, end offset". that's it.
<anholt_>
yeah, same.
<karolherbst>
okay
<karolherbst>
I thought there was more to it at least on some GPUs
<karolherbst>
anyway
<karolherbst>
I don't think our driver is written in a solid way anyway
<karolherbst>
and isn't at all multithreading capable
<karolherbst>
it's all broken
<Kayden>
you probably do want one HW context per pipe context
<karolherbst>
no
reductum has quit [Read error: Connection reset by peer]
<karolherbst>
we absolutely don't
<karolherbst>
we talk about ms of context switching
<karolherbst>
not ns
<karolherbst>
or
<karolherbst>
so
<karolherbst>
literally ms
<imirkin>
a context is a lot of data, so pretty expensive to switch out
<imirkin>
we've avoided going with hw contexts to represent GL contexts for this reason
<imirkin>
blob driver also avoids it, iirc
<karolherbst>
it involves firmware code getting executed
<imirkin>
(at least did at some point)
<Kayden>
okay, so
<jenatali>
Yeah, D3D (11 and older) also use a screen-level hardware context
<karolherbst>
imirkin: it's better on modern hw though
<anholt_>
yeah, so just reset your state at the start of your batch while multiplexing on a hw context
<karolherbst>
anholt_: can't do it :)
<karolherbst>
as I said: the driver doesn't track the state
<Kayden>
in that case, you do what anholt suggested: you queue up commands in userspace, and submit them in one big quantum of work
<Kayden>
that quantum of work executes in the ring at one time
<karolherbst>
yeah.. the driver needs to be rewritten quite a lot to make it all work
<imirkin>
yeah, i think that's the radeonsi approach
<Kayden>
after you submit you "forget" everything about the state of the GPU and flag everything dirty
<karolherbst>
imirkin: the only sane one
<imirkin>
unfortunately a lot of those commands involve stalls as well
<Kayden>
i965 has been doing this since 2006
<karolherbst>
Kayden: yeah, basically
<karolherbst>
we could do some tracking, but we would always have to start from scratch
<karolherbst>
which.. we can't really
<imirkin>
so we've also avoided *that* approach :)
<karolherbst>
yeah...
<Kayden>
it sounds like you *can*, you just are deciding to sacrifice correctness to avoid stalls
<karolherbst>
a lot of WFI stuff
<karolherbst>
Kayden: depends
<karolherbst>
we don't have to set everything
<karolherbst>
shader buffer address is also part of the state
<Kayden>
we also try and queue up larger piles of work before submitting so that we don't have to reprogram the universe as often
<karolherbst>
which involves WFI
<karolherbst>
Kayden: ahh, good idea actually
<karolherbst>
but that kind of means writing a new driver anyway :p
<Kayden>
we have spun up several new drivers in tree recently... :)
<jenatali>
FWIW the D3D11 way of doing this is to essentially have a single context at the screen level, and then any multithreading is basically just recording software commands which are played back at flush-time, so you can track dirty state without having to reset the world
<jenatali>
It helps that D3D does have a "primary" context though where that doesn't need to happen
<karolherbst>
yeah...
<karolherbst>
maybe I just write a new driver and make use of all the fancy new stuff
<karolherbst>
threaded context
<karolherbst>
and what else there is
<karolherbst>
:D
<Kayden>
tc is nice
<karolherbst>
or just wait until zink is done and just go vulkan only
<karolherbst>
no clue
<Kayden>
would need a vulkan driver first :)
<karolherbst>
yeah.. well
<karolherbst>
got some insider info on that
<HdkR>
:juicy:
<Kayden>
excellent(?)
<Kayden>
(assuming the insider info isn't "there isn't one and it'll never happen" hahaha)
hikiko_ has joined #dri-devel
<karolherbst>
we could make a perfectly fine GL driver and everything but at this point I am always at this just start from scratch moment
<karolherbst>
and then you have zink...
<karolherbst>
the thing is just, that if you start to tweak the driver into being more robust, bits just fall apart left and right
<Kayden>
writing a new gallium driver is probably less work than you think
<karolherbst>
I even tried out context recovery and stuff
<karolherbst>
it worked.. except that the hw just throw random errors at me for... whatever reaons
<Kayden>
especially since you can borrow a lot of code and ideas, but get the architecture how you want it
<karolherbst>
Kayden: yeah.. and we have the old code to cheat from
<karolherbst>
I mean.. our driver started with.. 15 year old hw?
<karolherbst>
and we essentially just copied from it for nv50 and nvc0?
hikiko has quit [Ping timeout: 480 seconds]
<karolherbst>
although nvc0 is a better in a lot of places
<karolherbst>
but if core features like command submission is just a huge pita?
<Kayden>
I'm not entirely sold that I got the iris architecture right...my mythical compute ring still hasn't shown up yet, and I might want more draw call reordering...
<Kayden>
but it's a hell of a lot better than i965 was
<Kayden>
just couldn't work on it anymore
<karolherbst>
what's the benefit of threaded context though? you essneitally just do the actual work from one thread and your contexts submit to it or what?
<karolherbst>
never really looked into it
<Kayden>
couple things
mlankhorst has joined #dri-devel
<Kayden>
a lot of the gallium hooks end up running in another thread
<Kayden>
so the GL frontend and your driver work can happen in parallel
<karolherbst>
okay, yeah, sounds like a good idea
<Kayden>
some things happen immediately, like creating objects, so you can talk about them
<dcbaker>
Kayden: your ring architecture is the right thing if we ever do media
<Kayden>
but also, there is a little bit of optimization in the queue as well
<HdkR>
Also wackloads of games do heavy GL + logic on the same thread. So offloading as much logic from that thread is hugely beneficial to those games...
<Kayden>
since draws are queued up before being dispatched...it can look at sequential draws and optimize them a bit
<karolherbst>
what's a bit annoying and I don't have an answer for that yet is to track context state and once a "thread" or "context" is allowed to submit, to diff state, create the buffer and submit it to... a ring or something
<karolherbst>
but that gets hairy
<Kayden>
combine them into a single draw, for example
<karolherbst>
wondering if threaded context helps with that
<Kayden>
I don't think so
<karolherbst>
okay, so what you do is to essentially store draws in a buffer and submit them as you go at some random point?
<karolherbst>
mhhh
<Kayden>
most hooks store state and flag things dirty
<karolherbst>
maybe that's not a bad idea.. I already playing around with an approach like that in rusticl, where I just offload executing commands in a thread
<Kayden>
pipe->draw_vbo() looks at the dirty flags and emits commands based on the state you saved in your pipe_context
<karolherbst>
and I could track the current hw state in that worker and submit things from a queue
<karolherbst>
and context just submit into this queue
<Kayden>
you submit that when you hit a fence or flush
<karolherbst>
right..
<karolherbst>
and that queue gets flushed out completely once we hit those sync points
<Kayden>
or you hit some max, I limit the command buffer to 64KB for example
<karolherbst>
yeah..
<Kayden>
if I've queued up that many commands I just submit it anyway
<karolherbst>
or when the GPU is bored
<karolherbst>
well..
<karolherbst>
close to getting bored
<Kayden>
we don't even do that
<karolherbst>
but we don't really have this information............ we actually have
jkrzyszt has quit [Ping timeout: 480 seconds]
<Kayden>
I don't think radeonsi does either
xexaxo has quit [Ping timeout: 480 seconds]
<karolherbst>
mhhh
<Kayden>
the assumption is the app is going to be feeding the GPU fast enough
<karolherbst>
so we could check the latest fence reached
<Kayden>
and if it isn't (CPU limited) then your job is to reduce overhead
<karolherbst>
Kayden: yeah... I meant as an alternative to waiting until the buffer is full
<Kayden>
not try and speculatively flush work early to keep the GPU going
<Kayden>
I mean, end of frame does happen
<karolherbst>
yeah..
<karolherbst>
except you do CL
<Kayden>
yeah, I don't know when you flush for CL
hikiko_ has quit [Remote host closed the connection]
<karolherbst>
if you tell the runtime
<karolherbst>
clover also has a limit on queued commands and flushes the queue on its own though
alyssa has joined #dri-devel
<alyssa>
jekstrand: "We wrote a little intel_clc build tool" gosh you're tempting me again
hikiko has joined #dri-devel
<Kayden>
so there you go, just flush then
<karolherbst>
yeah
<Kayden>
I don't think you're really going to gain much by flushing more frequently
<karolherbst>
just wondering if that actually leads to tiny stalls and nobody noticed yet
<karolherbst>
but with threaded context I highly doubt that ever happens
<Kayden>
you are going to have less ioctl overhead by waiting
<karolherbst>
ehhmmm.
<karolherbst>
Kayden: we don't have a "wait on fence" ioctl
<karolherbst>
:(
<Kayden>
I mean, not calling your exec ioctl all the time
<bnieuwenhuizen>
karolherbst: use syncobj
<karolherbst>
bnieuwenhuizen: yeah.. I think Ben is working on something there
<karolherbst>
Kayden: ahh..
<karolherbst>
we don't even batch commands
<karolherbst>
even though we could
<karolherbst>
ehh
<karolherbst>
command submissions
<Kayden>
so you just call ioctls all the ime to submit commands?
<karolherbst>
so it's always one buffer at once
<karolherbst>
Kayden: basically
<Kayden>
that's going to be a lot of userspace <-> kernel switching
<karolherbst>
well
<jekstrand>
alyssa: I've been tempted to use it for more stuff
<karolherbst>
not that bad
<karolherbst>
not for every command
<Kayden>
for every draw?
<ajax>
forgive me but did i seriously just read that nouveau thinks context rolls are too hard so it doesn't try
<alyssa>
jekstrand: just like. every time there's a change to indirect draws, or a request for GS/tess, I get a little sadder inside.
<karolherbst>
Kayden: mhhh a bit like that
<Kayden>
that's going to be hella expensive
<karolherbst>
I don't think it will always happen though
<karolherbst>
but there are a lot of things which flush out the buffer
<karolherbst>
like waiting on a bo
<karolherbst>
or mapping a bo
<karolherbst>
and stuff
<Kayden>
well, yeah, waiting on a BO referenced by a batch would flush, yes
<Kayden>
that's one of the things that can cause a flush
<Kayden>
have to trigger a submission before you can wait
<karolherbst>
we also have a few explicit places where we submit
<karolherbst>
but our buffers aren't that huge
<Kayden>
Not really sure how to compare the size of buffers - ours are larger, but Intel's command streamer also has, uh, very verbose commands
<karolherbst>
depends on hte hw though, on newer chips it's less annoying
<Kayden>
so other drivers having a smaller one might be equivalent :)
<jekstrand>
alyssa: Yeah, for stuff like that, it's very tempting.
<alyssa>
Kayden: AGX has lovely small commands
<jekstrand>
alyssa: And if you don't want to carry a HW binary per HW version, you can just bake the SPIR-V or the NIR into the driver instead.
<jekstrand>
alyssa: Being able to write GPU stuff in C is a pretty nice feature.
<karolherbst>
Kayden: well.. our commands are weird.. but yeah
<karolherbst>
I think our buffers are like 512k or so
<alyssa>
jekstrand: Yeah I mean
<alyssa>
nir_builder really doesn't scale to the equivalent of 500 lines of C code
<bnieuwenhuizen>
on the order of 512k sounds about right for the max size of a command buffer
<karolherbst>
ohh right
<karolherbst>
queries was the huge problem
<karolherbst>
so queries kind of always lead to flushes
<karolherbst>
but meh
<karolherbst>
it's not really easy to tell when a flush happens
<jekstrand>
alyssa: Yup
<karolherbst>
as we don't do it explicitly
<karolherbst>
well, we can
<karolherbst>
but usually it gets flushed implicitly at random points in time :/
<karolherbst>
and I mean randomly
<karolherbst>
it can happen as you add more commands
<karolherbst>
and then it is flushed and the next command is in a new buffer
<karolherbst>
even within a draw call
<Kayden>
queries like...occlusion queries?
hikiko has quit [Remote host closed the connection]
<karolherbst>
yeah, but those might be safe
<Kayden>
so yeah, if you CPU access a query result, you need to flush
<Kayden>
this is all totally normal and all drivers handle these things
<karolherbst>
those are just the places where we do it explicitly
<Kayden>
yep
<bnieuwenhuizen>
FWIW radeonsi reserves a large part of the buffer for each draw call (like 16KiB or so) so that we don't have to switch buffers in the middle of the state setup
<Kayden>
you can look for iris_batch_flush in iris to see where we flush things explicitly
<karolherbst>
bnieuwenhuizen: yeah.... sounds like a good idea
<karolherbst>
I don't think the driver is terribly broken, just.. not future proof
hikiko has joined #dri-devel
<karolherbst>
I was debugging one game once and I was wondering why performance was terrible so I looked at the GPU load: 20% :/
<karolherbst>
and CPU was at like 100%
<jenatali>
Just add TC
<karolherbst>
yeah...
<jenatali>
Not that I should talk, we haven't done that yet :P hopefully we can get to that soon though
<karolherbst>
it always feels like I want to change everything, so I either have those 10 things I can fix one after each other
<karolherbst>
or just start from scratch
<alyssa>
i'm done rewriting panfrost
<karolherbst>
implicit flushes are super annoying though
<karolherbst>
essentially I want to rewrite all command submission + state tracking
<karolherbst>
and then I can just write a new driver :D
<karolherbst>
as I don't think drivers really do more than that.. maybe shader compiling :p
<karolherbst>
can keep the compiler
<jekstrand>
karolherbst: Make it NIR-only, then it's like half a new compiler. :)
<karolherbst>
jekstrand: volta+ is already nir only
<karolherbst>
still uses codegen though
<jekstrand>
Well, yeah, you need a back-end compiler
<jekstrand>
Though NV HW is nice enough, maybe you could codegen straight from NIR. :-P
<karolherbst>
:D
<karolherbst>
maybe
<jekstrand>
I doubt it
frieder has quit [Quit: Leaving]
<karolherbst>
but my thinking is always: why spending the time when we want vulkan anyway and zink looks promising
<karolherbst>
the question is.. can we do better than zink+vk with the limited resources we have
hikiko has quit [Remote host closed the connection]
<karolherbst>
which is also the question I ask myself in regards to codegen passes which aren't even good
<jenatali>
Our philosophy on the Windows side is that drivers for low-level APIs are simpler, and using mapping layers lets you focus optimizations in one spot that benefits everyone
<karolherbst>
most of it nir can do better
<jenatali>
So if I were you, I'd go all in on Vk+Zink
<karolherbst>
and that's just a fact
<karolherbst>
some things nir won't be able to do, so those can stay
<karolherbst>
jenatali: yeah.. hence me not in the mood of rewriting the driver :D
hikiko has joined #dri-devel
<karolherbst>
fixing mt is like the minimum I'd do
<alyssa>
karolherbst: If you're intent on rewriting everything, sure go straight to vk
<jenatali>
Just write a Vk driver then, that's like "rewriting" the driver and lets you ignore a lot of the hard problems, like when to flush
dviola has quit [Quit: WeeChat 3.2.1]
<alyssa>
^yep
<karolherbst>
yeah
* alyssa
still doesnt know what to do for asahi
<jenatali>
Also Vk queues -> hw contexts
<jenatali>
Much simpler
<karolherbst>
I mean.. I would totally got for a good GL driver, but I know we lack the support to actually do it
<alyssa>
I mean. I started a gles2 driver but it's mostly just a shell to exercise the compiler :P
<karolherbst>
jenatali: I doubt that's true for nvidia
<karolherbst>
I am sure they always do one context within an application regardless of what you do
<jenatali>
I can't speak to Linux and Vk, but I can tell you how they do Windows + D3D12 :P
<jenatali>
And it's not one context
<karolherbst>
uhh surprising
<karolherbst>
but d3d12 might only cover hw where it's not terrible
<jenatali>
Fermi+
<karolherbst>
old nv hw don't have... many contexts
<jenatali>
Though probably they do hacky stuff on Fermi
<karolherbst>
imirkin: do you know when it changed?
<karolherbst>
jenatali: yes
<karolherbst>
fermi isn't bindless
<jenatali>
Oh right, I remember that problem... yeah I'm pretty sure that got better with Kepler, but not fully fixed until after that
<karolherbst>
so this is very annoying
<karolherbst>
nvidia even accidentally shiped vulkan on fermi
<karolherbst>
and disabled it in the next release
<alyssa>
how do you accidentally ship vulkan
hikiko has quit [Read error: Connection reset by peer]
<karolherbst>
alyssa: I guess because of d3d12
<karolherbst>
if they enable d3d12 on fermi?
<alyssa>
no judgement, nvidia is no longer the worst hw vendor in the linux space 🤷
<karolherbst>
but
<karolherbst>
do thye?
<jenatali>
They did
<karolherbst>
jenatali: are you sure that fermi is d3d12?
<karolherbst>
jenatali: yeah... how should I tell you.. the hw can't do it :p
<karolherbst>
for nouveau at some point we always spilled UBOs to SSBOs for all UBOs
<jenatali>
Guess the driver does some magic then. Dunno how thoug
<karolherbst>
at some point we started to do this for the first 6 UBOs only
<karolherbst>
jenatali: yeah.. using SSBOs :p
hikiko has quit [Read error: Connection reset by peer]
<karolherbst>
UBOs are also just... ubffers
<karolherbst>
*buffers
<karolherbst>
but with caching
<karolherbst>
so you don't even have to do any magic
<karolherbst>
you just don't bind it as a UBO and access it via global memory
<imirkin>
UBO's have a magic cache on nvidia
hikiko has joined #dri-devel
<imirkin>
which allows you to update a UBO while draws using that UBO are still ongoing
<imirkin>
without messing up those draws
<karolherbst>
yeah.. I suspect they upload it to the hw on launching the draw
<karolherbst>
or well..
<karolherbst>
we do
thellstrom has quit [Ping timeout: 480 seconds]
<karolherbst>
guess that's what the commands are doing
<karolherbst>
would be interesting to know when they start getting cached
<imirkin>
yeah, i dunno precisely when the data hits memory
<imirkin>
it does eventually though
<jekstrand>
jenatali: Not yet
<jekstrand>
jenatali: Maybe Monday? Tomorrow's not looking good for writing code.
<jenatali>
Heh. Fair enough
<karolherbst>
I should write 2 slides or so
<karolherbst>
wait
<karolherbst>
I am already doing this stuff for 10 years?
<karolherbst>
that's like a third of my life
soreau has quit [Remote host closed the connection]
<jenatali>
Hah. Ditto on both parts :P
soreau has joined #dri-devel
<karolherbst>
my laptop makes weird sounds when running latex...
<idr>
Maybe it's allergic?
<karolherbst>
well...
<karolherbst>
I think so
<karolherbst>
but this CPU here doens't really like switching clocks that much
<karolherbst>
mhhh
<karolherbst>
no idea what kind of demo I want to show tomorrow.. probably some CTS test isn't really interesting to watch :D
slattann has joined #dri-devel
<imirkin>
karolherbst: just go with starcraft :)
<karolherbst>
ehhhh
<karolherbst>
it's not GL
<imirkin>
more fun to watch than CTS tests though
<karolherbst>
right
<karolherbst>
we can also look at code together :D
<alyssa>
i thoought you were rewriting clover in rust?
<karolherbst>
alyssa: mhh?
<karolherbst>
I mean.. technically you could call it rewriting
<alyssa>
?
<emersion>
austriancoder: thanks for working on this! will review tomorrow
<karolherbst>
alyssa: I would rather see it as implementing OpenCL in Rust inside mesa
<alyssa>
so... RiiR of clover?
<karolherbst>
essentially
<karolherbst>
but I am working closer with the CL spec than with clover
jessica_24 has quit [Quit: Connection closed for inactivity]
<alyssa>
alright
columbarius has joined #dri-devel
co1umbarius has quit [Ping timeout: 480 seconds]
* airlied
watches XDC at 1.5x, y'all speak fast before speeding up :-P
<alyssa>
airlied: o:)
camus1 has joined #dri-devel
<karolherbst>
ehh right, I wanted to work on my slides
camus has quit [Ping timeout: 480 seconds]
nchery has quit [Remote host closed the connection]
thellstrom has joined #dri-devel
rasterman has quit [Quit: Gettin' stinky!]
lemonzest has quit [Quit: WeeChat 3.2]
nchery has joined #dri-devel
slattann has quit []
<columbarius>
daniels: in the proposed documentation for buffer exchange via DMA-BUFs, DRM_FORMAT_MOD_INVALID is handled as a token for an implicit modifier. Some apis like EGL support importing DMA-BUFs with implicit modifiers with the explicit api and this token used. Will this stay a special case, or is there an effort to get this convention into all api specifications?
dllud_ has joined #dri-devel
dllud has quit [Read error: Connection reset by peer]
<jenatali>
Oops, accidentally got another Phoronix article written about me
mlankhorst has quit [Ping timeout: 480 seconds]
<Sachiel>
competing with zmike?
dllud_ has quit [Read error: Connection reset by peer]
dllud has joined #dri-devel
<jenatali>
Guess so
<jenatali>
Unsurprisingly, the comments are completely different on his articles vs mine :P
<FLHerne>
I do wonder why Michael doesn't seem to hang out in here
<FLHerne>
he could get inane scoops even faster
<dcbaker>
At least he finally seems to have noticed that we've moved to gitlab
<FLHerne>
I think I've only had one Phoronix article so far, and it was so inaccurate I cancelled my subscription after he refused to correct it :p
<FLHerne>
[about KDevelop]
<jenatali>
They're usually accurate enough, but also, not really any substance
<FLHerne>
Yeah, it's usually like an RSS aggregator with more steps
<dcbaker>
He stopped writing them about me after using Meson ceased be be controversial and all of the cools kids started using it, lol
<Sachiel>
the substance is in the comments
<FLHerne>
The benchmarks are ok
gpuman_ has joined #dri-devel
<jenatali>
Eventually I'm going to start hobby development and still get articles claiming the will of Microsoft is making me contribute so we can do evil things
gpuman__ has quit [Ping timeout: 480 seconds]
<FLHerne>
to be fair, if you read Phoronix comments *everyone* is in on the conspiracy according to someone
<FLHerne>
KDE developers are helping the Qt Company lock everyone into their not-really-free library, GNOME want to stamp out all other DEs and turn Linux into something for tablets
<FLHerne>
Everyone is going to have to buy a RHEL subscription until Google replace the kernel with Fuschia
<Sachiel>
maybe I should have said "the substance abuse is in the comments"
<agd5f>
FLHerne, sensationalism gets page views
<dcbaker>
Sachiel: "[substance] abuse"
<Company>
FLHerne: to be fair, nobody has suggested munching horse paste to improve things yet
<FLHerne>
jenatali: tbh, I still have a lingering suspicion that WSL is primarily to avoid young coders having to install Linux
<FLHerne>
which is obviously bad for the Linux ecosystem
<FLHerne>
but maybe I'm just paranoid, and certainly OpenGL-DX doesn't have much to do with that
<jenatali>
Is it though? IMO it's only bad once it ceases being actual Linux in a VM
<jenatali>
It's not my product or my decisions, but it seems like a win-win to me, more people get easier exposure to Linux, but they don't necessarily have to leave Windows to do it
<FLHerne>
I think so; it means a lot of people will never interact with an actually free desktop environment, nor use [predominantly-FOSS] IDEs and tools that run on Linux
<jenatali>
The problem in my mind would be if we tried to extend it to the point that WSL's Linux flavor is something unique that you could never get at any other way. Hence why we have no interest in getting native D3D12 apps running on WSL
<FLHerne>
It's a model where the lowest-effort option is to use a handful of Linux tools when it's most convenient, but there's less motivation to go further than that level of toe-dipping
<jenatali>
Eh, I can see where you're coming from, but I'd argue the opposite - where people previously would've had to go all-in to experience any of it, now there's an easy option for toe-dipping before that tipping point
gpuman__ has joined #dri-devel
JohnnyonFlame has joined #dri-devel
gpuman_ has quit [Ping timeout: 480 seconds]
<FLHerne>
I suppose we'll find out in five or ten years
aswarup_ has joined #dri-devel
Duke`` has quit [Ping timeout: 480 seconds]
danvet has quit [Ping timeout: 480 seconds]
aswarup_ has quit []
mbrost has quit [Ping timeout: 480 seconds]
gpuman_ has joined #dri-devel
gpuman__ has quit [Ping timeout: 480 seconds]
Hi-Angel has quit [Remote host closed the connection]
mbrost has joined #dri-devel
Hi-Angel has joined #dri-devel
gouchi has quit [Remote host closed the connection]
gawin has quit [Quit: Konversation terminated!]
gpuman__ has joined #dri-devel
<karolherbst>
tagr: if you find some time, might debugging your modifier patches with gnome 41 on fedora 35? There is something weird going on and I have no clue what. This was the bug I actually wanted to debug, but those memory corruptions got in the way
gpuman_ has quit [Ping timeout: 480 seconds]
iive has quit [Ping timeout: 480 seconds]
mbrost has quit [Ping timeout: 480 seconds]
<alyssa>
jenatali: 🍿
<karolherbst>
FLHerne, jenatali: just wait until MS announces a Linux distribution with Windows as the desktop, then you can restart your argument :P
camus has joined #dri-devel
Hi-Angel has quit [Remote host closed the connection]
<FLHerne>
That would be pretty neat
<FLHerne>
I don't see it ever happening though
<karolherbst>
FLHerne: would you have seen MS contributing to Linux happening 10 years ago?
<karolherbst>
especially to mesa
<karolherbst>
also getting into mono is also an surprising step
<FLHerne>
Mesa, honestly yes eventually
<karolherbst>
not sure what their end goal is, but I'd believe them if they say they are trying
gpuman_ has joined #dri-devel
<karolherbst>
wouldn't be the case 10 years ago
<FLHerne>
I still expect Mesa to replace most blob drivers eventually
<FLHerne>
including the Windows ones
<karolherbst>
that would be cool
<karolherbst>
but I don't think the outlook was that great 10 years ago
<FLHerne>
and I expected that ten years ago even if it was rather more remote then
camus1 has quit [Ping timeout: 480 seconds]
gpuman__ has quit [Ping timeout: 480 seconds]
pcercuei has quit [Quit: dodo]
<karolherbst>
was it really an expectation or just a hope?
<bnieuwenhuizen>
just an extrapolation from AMD being the second vendor with an official mesa driver around that time :P
<FLHerne>
iirc that was about when radeonsi and the first coming of Clover were the shiny thing
<FLHerne>
Yeah, that
<karolherbst>
ahh
<FLHerne>
then it slowed down a bit until Valve and all the ARM drivers came along
<karolherbst>
yeah well.. a bit sad about OpenCL diying and everything
<karolherbst>
but quite interesting that it's getting up a bit again
<FLHerne>
I'd believe them if they said it was the plan, but failing that I don't believe it's the plan
<karolherbst>
mhhh, not sure
<FLHerne>
you'd need to somehow fit the Windows stable-forever kernel driver ABI onto Linux...somehow...if you wanted to not break everyone's stuff
<FLHerne>
that or support every crazy peripheral upstream
<karolherbst>
I don't think this is the issue
<karolherbst>
the issue is just that like 99% of the desktops out there aren't user friendly
<karolherbst>
and we have too many of those "just use the cli" people
<FLHerne>
Surely "a Linux distribution with Windows as the desktop" bypasses that?
<jenatali>
FLHerne: I'd honestly doubt that Windows drivers will ship out of Mesa. Partly because unfortunately there's still licensing restrictions on the Windows DDK that I believe would prevent open-sourcing of Windows kernel drivers...
<karolherbst>
FLHerne: exactly
<FLHerne>
I remember having the other way round, you could replace the Windows desktop shell with KDE 4 once
<karolherbst>
not because the desktop will suddenly change anything
<karolherbst>
just that Microsoft have to provide a proper GUI for their GUI admins :p
<karolherbst>
*has
<bnieuwenhuizen>
jenatali: but mesa isn't kernel drivers anyway right?
<karolherbst>
so out of the sudden you have this driver to have GUIs for everything
<FLHerne>
jenatali: Yeah, but that's the sort of thing you can fix now that MS is doing Mesa :p
<karolherbst>
and a another company having to support that stuff
<karolherbst>
try to use linux without using the cli for a year
<jenatali>
bnieuwenhuizen: True, I believe the usermode DDK bits are also restrictively licensed
<karolherbst>
and report back next XDC :)
<jenatali>
FLHerne: Re-licensing things is uh... not in my capability lol
<karolherbst>
"not using the CLI challange on Linux" or "why I gave up after three days"
<karolherbst>
heck.. I have use CLI to manage files
<karolherbst>
because using mv actually works and drag&drop can't fail me
<FLHerne>
jenatali: You guys relicensed whole chunks of .NET and so on, clearly it's in someone's capability
<karolherbst>
so yes.. I don't think valve has anything to do that their linux advances are failing
<jenatali>
True, we've also re-licensed the D3D12 API headers, but the DDK has different ownership and different licensing
Hi-Angel has joined #dri-devel
<jenatali>
Either way, not my call
<karolherbst>
I am already happy that now I am using a desktop with working mixed DPI scaling :)
<FLHerne>
Yeah, but when AMD come knocking and want to halve their driver stack, what's the rationale for saying 'no'?
<FLHerne>
there's a precedent for open-sourcing these kinds of headers now
<jenatali>
Yep, fair points. If we had a compelling reason, I bet we could convince people to make it happen
<FLHerne>
Maybe it'll be another decade
<FLHerne>
but I've been waiting a decade, so that's fine :p
<karolherbst>
I suspect that a graphics stack has a more difficult patent situation
<FLHerne>
There's a precedent for granting free open-source use of patents too; see exFAT
<karolherbst>
source being public is one thing
<karolherbst>
doesn't help if nobody can use it commercially like that
<karolherbst>
FLHerne: sure, not saying impossible
<karolherbst>
just "more difficult"
<karolherbst>
like.. just thinkg about HDMI 2.2 or DP 2.0
<karolherbst>
so those older specs were public and now they decided to make it private ...
<bnieuwenhuizen>
wait, DP 2.0 too?
<karolherbst>
yeah, but VESA membership or so helps
<bnieuwenhuizen>
I thought HDMI 2.1+ were an issue?
<karolherbst>
VESA is nice to us
<karolherbst>
HDMI ont the other hand.. is not
<karolherbst>
if there just would be enough company to yell at HDMI :p
<karolherbst>
*companies
<karolherbst>
jenatali: hey.. there is something you could helps us with :p
rasterman has joined #dri-devel
<jenatali>
Pretty sure that also falls under the "not my call" category :P
<karolherbst>
jenatali: well... worth trying though :D
<karolherbst>
we kind of need the HDMI spec in a way we can make use of it inside linux :p