<rodrigovivi>
should we continue using STAGING config in our drm drivers and make sure we taint the kernel? or should we create a new DRM_STAGING tag depending on EXPERT and move the current usages to it and then add it to Xe?
<rodrigovivi>
demarchi ^
smiles_1111 has joined #dri-devel
Daaanct12 has quit [Remote host closed the connection]
smilessh has quit [Ping timeout: 480 seconds]
Danct12 has quit [Quit: WeeChat 3.8]
<sima>
rodrigovivi, for out-of-tree it's kinda whatever
<sima>
once it's merged I think the alpha_support_flag is good enough until it's all fully validated
<sima>
since that should taint too
<sima>
ofc if you're already 100% confident on the uapi and "not too many bugs to the point we can't support the thing" then you can enable right away imo
<sima>
since either way there's no taking back with uapi really
dcz_ has joined #dri-devel
dcz has quit [Ping timeout: 480 seconds]
mbrost has joined #dri-devel
mbrost_ has joined #dri-devel
jkrzyszt_ has quit [Remote host closed the connection]
<karolherbst>
how so? I just don't see the problem lockdep points out here
<sima>
oh just general sarcasm, there's usually not a good time ahead when your lockdep splat has 4 locks involved
<karolherbst>
yeah, but I also don't see the deadlock anyway
<sima>
btw if you have anyhing with more than 2 contexts (2 locks or lock + irq) the "Possible unsafe locking scenario:" is bs
<sima>
it's only correct for 2
<karolherbst>
ahh
<sima>
so you need to build the full graph yourself
<karolherbst>
shouldn't get printed then
<sima>
where's the entertainment in that
mszyprow has quit [Remote host closed the connection]
<sima>
but yeah the loop seems to be calling request_fw while holding a lock
<sima>
which has a chain back to mmap_sem
<sima>
which deadlocks, because request_fw needs userspace and that needs to be able to take faults :-)
<sima>
so pull that request_fw call out of any mutex section and you should be good
<karolherbst>
ahh
<karolherbst>
so generally, request_firmware while holidng locks == bad
<sima>
you get the splat only into some arbitrary vfs locks but that's the root issue I think
<sima>
yup
<karolherbst>
I have to try to trigger this anyway
<karolherbst>
mhhh
<karolherbst>
sima: yeah soo. bad news, I doubt we can move that request_firmware out of locked sections
<sima>
maybe put a mmap_read_lock(mm); mmap_read_unlock(mm); into request_fw to trigger it more reliably
<sima>
karolherbst, hence regrets :-P
<sima>
karolherbst, from a very quick look
<sima>
you get to add another hook plus bail-out path
<sima>
so you can request the fw in the ioctl, stuff it somewhere and retry
<karolherbst>
mhh. annoying
<sima>
or you make sure you load all these at driver init time, which is kinda how it should
<karolherbst>
yeah.. we already moved some firmware to that
<sima>
(aside from the annoyance of delays and I guess harder to hack around on)
<karolherbst>
and I was thinking about doing the same with this one then
<sima>
yeah, somehow you need to pull it out
<sima>
the annoyance with delays is that when you do this at driver load it's either a stall, or you do it async and then it's still a dependency, but you might accidentally cheat lockdep out of noticing it
<karolherbst>
Maybe I just talk to Ben about it, because with GSP all that changes anyway
<karolherbst>
and with GSP we get like multi MB firmware
<karolherbst>
sima: wasn't the problem with loading firmware on init, that we need an FS in the first place?
<karolherbst>
which kinda requires a userspace program context?
<sima>
karolherbst, the fs can be the initramfs
<karolherbst>
right.. but I was kinda under the impression it needs to start from an ioctl, $because
<karolherbst>
ohh wait, that was just a problem with suspend/resume/whatever
<karolherbst>
so on suspect you can't load firmware
<karolherbst>
*suspend
<karolherbst>
sima: so I guess what we should do is to call request_firmware_nowait on all firmware files we need and just make code needing that firmware to wait on its completion?
lplc has joined #dri-devel
<karolherbst>
which we should be able to do outside of locked areas
<karolherbst>
mhh
<karolherbst>
maybe not
<karolherbst>
uhhh
<karolherbst>
annoying
alpalcone has quit [Ping timeout: 480 seconds]
yuq825 has quit []
<sima>
karolherbst, yeah the wait is still an issue
<sima>
but if you just wait for a completion lockdep wont see the chain anymore
<sima>
"there I fixed it"
Haaninjo has joined #dri-devel
vliaskov has joined #dri-devel
fxkamd has joined #dri-devel
<karolherbst>
sima: yeah.. so I think the issue here is, that when all users of the GPU disappear, we kinda clean stuff up, but once there is another user, we reinitialize things, which might trigger the firmware to be loaded again. So my assumption is, if we just cache it the first time it should be fine...
<karolherbst>
it's kinda a pita situation tho... I mean the entire way of how firmware loading things
<karolherbst>
*works
<sima>
karolherbst, why do you clean up when all users disappear?
<karolherbst>
I have no idea
<sima>
is that just poor runtime pm because vgaswitcheroo didn't need more?
<karolherbst>
I doubt it
<sima>
but yeah loading the fw once and caching it is the way to go I think
<karolherbst>
we also do so because we had issues doing firmware stuff on suspend
<karolherbst>
for some stuff
<sima>
but first time still needs to be outside the locks
<karolherbst>
guess we should do it for more
<karolherbst>
mhhh
<sima>
well reloading fw on resume makes sense
<sima>
it's more "why is this connected to users"
<sima>
since ideally you suspend stuff even when there are users, as long as they didn't do anything for a while
<karolherbst>
the problem was, we needed new firmware on suspend
<sima>
yeah you need to cache
<karolherbst>
we already do
<sima>
hm what do you mean with "new fw" then?
<karolherbst>
I mean, we fixed _that_ issue already
<karolherbst>
but I don't think we need to do the initial loading outside locks
<karolherbst>
the problem is more or less: process releases all fds to nouveau, takes mmap lock on it clean up path. Another process spawns opening an fd to nouveau and starts the init path, that might trigger a firmware load and with bad timing -> dead lock
<karolherbst>
we kinda have locks per subsystem
<karolherbst>
and uhh.. it's kinda annoying
<karolherbst>
so I can see this happening midway through deinit/init
<sima>
karolherbst, "I can explain how lockdep is wrong in this specific case" is not very good approach to shutting up lockdep :-)
<karolherbst>
I'm not saying lockdep is wrong
<kode54>
sima: maybe 14013111325?
<sima>
kode54, ?
<kode54>
workaround/bug number
<kode54>
something to do with MCS?
<karolherbst>
I'm just saying it's a deadlock when you have a process freeing the last nouveau resources vs one allocating one and it's just bad timing
<sima>
karolherbst, hm I guess I misunderstood
<sima>
but also my take on lockdep splats is that they tell you the design isn't clean
<karolherbst>
so we if cache the firmware, on reinit it should be fine
<sima>
even if there's some specific reasons for why it can or cannot happen in practice
<karolherbst>
that case only happens if the firmware already got loaded once is my understanding here
<karolherbst>
and doesn't on first load
<sima>
yeah that's what I mean with "in practice this can't happen on first load, lockdep is wrong"
<sima>
because it's different files/mmap_sem/whatever
<karolherbst>
yeah, it's not complaining on first load here
<sima>
hm I need to look at the splat again
<karolherbst>
`__x64_sys_munmap` kinda gives it away I think
<karolherbst>
mhh not sure
<sima>
well if you say it doesn't splat on load
<sima>
then that means the chain of lock deps matter
<sima>
to tie the loop
<sima>
and a lot of those are in nouveau cleanup code
<sima>
so maybe there's another place where you have a not-so-great dependency that shouldn't be there for a clean design
<karolherbst>
I'm sure there are some of those
<karolherbst>
that splat with driver_probe_device is kinda weird
mbrost has joined #dri-devel
<karolherbst>
maybe it's also a race on driver init and userpsace processes already using the driver?
<sima>
lockdep doesn't care about races
<sima>
as long as you hit all the call paths
<karolherbst>
I know
<karolherbst>
but
mbrost_ has joined #dri-devel
<karolherbst>
mhhh
alpalcone has joined #dri-devel
<sima>
so I don't know enough nouveau.ko by far to say anything
<sima>
but in general taking big locks in ctor/dtor paths is smelly
<sima>
and having a request_fw makes these locks the biggest you can have in the kernel
lplc has quit [Ping timeout: 480 seconds]
<sima>
big = starting points in the dep graph
<sima>
the issue could be taking these locks in dtor/ctor (but in the issue there's 3, so which one is it?)
<sima>
or that the innermost one is too big due to the request_fw and really should just be a very, very inner mutex to serialize hw access
<sima>
sometimes it's unavoidable and then you need to push dtor work into a cleanup queue
fxkamd has quit []
<karolherbst>
right, but this is more ref counting stuff, where nouveau frees up resources on last use
<karolherbst>
it's also to deal with the fact, that you can enable/disable certain parts of the GPU
<sima>
yeah holding the mutex over more than the dec is an antipattern
<sima>
it's kref_put (good) vs kref_put_mutex (pain)
mbrost has quit [Ping timeout: 480 seconds]
<karolherbst>
yeah.. so we guard the initialization of those bits with locks
<karolherbst>
and we take the same lock on unref
<karolherbst>
mhh, it's all kinda weird, maybe Ben has any ideas
kzd has joined #dri-devel
Duke`` has joined #dri-devel
Danct12 has joined #dri-devel
iive has joined #dri-devel
<kode54>
might have been mesa!22802
<q4a>
Hi. Is there channel for Imagination GPU? like #_oftc_#panfrost:matrix.org for Mali or #_oftc_#d3d9:matrix.org for Nine.
<alyssa>
yeah
<alyssa>
#powervr
smilessh has joined #dri-devel
smiles_1111 has quit [Ping timeout: 480 seconds]
<q4a>
Thanks!
pochu has quit [Quit: leaving]
frankbinns has joined #dri-devel
tzimmermann has quit [Quit: Leaving]
frieder has quit [Remote host closed the connection]
cmichael has quit [Quit: Leaving]
Guest1014 has quit [Remote host closed the connection]
<pinchartl>
sima: would you consider dim to be usable for other subsystems ? not necessarily as such, but would the required refactoring of the script to move DRM-specific bits to a configuration file be something that would be a) doable and b) accepted upstream ?
<sima>
not sure
kts has joined #dri-devel
<sima>
in the end it's bash
kts has quit []
<pinchartl>
at least it's not perl ;-)
<sima>
but also, it has config files, both for user specific stuff (.dimrc) and for what kind of trees/branches you have and how the integration tree is built (nightly.conf for historical reasons)
<sima>
so I think the only thing needed is to make the "where is the branch with nightly.conf" configurable you're probably 90% there
<pinchartl>
I may have a look, depending on how much buy-in I would get for using it for V4L2
thellstrom1 has quit [Read error: Connection reset by peer]
clem_ has left #dri-devel [#dri-devel]
lynxeye has quit [Quit: Leaving.]
gouchi has joined #dri-devel
gouchi has quit []
kts has joined #dri-devel
pyromancy has joined #dri-devel
pyromancy has quit []
pyromancy has joined #dri-devel
dviola has quit [Ping timeout: 480 seconds]
dviola has joined #dri-devel
dviola has quit []
mbrost__ has joined #dri-devel
mbrost has joined #dri-devel
mbrost_ has quit [Ping timeout: 480 seconds]
<DemiMarie>
pinchartl: Perl > bash
pyromancy has quit [Quit: pyromancy]
mbrost__ has quit [Ping timeout: 480 seconds]
pyromancy has joined #dri-devel
ngcortes has joined #dri-devel
tursulin has quit [Ping timeout: 480 seconds]
mbrost has quit [Remote host closed the connection]
mbrost has joined #dri-devel
mbrost has quit [Ping timeout: 480 seconds]
kts has quit [Quit: Konversation terminated!]
psykose has quit [Remote host closed the connection]
jolan has quit [Quit: leaving]
jolan has joined #dri-devel
tango_ has quit [Ping timeout: 480 seconds]
gildekel has quit [Quit: Connection closed for inactivity]
tango_ has joined #dri-devel
<karolherbst>
gfxstrand: I think I know how to deal with crossworkgroup initializers: make them the frontends problem. We should simply add some APIs (maybe to clc) to query for all of them and the frontends just creates the nir_shader normally for them. And then it's entirely up to the frontend to when to call them
<karolherbst>
mhh actually... those are always blocks of constant memory, no? Though for destructors it might not be
<karolherbst>
Initializer + Finalizer
<karolherbst>
but yeah, for those I think clc should just expose that information
<karolherbst>
and make it part of `clc_kernel_info` or something
<karolherbst>
I wonder if we want to lower CrossWorkgroup Variable initializers to a kernel function just writing to memory passed in as the only arg
<karolherbst>
hh
<karolherbst>
mhhh
<jenatali>
Yeah, that was my thinking about how to handle it too
<karolherbst>
sadly I really need this soonish now :)
<jenatali>
Ping me if you need code review
<karolherbst>
I think the Initializer + Finalizer case is pretty straightforward
<karolherbst>
the variable initializer one is the annoying part
<karolherbst>
I think that `cl_khr_subgroups` line is actually wrong
<karolherbst>
soo.. this is where it's getting weird
<karolherbst>
OpenCL core is different from cl_khr_subgroups
<jenatali>
For initializers, I think it can just be reflected out and make it the frontend's problem. It just needs to ensure that the data is uploaded into the buffer before the first kernel is executed
<karolherbst>
the extension requires CL_DEVICE_SUB_GROUP_INDEPENDENT_FORWARD_PROGRESS where OpenCL core doesn't
<karolherbst>
so I'm not sure if it's safe to always expose the extension string via clc
<jenatali>
Oof
<jenatali>
Yeah maybe split that
<karolherbst>
subgroup + subgroup_ifp or subgroup_khr_ext?
<jenatali>
I think subgroup_ifp
<karolherbst>
the CTS is happy eihter way, but I don't want to debug applications where they actually check that correct in kernel and do things differently :)
<karolherbst>
*correctly
<karolherbst>
but I'm also not sure if we could just return true for all hw vendors here. I just don't know
karolherbst is now known as karolherbst_
<karolherbst_>
ohh it's for threads within a sub group
<karolherbst_>
yeah.. I guess not then
karolherbst_ is now known as karolherbst
<karolherbst>
ehh no, it's about subgroups towards each other actually
<karolherbst>
that's way more sane
<airlied>
konstantin_: asan with just a cts run doesn't produce anything locally?
<konstantin_>
One sec, lemme try
<konstantin_>
passes
rasterman has quit [Quit: Gettin' stinky!]
<konstantin_>
airlied: btw, why does llvmpipe copy constant buffers? I had to add it back because it caused some failures with the llvmpipe job but now, all dEQP-VK.descriptor_indexing.sampler_after_bind* test fail
<karolherbst>
jenatali: I think I have a better idea... spirv_to_nir sould just return a array like for constant, and if the frontend allocates memory it just uploads the data...
<karolherbst>
and for Initializer + Finalizer we just compile the kernel in the frontend as discussed above
<jenatali>
karolherbst: Yeah isn't that what I said?
<jenatali>
"For initializers, I think it can just be reflected out and make it the frontend's problem. It just needs to ensure that the data is uploaded into the buffer before the first kernel is executed"
<karolherbst>
ohh, didn't see that
<karolherbst>
yeah, it can be done on creation of the cl_program object and it just lives there or something...
<karolherbst>
yeah...
<karolherbst>
shouldn't be too hard then
<jenatali>
Yeah it has to live there, since it can be shared among kernels in the program
<karolherbst>
finalizers are funky
<karolherbst>
I kinda don't see the point, but I can see weird pointer magic happening
<karolherbst>
I suspect you could store pointers to memory there and retrieve it after the program object was destroyed
<karolherbst>
it's messed up, but...
jkrzyszt_ has quit [Ping timeout: 480 seconds]
apinheiro has joined #dri-devel
apinheiro has quit [Quit: Leaving]
dcz_ has quit [Ping timeout: 480 seconds]
<airlied>
konstantin_: complete CTS run though? it seems weird venus is triggering anything
<airlied>
konstantin_: I'm not actually 100% what was behind the consts copying, I think it comes from the olden days where consts could get overwritten somehow
<airlied>
so it would copy the consts, I'm not sure what value there really is doing it now if any
alyssa has left #dri-devel [#dri-devel]
Duke`` has quit [Ping timeout: 480 seconds]
tango_ has quit [Ping timeout: 480 seconds]
tango_ has joined #dri-devel
Haaninjo has quit [Quit: Ex-Chat]
Kayden has quit [Quit: -> insomnia]
q66 is now known as Guest1071
q66 has joined #dri-devel
elongbug_ has quit [Read error: Connection reset by peer]