ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar
mbrost has joined #dri-devel
penguin42 has quit [Remote host closed the connection]
co1umbarius has joined #dri-devel
columbarius has quit [Ping timeout: 480 seconds]
mbrost has quit [Ping timeout: 480 seconds]
nehsou^ has quit [Remote host closed the connection]
agd5f has joined #dri-devel
TheCompany has quit []
agd5f_ has quit [Ping timeout: 480 seconds]
ngcortes has quit [Ping timeout: 480 seconds]
mdroper has quit [Read error: Connection reset by peer]
UnOriginalOne has joined #dri-devel
Danct12 has joined #dri-devel
ngcortes has joined #dri-devel
UnOriginal1 has joined #dri-devel
UnOriginalOne has quit [Ping timeout: 480 seconds]
UnOriginal1 has quit [Quit: Leaving]
ngcortes has quit [Ping timeout: 480 seconds]
sarnex has quit [Read error: Connection reset by peer]
sarnex has joined #dri-devel
Zopolis4 has joined #dri-devel
ayaka has joined #dri-devel
yuq825 has joined #dri-devel
krushia has joined #dri-devel
bmodem has joined #dri-devel
heat has quit [Remote host closed the connection]
heat has joined #dri-devel
heat has quit [Read error: No route to host]
heat has joined #dri-devel
dcz has joined #dri-devel
YuGiOhJCJ has joined #dri-devel
kzd has quit [Ping timeout: 480 seconds]
Zopolis4 has quit [Quit: Connection closed for inactivity]
bgs has joined #dri-devel
YuGiOhJCJ has quit [Ping timeout: 480 seconds]
YuGiOhJCJ has joined #dri-devel
itoral has joined #dri-devel
Duke`` has quit [Ping timeout: 480 seconds]
bmodem1 has joined #dri-devel
bmodem has quit [Ping timeout: 480 seconds]
heat has quit [Remote host closed the connection]
heat has joined #dri-devel
<tomeu> if anybody has trouble building rusticl with clang 16, try upgrading bindgen to something newer than 0.59.2 (what is in CI currently)
mszyprow has joined #dri-devel
Zopolis4 has joined #dri-devel
thellstrom has joined #dri-devel
elongbug_ has quit [Read error: Connection reset by peer]
elongbug_ has joined #dri-devel
<kode54> wow
<kode54> check out this um, thing
<kode54> (video recording of monitor showing some fine sync looking jank)
<kode54> (video on screen is of a bald eagle getting its beak planed, because that's a thing captive birds need?)
bgs has quit [Remote host closed the connection]
yixie has joined #dri-devel
sghuge has quit [Remote host closed the connection]
sghuge has joined #dri-devel
frieder has joined #dri-devel
fab has joined #dri-devel
tzimmermann has joined #dri-devel
tursulin has joined #dri-devel
fab has quit [Ping timeout: 480 seconds]
bmodem1 has quit [Ping timeout: 480 seconds]
bbrezillon has quit [Remote host closed the connection]
rasterman has joined #dri-devel
bbrezillon has joined #dri-devel
pochu has joined #dri-devel
<sima> kode54, looks like tile-y ccs compression lolz
dcz_ has joined #dri-devel
pcercuei has joined #dri-devel
dcz has quit [Ping timeout: 480 seconds]
eukara has quit [Ping timeout: 480 seconds]
lynxeye has joined #dri-devel
heat has quit [Remote host closed the connection]
heat has joined #dri-devel
vjaquez has quit [Remote host closed the connection]
sarahwalker has joined #dri-devel
thellstrom1 has joined #dri-devel
swalker_ has joined #dri-devel
swalker_ is now known as Guest1014
thellstrom has quit [Ping timeout: 480 seconds]
sarahwalker has quit [Ping timeout: 480 seconds]
bmodem has joined #dri-devel
bmodem has quit [Ping timeout: 480 seconds]
Company has joined #dri-devel
Leopold_ has joined #dri-devel
alpalcone has joined #dri-devel
lplc has quit [Ping timeout: 480 seconds]
rauji__ has quit []
dcz has joined #dri-devel
dcz_ has quit [Ping timeout: 480 seconds]
cmichael has joined #dri-devel
TMM has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
TMM has joined #dri-devel
vliaskov has joined #dri-devel
Zopolis4 has quit [Quit: Connection closed for inactivity]
heat has quit [Remote host closed the connection]
heat has joined #dri-devel
dviola has quit [Read error: Connection reset by peer]
dviola has joined #dri-devel
Lucretia has quit []
vliaskov has quit [Read error: Connection reset by peer]
YuGiOhJCJ has quit [Quit: YuGiOhJCJ]
mitchpk has joined #dri-devel
Leopold__ has joined #dri-devel
dcz has quit [Ping timeout: 480 seconds]
mitchpk has quit [Quit: Leaving]
dcz has joined #dri-devel
Leopold_ has quit [Ping timeout: 480 seconds]
itoral has quit [Ping timeout: 480 seconds]
TMM has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
TMM has joined #dri-devel
Daanct12 has joined #dri-devel
Daaanct12 has quit [Read error: Connection reset by peer]
Daaanct12 has joined #dri-devel
Daanct12 has quit [Ping timeout: 480 seconds]
<rodrigovivi> airlied sima: what's your take on this https://lore.kernel.org/all/2023051029-overspend-sherry-1b85@gregkh/ ?
<rodrigovivi> should we continue using STAGING config in our drm drivers and make sure we taint the kernel? or should we create a new DRM_STAGING tag depending on EXPERT and move the current usages to it and then add it to Xe?
<rodrigovivi> demarchi ^
smiles_1111 has joined #dri-devel
Daaanct12 has quit [Remote host closed the connection]
smilessh has quit [Ping timeout: 480 seconds]
Danct12 has quit [Quit: WeeChat 3.8]
<sima> rodrigovivi, for out-of-tree it's kinda whatever
<sima> once it's merged I think the alpha_support_flag is good enough until it's all fully validated
<sima> since that should taint too
<sima> ofc if you're already 100% confident on the uapi and "not too many bugs to the point we can't support the thing" then you can enable right away imo
<sima> since either way there's no taking back with uapi really
dcz_ has joined #dri-devel
dcz has quit [Ping timeout: 480 seconds]
mbrost has joined #dri-devel
mbrost_ has joined #dri-devel
jkrzyszt_ has quit [Remote host closed the connection]
mbrost has quit [Ping timeout: 480 seconds]
agd5f_ has joined #dri-devel
jkrzyszt_ has joined #dri-devel
psykose has joined #dri-devel
mbrost_ has quit [Ping timeout: 480 seconds]
agd5f has quit [Ping timeout: 480 seconds]
<karolherbst> sima: do you have some time to help me understand what's going on with https://gitlab.freedesktop.org/drm/nouveau/-/issues/209 ?
<sima> regrets?
<karolherbst> how so? I just don't see the problem lockdep points out here
<sima> oh just general sarcasm, there's usually not a good time ahead when your lockdep splat has 4 locks involved
<karolherbst> yeah, but I also don't see the deadlock anyway
<sima> btw if you have anyhing with more than 2 contexts (2 locks or lock + irq) the "Possible unsafe locking scenario:" is bs
<sima> it's only correct for 2
<karolherbst> ahh
<sima> so you need to build the full graph yourself
<karolherbst> shouldn't get printed then
<sima> where's the entertainment in that
mszyprow has quit [Remote host closed the connection]
<sima> but yeah the loop seems to be calling request_fw while holding a lock
<sima> which has a chain back to mmap_sem
<sima> which deadlocks, because request_fw needs userspace and that needs to be able to take faults :-)
<sima> so pull that request_fw call out of any mutex section and you should be good
<karolherbst> ahh
<karolherbst> so generally, request_firmware while holidng locks == bad
<sima> you get the splat only into some arbitrary vfs locks but that's the root issue I think
<sima> yup
<karolherbst> I have to try to trigger this anyway
<karolherbst> mhhh
<karolherbst> sima: yeah soo. bad news, I doubt we can move that request_firmware out of locked sections
<sima> maybe put a mmap_read_lock(mm); mmap_read_unlock(mm); into request_fw to trigger it more reliably
<sima> karolherbst, hence regrets :-P
<sima> karolherbst, from a very quick look
<sima> you get to add another hook plus bail-out path
<sima> so you can request the fw in the ioctl, stuff it somewhere and retry
<karolherbst> mhh. annoying
<sima> or you make sure you load all these at driver init time, which is kinda how it should
<karolherbst> yeah.. we already moved some firmware to that
<sima> (aside from the annoyance of delays and I guess harder to hack around on)
<karolherbst> and I was thinking about doing the same with this one then
<sima> yeah, somehow you need to pull it out
<sima> the annoyance with delays is that when you do this at driver load it's either a stall, or you do it async and then it's still a dependency, but you might accidentally cheat lockdep out of noticing it
<karolherbst> Maybe I just talk to Ben about it, because with GSP all that changes anyway
<karolherbst> and with GSP we get like multi MB firmware
<karolherbst> sima: wasn't the problem with loading firmware on init, that we need an FS in the first place?
<karolherbst> which kinda requires a userspace program context?
<sima> karolherbst, the fs can be the initramfs
<karolherbst> right.. but I was kinda under the impression it needs to start from an ioctl, $because
<karolherbst> ohh wait, that was just a problem with suspend/resume/whatever
<karolherbst> so on suspect you can't load firmware
<karolherbst> *suspend
<karolherbst> sima: so I guess what we should do is to call request_firmware_nowait on all firmware files we need and just make code needing that firmware to wait on its completion?
lplc has joined #dri-devel
<karolherbst> which we should be able to do outside of locked areas
<karolherbst> mhh
<karolherbst> maybe not
<karolherbst> uhhh
<karolherbst> annoying
alpalcone has quit [Ping timeout: 480 seconds]
yuq825 has quit []
<sima> karolherbst, yeah the wait is still an issue
<sima> but if you just wait for a completion lockdep wont see the chain anymore
<sima> "there I fixed it"
Haaninjo has joined #dri-devel
vliaskov has joined #dri-devel
fxkamd has joined #dri-devel
<karolherbst> sima: yeah.. so I think the issue here is, that when all users of the GPU disappear, we kinda clean stuff up, but once there is another user, we reinitialize things, which might trigger the firmware to be loaded again. So my assumption is, if we just cache it the first time it should be fine...
<karolherbst> it's kinda a pita situation tho... I mean the entire way of how firmware loading things
<karolherbst> *works
<sima> karolherbst, why do you clean up when all users disappear?
<karolherbst> I have no idea
<sima> is that just poor runtime pm because vgaswitcheroo didn't need more?
<karolherbst> I doubt it
<sima> but yeah loading the fw once and caching it is the way to go I think
<karolherbst> we also do so because we had issues doing firmware stuff on suspend
<karolherbst> for some stuff
<sima> but first time still needs to be outside the locks
<karolherbst> guess we should do it for more
<karolherbst> mhhh
<sima> well reloading fw on resume makes sense
<sima> it's more "why is this connected to users"
<sima> since ideally you suspend stuff even when there are users, as long as they didn't do anything for a while
<karolherbst> the problem was, we needed new firmware on suspend
<sima> yeah you need to cache
<karolherbst> we already do
<sima> hm what do you mean with "new fw" then?
<karolherbst> I mean, we fixed _that_ issue already
<karolherbst> but I don't think we need to do the initial loading outside locks
<karolherbst> the problem is more or less: process releases all fds to nouveau, takes mmap lock on it clean up path. Another process spawns opening an fd to nouveau and starts the init path, that might trigger a firmware load and with bad timing -> dead lock
<karolherbst> we kinda have locks per subsystem
<karolherbst> and uhh.. it's kinda annoying
<karolherbst> so I can see this happening midway through deinit/init
<sima> karolherbst, "I can explain how lockdep is wrong in this specific case" is not very good approach to shutting up lockdep :-)
<karolherbst> I'm not saying lockdep is wrong
<kode54> sima: maybe 14013111325?
<sima> kode54, ?
<kode54> workaround/bug number
<kode54> something to do with MCS?
<karolherbst> I'm just saying it's a deadlock when you have a process freeing the last nouveau resources vs one allocating one and it's just bad timing
<sima> karolherbst, hm I guess I misunderstood
<sima> but also my take on lockdep splats is that they tell you the design isn't clean
<karolherbst> so we if cache the firmware, on reinit it should be fine
<sima> even if there's some specific reasons for why it can or cannot happen in practice
<karolherbst> that case only happens if the firmware already got loaded once is my understanding here
<karolherbst> and doesn't on first load
<sima> yeah that's what I mean with "in practice this can't happen on first load, lockdep is wrong"
<sima> because it's different files/mmap_sem/whatever
<karolherbst> yeah, it's not complaining on first load here
<sima> hm I need to look at the splat again
<karolherbst> `__x64_sys_munmap` kinda gives it away I think
<karolherbst> mhh not sure
<sima> well if you say it doesn't splat on load
<sima> then that means the chain of lock deps matter
<sima> to tie the loop
<sima> and a lot of those are in nouveau cleanup code
<sima> so maybe there's another place where you have a not-so-great dependency that shouldn't be there for a clean design
<karolherbst> I'm sure there are some of those
<karolherbst> that splat with driver_probe_device is kinda weird
mbrost has joined #dri-devel
<karolherbst> maybe it's also a race on driver init and userpsace processes already using the driver?
<sima> lockdep doesn't care about races
<sima> as long as you hit all the call paths
<karolherbst> I know
<karolherbst> but
mbrost_ has joined #dri-devel
<karolherbst> mhhh
alpalcone has joined #dri-devel
<sima> so I don't know enough nouveau.ko by far to say anything
<sima> but in general taking big locks in ctor/dtor paths is smelly
<sima> and having a request_fw makes these locks the biggest you can have in the kernel
lplc has quit [Ping timeout: 480 seconds]
<sima> big = starting points in the dep graph
<sima> the issue could be taking these locks in dtor/ctor (but in the issue there's 3, so which one is it?)
<sima> or that the innermost one is too big due to the request_fw and really should just be a very, very inner mutex to serialize hw access
<sima> sometimes it's unavoidable and then you need to push dtor work into a cleanup queue
fxkamd has quit []
<karolherbst> right, but this is more ref counting stuff, where nouveau frees up resources on last use
<karolherbst> it's also to deal with the fact, that you can enable/disable certain parts of the GPU
<sima> yeah holding the mutex over more than the dec is an antipattern
<sima> it's kref_put (good) vs kref_put_mutex (pain)
mbrost has quit [Ping timeout: 480 seconds]
<karolherbst> yeah.. so we guard the initialization of those bits with locks
<karolherbst> and we take the same lock on unref
<karolherbst> mhh, it's all kinda weird, maybe Ben has any ideas
kzd has joined #dri-devel
Duke`` has joined #dri-devel
Danct12 has joined #dri-devel
iive has joined #dri-devel
<kode54> might have been mesa!22802
<q4a> Hi. Is there channel for Imagination GPU? like #_oftc_#panfrost:matrix.org for Mali or #_oftc_#d3d9:matrix.org for Nine.
<alyssa> yeah
<alyssa> #powervr
smilessh has joined #dri-devel
smiles_1111 has quit [Ping timeout: 480 seconds]
<q4a> Thanks!
pochu has quit [Quit: leaving]
frankbinns has joined #dri-devel
tzimmermann has quit [Quit: Leaving]
frieder has quit [Remote host closed the connection]
cmichael has quit [Quit: Leaving]
Guest1014 has quit [Remote host closed the connection]
<pinchartl> sima: would you consider dim to be usable for other subsystems ? not necessarily as such, but would the required refactoring of the script to move DRM-specific bits to a configuration file be something that would be a) doable and b) accepted upstream ?
<sima> not sure
kts has joined #dri-devel
<sima> in the end it's bash
kts has quit []
<pinchartl> at least it's not perl ;-)
<sima> but also, it has config files, both for user specific stuff (.dimrc) and for what kind of trees/branches you have and how the integration tree is built (nightly.conf for historical reasons)
<sima> so I think the only thing needed is to make the "where is the branch with nightly.conf" configurable you're probably 90% there
<pinchartl> I may have a look, depending on how much buy-in I would get for using it for V4L2
Daanct12 has joined #dri-devel
Danct12 has quit [Ping timeout: 480 seconds]
thenemesis has joined #dri-devel
smilessh has quit [Ping timeout: 480 seconds]
thenemesis has quit []
thenemesis has joined #dri-devel
thenemesis has quit []
jeeeun841351 has quit [Quit: The Lounge - https://thelounge.chat]
jeeeun841351 has joined #dri-devel
thellstrom has joined #dri-devel
thellstrom1 has quit [Read error: Connection reset by peer]
clem_ has left #dri-devel [#dri-devel]
lynxeye has quit [Quit: Leaving.]
gouchi has joined #dri-devel
gouchi has quit []
kts has joined #dri-devel
pyromancy has joined #dri-devel
pyromancy has quit []
pyromancy has joined #dri-devel
dviola has quit [Ping timeout: 480 seconds]
dviola has joined #dri-devel
dviola has quit []
mbrost__ has joined #dri-devel
mbrost has joined #dri-devel
mbrost_ has quit [Ping timeout: 480 seconds]
<DemiMarie> pinchartl: Perl > bash
pyromancy has quit [Quit: pyromancy]
mbrost__ has quit [Ping timeout: 480 seconds]
pyromancy has joined #dri-devel
ngcortes has joined #dri-devel
tursulin has quit [Ping timeout: 480 seconds]
mbrost has quit [Remote host closed the connection]
mbrost has joined #dri-devel
mbrost has quit [Ping timeout: 480 seconds]
kts has quit [Quit: Konversation terminated!]
psykose has quit [Remote host closed the connection]
jolan has quit [Quit: leaving]
jolan has joined #dri-devel
tango_ has quit [Ping timeout: 480 seconds]
gildekel has quit [Quit: Connection closed for inactivity]
tango_ has joined #dri-devel
<karolherbst> gfxstrand: I think I know how to deal with crossworkgroup initializers: make them the frontends problem. We should simply add some APIs (maybe to clc) to query for all of them and the frontends just creates the nir_shader normally for them. And then it's entirely up to the frontend to when to call them
<karolherbst> mhh actually... those are always blocks of constant memory, no? Though for destructors it might not be
<karolherbst> Initializer + Finalizer
<karolherbst> but yeah, for those I think clc should just expose that information
<karolherbst> and make it part of `clc_kernel_info` or something
<karolherbst> I wonder if we want to lower CrossWorkgroup Variable initializers to a kernel function just writing to memory passed in as the only arg
<karolherbst> hh
<karolherbst> mhhh
<jenatali> Yeah, that was my thinking about how to handle it too
<karolherbst> sadly I really need this soonish now :)
<jenatali> Ping me if you need code review
<karolherbst> I think the Initializer + Finalizer case is pretty straightforward
<karolherbst> the variable initializer one is the annoying part
<karolherbst> also
<karolherbst> I think that `cl_khr_subgroups` line is actually wrong
<karolherbst> soo.. this is where it's getting weird
<karolherbst> OpenCL core is different from cl_khr_subgroups
<jenatali> For initializers, I think it can just be reflected out and make it the frontend's problem. It just needs to ensure that the data is uploaded into the buffer before the first kernel is executed
<karolherbst> the extension requires CL_DEVICE_SUB_GROUP_INDEPENDENT_FORWARD_PROGRESS where OpenCL core doesn't
<karolherbst> so I'm not sure if it's safe to always expose the extension string via clc
<jenatali> Oof
<jenatali> Yeah maybe split that
<karolherbst> subgroup + subgroup_ifp or subgroup_khr_ext?
<jenatali> I think subgroup_ifp
<karolherbst> the CTS is happy eihter way, but I don't want to debug applications where they actually check that correct in kernel and do things differently :)
<karolherbst> *correctly
<karolherbst> but I'm also not sure if we could just return true for all hw vendors here. I just don't know
karolherbst is now known as karolherbst_
<karolherbst_> ohh it's for threads within a sub group
<karolherbst_> yeah.. I guess not then
karolherbst_ is now known as karolherbst
<karolherbst> ehh no, it's about subgroups towards each other actually
<karolherbst> that's way more sane
<airlied> konstantin_: asan with just a cts run doesn't produce anything locally?
<konstantin_> One sec, lemme try
<konstantin_> passes
rasterman has quit [Quit: Gettin' stinky!]
<konstantin_> airlied: btw, why does llvmpipe copy constant buffers? I had to add it back because it caused some failures with the llvmpipe job but now, all dEQP-VK.descriptor_indexing.sampler_after_bind* test fail
<karolherbst> jenatali: I think I have a better idea... spirv_to_nir sould just return a array like for constant, and if the frontend allocates memory it just uploads the data...
<karolherbst> and for Initializer + Finalizer we just compile the kernel in the frontend as discussed above
<jenatali> karolherbst: Yeah isn't that what I said?
<jenatali> "For initializers, I think it can just be reflected out and make it the frontend's problem. It just needs to ensure that the data is uploaded into the buffer before the first kernel is executed"
<karolherbst> ohh, didn't see that
<karolherbst> yeah, it can be done on creation of the cl_program object and it just lives there or something...
<karolherbst> yeah...
<karolherbst> shouldn't be too hard then
<jenatali> Yeah it has to live there, since it can be shared among kernels in the program
<karolherbst> finalizers are funky
<karolherbst> I kinda don't see the point, but I can see weird pointer magic happening
<karolherbst> I suspect you could store pointers to memory there and retrieve it after the program object was destroyed
<karolherbst> it's messed up, but...
jkrzyszt_ has quit [Ping timeout: 480 seconds]
apinheiro has joined #dri-devel
apinheiro has quit [Quit: Leaving]
dcz_ has quit [Ping timeout: 480 seconds]
<airlied> konstantin_: complete CTS run though? it seems weird venus is triggering anything
<airlied> konstantin_: I'm not actually 100% what was behind the consts copying, I think it comes from the olden days where consts could get overwritten somehow
<airlied> so it would copy the consts, I'm not sure what value there really is doing it now if any
alyssa has left #dri-devel [#dri-devel]
Duke`` has quit [Ping timeout: 480 seconds]
tango_ has quit [Ping timeout: 480 seconds]
tango_ has joined #dri-devel
Haaninjo has quit [Quit: Ex-Chat]
Kayden has quit [Quit: -> insomnia]
q66 is now known as Guest1071
q66 has joined #dri-devel
elongbug_ has quit [Read error: Connection reset by peer]
<karolherbst> dcbaker: updated my rust.bindgen + llvm fix https://github.com/mesonbuild/meson/pull/11733
<karolherbst> I don't really understand that loop in get_compiler_system_dirs, so I just copied it and hope for the best
Guest1071 has quit []
sima has quit [Ping timeout: 480 seconds]
pcercuei has quit [Quit: dodo]
smilessh has joined #dri-devel
vliaskov has quit [Remote host closed the connection]
thellstrom has quit [Ping timeout: 480 seconds]
fxkamd has joined #dri-devel
mbrost has joined #dri-devel
alyssa has joined #dri-devel
alyssa has left #dri-devel [#dri-devel]
mbrost_ has joined #dri-devel
mbrost has quit [Ping timeout: 480 seconds]
mbrost__ has joined #dri-devel
mbrost has joined #dri-devel
mbrost_ has quit [Ping timeout: 480 seconds]
mbrost__ has quit [Ping timeout: 480 seconds]
alanc has quit [Remote host closed the connection]
alanc has joined #dri-devel
mbrost has quit [Ping timeout: 480 seconds]
iive has quit [Quit: They came for me...]
Kayden has joined #dri-devel