ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar
psykose has quit [Remote host closed the connection]
psykose has joined #dri-devel
columbarius has joined #dri-devel
co1umbarius has quit [Ping timeout: 480 seconds]
ngcortes has quit [Ping timeout: 480 seconds]
heat has quit [Remote host closed the connection]
kzd has quit [Quit: kzd]
kzd has joined #dri-devel
dviola has left #dri-devel [WeeChat 4.0.0]
dviola has joined #dri-devel
digetx has quit [Ping timeout: 480 seconds]
digetx has joined #dri-devel
Danct12 is now known as Guest4502
Danct12 has joined #dri-devel
Danct12 has quit []
TMM has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
TMM has joined #dri-devel
orbea1 has quit []
orbea has joined #dri-devel
aravind has joined #dri-devel
<orowith2os> karolherbst: was poking around the Mesa source tree, rusticl specifically, and noticed that rusticl is currently on Rust 2018. Have you not gotten around to updating to 2021 yet, or do you want to stay on 2018 for backwards compat, or?
<orowith2os> I'd love to give it a small attempt myself to see if I can manage it, test my skills, but want to check with you first.
aravind has quit [Ping timeout: 480 seconds]
bmodem has joined #dri-devel
lina has quit [Quit: Lost terminal]
Duke`` has joined #dri-devel
bmodem has quit [Ping timeout: 480 seconds]
bmodem has joined #dri-devel
bmodem has quit [Excess Flood]
bmodem has joined #dri-devel
kzd has quit [Ping timeout: 480 seconds]
<orowith2os> never mind, git logs were goofy 😛
<orowith2os> it seems like it uses Rust 2021 already? At least, it uses the stdlib of 2021.
fab has joined #dri-devel
zf has quit [Ping timeout: 480 seconds]
Duke`` has quit [Ping timeout: 480 seconds]
zf has joined #dri-devel
bmodem has quit [Ping timeout: 480 seconds]
bmodem has joined #dri-devel
Ultra has quit [Ping timeout: 480 seconds]
Ultra has joined #dri-devel
zf has quit [Ping timeout: 480 seconds]
lina has joined #dri-devel
itoral has joined #dri-devel
zf has joined #dri-devel
sgruszka has joined #dri-devel
f11f12 has joined #dri-devel
fab has quit [Quit: fab]
rasterman has joined #dri-devel
bgs has joined #dri-devel
mbrost has quit [Ping timeout: 480 seconds]
tursulin has joined #dri-devel
fab has joined #dri-devel
sghuge has quit [Remote host closed the connection]
sghuge has joined #dri-devel
tzimmermann has joined #dri-devel
mbrost has joined #dri-devel
fab has quit [Read error: Connection reset by peer]
fab has joined #dri-devel
fab is now known as Guest4518
eukara has quit [Ping timeout: 480 seconds]
bmodem has quit [Ping timeout: 480 seconds]
Guest4518 has quit []
bmodem has joined #dri-devel
bmodem has quit [Excess Flood]
fab_ has joined #dri-devel
bmodem has joined #dri-devel
fab_ is now known as Guest4519
jhli has quit [Remote host closed the connection]
frieder has joined #dri-devel
frankbinns has joined #dri-devel
mbrost has quit [Ping timeout: 480 seconds]
aravind has joined #dri-devel
<MrCooper> pq: "If FB_ID is non-0, solid_fill blob is ignored" is backwards compatible, isn't it?
bmodem has quit [Ping timeout: 480 seconds]
bmodem has joined #dri-devel
bmodem has quit [Excess Flood]
bmodem has joined #dri-devel
JohnnyonF has joined #dri-devel
JohnnyonFlame has quit [Ping timeout: 480 seconds]
cwegener1 has quit [Quit: WeeChat 3.8]
JohnnyonFlame has joined #dri-devel
kxkamil2 has quit []
elongbug has joined #dri-devel
Ahuj has joined #dri-devel
JohnnyonF has quit [Ping timeout: 480 seconds]
djbw has quit [Read error: Connection reset by peer]
i509vcb has quit [Quit: Connection closed for inactivity]
<lordheavy> Any way to debug "Couldn't create Clang invocation." with rusticl ? already tried RUSTICL_DEBUG=clc
<pq> MrCooper, I suppose so, if FB_ID=0 is not the (only) way to disable the plane.
<MrCooper> hmm, good point, that may be the case
<pq> thinking about someone leaving a non-0 solid_fill blob behind
<tarceri> Anyone got a setup with nvidia binary drivers installed and can test a piglit shader_test for me?
<MrCooper> pq: maybe CRTC_ID=0 disables the plane as well
<pq> MrCooper, I guess, but what does userspace rely on?
<pq> maybe DRM core required both CRTC_ID and FB_ID to be 0 together right now? That would work.
kxkamil has joined #dri-devel
Guest4519 has quit [Read error: Connection reset by peer]
aravind has quit [Ping timeout: 480 seconds]
<pq> OTOH, I don't even care about this kind of "new userspace left-overs" compatibility with old userspace, because it seems no-one else does either. And it gets exponentially more difficult when new things get added if done this ad hoc way.
lina has quit [Ping timeout: 480 seconds]
<MrCooper> looks like drm_atomic_plane_check does enforce that both are (non-)0
Haaninjo has joined #dri-devel
<pq> it would need extending to solid_color blob, too
<pq> to be future-proof
<MrCooper> that would break currently existing user space though
quantum5 has joined #dri-devel
quantum5- has quit [Ping timeout: 480 seconds]
alanc has quit [Remote host closed the connection]
alanc has joined #dri-devel
<pq> MrCooper, yeah, OTOH when someone then adds a third way of putting content to planes, we would not be able to ad hoc make that backward-compatible, because the "all are guaranteed 0 for disable plane" card was already used and discarded.
<MrCooper> right
Haaninjo has quit [Ping timeout: 480 seconds]
<karolherbst> orowith2os: should be 2021
<karolherbst> atm we set the rustc req to 1.60, but I think there might be reason enough to bump that soon, now that the kenrel is at 1.68.2 and firefox ESR at 1.65
fab_ has joined #dri-devel
fab_ is now known as Guest4527
cmichael has joined #dri-devel
YuGiOhJCJ has joined #dri-devel
MrCooper has quit [Quit: Leaving]
MrCooper has joined #dri-devel
MrCooper has quit [Remote host closed the connection]
MrCooper has joined #dri-devel
MrCooper has quit [Remote host closed the connection]
MrCooper has joined #dri-devel
funtoomen has joined #dri-devel
<funtoomen> Hi, I have recently read Phoronix article about you switching to BLAKE3 instead of SHA1. If BLAKE3 is a cryptographic hash function wouldn't it be faster to use a non cryptographic hash function or even a checksum function? Do you need the benefits of cryptographic hash functions over other hash/checksum functions for the purpose of uniquely identifing Vulkan shaders?
<pendingchaos> yes, it needs to be cryptographic
<funtoomen> Why so?
<pendingchaos> because collisions aren't handled
<HdkR> Some internal hashing uses xxhash for things that don't need cryptographic
vliaskov has joined #dri-devel
<funtoomen> pendingchaos: what do you mean? wouldn't collisions happen almost never?
<pendingchaos> with a cryptographic hash, yes
<pendingchaos> not with a non-cryptographic hash
<pq> But how much is not too much? There is a difference when you have an adversary that is intentionally attempting to cause a collision, and hitting one just by sheer bad luck.
<dottedmag> funtoomen: Any performance suggestions should come with benchmark information, thnks
<pendingchaos> pq: I'm not sure I understand the question
<pendingchaos> a single collision is bad, because then the shader cache will use the wrong shader binary
<funtoomen> dottedmag: i mean, im just asking. i dont know nothing about graphics driver development, but i know enough cryptology to know that cryptographic hash functions come but some drawbacks
<funtoomen> s/come but/come with/
<pq> All hash functions with hash shorter than input have collisions. With cryptographic hash functions it is just much harder to intentionally cause collisions, but they can still happen accidentally.
<pq> when does that theoretical concern become a practical concern, I have no clue
<pendingchaos> probably never
<pq> why?
<pq> Is the goal of making intentionally finding collisions as hard as possible equivalent to the goal of reducing the possibility that two shader texts collide by accident?
<pq> funtoomen, any idea?
<pendingchaos> because cryptographic hash functions are very good and have a relatively large output size
<pendingchaos> I would expect the former goal to help in the latter
<funtoomen> pq: in my opinion even checksum function shoud make the collision *very* unlikely, and would come with quite some performance. but as i said i know nothing about graphic driver development, im just cryptology enthusiast.
<pq> funtoomen, really nice to hear from that side :-)
<psykose> shader cache is trusted input so it does have the implication of someone wanting to intentionally collide it under some scenario
elongbug has quit [Remote host closed the connection]
elongbug has joined #dri-devel
<pq> psykose, but if an adversary is able to run shaders, whould they also not have write access to the cache files? Maybe not on WebGL perhaps?
<psykose> other way around, can't run shaders but can write to cache
<pq> how?
<pq> you mean the cache is poisoned whatever ways, and then a legit app falls prey?
<psykose> perhaps
<psykose> i mean it's obviously a very niche scenario
<psykose> hm, though it is possible how you say it too
<pq> in that case, why wouldn't the attacker just look at the original cache what hashes have been used, and simply replace their contents?
<pq> guaranteed hit the legit app starts the next time
<pq> regardless of hash functions
<psykose> that's true :)
<funtoomen> so you use cryptographic functions just because the make almost impossible thing (collision) even more imposible?
<funtoomen> s/because the/because they/
<HdkR> Targeted collisions are very much a concern
<pq> I suppose saving the original shader text in the cache for detecting collisions would be prohitive from both performance and legal perspective? and storage?
<pq> *prohibitive
<funtoomen> HdkR: but if someone could try and target a collision wouldnt be the machine already comprimised?
<pq> not with WebGL, I suppose - unless you consider WebGL itself to compromise the machine in the first place
<HdkR> Remote execution doesn't necessarily mean the full system is compromised
<HdkR> Shutting down attack vectors is good :)
<karolherbst> yeah soo.. if it comes to hashes, the _correct_ way of using them is verify the key matches the data set, though that would blow up our disk cache and makes it more expensive, but... yeah, atm we have this current potentialy problem of a key loading a different cache item
<funtoomen> HdkR: Heah, i guess you are right. Im curius what are the performence drawbacks, would be cool if someone more competent then me did a benchmark.
<dottedmag> Also Amdahl's law. How much time does hashing take after switching to blake3? That's why I asked for benchmarking: if hashing cost is now in the noise, then replacing the hash with the one that executes in no time won't improve anything.
<karolherbst> but a cryptographic "safe" hash kinda mitigates that problem enough so people rely on it being sane
<funtoomen> ok, i think i get it now.
<HdkR> Blake3 is actaully quite quick for being a cryptographic hash which helps :)
<karolherbst> but anyway.. sha1 is already broken, soo...
<funtoomen> yeah, thats why they switched i guess
<karolherbst> I am wondering if any use actually hit a cache collision already...
<karolherbst> *user
<funtoomen> same
<karolherbst> *hash
<pq> they probably tried a different mesa version next - does that invalidate the whole cache?
<funtoomen> would love i someone calculated the probability based on the size of hash cache
<funtoomen> s/love i/love if/
<karolherbst> pq: yes
<pq> so that would blow the problem away
<karolherbst> well... we use the build-id of the so files
<funtoomen> thank you all, i have to go
funtoomen has quit [Quit: Page closed]
<psykose> dottedmag: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22387 is like 50% already just over sha1 for full cache hits
<psykose> not sure what that 2.4s ratio is of cache:everything-else
<psykose> but it's really just low territory at that point i guess
<karolherbst> I wonder if it's better to just depend on a lib implementing blake3 instead of shipping on ourselfes....
<psykose> well, it is a lib
<psykose> it's blake3 reference copied into the tree
<psykose> :D
<karolherbst> yeah...
<psykose> just like sha1 was and xxhash also is i think
<karolherbst> could also just depend on libblake instead or something :P
<psykose> aye
<pq> good luck adding any dependency to mesa, or even bumping existing ones
<karolherbst> ohh seems like blake3 is so fast, because it actually bothered to keep SIMD in mind
<dottedmag> I remember maintaining libsha1 (a copy-paste from somewhere else, SHA1 only) in an embedded distro just to build kdrive (or was it fontconfig?). It wasn't that fun.
<karolherbst> so the speed mainly comes from the fact you can run stuff in parallel
<karolherbst> fun.. seems like the rust impl of blake3 is the "main" one
<psykose> eh, i'd say the C one is also maintained and meant to be used
<psykose> not like some thing someone threw in
<karolherbst> yeah, but you can't run it with multiple threads
<psykose> well, on the same input no
<psykose> but shaders are parallelisable per-shader anyway, no?
<psykose> i.e. multiple hash threads, each takes 1, ..
<karolherbst> it's not limited to one operation
<karolherbst> you can run hashing one thing in parallel
<karolherbst> (at least with the rust impl)
<karolherbst> `b3sum` e.g. does this
<psykose> yeah, i'm referring to the C version and what you can do anyway, abstractly for things that don't parallel on one input
<psykose> personally i always found that to be an easier model
<karolherbst> right
<karolherbst> it probably doesn't make much sense if compilation of shaders happens in parallel already
<karolherbst> but it seems like most of the speed actually comes from doing it multithreaded
<karolherbst> yeah...
<psykose> are you sure? with b3sum it's actually slower with more threads on a random thing i tested
<karolherbst> huh.. weird
<psykose> ah, no
<psykose> i was misreading
<karolherbst> I wonder how its speed compares to sha1sum? with 1 thread
<karolherbst> but yeah.. it looks like 8-16 threads is somehow the sweet spot
<karolherbst> at least according to the paper
<karolherbst> ohh.. the "5.2 Multi-threading" section explains it quite well
<karolherbst> makes it sound like you can actually also do it on the GPU fairly easy
<karolherbst> okay
<psykose> personally though i'd say the User: time is the most interesting one, and how it's 2x faster than sha1sum
<karolherbst> yeah, so it's not _that_ much faster single threaded
<karolherbst> still faster tho
<psykose> more threads is 'nice' but you can't in general rely on it because it is more actual cpu resource even if wall is lower
<karolherbst> yeah
<psykose> (compression algorithms usually have a scaling issue there, like if you do more than -T4 on zstd you'll get less time but at like -T16 you're not even halving the time and using over 4x the cpu, so it becomes very wasteful unless you have a dedicated reason to do it, etc)
<karolherbst> yeah so in blake3 you chunk the input and each thread can operate on those chunks, and then the main thread chains it all together in order
<psykose> yeah, it scales quite efficiently
<karolherbst> you could probably calculate all those chunks on a GPU and just chain it on the CPU :D
<karolherbst> but kinda cool
<karolherbst> being able to make full use of SIMD is nice
<karolherbst> at least it's a way more sane programming model they had in mind it seems
<psykose> nerd :p
<karolherbst> :P
<karolherbst> well.. most other things vectorize a linear operation which doens't get you anywhere
itoral has quit [Remote host closed the connection]
YuGiOhJCJ has quit [Remote host closed the connection]
Guest4527 has quit [Quit: Guest4527]
kts has joined #dri-devel
frankbinns has quit [Remote host closed the connection]
<dottedmag> karolherbst: We put a shader into shader so you can hash while you hash?
kts has quit [Ping timeout: 480 seconds]
<karolherbst> yes
<javierm> tzimmermann: nice cleanup series. I only read the cover-letter for now but agree on the direction. I'll try to review the patches tomorrow
<javierm> tzimmermann: btw, I rebased last night the RFC to split FB in FB_CORE and FB. After your recent fbdev cleanups, I could drop two patches and now are only Kconfig and makefile changes :)
<tzimmermann> thanks, javierm
<tzimmermann> i wanted to use the firmware edid with simpledrm, but found that the rsp code is slightly chaotic. hence the cleanup
<javierm> tzimmermann: right. But having screen_info defined only for the arches that use it would be great. I remember having build issues due some arches missing ifdefery for that
lina has joined #dri-devel
<tzimmermann> yes. and we can even do better, i think. i outline in the cover letter that we could enable it only when there are actual users. it's not in the patchset, but a follow-up would be straight forward.
<javierm> tzimmermann: yup, I read that. That's why I said agree on the direction :)
<tzimmermann> javierm, BTW have you seen https://gitlab.freedesktop.org/drm/amd/-/issues/2649
<tzimmermann> for some odd config, the fbdev console doesn't set up correctly. i'm trying to wrap my head around it, but it's confusing
<tzimmermann> if the primary display is off, the usb-attached monitors remain off as well.
<tzimmermann> it's a regression
<javierm> tzimmermann: hmm, no I haven't seen that bug before. Let me read it...
fab has joined #dri-devel
<MrCooper> tzimmermann: I suspect the lid being closed might just matter indirectly as well, e.g. via timing; AFAICT the fundamental issue is that the fbdev emulation doesn't correctly handle the hot-plugged DP MST connector
<tzimmermann> i does work if the old output_poll_changed callback has been set. i cannot get how this affects fbdev state. maybe i'll fill the code with printks and let the reporter run it before and after.
<javierm> tzimmermann, MrCooper: if amdgpu is using the generic fbdev emulation then setting .output_poll_changed should not be needed indeed
<javierm> I guess will have to dig that driver code
<MrCooper> "should" being the key word :)
<javierm> MrCooper: right :)
<tzimmermann> right. it "should" not be needed. so there's a bug somewhere
rasterman has quit [Quit: Gettin' stinky!]
sgruszka has quit [Ping timeout: 480 seconds]
<pq> Has anyone else had problems that when program and Mesa are built with ASan, ASan itself segfaults on exit (radeonsi), or everything futex-deadlocks before the program finished (llvmpipe)?
<pq> the program is Weston, fwiw
simon-perretta-img has joined #dri-devel
<pq> That's Mesa build with glvnd. Without glvnd I get reports of leaks inside Mesa with radeonsi, and the same futex deadlock with llvmpipe.
<pq> and I'm pretty sure it's not Weston leaking anything, since eglTerminate + eglReleaseThread should clean up.
illwieckz has quit [Ping timeout: 480 seconds]
illwieckz has joined #dri-devel
sgruszka has joined #dri-devel
<lordheavy> Ok, i have more informations about clang failure and rusticl - adding this simple patch https://paste.xinu.at/6CTg/
<MrCooper> pq: I think the EGLDisplay is inevitably leaked without EGL_KHR_display_reference (which Mesa doesn't support yet: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10118)
<lordheavy> gives me error: unknown argument: '-no-opaque-pointers'
<lordheavy> i think it's related to archlinux - so not a mesa bug - but think king of information is usefull intead of just the message 'Couldn't create Clang invocation'
<psykose> lordheavy: that looks like the mesa is using clang16 which doesn't recognise the no-opaque mode anymore
<psykose> https://gitlab.freedesktop.org/mesa/mesa/-/issues/7468 intel clc is not ported yet, so only works with 15 afaict
<psykose> that said i'm not a developer for that, just what i know of of that combination more generally :)
JohnnyonFlame has quit [Ping timeout: 480 seconds]
<psykose> and you're right, error should be logged i think
<lordheavy> psykose: oh, thanks - now it's time to patch and test
<psykose> i don't think there will be a trivial patch unless it was accidentally left in there, as being opaque-pointers compatible usually takes a bunch more work
<psykose> good luck however
JohnnyonFlame has joined #dri-devel
<pq> MrCooper, oh. In that case I'd kinda expect to see more leaks than I do. Would leaking the display also leak shader bits?
<MrCooper> that's been my assumption, not sure though
<MrCooper> valgrind always reports tons of leaks in Mesa code for me with a Wayland compositor or Xwayland; I've been assuming they're mostly due to this, might be wishful thinking though :)
fab_ has joined #dri-devel
<pq> https://gitlab.freedesktop.org/-/snippets/7648 are the leaks I see from the main thread. There were more in other mesa created threads.
fab_ is now known as Guest4544
<lordheavy> psykose: great, removing '-no-opaque-pointers' at least fixed my issue with clinfo ;)
<psykose> :D
Haaninjo has joined #dri-devel
fab has quit [Ping timeout: 480 seconds]
Guest4544 is now known as fab
fab is now known as Guest4545
Company has joined #dri-devel
junaid has joined #dri-devel
<javierm> MrCooper, tzimmermann: after staring the amdgpu code for a long time, the only thing that I can't think of is that drm_fbdev_generic_setup() is only called when there are available connectors
<javierm> if after probing the driver, a connector is added then the generic fbdev won't be set-up ?
<tzimmermann> javierm, indeed. that's some weird code
<tzimmermann> that whole condition should be removed IMHO
<javierm> which might cause the issue since the drm_fbdev_generic_setup() -> drm_fbdev_generic_client_hotplug() won't happen
<MrCooper> javierm: sounds plausible, if it wasn't for the eDP connector existing even with the lid closed here (even with connected status)
<MrCooper> also, fbdev emulation works on the internal panel if I open the lid
<javierm> MrCooper: that's not what the shared dmesg log says though... at least how I read it
<javierm> [drm:drm_helper_probe_single_connector_modes [drm_kms_helper]] [CONNECTOR:78:eDP-1] status updated from unknown to connected
<tzimmermann> javierm, for the output_poll_changed to work, you'd still need generic_setup
<javierm> tzimmermann: hmm, right
<tzimmermann> still, that branch should probably go
<javierm> tzimmermann, MrCooper: but just by reading the code, I can't see a reason why drm_fb_helper_output_poll_changed() is needed that drm_fbdev_generic_setup() doesn't already
<MrCooper> javierm: FWIW, the lid of this laptop (which is affected by the same or at least very similar issue) is currently closed, and drm_info says eDP status is connected
<javierm> MrCooper: yeah, my suspicious was wrong if drm_fbdev_generic_setup() is needed for mode_config->output_poll_changed
<tzimmermann> javierm, MrCooper. IMHO there's something with the handling of deferred_setup https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/drm_fb_helper.c#L2329 as if it fails once and then cannot recover
sgruszka has left #dri-devel [#dri-devel]
<tzimmermann> but i cannot really point to the issue
<tzimmermann> at probe we call generic_setup()
<tzimmermann> it simulates a hotplug to initialize the display: https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/drm_fbdev_generic.c#L343
Ahuj has quit [Ping timeout: 480 seconds]
<tzimmermann> this apparently worked, as there's no error message in that debug log
<javierm> tzimmermann: yeah, some debug log in drm_fbdev_generic_setup() when it succeedes would be useful
<tzimmermann> the output_poll_changed callback is only called by the two functions starting at https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/drm_probe_helper.c#L691
<tzimmermann> it's immediately followed by client_dev_hotplug(), which calls our client code
<tzimmermann> and it should end up in the same place as output_poll_changed, namely drm_fb_helper_hotplug_event: https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/drm_fbdev_generic.c#L272
<javierm> yeah
<javierm> that's my conclusion as well so I don't understand why is failing...
<tzimmermann> and i'm pretty sure that we take this branch, because our initial simulated hoptplug did not fail. so dev->fb_helper should be set as this point
<javierm> yes
<tzimmermann> javierm, i don't find this line in the debug logs: https://elixir.bootlin.com/linux/v6.1.35/source/drivers/gpu/drm/drm_fb_helper.c#L2085
<tzimmermann> grep for hotplug_event and there's only the sysfs stuff
kzd has joined #dri-devel
<tzimmermann> that would still return an err of 0
<macromorgan> anyone have experience with tinydrm panel drivers? I'm confused about the rs pin and how I make it work with my SPI controller...
<javierm> tzimmermann: which would be !fb_helper->fb since is unlikely that the mutex grabbing in drm_master_internal_acquire() would fail
<javierm> so it seems your intuition is correct and the problem is in the delayed outplug path
<javierm> *hotplug
<tzimmermann> javierm, there's x11 running! so drm_master_internal_acquire() should fail
<tzimmermann> x11 is the drm master already
<tzimmermann> after do the described vt-switch (alt+f2), the code would run last_close IIRC
f11f12 has quit [Quit: Leaving]
<tzimmermann> where delayed_hotplug is being handled
<tzimmermann> i still don't see the issue, though :/
<MrCooper> in my case, gnome-shell is most certainly not running yet at that point, plymouth might be though
JohnnyonFlame has quit [Ping timeout: 480 seconds]
<javierm> tzimmermann: yeah porque if (do_delayed) then drm_fb_helper_hotplug_event() should be called
<javierm> err, because. I don't know why I mixed spanish and english in the same sentence haha
<tzimmermann> de nada
<javierm> :D
<javierm> tzimmermann: so I'm also not seeing the issue... MrCooper maybe you can add some debug logs in drm_fb_helper_hotplug_event() and figure out whether is called or not on switch to VT ?
alyssa has joined #dri-devel
<javierm> tzimmermann: but wouldn't that be the case too for the drm_fb_helper_output_poll_changed() -> drm_fb_helper_hotplug_event() path ?
<MrCooper> can do
<tzimmermann> MrCooper, thanks
padovan4 has joined #dri-devel
<javierm> MrCooper: great. I'm out of ideas
padovan4 is now known as padovan
Duke`` has joined #dri-devel
<MrCooper> thanks for the brainstorming guys
alyssa has quit [Quit: alyssa]
Haaninjo has quit [Ping timeout: 480 seconds]
bgs has quit [Remote host closed the connection]
alyssa has joined #dri-devel
tzimmermann has quit [Quit: Leaving]
Ahuj has joined #dri-devel
bmodem has quit [Ping timeout: 480 seconds]
TMM has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
TMM has joined #dri-devel
illwieckz has quit [Ping timeout: 480 seconds]
frieder has quit [Remote host closed the connection]
Haaninjo has joined #dri-devel
Haaninjo has quit []
clever has quit [Ping timeout: 480 seconds]
illwieckz has joined #dri-devel
cmichael has quit [Quit: Leaving]
<agd5f> javierm, we don't set up the fbdev code if the GPU doesn't have any display hardware on the GPU.
<agd5f> some GPUs may not have any display IPs at all, others may have display IPs, but no physical connectors on the board.
Ahuj has quit [Ping timeout: 480 seconds]
dylanchapell has joined #dri-devel
<javierm> agd5f: yeah, I think that understood the rationale of that logic. But that wasn't the issue anyways as tzimmermann mentioned. It's likely something in the deferred setup
<javierm> macromorgan: sorry, I missed your message before. What problem do you have, with what panel driver ?
benjaminl has joined #dri-devel
benjamin1 has quit [Ping timeout: 480 seconds]
benjamin1 has joined #dri-devel
<macromorgan> I'm trying to create a new panel driver for tinydrm based on this: https://github.com/FunKey-Project/linux/blob/FunKey_S/drivers/staging/fbtft/fb_st7789v.c
benjaminl has quit [Read error: Connection reset by peer]
rasterman has joined #dri-devel
<macromorgan> I'm having issues understanding though how to handle the RS pin (which is hardwired to the MISO pin)
djbw has joined #dri-devel
benjaminl has joined #dri-devel
<macromorgan> if I define the pinctrl for the SPI bus, won't that block me from using the MISO pin as the RS pin? Is there a helper function that does that in Linux I'm missing?
<javierm> macromorgan: there's already a panel driver for this chip I see: drivers/gpu/drm/panel/panel-sitronix-st7789v.c
benjamin1 has quit [Ping timeout: 480 seconds]
<macromorgan> that's for initing the panel via SPI but displaying it via DPI, if I'm not mistaken
<macromorgan> Mine is to both init and display via SPI
<macromorgan> I assumed that needed a different setup
<javierm> macromorgan: ah Ok. But still I wonder if wouldn't be better to extend that driver to support both DPI and SPI transports
<javierm> mripard ^
<macromorgan> If that's the route we want to go. Honestly I'm just trying to get the panel to work, then I can worry about making it mainline conformant
<macromorgan> this is the first time I've worked with a pure SPI panel that used the MISO pin as a "switch" to note if we're sending data or commands
<javierm> macromorgan: yeah, that's normal for some of these SPI panels. It's usually called D/C (data or command) and not RS though (which sounds more like reset?)
mbrost has joined #dri-devel
<javierm> macromorgan: and what you do usually is to use a GPIO to toggle that pin
<javierm> macromorgan: it seems is called D/CX in your chip datasheet, by looking at "8.4 Serial Interface" section in https://newhavendisplay.com/content/datasheets/ST7789V.pdf
kts has joined #dri-devel
<javierm> macromorgan: "In 4-lines serial interface, data packet contains just transmission byte and control bit D/CX is transferred by the D/CX pin"
<macromorgan> yep... sadly in my implementation it's hooked to the MISO pin
Guest4336 has quit []
leandrohrb5 has quit [Quit: The Lounge - https://thelounge.chat]
italove7 has quit []
dwlsalmeida has quit [Quit: The Lounge - https://thelounge.chat]
leandrohrb is now known as leandrohrb5
<javierm> macromorgan: I see... and you must use 4-wire, you can't support a 3-wire SPI setup?
dwlsalmeida8 has joined #dri-devel
<javierm> because with 3-wire you can have the D/C bit as a part of the 9-bit payload
<macromorgan> I honestly don't know
<macromorgan> new to SPI displays honestly
<javierm> macromorgan: look at the "8.4.2 Command write mode" section in the datasheet I shared
<macromorgan> okay will do
<javierm> you either can send a 9-bit payload (where the first bit is the D/CX to let the controller know whether the payload is data or a command) or a 8-bit payload (where the D/CX is out-of-band using a pin)
<macromorgan> so if I send an 8 bit payload over a 3-wire interface I should be golden right?
<macromorgan> I guess I can try that and see if it falls flat on its face or not
<javierm> macromorgan: yeah, that won't work because the chip won't know was you are sending it...
<macromorgan> ohh, wait, you mean send a 9 bit payload over a 3 wire interface
tursulin has quit [Ping timeout: 480 seconds]
<javierm> macromorgan: yes
<macromorgan> okay, let's try that :-)
<javierm> the 9-bit is <1-bit D/CX, 8-bit payload>
<javierm> macromorgan: but your chip has to be configured to use that interface
<javierm> see "6.2 Interface Logic Pins" section
<javierm> pins IM[3-0] are used for that but I don't know whether those are accessible on your design
<javierm> you wan't those to be 0 1 0 1 according that datasheet
<macromorgan> they are not available and I don't know how they are configured... let me check the panel display sheet in case it has something on it
<javierm> err, it seems 1 1 0 1. I misread
<javierm> macromorgan: I've to leave but I had to implement something like that for the ssd130x driver: https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/solomon/ssd130x-spi.c#L21
<macromorgan> okay, thank you for your help. Gives me something to look at more
<javierm> but if you can't use a GPIO, then that's not an option for you :(
<javierm> macromorgan: you are welcome. Now that I think about it 3-wire SPI should also be supported for the ssd130x, maybe I should implement that too
jhli has joined #dri-devel
kts has quit [Ping timeout: 480 seconds]
dviola has quit [Ping timeout: 480 seconds]
eukara has joined #dri-devel
<anholt_> daniels: thanks. looks like I was already picking the radv runners, so I just need to --stress and we'll be good.
junaid has quit [Quit: leaving]
JohnnyonFlame has joined #dri-devel
dylanchapell has quit [Ping timeout: 480 seconds]
illwieckz has quit [Ping timeout: 480 seconds]
diego has joined #dri-devel
clever has joined #dri-devel
kasper93 has joined #dri-devel
rasterman has quit [Quit: Gettin' stinky!]
illwieckz has joined #dri-devel
<daniels> nice
LexSfX has joined #dri-devel
hussam has joined #dri-devel
<hussam> Hello. Does OpenCL 3.0 work with mesa on a 620 intel hd?
mbrost has quit [Ping timeout: 480 seconds]
kzd has quit [Ping timeout: 480 seconds]
junaid has joined #dri-devel
neonking has joined #dri-devel
kzd has joined #dri-devel
vliaskov has quit []
gouchi has joined #dri-devel
gouchi has quit []
CATS has quit [Ping timeout: 480 seconds]
Guest4545 has quit [Quit: Guest4545]
* alyssa glares at dEQP-EGL.functional.render.multi_context.*
<alyssa> I'm running it in a loop and it's passing reliably.
<alyssa> But I've SEEN it flake T_T
<karolherbst> hussam: yes
<karolherbst> well.. mostly
<alyssa> apparently my stress test isn't stressful enough
<karolherbst> you need help hitting flakes? tried running 200 threads in parallel?
<alyssa> That's an idea..
<hussam> karolherbst: What meson options do I need?
<karolherbst> that's how I fixed all the CL flakes I had left 🙃
<karolherbst> well.. maybe not 200, but...
<karolherbst> using "stress" to keep your CPU busy does help with finding CPU related flakes 🙃
<karolherbst> hussam: gallium-rusticl=true
<hussam> will that generate the icd file?
<karolherbst> yes
<hussam> Thank you. I will try now.
CATS has joined #dri-devel
<karolherbst> jenatali: ever saw a "Attribute does not match Module context!" error?
<hussam> karolherbst: Done. hashcat says no devices found.
<jenatali> karolherbst: Sounds like you've got two LLVM contexts?
<karolherbst> jenatali: so I have a user seeing this: https://pastebin.com/eiQn4R21
<karolherbst> mhh but yeah..
<karolherbst> could be something silly inside gentoo again...
<karolherbst> but also makes no sense really..
<karolherbst> hussam: need to set RUSTICL_ENABLE=iris
<hussam> yay. that worked.
<karolherbst> it might or might not work correctly yet
<jenatali> Weird
<karolherbst> it's good enough to pass the CTS and run random stuff... but.. there are still issues I want to tackle before enabling anything by default :F
<karolherbst> jenatali: yeah...
<karolherbst> I'll ask for a LD_DEBUG=libs...
<karolherbst> jenatali: this "llvm::compression::zstd::compress" confuses me....
<psykose> how so
<karolherbst> why would zstd::compress be called when reporting an error?
<karolherbst> I'm sure it's just some optimized build and figuring out the symbols is screwed up
JohnnyonF has joined #dri-devel
<psykose> hhmm
<psykose> yeah, you're right
<psykose> that does not track at all so it's wrong symbols or similar brokenness
<psykose> well, maybe there's some magic path where the function call inside report_fatal_error is so broken it jumps into zstd::compress which then aborts on something :D
<psykose> corrupted memory can do anything
<karolherbst> jenatali: https://pastebin.com/JcLzijD3 .... mhhhh
<karolherbst> I don't know, but having two llvms....
<jenatali> Yeah 100% that's the problem
<karolherbst> `libigdfcl` that's intel, no?
<karolherbst> yep...
<karolherbst> mhhh
<karolherbst> :pain:
<psykose> i thought having two llvms just aborted on init
<psykose> or does that not happen on glibc
<karolherbst> good question
<psykose> on musl at least the second one will abort due to some Options thing being double registered, inside llvms constructor
<karolherbst> yeah...
<karolherbst> that's usually what happens
<karolherbst> maybe something else happens now
<karolherbst> but anyway.... uhhh
<karolherbst> can we torch llvm? :D
<psykose> but cute dragon :(
<karolherbst> we keep the dragon
<karolherbst> I wonder if we really have to runtime load llvm....
<karolherbst> and just load it in a way it's only private to us
JohnnyonFlame has quit [Ping timeout: 480 seconds]
<jenatali> Like static linking...?
<karolherbst> but uhhh.....
<karolherbst> mnhhh
<karolherbst> I don't know if I'm in the mood for that kind of bikeshed comming to use
<karolherbst> *us
<karolherbst> _but_
<karolherbst> we could tell distributions if they support multiple llvm versions at once, then they either static link or we close all bugs
<psykose> it's a distribution issue to make sure everything in mesa path has the same llvm version, yes
<psykose> but it really is very obvious 99% of the time, you just get a load abort..
<karolherbst> ehh it's not inside mesa
<psykose> i dont' know why this case is different
<karolherbst> mesa is fine
<karolherbst> well
<karolherbst> soo
<karolherbst> ever heard of the Vulkan ICD thing? so this was an idea they took from CL and CL does the same thing
<psykose> i am completely clueless about anything ICD :D
<karolherbst> the issue with CL is, that... all (like.. almost all) cl impls use LLVM
<karolherbst> loading multiple implementations at once
<psykose> yeah, all gotta match
<karolherbst> so the user has Intels CL stack (on LLVM-15) and mesa (on LLVM-16)
<karolherbst> and the ICD dlopens those CL impls
<psykose> ah, i think that's the issue
<psykose> if you dlopen the conflicting llvm much later you get into this state
<karolherbst> yep, the user already confirmed it :)
<psykose> but yeah, it's a pain
<karolherbst> sooo.. the icd loads with `RTLD_LAZY|RTLD_LOCAL`
<karolherbst> _but_
<karolherbst> there is a `// | RTLD_DEEPBIND` in the code...
<karolherbst> and I wonder if that would fix it...
<karolherbst> but really.. this part is really broken on linux
<psykose> i dunno, it sounds like it would just somewhat hide it more
<karolherbst> maybe
<karolherbst> but this is something we have to figure out I guess
<psykose> famously musl also does not have that, but distro side that's a 5 second patch for me so idc personally
<karolherbst> yeah so some distros support multiple LLVM versions
<karolherbst> like gentoo
<psykose> and similarly on distro side "match all the llvms" is just what i do
<karolherbst> and ubuntu
<airlied> and fedora :)
<psykose> e.g. in alpine mesa is 15 because blender is 15
<karolherbst> and I think fedora as well in theory
<karolherbst> I think dlopening llvm ourselves is probably the way to go here :/
<karolherbst> but llvmpipe devs will hate us
hikiko has quit [Ping timeout: 480 seconds]
<karolherbst> but uhhh...
<karolherbst> why is it such a huge issue with llvm
<psykose> it would be the same with any multiple-abi-versions dep
<psykose> llvm is just the famous one here :)
<karolherbst> ahh right.. because symbol versioning is broken or something
<psykose> clearly what we need is more market fragmentation
<psykose> so others start using definitelynotllvm that doesn't conflict
<karolherbst> yeah, maybe we just have to make more people run into this issue so it finally gets addressed
<psykose> libc side i don't think there's anyone super interested in addressing this in some meaningful way that i've seen for the loader
<karolherbst> but honestly.. why couldn't the icd dlopen our libraries in a way that dependencies are private to the library or something :/
<psykose> could be wrong
Kayden has quit [Quit: change locations]
<karolherbst> khronos own loader uses RTLD_NOW mhhh
JohnnyonFlame has joined #dri-devel
<karolherbst> so the user had used khronos loader which is using RTLD_NOW mhhh
<karolherbst> I wonder if the loader should be fixed here...
JohnnyonF has quit [Ping timeout: 480 seconds]
<karolherbst> does anybody know if "RTLD_LOCAL" gets applied to dependencies as well?
<karolherbst> so if one loads Intel's CL impl with RTLD_LOCAL, would its LLVM dep be only local to it?
Duke`` has quit [Ping timeout: 480 seconds]
Kayden has joined #dri-devel
<psykose> if you mean things that are DT_NEEDED on the cl object and not something it itself also later dlopens then i think so
<psykose> didn't test though
<psykose> this stuff is so broken though i wouldn't be surprised if it's actually the opposite of that :D
<karolherbst> yeah...
<karolherbst> sooo.. I'm checking if it works with how ocl-icd loads things
<karolherbst> and if it does, I just file a bug against khronos loader or change it to RTLD_LOCAL
<karolherbst> because on fedora intel's stack also installs with llvm-15 :')
<psykose> yeah idk how that works at all
<psykose> i imagine mesa being 16 on any distro isn't actually functional for any later deps
<psykose> but there's always magic
<karolherbst> what magic
<psykose> magic of magic, the unknown :D
<psykose> (it's bedtime for me)
<psykose> nini karol
<karolherbst> rude
<psykose> <3
<karolherbst> :3
junaid has quit [Remote host closed the connection]
Daanct12 has joined #dri-devel
<karolherbst> yeah sooo...
<karolherbst> with ocl-icd it just works it seems
tursulin has joined #dri-devel
<karolherbst> or at least I think it does... mhh
<karolherbst> ehh no, both are built against llvm-15.. uhhh
Guest4502 has quit [Ping timeout: 480 seconds]
JohnnyonFlame has quit [Ping timeout: 480 seconds]
Kayden has quit [Quit: change loc]
<karolherbst> yeah dunno.. on fedora it just works
<karolherbst> must be a gentoo bug then
tursulin has quit [Ping timeout: 480 seconds]
<karolherbst> mattst88: so apparently on gentoo if a process loads llvm-15 and llvm-16 it crashes in weird ways. On fedora the same thing seems to work. Maybe fedora is something dodgy to make it work. Maybe gentoo also needs to do something dodgy. I have no idea, but just wanted to let you know
<karolherbst> this can happend if a user has intel's and mesa's CL impl installed and are compiled against different LLVM versions
ngcortes has joined #dri-devel
<mattst88> karolherbst: ugh :(
<mattst88> oh, separately, did you make an MR with the patch Gentoo is carrying? the clang resource dir one
<alyssa> anholt_: any pointers about dEQP-EGL.functional.render.multi_context*?
<anholt_> alyssa: nope
<alyssa> Wheeee
<anholt_> disable any job reordering?
<alyssa> We're hitting very rare flakiness for it on Asahi.. but I see it's also on freedreno flake list so I'm thinking it's not driver specific
<alyssa> disabling the shader disk cache seems to make the flakes all but disappear, though I think I still hit.. the flake once with cache disabled
* alyssa should try to reproduce the flakiness on panfrost
<alyssa> looking through Mesa CI Daily Reports, I see a similar test (`wayland-dEQP-EGL.functional.color_clears.multi_context.gles1.rgba8888_window`) flaked(?) on llvmpipe
<alyssa> which seems like a harbinger
<alyssa> another related test is on the rpi3 flake list but I suspect the whole test group (not just that one) is flaky https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20452/diffs
<alyssa> I can't really imagine what could be going wrong for the multi_context tests (which AFAICT are still single-threaded?) to flake so rarely across... every driver that's running them in CI, seemingly?
<karolherbst> mattst88: I planned to do so tomorrow
Haaninjo has joined #dri-devel
benjamin1 has joined #dri-devel
benjaminl has quit [Ping timeout: 480 seconds]
<karolherbst> uhh multi_context?
<karolherbst> I think I was also seeing flakes in nouveau with that...
<karolherbst> but yeah, multi_context is single threaded
leo60228- has joined #dri-devel
leo60228 has quit [Ping timeout: 480 seconds]
i509vcb has joined #dri-devel
gtn has joined #dri-devel
Haaninjo has quit [Quit: Ex-Chat]
kzd has quit [Ping timeout: 480 seconds]
<alyssa> karolherbst: I don't understand how a single threaded test can flake on every driver
rasterman has joined #dri-devel
<alyssa> https://rosenzweig.io/flaker.xml if anyone is curious
<alyssa> I notice some interesting rectangular corruption
<alyssa> IDK if tile boundaries but.. doesn't seem natural
<alyssa> it just flakes /so rarely/ that I don't know how to debug this monster
<anholt_> MSAA only?
<alyssa> not sure, will try to capture more qpa's
<alyssa> also sometimes see GPU timeouts (though not faults). this seems distinct symptom from the fails.
<anholt_> wonder how much happier everything would be if we did multisample render to single sampled in the winsys.
<alyssa> heh
gtn has quit []
<alyssa> this would be a lot easier if I could actually reproduce the damn flake
<alyssa> got another fail, this time config 48, EGL_SAMPLES=4
kzd has joined #dri-devel
<alyssa> 47, EGL_SAMPLES=2
<alyssa> another 47
<alyssa> a lot of 47
<alyssa> 47 twice more
<alyssa> while that's a little odd
<alyssa> it's testing 24 configs, only 3(?) of which are eGL_SAMPLES=0
<alyssa> so while I have not observed a non-MSAA failure, that's not wholly unexpected
rasterman has quit [Quit: Gettin' stinky!]
<alyssa> hacked up the driver to pretend not to support MSAA, let's go
JohnnyonFlame has joined #dri-devel
<alyssa> without MSAA, so far no fails observed after over 4000 iterations
<alyssa> going to let this keep going just in case, but I think this is indeed solid evidence that yes, it's MSAA related.
<alyssa> anholt_: nice one :+1:
shashanks_ has joined #dri-devel
shashanks__ has quit [Ping timeout: 480 seconds]
<alyssa> ok, 10k iterations with no fail
<alyssa> yeah, I'd say this is indeed MSAA only.
eukara has quit [Ping timeout: 480 seconds]
elongbug has quit [Read error: Connection reset by peer]
<airlied> src/compiler/nir/nir_opt_algebraic.c: In function ‘nir_opt_algebraic’:
<airlied> src/compiler/nir/nir_opt_algebraic.c:1374082: note: ‘-Wmisleading-indentation’ is disabled from this point onwards, since column-tracking was disabled due to the size of the code/headers
<airlied> 1374082 | nir_foreach_function_impl(impl, shader) {
<airlied> nothing to see here, only 1.4M loc file :-P