ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar
Company has quit [Quit: Leaving]
<DemiMarie> Is there a “how can I get started in Linux graphics” document anywhere? Asking because Qubes might be interested in Intel SR-IOV, Xen virtio-GPU native contexts, or some combination.
<DemiMarie> Is it safe to assume that if (unprivileged) userspace can freeze the GPU and the kernel fails to recover from it that is a kernel bug?
youmukon1 has joined #dri-devel
YuGiOhJCJ has joined #dri-devel
youmukonpaku1337 has quit [Ping timeout: 480 seconds]
YuGiOhJCJ has quit [Remote host closed the connection]
YuGiOhJCJ has joined #dri-devel
ngcortes has quit [Ping timeout: 480 seconds]
co1umbarius has joined #dri-devel
columbarius has quit [Ping timeout: 480 seconds]
flynnjiang has joined #dri-devel
youmukonpaku1337 has joined #dri-devel
DragoonAethis has quit [Quit: hej-hej!]
DragoonAethis has joined #dri-devel
youmukon1 has quit [Ping timeout: 480 seconds]
mvchtz has quit [Quit: WeeChat 3.5]
mvchtz has joined #dri-devel
jewins has quit [Ping timeout: 480 seconds]
youmukonpaku1337 has quit [Ping timeout: 480 seconds]
yuq825 has joined #dri-devel
flynnjiang has quit [Remote host closed the connection]
ayaka_ has joined #dri-devel
ayaka_ has quit [Remote host closed the connection]
<alyssa> karolherbst: I think the next big compute stack should be rusticl on zink on radv with the amd llvm backend
<alyssa> LLVM -> SPIRV -> NIR -> SPIRV -> NIR -> LLVM
<alyssa> :P
<HdkR> How many layers of translation can we go?
luc has joined #dri-devel
dri-logger has joined #dri-devel
mareko_ has joined #dri-devel
dri-logg1r has quit [Ping timeout: 480 seconds]
glisse has quit [Ping timeout: 480 seconds]
mareko has quit [Ping timeout: 480 seconds]
glisse has joined #dri-devel
<luc> hi, all, discussions on maillist by replying to **cover letter** of patch series won't be recorded on https://patchwork.freedesktop.org/series/123257/, will they?
anarsoul|2 has quit [Read error: No route to host]
anarsoul has joined #dri-devel
tristan has joined #dri-devel
tristan is now known as Guest1985
yout has joined #dri-devel
yout has quit [Read error: Connection reset by peer]
Guest1985 has quit [Ping timeout: 480 seconds]
Duke`` has joined #dri-devel
yyds has joined #dri-devel
yyds has quit [Remote host closed the connection]
Haaninjo has joined #dri-devel
yyds has joined #dri-devel
Duke`` has quit [Ping timeout: 480 seconds]
fab has joined #dri-devel
sima has joined #dri-devel
itoral has joined #dri-devel
ohmltb^ has joined #dri-devel
kzd has quit [Ping timeout: 480 seconds]
bmodem has joined #dri-devel
i-garrison has quit []
i-garrison has joined #dri-devel
tzimmermann has joined #dri-devel
mauld has quit [Remote host closed the connection]
mauld has joined #dri-devel
mszyprow has joined #dri-devel
vjaquez has quit [Remote host closed the connection]
youmukonpaku1337 has joined #dri-devel
fab has quit [Quit: fab]
vjaquez has joined #dri-devel
crabbedhaloablut has joined #dri-devel
An0num0us has joined #dri-devel
<itoral> karolherbst: jasuarez told me you were having some trouble with fences in v3d, maybe I can assist you with that
frieder has joined #dri-devel
fab has joined #dri-devel
TMM has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
TMM has joined #dri-devel
mripard has joined #dri-devel
sghuge has quit [Remote host closed the connection]
sghuge has joined #dri-devel
donaldrobson_ has joined #dri-devel
donaldrobson has quit [Ping timeout: 480 seconds]
youmukonpaku1337 has quit [Remote host closed the connection]
youmukonpaku1337 has joined #dri-devel
lplc has joined #dri-devel
elongbug has joined #dri-devel
lynxeye has joined #dri-devel
flynnjiang has joined #dri-devel
swalker__ has joined #dri-devel
swalker_ has joined #dri-devel
swalker_ is now known as Guest2000
rasterman has joined #dri-devel
swalker__ has quit [Ping timeout: 480 seconds]
vliaskov has joined #dri-devel
<airlied> alyssa: okay I've slogged the functions MR to a place of great happiness :-) I'd appreciate when you next have time!
youmukonpaku1337 has quit [Remote host closed the connection]
youmukonpaku1337 has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
<pq> gildekel, Weston is unfortunate in that respect: it does not handle "link-status" property at all. It does have "HOTPLUG" uevent handling.
Leopold_ has quit [Remote host closed the connection]
<pq> gildekel, I don't know what would happen in Weston on link training failure. OTOH, it does handle connectors appearing and disappearing, but that's not what you're asking about.
rgallaispou has quit [Quit: Leaving.]
<pq> gildekel, I can't see Weston code reacting to mode list pruning in any way. It might be simply not implemented, so I'd guess it'd malfunction anyway already, so your kernel changes can't make it worse.
milek7 has quit [Remote host closed the connection]
Leopold_ has joined #dri-devel
rgallaispou has joined #dri-devel
Ahuj has joined #dri-devel
youmukon1 has joined #dri-devel
youmukonpaku1337 has quit [Ping timeout: 480 seconds]
fab has quit [Read error: No route to host]
fab has joined #dri-devel
donaldrobson_ has quit [Ping timeout: 480 seconds]
donaldrobson has joined #dri-devel
YuGiOhJCJ has quit [Remote host closed the connection]
YuGiOhJCJ has joined #dri-devel
youmukonpaku1337 has joined #dri-devel
An0num0us has quit [Ping timeout: 480 seconds]
youmukon1 has quit [Read error: Connection reset by peer]
milek7 has joined #dri-devel
itoral has quit [Quit: Leaving]
fab has quit [Ping timeout: 480 seconds]
dos1 has quit [Ping timeout: 480 seconds]
luc has quit [Remote host closed the connection]
dos1 has joined #dri-devel
youmukonpaku1337 has quit [Remote host closed the connection]
youmukonpaku1337 has joined #dri-devel
<emersion> pq, most likely a monitor would go black and weston would continue to use it as-if nothing happened
<emersion> pq, doesn't weston reload the modelist on HOTPLUG=1?
<emersion> (wlroots doesn't, it only does when link-status=bad)
<emersion> pq, a link-status property change results in HOTPLUG=1 uevent
<emersion> with also CONNECTOR and PROPERTY set
<emersion> in the uevent
<emersion> so if you don't read CONNECTOR/PROPERTY, it just looks like a normal HOTPLUG=1 uevent
<emersion> the kernel uses HOTPLUG=1 as a generic way of informing userspace that something in the KMS state changed
youmukonpaku1337 has quit [Remote host closed the connection]
youmukonpaku1337 has joined #dri-devel
Haaninjo has quit [Quit: Ex-Chat]
<karolherbst> alyssa: obviously
<karolherbst> it might be what we have to use until aco can deal with function calls :D
<karolherbst> doing that in zink should be trivial
An0num0us has joined #dri-devel
swalker_ has joined #dri-devel
swalker_ is now known as Guest2011
Guest2011 has quit [Remote host closed the connection]
Guest2000 has quit [Remote host closed the connection]
youmukonpaku1337 has quit [Remote host closed the connection]
youmukonpaku1337 has joined #dri-devel
<pq> emersion, no, Weston does things backwards: it updates the mode list only after the frontend has already picked a mode, and that only happens when the frontend inits or changes the mode. *facepalm*
<pq> Weston does update pretty much everything *else* on hotplug event
<pq> and communicates the change to the frontend, so the frontend can choose what to do
<pq> just... not the modes
<pq> so, nothing to worry about from kernel side, Weston is already broken there
DottorLeo has joined #dri-devel
<DottorLeo> hi!
<DottorLeo> someone has sucessfully used mesa on a phone? what is the best chipset supported by Mesa?
youmukonpaku1337 has quit [Remote host closed the connection]
youmukonpaku1337 has joined #dri-devel
mauld has quit [Ping timeout: 480 seconds]
<javierm> tzimmermann: I see that drivers/video/fbdev/au1100fb.c also has some logic when !defined(CONFIG_FRAMEBUFFER_CONSOLE) && defined(CONFIG_LOGO)
Company has joined #dri-devel
<javierm> tzimmermann: in its au1100fb_drv_remove() callback, it does a au1100fb_fb_blank(VESA_POWERDOWN, &fbdev->info)
<javierm> not sure why, because that driver doesn't show the logo like drivers/video/fbdev/au1200fb.c does
<javierm> tzimmermann: but maybe you drop that too in your series?
<javierm> tzimmermann: and drivers/video/fbdev/xilinxfb.c has the same logic in xilinxfb_release() but also doesn't show the logo, maybe a copy&paste thing between these old drivers?
youmukonpaku1337 has quit [Remote host closed the connection]
youmukonpaku1337 has joined #dri-devel
<tzimmermann> javierm, my cleanups where specifically for fb_prepare_logo() and fb_show_logo(), so that fb_logo.c can be made optional easily. can these other cleanups be send out separately?
<javierm> tzimmermann: yes, sure
<javierm> I was mentioning just in case you missed those
<tzimmermann> why these drivers do this is not so clear to me. that code should be in the code, if any
<tzimmermann> javierm, the definition of the blanking constants is kinda confusing https://elixir.bootlin.com/linux/latest/source/include/uapi/linux/fb.h#L303
youmukonpaku1337 has quit [Remote host closed the connection]
youmukonpaku1337 has joined #dri-devel
<pq> tzimmermann, I can guess what those mean in the signal, but I don't know why the H and V suspend modes exist.
<javierm> tzimmermann: yeah, me neither but is weird that they do on their remove/release but nothing on driver probe
<javierm> that's why I think is some kind of left over or wrong copy&paste
itoral has joined #dri-devel
<tzimmermann> pq, i think this corresponds to DPMS levels. wasn't it such that the old CRTs would resume faster/slower on different levels?
<pq> the monitor of the original IBM PC comes to mind, which would let smoke out if HSYNC pin was left in a wrong state for a too long moment.
<pq> tzimmermann, yes, kind of.
<pq> I presume that if you syncs flying, the CRT circuitry stays alive and warm, and the high voltage remains.
<pq> if you stop HSYNC, you lose high voltage, because usually the flyback transformer is driven directly from that, without an oscillator of its own.
<pq> getting the high voltage back may take a moment, but what takes a lot longer is if the CRT filament cools down.
<tzimmermann> "which would let smoke out if HSYNC pin was left in a wrong state" that's hilarious!
<pq> I don't what controls the CRT heater.
mauld has joined #dri-devel
<pq> *I don't know
<tzimmermann> thanks for the HW details
<pq> vacuum tubes! \o/
ap51 has joined #dri-devel
youmukonpaku1337 has quit [Remote host closed the connection]
youmukonpaku1337 has joined #dri-devel
<pq> filament... I mean the cathode
<ccr> similar kind of issue existed with certain Commodore PET models. you could, by manipulating some video register(s), cause damage to the CRT monitor.
<pq> yeah, the IBM PC monitor was something. That was before VGA. With VGA, it was still possible to let the smoke out of some monitors by sending a too high hsync frequency.
<pq> precisely because hsync was used to drive the flyback transistor, and too high freq. caused it to melt
<javierm> pq: I remember that a DPMS caused a noisy click on my old CRTC monitor
<javierm> in fact, every mode set was noisy IIRC
<pq> I guess relays, maybe protecting the above mentioned components. :-)
<pq> javierm, or was that accompanied with a vibrating image that then settled down?
<javierm> pq: I don't remember vibrating image, only the noise
<pq> ok
<javierm> unless the settle down was too quick for my eye to notice :)
<pq> degaussing was a fun feature of CRT monitors too
<javierm> of SW causing things to melt :)
<pq> isn't Asahi also needing to deal with explicit power control to avoid melting those speakers too?
<tzimmermann> javierm, "Is this howto useful? I think it is. Cause if you have device broken in some way and you want to get it replaced you can just run it and hope for replacement instead of repair." :D
<vsyrjala> iirc n900 had a pulseaudio filter to protect the speakers from unwanted frequencies. good luck if you wanted to not use pulse :/
<karolherbst> itoral: yeah.. so trying to bring up rusticl on v3d, but the compute job I'm enqueing is never signaling the fence or something... I honestly don't exactly know what's happening, that's my patch so far: https://gitlab.freedesktop.org/karolherbst/mesa/-/commit/c3d472346704d1a95609f992128dc18a4352919e
youmukonpaku1337 has quit [Remote host closed the connection]
<karolherbst> I'm probably just missing something.. dunno
youmukonpaku1337 has joined #dri-devel
<karolherbst> I could also make a gallium trace if you think that would help you understand what's going on
<itoral> so what is happening exactly? you submit a compute job and it timeouts?
tristianc6704 has quit [Ping timeout: 480 seconds]
* ccr shakes fist at pulseaudio
<karolherbst> itoral: yes
<youmukonpaku1337> pulse audio more like pulse midio
<itoral> karolherbst: what do you see in dmesg?
<karolherbst> itoral: nothing
<itoral> nothing? uh, that is very weirs
<karolherbst> I know..
<itoral> are you sure the job is being submitted?
<karolherbst> I suspect somehting with buffer tracking is off.. or context stuff
<karolherbst> yes
<karolherbst> and the ioctl doesn't return an error
<itoral> ugh, that is so weird... could you send an e-mail to me with instructions on how to replicate the issue? (assuming it is not too much work)
<itoral> in fact, if you create an issue in gitlab with the details it would be even better
<itoral> I am then try to replicate and if I do I figure I should be able to find what is happening
<karolherbst> I suspect it's something pipe_context related
<karolherbst> ehh wait...
<karolherbst> I booted it up again and now it works...
<itoral> uh, maybe the gpu was in a silly state :-/
<karolherbst> yeah....
<karolherbst> I think there were MMU faults in the dmesg, but they were triggered by other jobs? Maybe they messed up the GPU state I guess..
<itoral> yes, that could be
<karolherbst> v3d fec00000.v3d: MMU error from client L2T (0) at 0x3400, write violation, pte invalid
<karolherbst> v3d fec00000.v3d: MMU error from client L2T (0) at 0xa00, write violation, pte invalid
<karolherbst> that's what I had seen, but if one job triggering those mess up all the future ones that's kinda a problem for CTS runs :'(
<karolherbst> maybe it's also something else
<itoral> there is usually one of those on every boot... but you should only see that one
<karolherbst> this boot I see none
<itoral> if you are seeing more than that, and in parricular, if you are seeing those when you run your workload then there is something wrong
<karolherbst> now I have to deal with other errors like "unknown NIR ALU inst: div 64 %76 = u2u64 %75" :'(
<karolherbst> yeah..
<karolherbst> I try to track down what's causing those errors
<itoral> ugh, we don't support 64bit alu :-(
<karolherbst> right...
<karolherbst> probably needs more int64 lowering
<karolherbst> but then it requires packing
<itoral> yep
<karolherbst> also.. the hw doesn't support fma?
<itoral> nope :-(
<karolherbst> :'(
<itoral> yeah...
<karolherbst> CL requires _precise_ fma :')
<karolherbst> so fma will be lowered to software
<itoral> right
<karolherbst> though I wonder if we can improve the lowering in libclc :D
<karolherbst> anyway..
<karolherbst> u2u64 lowering needs pack_64_2x32_split
<karolherbst> int64 support is optional though, and I don't know why emulated fma even needs u2u64...
sarahwalker has joined #dri-devel
YuGiOhJCJ has quit [Quit: YuGiOhJCJ]
<karolherbst> uhh.. the lowering stores the mantissa as a long
karolherbst is now known as karolherbst_
<karolherbst_> pain
karolherbst_ is now known as karolherbst
<itoral> there is a lower_pack_64_2x32_split right?
tristan has joined #dri-devel
<karolherbst> ohh indeed
tristan is now known as Guest2017
youmukonpaku1337 has quit [Remote host closed the connection]
youmukonpaku1337 has joined #dri-devel
<karolherbst> itoral: it stopped working :') and nothing in dmesg
ap51 has quit [Ping timeout: 480 seconds]
<karolherbst> ehh wait.. that's just nir opts infinitely loop.. nvm
<karolherbst> why is int64 lowering so cursed
<itoral> ouch!
<karolherbst> added lower_int64 into my opt loop, because opt_algebrics pack lowering adds u2u64
<karolherbst> and uhm.. it doesn't stop now
<karolherbst> :D
<karolherbst> ahh yes
<karolherbst> nir_pack_64_2x32_split lowering uses u2u64 and u2u64 lowering uses nir_pack_64_2x32_split
<karolherbst> :')
<itoral> oh boy :)
<karolherbst> you think if lower_pack_64_2x32_split can be easily implemented though?
<karolherbst> ehhh
<karolherbst> pack_64_2x32 I mean
milek7 has quit [Remote host closed the connection]
<itoral> but that would require that we are actually able to have 64bit defs around, no?
<itoral> I mean, we would need an ssa that can reference an actual 64bit value
<karolherbst> not necessarily
<karolherbst> the lowering is literally doing `nir_pack_64_2x32_split(b, x32, nir_imm_int(b, 0))`
<karolherbst> could just.. uhm.. treat is as 32 bit :D
<itoral> oh
<karolherbst> I guess it would get messy with load/stores
<itoral> well... sure, we can do that XD
<karolherbst> and how that 64 bit value is used
<karolherbst> but I think when translating from nir that nonsense _could_ be handled in a hacky way
youmukon1 has joined #dri-devel
<karolherbst> other int64 lowering isn't as nice though
<karolherbst> just the conversion ones just place a 0 in the upper bits
<karolherbst> maybe we should just have alternative nir lowering
<karolherbst> `(('pack_64_2x32_split', a, b), ('ior', ('u2u64', a), ('ishl', ('u2u64', b), 32)), 'options->lower_pack_64_2x32_split')` mhh
<karolherbst> itoral: the thing is... u2u64 is also trivial to implement on 32 bit hardware
<karolherbst> but yeah... I suspect for it to not be very cursed, the backend IR would need some support for 64 bit values
<karolherbst> which it probably should have for vectorized load/stores anyway
<karolherbst> unless you don't have vectorized loads/stores
<karolherbst> or I just ignore fma for now...
youmukonpaku1337 has quit [Ping timeout: 480 seconds]
<itoral> maybe we would like a lowering that works in terms of uvec2 instead of u64
<itoral> that's what we do for the vulkan device address extension
<karolherbst> mhh.. maybe
<itoral> which expects device addresses to be 64-bit
<karolherbst> translatinv u2u64 to a vec2 shouldn't be too hard
<karolherbst> or rather
ap51 has joined #dri-devel
<karolherbst> ehh.. I guess it depends more on where that 64 bit value is coming from
<alyssa> airlied: :+1:
<karolherbst> yeah.. anyway, if not doing fma stuff things seem to work
<karolherbst> - random compiler crashes :')
<itoral> cool
<itoral> hahaha
<karolherbst> _but_ basic kernels seem to work
<alyssa> ...does videocore deserve CL
<karolherbst> no, but there is high demand
<karolherbst> it has 32 bit pointers :')
<karolherbst> itoral: the bigger problem is 8/16 bit handling which is mandatory in CL
<karolherbst> at least for ints
<karolherbst> let's see if I can trigger this MMU fault or if my proper implementation of set_Global_bindings fixed that one
youmukonpaku1337 has joined #dri-devel
<itoral> v3d doesn't support native 8bit/16bit int alu, but I guess it can be done in 32-bit?
<karolherbst> yeah
<karolherbst> more or less
<karolherbst> it gets a bit fishy around load/stores but I think we have lowering in place for most of it
<karolherbst> `nir_lower_bit_size`
<itoral> those MMU errors are a real issue
<itoral> probably some out-of-bounds access
<karolherbst> mhhh
tristianc6704 has joined #dri-devel
<itoral> [ 1091.642212] v3d fec00000.v3d: MMU error from client L2T (0) at 0x0, write violation, pte invalid
<itoral> that one is pretty clear, seems to be trying to write something at a NULL address
<karolherbst> mhhh
youmukon1 has quit [Ping timeout: 480 seconds]
<karolherbst> could be something in my set_global_binding impl
<karolherbst> the model CL has with memory is kinda annoying
<karolherbst> but I'm also not sure the way I've implemented load/store global is actually correct
<karolherbst> but it's just doing 32 bit addresses anyway
<itoral> mmm... I think nir_intrinsic_load_global takes a 64-bit value no?
<itoral> that's why we added nir_intrinsic_load_global_2x32
<itoral> (which is what we use in vulkan for device addresses)
<karolherbst> ahh
<karolherbst> it's shared memory stuff
<karolherbst> itoral: not necessarily
<karolherbst> it can take a 32 bit one, but that depends on the API
<karolherbst> CL supports multiple pointer sizes and the device simply reports what it supports
<karolherbst> so if the runtime claims 32 bit points, all pointers are 32 bit
<itoral> aha
<karolherbst> anyway..
<karolherbst> seeing those mmu errors when using shared memory
<karolherbst> which makes sense
<karolherbst> because...
<karolherbst> uhm
<karolherbst> pipe_grid_info::variable_shared_mem
<karolherbst> so CL has a cursed model of shared memory
<karolherbst> you have 1. in kernel declared static shared memory blocks
<karolherbst> and 2. you can pass arbitrary sized shared memory blocks as kernel parameters into the runtime
<karolherbst> so you have a static part (declared in nir_shader) + a variable part (set via pipe_grid_info::variable_shared_mem at launch_grid time)
<karolherbst> I guess I just need to support that one then :')
<itoral> oh, I see
<karolherbst> I've added a shader_info::cs::has_variable_shared_mem so backend compilers can deal with some of it
<karolherbst> or drivers in general
<karolherbst> luckily we know if a shader has variable shared mem at compile time, so drivers can deal with that in some proper way
<karolherbst> ohh
<karolherbst> I suspect that null pointer is when it's all variable and no shared memory block is allocated
<karolherbst> maybe
<karolherbst> mhhh
milek7 has joined #dri-devel
<karolherbst> I guess some of the shared_size handling in the compiler needs to be fixed as well
<itoral> gotta go now but if you hit anything on v3d where you need asistance feel free to ping me or open an issue and I'll try to help
<karolherbst> okay, cool
milek7 has quit [Remote host closed the connection]
itoral has quit [Quit: Leaving]
<karolherbst> nice... I've added _broken_ variable_shared_size handling, but at least some/all of the MMU faults are gone
fab has joined #dri-devel
paulk-bis has quit []
paulk has joined #dri-devel
youmukonpaku1337 has quit [Remote host closed the connection]
elongbug has quit [Read error: Connection reset by peer]
youmukonpaku1337 has joined #dri-devel
elongbug has joined #dri-devel
youmukonpaku1337 has quit [Remote host closed the connection]
youmukonpaku1337 has joined #dri-devel
yyds has quit [Remote host closed the connection]
yyds has joined #dri-devel
DottorLeo has quit [Quit: Konversation terminated!]
kts has joined #dri-devel
kts_ has joined #dri-devel
jdavies has joined #dri-devel
jdavies is now known as Guest2029
kts has quit [Ping timeout: 480 seconds]
<austriancoder> which existing nir pass would do this simple transformation?
<austriancoder> 32x4 %5 = load_const (0x00000000, 0x00000000, 0x00000000, 0x00000000) = (0.000000, 0.000000, 0.000000, 0.000000)
<austriancoder> from:
<austriancoder> @store_reg (%5 (0x0, 0x0, 0x0, 0x0), %12) (base=0, wrmask=xyzw, legacy_fsat=0)
<austriancoder> to:
<austriancoder> @store_reg (%5.xxxx (0x0, 0x0, 0x0, 0x0), %12) (base=0, wrmask=xyzw, legacy_fsat=0)
<austriancoder> 32 %5 = load_const (0x00000000) = (0.000000)
<austriancoder> I think my irc client has messed up the last messages :( https://www.irccloud.com/pastebin/kY2ts6HY/
elongbug_ has joined #dri-devel
bmodem has quit [Ping timeout: 480 seconds]
yyds has quit [Remote host closed the connection]
An0num0us has quit [Ping timeout: 480 seconds]
Guest2017 has quit [Ping timeout: 480 seconds]
yyds has joined #dri-devel
elongbug has quit [Ping timeout: 480 seconds]
youmukon1 has joined #dri-devel
ap51 has quit [Ping timeout: 480 seconds]
youmukonpaku1337 has quit [Read error: Connection reset by peer]
<karolherbst> austriancoder: sounds like a job for nir_opt_reuse_constants, but I don't think nir_instr_set_add_or_rewrite (which it uses, and also used by opt_cse) seem to look into vectors
<austriancoder> karolherbst: or maybe nir_opt_shrink_vectors .. which I try to hack to support the new nir register thing
<karolherbst> mhhh
<karolherbst> I _think_ it could rather be a mix of two passes
<karolherbst> reuse_constants to use the .xxxx swizzle
<karolherbst> and shrink vectors could ditch the unused ones
kts_ has quit []
pekkari has joined #dri-devel
<karolherbst> dunno though, it just kinda feels like a cse problem
<karolherbst> and cse is currently based on entire ssa values afaik
<karolherbst> I can also see that in the future we should be able to cse parts of vectors, what if you have xy and zw being equal in a vec4?
<austriancoder> karolherbst: in the future .. are you working on something?
<karolherbst> I'm not
<karolherbst> I was just thinking out loud :D
<austriancoder> :)
itsmeluigi has joined #dri-devel
Ahuj has quit [Ping timeout: 480 seconds]
mszyprow has quit [Ping timeout: 480 seconds]
frieder has quit [Remote host closed the connection]
bmodem has joined #dri-devel
<gildekel> emersion:
<gildekel> `7:53 PM <emersion> gildekel: yeah i think 0 modes is a kernel bug`
<gildekel> I hace seen 0 modes connectors in the past, which did seem weird to me. But I am offering that connectors in such a bad state should be pruned by userspaces. How does sway handles these cases?
<gildekel>
<gildekel> pq:
<gildekel> Oh boy - that's a 5 year old bug.. heh
kzd has joined #dri-devel
<emersion> pretty sure sway misbehaves
<emersion> when the kernel exposes a 0-mode connector
<gildekel> Well, complete link-training failures on SST sources should be fairly uncommon, as the link-training fallback logic usually salvages the process
<gildekel> however, link-training fallback is not currently not implemented for MST in i915, which may cause userspaces to hit this case more often
<gildekel> So, if my series is approved, you may be running into more modeless connectors in a bad state
mareko_ is now known as mareko
<emersion> imho still a kernel bug when a 0-mode connector is exposed
<emersion> sima: ^
<emersion> when i say "kernel mode", i mean that sway should not fix it
<emersion> s/mode/bug/
<gildekel> Hmm.. I am not entirely convinced that's a bug. How would you otherwise signal userspace that that connector is in a bad state?
<gildekel> you can't just prune it in DRM
rsalvaterra_ has quit []
<gildekel> userspace would be left confused about why the connector is completely missing. A connector with 0 modes have a state that userspace can parse
<gildekel> inform end users
youmukonpaku1337 has joined #dri-devel
<emersion> you mark it as disconnected, if it really isn't usable, gildekel
<gildekel> That's also an option I am suggesting, as an alternative. But that's still not an accurate state for the connector, as it is actually connected. There was successful communication, DPCD reads, modes, link-training even
<gildekel> If it is marked as disconnected, again, userspace may be left wondering how come a connected display is ignored without sufficient signal.
youmukon1 has quit [Read error: Connection reset by peer]
yuq825 has quit [Remote host closed the connection]
<zamundaaa[m]> Yeah it would be nice to be able to show the user that the display is connect and just not working. But if we don't get more information than "it doesn't work" then it's probably not too useful vs just not detecting the display in the first place
<zamundaaa[m]> Because "it doesn't work" is something the user will notice by themselves already
An0num0us has joined #dri-devel
youmukon1 has joined #dri-devel
<gildekel> Agreed, but seeing a connected display misbehaving is better than seeing nothing. Also, DRM defines link-training failures to be signaled to userspace via uevents and a connector prop "link-status" to be BAD
<karolherbst> also kernel changing behavior even if userspace is buggy is still a kernel regression
rsalvaterra has joined #dri-devel
rgallaispou has quit [Read error: Connection reset by peer]
mauld has quit [Remote host closed the connection]
<gildekel> No arguments there, karolherbst. But the current state of i915 is that there's a gap when link-training completely fails. The failed attempt does not issue a uevent to userspace, and, at least ChromeOS, is left thinking that the modeset was successful and the display is operational.
<karolherbst> right...
<karolherbst> I think that goes back to the point if userspace can even do anything with that information
<karolherbst> can it recover? probably not
<zamundaaa[m]> Perhaps a solution could be to add a new connection state? DRM_MODE_CONNECTED_LINK_FAILURE or something like that?
<karolherbst> can the kernel tell the user what to do to fix this situation or is it simply broken no matter what?
<gildekel> Well, if you get a connected connector with no modes, you could potentially signal to the user that the connector failed to link-train
<karolherbst> what does this tell the user even?
<gildekel> and suggest to replace the cable, or try a simpler display set up
<karolherbst> what should the user do with this information
<karolherbst> okay
<karolherbst> but "disconnectred" on a connected display kinda means "your cable might be bonkers" already
<karolherbst> _but_
youmukonpaku1337 has quit [Ping timeout: 480 seconds]
<karolherbst> I think there is value to tell the user that the cable might be bonkers
youmukonpaku1337 has joined #dri-devel
<gildekel> hey, displays are hard. We face the same difficulty in ChromeOS. We are thinking to show a bubble with a link to possible troubleshooting around displays
<karolherbst> right
<gildekel> it could be anything from bad cable to incompatibility, to driver bugs.
<karolherbst> okay
<karolherbst> but does the way the kernel reports this matter here?
<gildekel> It does, I think. If a connector comes back simply disconnected, then how do you signal userspace something went wrong vs. nothing was detected on the link?
<karolherbst> not disagreeing with your idea to report 0 modes, it's just icky if it breaks userspace
<gildekel> It's ok, this discussion is healthy, and exactly what I was hoping for
<gildekel> I am trying to make all our lives better.. not harder
<karolherbst> yeah...
<gildekel> At the end of the day, I want to be able to know in ChromeOS why a connector that succeeded to modeset is not coming up. That's where it all started. I call those "zombie" displays
<gildekel> I realized it is because i915 stops issuing uevents to users once RBR x 1 Lane fails
<karolherbst> it's just that because this is uapi territory, not breaking userspace is just more important than a clean UAPI
<gildekel> Your userspace is already broken
<gildekel> you see zombie displays, at best
youmukon1 has quit [Read error: Connection reset by peer]
<gildekel> if you're running i915, that is
<karolherbst> fair
<gildekel> and btw, this change if currently affecting i915 alone
<karolherbst> I guess it depends on how bad the regression is. turning non seeing anything on the display into a compositor crash might be reason enough
<karolherbst> if it's just the compostitor doing something silly vs not doing something at all it might not matter
<gildekel> if you do not ignore 0 mode connectors, then, depending on how you choose modes, you'll hopefully just send a disable modeset request..?
<karolherbst> at least if the display is still black, the user could e.g. just check the cables (which they probably will) and the system might recover
<gildekel> Sure, that's true. But wouldn't a better user experience be that userspace signals the user that the display is in trouble?
<karolherbst> if the display is black, what can you even signal to the user?
<gildekel> via parsing the state of a connected connector in link-status=bad, as DRM dictates?
<karolherbst> though you still have your primary one...
<karolherbst> gildekel: oh.. I meant in case all connectors are bad or something, but I think this issue is about MST specifically, but then again, what if the MST display is the only display connected anyway
<gildekel> `10:54 AM <karolherbst> if the display is black, what can you even signal to the user?`
<gildekel> Unfortunately, that's a case that can't be alleviated by anything we do
<gildekel> if the only display a user has doesn't come up...
<gildekel> no amount of signaling can help
<gildekel> unless we emit audio
<gildekel> but there's no end to this
<karolherbst> but yeah... I think _if_ marking this state in a special way helps userspace to report something more meaningful to the user of the system then that's reason enough to be more explicit unless there will be regressions
<gildekel> Agreed. I personally believe that a controlled state is better than undefined behavior, which is "zombie" displays. Also don't forget that you produce frames in these scenarios.
<karolherbst> right...
<gildekel> At least in ChromeOS we do. Frames, mouse warping, layouts, the whole thing.
<daniels> I think a pragmatic compromise would be a tweak on what Karol suggested: for current userspace, expose as disconnected, but for userspace which opts in with a client cap, expose as connected-but-useless
<daniels> the fact that auxch works isn’t much comfort to userspace; any userspace that’s weird enough to care about the distinction can opt in to receive the new status
jewins has joined #dri-devel
pekkari has quit [Remote host closed the connection]
kts has joined #dri-devel
mauld has joined #dri-devel
<gildekel> Alright. Obviously there's some discomfort around the suggested change. I'll think this through some more. Thanks for the input all. Thanks for the suggestion daniels
<daniels> np!
<daniels> your usecase makes total sense, I think it just needs finesse to keep other userspace doing the right thing is all
<gildekel> I completely understand. That's why I initiated the discussion. I don't want to break anyone's stuff.
tzimmermann has quit [Quit: Leaving]
<gildekel> My view is that the solution solidifies/extends DRM specs around complete link-status failures for both SST and MST cases, but I also agree that it's not worth regressions.
alyssa has left #dri-devel [#dri-devel]
An0num0us has quit [Ping timeout: 480 seconds]
Duke`` has joined #dri-devel
pekkari has joined #dri-devel
pekkari has quit [Remote host closed the connection]
yyds has quit [Remote host closed the connection]
<daniels> you can add it to the list of things we’d do differently if we were greenfielding it, rather than something that was mostly bashed together at GUADEC 2007 ;)
<daniels> *than building on top of
pekkari has joined #dri-devel
<seanpaul_> how about adding a new value to the link-status property which is "terminal" or somethign
<seanpaul_> then we don't need a new connector status and _maybe_ don't need a new client cap
<daniels> so the connector status is disconnected but the link status is terminal?
<seanpaul_> connector would be connected
<seanpaul_> modes could be pruned or not (sway should be resilient in both cases, but meh)
<daniels> mm, but then you’d still have userspace blithely trying to light up a display which can never be used
<seanpaul_> link-status == bad typically means "try another modeset", whereas link-status == terminal means don't bother
<zamundaaa[m]> seanpaul_: Old userspace would just overwrite that link-status value with "Good"
<daniels> ‘connected’ = ‘pixels will appear somewhere’
<seanpaul_> zamundaaa[m]: that's probably ok, they're broken anyways
<seanpaul_> daniels: i don't think that's necessarily true, connectors are connected before modeset
<seanpaul_> we have this link-status property, it seems wasteful to introduce an entirely new signal which means almost the same thing
pekkari has quit [Quit: Konversation terminated!]
<daniels> right, but once you do a modeset, you can reasonably expect that they’ll probably work
<daniels> like, semantically to userspace, the expectation is ‘this is a connector you can and probably should light up’, not ‘there’s a cable plugged in but it will never work’
<seanpaul_> i guess my point is that we already have an exception mechanism to that reasonable expectation, so why not use that
<sima> daniels, seanpaul_ gildekel I think emersion 's proposal of just marking terminally fubar connectors as disconnected makes the most sense
camus1 has joined #dri-devel
<sima> userspace can handle that, and most userspace will handle that without falling over
<sima> plus if you magically recover the link, you can change to connected again and throw an uevent out to userspace
<sima> so I'm not sure why we need to add another awkward corner case here
<sima> since connected + 0 modes very much means there's a working screen there, we just don't know how to drive it
<sima> such screens do exist (or at least have, on vga, way back)
<gildekel> sima:
<gildekel> `userspace can handle that, and most userspace will handle that without falling over`
<gildekel> But that would mean we lose all potential signal that a connector is in a bad state vs. not connected..
<gildekel> I am not completely opposed to this solution, to be clear
<sima> gildekel, why do you care?
<gildekel> Because I would like to be able to provide some information to the user that a display is connected, but there is a connectivity issue
<gildekel> I would like to provide feedback, in any form
<sima> like from a user pov, what's the difference between a badly plugged in cable and a shit cable?
<dottedmag> Maybe add a property to a connector saying "why this thing is disconnected"? With absence of value meaning "dunno, probably no cable"
<sima> in both cases they think it should work, but it doesnt
<sima> and in both cases they'll figure out that something is shit, and any attempts at further debugging feel a bit silly to me
<sima> like maybe the connector on the board is dented, and no amount of cable replacement is going to fix anything
<gildekel> In one case, they are left on their own. In the other, the OS validates their circumstances and provides some feedback
<gildekel> I see that as a better user experience.
camus has quit [Ping timeout: 480 seconds]
<sima> gildekel, yeah but what is the users going to do?
<sima> you can't tell them whether it's the laptop, cable, or sink that's busted
<sima> just "something is wrong"
<sima> which ... they know, it doesn't work
<zamundaaa[m]> sima: you can give them a hint, a list of things to do
<gildekel> ^
<zamundaaa[m]> For users that don't know anything about computers that's pretty helpful
<gildekel> In ChromeOS, we plan on providing a link to a Display Troubleshoot page
<gildekel> Displays are hard. period. We can provide basic education, where things can go wrong, cable management, etc.
<gildekel> vs. leave them on their own to open yet another bug in which "my displays don't turn on"
<gildekel> and call us all mokeys with keyboards on reddit
<gildekel> ^ (real story)
<sima> yeah that part is unfortunately not optional :-/
<sima> but yeah I still think what we should do is 1. handle this within current kms semantics, i.e. set the connector to disconnected or something if it's terminally busted
<sima> 2. do some extension on top of that, with the userspace glue to handle it, but that extensions needs to extend, not change the rules
<sima> maybe if we set it to disconnected we can abuse link-status=terminal, but that feels a bit risky
<gildekel> Well, in that case, how about zamundaaa[m]'s original suggestion to add a connector state? why isn't that sufficient? It'll be in terminal state + link-status bad after the final link training attempt. It should be sufficient to say that if a connector != CONNECTED then userspaces ignore it.. no?
<gildekel> And the, should a userspace care, it can parse connectors in terminal state for extra signal
mszyprow has joined #dri-devel
<sima> gildekel, that's 1&2 together
<sima> and I'm not super keen on auditing everything whether it copes with a new connector state
<seanpaul_> sima: can you elaborate on "risky"? it actually might work even better this way since you can mark the base connector as terminal for MST cases
mszyprow has quit [Ping timeout: 480 seconds]
<sima> that's even more risky than extending link-status, because almost everything kms userspace looks at connector state
<seanpaul_> it's already disconnected
<dottedmag> default: fprintf(stderr, "Unknown connector state"); exit)(1);
<dottedmag> this kind of risky?
<sima> yup
<seanpaul_> sima: to clarify, i was asking about why link-status extension was risky
<seanpaul_> (especially if the connector is disconnected)
<sima> seanpaul_, well same, but since the users of that are a lot more limited it's probably going to be fine, plus if it's already disconnected even more chances it's going to be fine
<sima> but fundamentally the no regression rule means that userspace is right, no matter how stupid
<seanpaul_> ok, understood, was curious if there was anything beyond that
<sima> and upgrading a "my screen external doesn't work" bug to a "my compositor just died" bug isnt good
<sima> seanpaul_, nah just general uapi paranoia
<gildekel> Agreed. I can work with a disconnected connector with link-status=terminal
<seanpaul_> perhaps i should tighten the straps on my tinfoil hat
<gildekel> you guys get a tinfoil hat? :o
<gildekel> sima: I'll add you to future revisions of the series, if you don't mind.
<zamundaaa[m]> I agree that link-status terminal + disconnected connector would be a good solution
<sima> gildekel, little wrapper with docs and all (plus extending uapi docs for the properties too ofc) would be good
<sima> since to avoid races you have to set to disconnected before updating link-status, then uevent
Jeremy_Rand_Talos__ has quit [Remote host closed the connection]
<sima> and we don't want to give drivers any other way to get to link-status=terminal I think
Jeremy_Rand_Talos__ has joined #dri-devel
<gildekel> aye aye, Captain :)
<gildekel> We already take a similar approach when we set the link status to bad in i915, especially with the recursive function to set downstream MST ports to BAD as well
<gildekel> so I feel like this would make sense.
<seanpaul_> i assume for the MST case we're going to mark link-status terminal on the base connector and leave everything else as-is?
<gildekel> I think we should mark all MST topology as terminal as well
<gildekel> Userspaces tend to abstract these details
<gildekel> Or, rather, let me speak for ChromeOS
<gildekel> Alternatively, we can mark all downstream ports as BAD, but why...?
<seanpaul_> hmm, but usually if all connectors are disconnected from an MST branch, they are destroyed
<seanpaul_> so that would have the unintended consequence of leaving a fully disconnected MST topology
<gildekel> Does the clean up occur independently of a re-probe? because otherwise, a failed link-training even would change the status of the connectors, and, I would assume, the follow-up uevent will trigger the cleanup
An0num0us has joined #dri-devel
<seanpaul_> > a failed link-training even would change the status of the connectors
<seanpaul_> i don't think it does/would
<gildekel> Well, it doesn't, currently. But it will if my series is accepted.
<gildekel> A part of the change is to modify the state of all downstream MST ports after a failed link-train
<seanpaul_> no, all status would stay the same, base would stay disconnected and sinks would stay connected
<gildekel> Not sure I am following.
<seanpaul_> link training would fail on the base connector
<gildekel> correct, and the new state would be propagated to the downstream ports as well.
Jeremy_Rand_Talos__ has quit [Remote host closed the connection]
<gildekel> that's a part of the work we have to do after the base connector fails.
<seanpaul_> but those connectors have connected status
<seanpaul_> so you would end up with link-status terminal and status_connected
Jeremy_Rand_Talos__ has joined #dri-devel
<seanpaul_> so i think what you want to do is leave the downstream ports as CONNECTED, leave the base connector as DISCONNECTED and just update link-status on the base connector to terminal
<gildekel> I must be misunderstanding you somehow. Here's what I plan to do:
<gildekel> 1) Base connector fails link-training
<gildekel> 2) Modify its status to disconnected, mark it as terminal
<gildekel> 3) If connector is MST, recursively mark its downstream ports to disconnected + terminal
<gildekel> 4) send uevent
<seanpaul_> re: 2) it's already disconnected
youmukonpaku1337 has quit [Ping timeout: 480 seconds]
<gildekel> It's not.. not at this point..
<gildekel> This is during link-training.. the connector comes in in a "good state" to link-training. Connector is connected with link-status=GOOD by kernel or userspace.
<gildekel> We can sync offline if you want
<seanpaul_> hmm, when does the base connector transition to disconnected in the successful case?
<gildekel> Oh. I see what you mean.
<gildekel> That's a good point. But shouldn't really matter.
<seanpaul_> re: 3) we currently don't support keeping the MST topology alive when all sinks are disconnected, so this would change current behavior
<gildekel> But then we're back to square one. We have connected connectors producing zombie displays..
<gildekel> on MST
<seanpaul_> you would have to inspect the base connector's link-status and do the recursion in userspace
<gildekel> why not just let the connectors die out, and have the base connector signal that?
mauld has quit [Ping timeout: 480 seconds]
<seanpaul_> which, conceptually kind of makes sense b/c the link-status between source & mst branch device is bad, but the link in between branch devices may not be
<gildekel> Ok. So do we at least change the link-status to terminal/bad on the downstream ports?
<gildekel> I would assume we d.
<gildekel> do*
<seanpaul_> i'd say no
donaldrobson has quit [Ping timeout: 480 seconds]
<gildekel> But from this source's point of view, these connectors are unusable.
<gildekel> doesn't matter if another potential source can use them
kzd has quit [Ping timeout: 480 seconds]
sukrutb_ has joined #dri-devel
vliaskov has quit [Remote host closed the connection]
junaid has joined #dri-devel
mszyprow has joined #dri-devel
ngcortes has joined #dri-devel
<zamundaaa[m]> <seanpaul_> "you would have to inspect the..." <- That would not be backwards compatible with current userspace
<gildekel> I believe he was suggesting a solution to proper pruning of the misbehaving connectors in userspace. If left un-handled, then it should produce the same behavior in which the connectors appear to be connected and operational.
<gildekel> ...a solution for* proper pruning...
<zamundaaa[m]> Sure, I guess that would kinda-ish maybe be fine. But it would be quite different from how userspace operates right now
junaid has quit [Remote host closed the connection]
<zamundaaa[m]> by which I mean that most userspace completely ignores that MST is a thing at all. It's abstracted away by the kernel, and all we get as information about it at all is the mst path property
<gildekel> correct, that I agree. That will not change.
<gildekel> The current behavior is that after a failed link-training, all (base) connectors are left connected, marked as link-status=BAD, and no more uevents are issued. So userspace thinks the last modeset succeeded.
<gildekel> This applies to the MST connectors as well
<gildekel> If we modify only the base connector to be in terminal state (as it's already marked disconnected by the MST topology manager), and leave the MST connectors as they are (connected and link-status good), then userspace sees them as always
<gildekel> the difference is that userspaces that wish to parse the new state, can look up the base connector of the MST ports, and see that it's in a bad state, and prune the MST connectors
<gildekel> Otherwise, you'll have your zombie displays, as always, as the MST connectors are marked connected and ready
<zamundaaa[m]> I don't think that's a good solution, it's inconsistent between different connectors
<zamundaaa[m]> If you need to keep MST downstream ports as connected for kernel-internal reasons, then keep that mess in the kernel. If you want all userspace to do a recursion on the connectors and mark them as disconnected, then why not do that in the kernel when sending connector information to userspace?
<gildekel> To clarify, I am with you on this. I would rather mark the downstream connectors as disconnected/terminal
<gildekel> but there seem to be more disagreement here. And justly so, because doing that _will_ change current behavior in userspace
mszyprow has quit [Ping timeout: 480 seconds]
tertl8 has joined #dri-devel
<zamundaaa[m]> What behavior would it change? If you're referring to MST connectors being marked as disconnected instead of removed, that's not a change that can break userspace (which isn't already fundamentally broken)
<gildekel> They will not be marked as disconnected, but removed entirely.. since MST support removes the topology entirely if all downstream connectors are marked as disconnected
<gildekel> so you go from having MST connectors showing up, to having no MST connectors
<gildekel> that's the change. Whether or not it's safe or acceptable is a different discussion
<zamundaaa[m]> But that's only in the kernel. I don't particularly care about what you do in there - if you have to keep connectors marked as connected internally, then do that
sarahwalker has quit [Remote host closed the connection]
<zamundaaa[m]> What I'm saying is that when userspace calls drmModeGetConnector, you should - inside the kernel - check if link-status is terminal, and set the connection status to disconnected for userspace
<zamundaaa[m]> Then you can simply mark all the affected MST connectors as connected + link-status=terminal inside the kernel, and userspace will see disconnected + link-status=terminal
alanc has quit [Remote host closed the connection]
alanc has joined #dri-devel
Guest2029 has quit [Ping timeout: 480 seconds]
gouchi has joined #dri-devel
lynxeye has quit [Quit: Leaving.]
Haaninjo has joined #dri-devel
mauld has joined #dri-devel
<gildekel> That's a possibility, but sounds a like debugging hell
ngcortes has quit [Ping timeout: 480 seconds]
oneforall2 has quit [Ping timeout: 480 seconds]
mszyprow has joined #dri-devel
bmodem has quit [Ping timeout: 480 seconds]
oneforall2 has joined #dri-devel
Mangix has quit [Read error: Connection reset by peer]
Mangix has joined #dri-devel
guru_ has joined #dri-devel
sima has quit [Ping timeout: 480 seconds]
Danct12 has quit [Remote host closed the connection]
neniagh has quit []
oneforall2 has quit [Ping timeout: 480 seconds]
milek7 has joined #dri-devel
Danct12 has joined #dri-devel
Mangix has quit [Ping timeout: 480 seconds]
Mangix has joined #dri-devel
<Kayden> hitting some test failures with an MR on virpipe. anyone know how to reproduce those? looks like something with virgl_test_server and llvmpipe?
neniagh has joined #dri-devel
<Kayden> hm I guess I just run the server and then LIBGL_ALWAYS_SOFTWARE=1 GALLIUM_DRIVER=virpipe program, huh
tintou has joined #dri-devel
<tintou> yes
<Kayden> tintou: thanks!
<tintou> To run the test server as in the CI, you can look at https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/.gitlab-ci/deqp-runner.sh?ref_type=heads#L137 to give the right arguments and environment variables
<Kayden> how do I get it building virgl_test_server? -Dgallium-drivers=virgl isn't sufficient it seems
<tintou> that's in the virglrenderer project https://gitlab.freedesktop.org/virgl/virglrenderer
<Kayden> oh, or that's...not part of mesa, got it
fab has quit [Quit: fab]
mszyprow has quit [Ping timeout: 480 seconds]
<Kayden> virglrenderer is what translates the shaders?
<Kayden> looks ilke it, yep
<Kayden> ah, from TGSI, so there's ntt too
kzd has joined #dri-devel
<Kayden> now to figure out if this is a virglrenderer bug or a mesa glsl language frontend bug
oneforall2 has joined #dri-devel
guru_ has quit [Ping timeout: 480 seconds]
Duke`` has quit [Ping timeout: 480 seconds]
<Kayden> arg, this is confusing.
<Kayden> so on my barrier optimization MR, I'm getting rid of unnecessary barrier modes
<Kayden> which causes virglrenderer to emit memoryBarrierBuffer() and memoryBarrierAtomicCounter() instead of the full memoryBarrier(). sensible
<Kayden> but it's doing #version 140 #extension GL_ARB_shader_storage_buffer_object : require
oneforall2 has quit [Remote host closed the connection]
<Kayden> mesa only provides those functions in GLSL 4.30 or ESSL 3.10 or if #extension GL_ARB_compute_shader : enable
oneforall2 has joined #dri-devel
<Kayden> GL_ARB_compute_shader, however, does not mention a #extension directive
<airlied> I assume that means all compute shaders should have them
<Kayden> right. however, this is a vertex shader
<airlied> okay that seems like a problem then :-)
<Kayden> The functions memoryBarrierShared() and groupMemoryBarrier() are available only in compute shaders; the other functions are available in all shader types.
<Kayden> so they should be there. but when
<Kayden> there's some thought that maybe it's just adding them as an interaction with the other specs...except...memoryBarrier() is provided by ARB_shader_image_load_store, not ARB_shader_storage_buffer_object
<airlied> I'd probably just fix virgl on the mesa side by sending more barrier modes that necessary and file a bug to get virglrenderer fix
<Kayden> I guess virglrenderer ought to be doing an #extension GL_ARB_compute_shader : require when using memory barriers?
<Kayden> yeah, was going to say, not sure how to synchronize changes in the two projects
<Kayden> so right now I'm just enabling the pass for all drivers in st/glsl
<airlied> I don't think the GL_ARB_compute_shader makes sense at all in mes
<airlied> mesa
<Kayden> oh?
mauld has quit [Ping timeout: 480 seconds]
<airlied> since it doesn't look specified
<Kayden> yeah.
<airlied> ssbo spec says "Additionally, the shading language provides the memoryBarrier() function to control the relative order of memory accesses within individual shader invocations and provides various memory qualifiers controlling how the memory corresponding to individual variables is accessed.
<airlied> "
<airlied> then never mentions it again
<airlied> it's only then mentioned in the image one, which seems like virglrender should be emitting then
<Kayden> yeah, that's where the original memoryBarrier is defined
<Kayden> makes me think that mesa ought to be adding the memoryBarrierShared/AtomicCounter/Image/Buffer() variants if ARB_compute_shader is supported at all, without an #extension directive, whenever memoryBarrier() is available
<Kayden> ah
<Kayden> right, okay, so here's the thing, I guess
<Kayden> it wasn't possible to create these situations outside of compute shaders before my pass.
<Kayden> because you didn't have memoryBarrier*() subvariants outside of compute shaders, or without turning them on yourself somehow
<Kayden> hm. I guess you could turn them on and do it though
<Kayden> vrend_shader.c does do ctx->shader_req_bits |= SHADER_REQ_IMAGE_LOAD_STORE; for the full barrier but nothing for the others
ngcortes has joined #dri-devel
<zmike> is that actually affecting anything? the scopes below should be all the filtering needed
rasterman has quit [Quit: Gettin' stinky!]
<Kayden> yeah, dEQP-GLES31.functional.image_load_store.2d.qualifiers.volatile_r32f was failing on zink+lavapipe without the fix, with https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24842/
idr has joined #dri-devel
<Kayden> was getting these barriers: http://whitecape.org/paste/lvp-diff.txt
<zmike> 🤔
<zmike> where are the NONE scope barriers coming from?
<Kayden> those are just memoryBarrier() I believe
<Kayden> yeah, test has memoryBarrier(); barrier();
<Kayden> in a compute shader
<Kayden> there's no SSBO or shared or global access, only images
<zmike> ah okay so it's a OpMemoryBarrier with scope=NONE?
<Kayden> but the barrier() still ought to synchronize the invocations I think
<Kayden> well, it's GLSL
<zmike> seems legit
<zmike> take my rb
<zamundaaa[m]> gildekel: you could always add a CAP that exposes the "real" state to userspace. Please just don't make connectors behave differently depending on their role in the MST topology
<Kayden> zmike: thank you!
gouchi has quit [Remote host closed the connection]
<Kayden> robclark: I'm hitting a freedreno failure in https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24842 (barrier mode optimizations) in piglit's spec/arb_compute_shader/execution/simple-barrier-atomics. not sure how best to debug this since I don't have an a6xx handy
<Kayden> guessing I deleted some barriers that you were relying on
<Kayden> or the atomic counters aren't being accessed via derefs when nir_opt_barrier_modes() is called so it's failing to see them. the place I call the pass seems to be working for other drivers though
mauld has joined #dri-devel
<airlied> Kayden: I'm a bit worried about emitting nir barriers on shader stages that haven't been traditionally used to seeing them
<airlied> though it's likely the problem is going to be mostly sw/virgl where we don't really have specified paths
<Kayden> airlied: seems like other than the extension directive thing, it ought to work
<Kayden> I guess I could make the pass optional via a nir_shader_compiler_options flag
<robclark> Kayden: I'll look in a few.. but you could probably use drm-shim... also I'm a big fan of asserts when it comes to what form a shader is in (since otherwise it is pretty much a mess knowing the right time and order for passes)
<idr> I just read issue 13 in the ARB_compute_shader spec, now I'm re-reading the IRC log...
<Kayden> robclark: oh can I? that'd be awesome
<robclark> well, assuming diffing nir_print and/or IR3_SHADER_DEBUG=disasm output is enough to spot the issue..
<Kayden> probably would be
<idr> Kayden: Based on my reading of issue 13, I think the "bug" is in virgl, but it occurs because it's getting something it doesn't expect.
<idr> Outside of a compute shader, a memory barrier of any sort should get turned into memoryBarrier().
mszyprow has joined #dri-devel
<Kayden> idr: why?
<Kayden> oh, you mean, when virgl translates back to GLSL.
<Kayden> because the sub-functions don't exist there
<Kayden> right
<idr> Right.
<idr> I was going to double-check what GLSL 4.30 says. That may affect things too.
<idr> If it's converting to GLSL 4.30, then... maybe.
<Kayden> it's using 1.40 + #extensions
<idr> I looked at the 4.60 spec because it's what was handy... memoryBarrierAtomicCounter, memoryBarrierBuffer, and memoryBarrierImage exist in all stages there, so I assume that's 4.30+ behavior.
<Kayden> robclark: thanks for the drm-shim pointer, I can run the shaders now and I see what's going on :)
<robclark> \o/
<Kayden> idr: Yeah. I think that's the intention
<Kayden> idr: so it's really just the undefined mess of #extension enabling in pre-4.30
An0num0us has quit [Ping timeout: 480 seconds]
<idr> Which I interpret as "those new 4.30 functions only exist in compute shaders."
<Kayden> *nods*
<airlied> yeah I think just force virgl to emit full barriers on non-compute might be the best plan
<Kayden> yeah
<airlied> we can't fix virglrenderer to fix this bug, it would need feature flags etc and new mesa needs to run on old virglrenderer
<airlied> anyone know about the v3d cpu tasks and why they aren't just compute shaders?
alyssa has joined #dri-devel
<alyssa> idr: IIRC when I reworked barriers, I noticed there was virglrenderer brokenness worked around in nir-to-tgsi
<alyssa> I have paged out all details but it's worth looking in ntt
<Kayden> alyssa: yeah, I could either hack around it in nir_to_tgsi, or in a virgl renderer pass to put them back, or just add a nir_shader_compiler_options flag to avoid calling my pass on non-compute for virgl
<alyssa> Kayden: what i meant is, I think there's already brokenness here
<alyssa> may or may not be my fault
<alyssa> may or may not affect your pass
<Kayden> okay :)
<Kayden> apparently my atomic counter handling is busted, too. glsl_to_nir uses nir_var_mem_ssbo as the mode for those. but the nir_deref_var is nir_var_uniform with glsl_type atomic_uint
<Kayden> and my pass is currently running before nir_lower_atomics_to_ssbo
a-865 has quit [Ping timeout: 480 seconds]
itsmeluigi has quit [Quit: Konversation terminated!]
youmukonpaku1337 has joined #dri-devel
youmukonpaku1337 has quit []
youmukonpaku1337 has joined #dri-devel
crabbedhaloablut has quit []
<alyssa> i kinda want to delete nir atomic counters
<Kayden> would be a fan
<Kayden> I guess it would impact r600
<idr> I think that's the only hardware that ever did anything special for atomic counters. :(
<Kayden> yeah
<idr> I have some vague recollection of warts on the spec because of that.
youmukon1 has joined #dri-devel
<idr> It wasn't us for once!
youmukonpaku1337 has quit []
<alyssa> idr: lol
<alyssa> idr: I can blame your employer for spending months of my life on geometry shaders though, right?
youmukon1 has quit []
youmukonpaku1337 has joined #dri-devel
<alyssa> (-:
a-865 has joined #dri-devel
<idr> Geometry shaders were for sure a group effort.
<idr> Honestly... I'd blame Microsoft for requiring GS and fp64.
<idr> Khronos was just keeping up with the Jones's.
<airlied> the fp64 requirement was one of dumbest self owns in graphics :-P
<airlied> pretty sure dx never actually required it, GL just had to one up them
<karolherbst> yeah.. who actually thought that was a good idea
<karolherbst> though without it, nobody would have implemented fp64 probably...
<karolherbst> or it would have been some enterprise only feature only exposed on 5 gpus
<airlied> which is exactly how it should have been :-)
<karolherbst> :D
<idr> MS must have required it, or it never would have ended up in Intel GPUs.
<Kayden> hmm, FEATURE_LEVEL_11 with D3D11_FEATURE_DOUBLES
* Kayden not adept at cross-referencing MS docs yet
<Kayden> but yeah unfortunate for sure
<alyssa> there was a brief, terrible time when Mali GPUs had native hardware for fp64
<alyssa> not just fp64
<alyssa> fp64vec2!
<alyssa> So you can add two doubles in one instruction! :-D
youmukon1 has joined #dri-devel
youmukon1 has quit []
<airlied> Kayden: yes so it was optional in d3d11, gl4.0 should have just left it as an extension
<airlied> or adopted some sort of capabilities for exts
youmukon1 has joined #dri-devel
youmukonpaku1337 has quit [Read error: Connection reset by peer]
mszyprow has quit [Ping timeout: 480 seconds]
<karolherbst> alyssa: but why...
<alyssa> karolherbst: the same genius 128-bit ALU that brought you native vec16 instructions
<karolherbst> I mean.. I kinda see the point of a 128 bit alu, but native fp64?
xypron has quit [Remote host closed the connection]
<karolherbst> I wonder if I finally have a CTS run on v3d without crashing the GPU
youmukon1 has quit [Read error: Connection reset by peer]
youmukonpaku1337 has joined #dri-devel
youmukon1 has joined #dri-devel
youmukonpaku1337 has quit [Read error: Connection reset by peer]
columbarius has joined #dri-devel
co1umbarius has quit [Ping timeout: 480 seconds]
<Kayden> huh, this is new
<Kayden> went to go test my updated MR !24842 to make sure I fixed the virpipe and freedreno regressions, and... https://gitlab.freedesktop.org/mesa/mesa/-/pipelines/979688 says "You are not authorized to run this manual job" on any of the container steps. I used to click the play button there to cause it to cascade through the other jobs and actually test stuff
<Kayden> did something change? am I supposed to be doing this differently now?
<Kayden> oh.
<Kayden> wasn't signed in, heh
<Kayden> (signed in twice today but various browser tabs are having...problems. *shrug*)
simon-perretta-img_ has joined #dri-devel
simon-perretta-img has quit [Ping timeout: 480 seconds]
YuGiOhJCJ has joined #dri-devel
<DemiMarie> Could one implement Venus on top of WebGL?
<DemiMarie> airlied: hard disagree on fp64, you kind of need it for many scientific computing workloads
<airlied> exactly so we don't need it :-)
<airlied> like intel even dropped it from their hw going forward
<HdkR> I like the NVIDIA approach. One fp64 pipeline per SM for compatibility, if you need real perf then buy the compute focused cards :D
simon-perretta-img_ has quit [Ping timeout: 480 seconds]
alanc has quit [Remote host closed the connection]
Haaninjo has quit [Quit: Ex-Chat]
heat has joined #dri-devel
<DemiMarie> airlied HdkR: who is the “we” in “so we don’t need it”?
<airlied> 99% of graphics/compute developers, the GL4 API mandating it etc
<airlied> like it was fine as an extension for specialist hardare like dx11 did it
<karolherbst> fp64 is unusable on any desktop GPU anyway
youmukonpaku1337 has joined #dri-devel
youmukon1 has quit [Ping timeout: 480 seconds]
kts has quit [Ping timeout: 480 seconds]