ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar
<Kayden> I'd say we could move the problem up a level and see all slices, but the fundamental API call is glCompressedTexImage2D
<Kayden> which only supplies a single miplevel
<Kayden> so we just can't know
nchery is now known as Guest2808
nchery has joined #dri-devel
<Kayden> I guess if we wanted to do something out of spec, we could probably assume that all miplevels are likely to either have alpha, or not have alpha, since each miplevel is likely to be minified versions of the same thing. hopefully they would supply either level 0 first or a similarly not-miptail level which still has some resolution
<Kayden> alternatively we could do DXT1 first and then realloc and do surgery to expand to DXT5 the moment we see alpha
Guest2808 has quit [Ping timeout: 480 seconds]
<nchery> Kayden: ah right..
<anholt> anyone else have a driver that can automatically do implicit derivs choose LOD 0 in the VS/CS? https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16156
<Kayden> we could also play tricks with the cache, and always transcode to BC7/DXT5 first, but save whether the texture had alpha at any point. next load of the image we could hit the cache and say "y'know what, let's reencode it a second time to DXT1"
<nchery> we'd need to group all miplevels belonging to a single texture in the cache to know if that's safe
<Kayden> yep
columbarius has joined #dri-devel
<nchery> how will the cache know it's seen all the miplevels of a texture?
co1umbarius has quit [Ping timeout: 480 seconds]
lanodan has joined #dri-devel
<Kayden> it seems like we'll have to do something in st_finalize_texture
icecream95 has joined #dri-devel
iive has quit [Ping timeout: 480 seconds]
camus1 has joined #dri-devel
camus has quit [Read error: Connection reset by peer]
<graphitemaster> jenatali, why would you say that, any particular reason why?
<jenatali> Time spent optimizing. Zink had had tc for years before I got around to hooking it up
lyudess has joined #dri-devel
bbrezill1 has joined #dri-devel
<graphitemaster> very interesting, it's my understanding microsoft plans to ship gl on d3d12 as the defacto gl driver at some point, where i suppose everyone but nv can forget about gl
<graphitemaster> would be kind of funny if zink is somehow better at that task
Lyude has quit [Ping timeout: 480 seconds]
rkanwal has quit [Ping timeout: 480 seconds]
ngcortes has joined #dri-devel
LexSfX has quit [Ping timeout: 480 seconds]
heat has quit [Remote host closed the connection]
LexSfX has joined #dri-devel
slattann has joined #dri-devel
Company has quit [Quit: Leaving]
zf has joined #dri-devel
FireBurn has joined #dri-devel
lemonzest has quit [Quit: WeeChat 3.4]
<jenatali> After reaching viability (gl3.3), GLon12 has been pretty much just a portion of my time. It'll get there eventually
mhenning has quit [Quit: mhenning]
ngcortes has quit [Remote host closed the connection]
mhenning has joined #dri-devel
mhenning has quit [Quit: mhenning]
sdutt has joined #dri-devel
slattann has quit []
Wally has joined #dri-devel
Duke`` has joined #dri-devel
<Wally> Where would vkCreateSwapchainKHR be defined?
<HdkR> `vulkan/vulkan_core.h` ?
<Wally> rephrase, how would wsi add functionality to vkCreateSwapchainKHR
<HdkR> Extending what the function accepts in the VkSwapchainCreateInfoKHR* argument
<HdkR> Specifically pNext to another struct type
<Sachiel> generally, you can grep for the function names without vk and you should find the entry points for each driver/common code
<Wally> Sachiel: was doing that, grepped for the wrong things :(
tzimmermann has joined #dri-devel
jewins has quit [Ping timeout: 480 seconds]
itoral has joined #dri-devel
slattann has joined #dri-devel
<airlied> karolherbst: yeah radeonsi support is going to be a bit of a pita no matter what
<airlied> I can get radeonsi/llvm backend to work without images if jekstrand relents on letting some horror patches in
<airlied> or I can try and move the big rock that is adding aco support to radeonsi in a semi-clean fashion at least for compute shaders
shankaru has joined #dri-devel
itoral has quit [Remote host closed the connection]
<Wally> Sorry to bother you guys again, how did wsi_swapchain_to_handle(wsi_swapchain) get generated as found in wsi_common.c? I couldnt grep for it and elixer didnt show it(im guessing its generated by a script)
Duke`` has quit [Ping timeout: 480 seconds]
<airlied> VK_DEFINE_NONDISP_HANDLE_CASTS(wsi_swapchain, base, VkSwapchainKHR, VK_OBJECT_TYPE_SWAPCHAIN_KHR)
<ishitatsuyuki> if you own a license of CLion, it should be able to navigate to it
<ishitatsuyuki> probably VSCode too since most of the navigation is handled by libclang anyway
mvlad has joined #dri-devel
<Wally> airlied: Sorry, but that didnt help me find a function declaration or definition for wsi_swapchain_to_handle :(
<ishitatsuyuki> Wally: You should see the definition of the macro
eukara has joined #dri-devel
<ishitatsuyuki> Also, if possible, clone the source and index it with an IDE on your computer
<Wally> ishitatsuyuki: Sorry, I dont use IDEs much, how do I generate a index on Vs Code?
<ishitatsuyuki> I use compilation database
<ishitatsuyuki> Basically you need to generate a build directory with `meson build`, then you point VSCode to the compile_commands.json generated in build dir
<ishitatsuyuki> you might want to disable LLVM support when configuring or it pulls in a bunch of annoying to set up dependencies
<ishitatsuyuki> llvm can be disabled with `-Dllvm=false` passed to meson
<Wally> ishitatsuyuki: How do I point VSCode to the compile_commands.json generated in build dir?
<ishitatsuyuki> Wally: the canonical way is to modify c_cpp_properties.json, the extension probably has some configuration GUI too but I forgot how to open it
itoral has joined #dri-devel
<Wally> ide moment
vyivel has quit [Read error: Connection reset by peer]
vyivel has joined #dri-devel
lemonzest has joined #dri-devel
<ishitatsuyuki> kinda annoying, yeah, but once you get it to work it's indispensable
frieder has joined #dri-devel
<Wally> ishitatsuyuki: I had to boot up weston for VsCode you meanie!
<Wally> *weeps*
<ishitatsuyuki> huh, just use X
<Wally> I just use tty
<Wally> X is bloat
<Sachiel> vim + ALE + ccls
<Wally> Sachiel: emacs + mpv + ani-cli
<Sachiel> I guess there must be something to use lsp with emacs too
danvet has joined #dri-devel
<ishitatsuyuki> the problem with C/C++ is that the open source LSP options are less than ideal
<Sachiel> ccls works really well for me
<Wally> Sachiel: Christ Community Lutheran School?
<HdkR> nvim > vim
<ishitatsuyuki> bruh just google it
<Wally> sed + nano > vim
<ishitatsuyuki> ok, enough hot takes, this is getting a bit boring
<Wally> ishitatsuyuki: its getting boring because we had the exact same irc messages every week for the last 40+ years
shankaru has quit [Quit: Leaving.]
<ishitatsuyuki> so?
<Wally> I see your point
frieder has quit [Remote host closed the connection]
<ishitatsuyuki> ok
shankaru has joined #dri-devel
<Wally> ishitatsuyuki: Was that for the proprietary C++ extension?
<ishitatsuyuki> yes
<Wally> derp, was using Vs Codium
<ishitatsuyuki> it's up to you if you want to set up it using some OSS alternative but I have no idea how those are configured
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
<Wally> Welp seems like a bunch of configuration :( Not what I want to do; I realized that I dont need to care about that so...I wont...sorry for your time
Wally has quit [Remote host closed the connection]
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
tobiasjakobi has joined #dri-devel
tobiasjakobi has quit []
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
tursulin has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
icecream95 has quit [Ping timeout: 480 seconds]
Daanct12 has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
iive has joined #dri-devel
thellstrom has joined #dri-devel
nchery is now known as Guest2841
nchery has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
thellstrom has quit [Ping timeout: 480 seconds]
Guest2841 has quit [Ping timeout: 480 seconds]
jkrzyszt has joined #dri-devel
zorowk has joined #dri-devel
zorowk has quit []
lynxeye has joined #dri-devel
rasterman has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
<karolherbst> airlied: yeah.. might make sense to use it for compute, or only CL or something
lemonzest has quit [Remote host closed the connection]
YuGiOhJCJ has joined #dri-devel
<airlied> karolherbst: even that is quite a mountain of work
<airlied> my initial hacks were enough to run a very simple CL kernel, but going further will need a lot of refactoring
sdutt has quit [Ping timeout: 480 seconds]
hikiko has joined #dri-devel
<karolherbst> airlied: what's the biggest issue? Just something gallium expects, or something else?
rkanwal has joined #dri-devel
rasterman has quit [Quit: Gettin' stinky!]
tzimmermann has quit [Remote host closed the connection]
rasterman has joined #dri-devel
columbarius has quit [Remote host closed the connection]
tzimmermann has joined #dri-devel
tzimmermann has quit [Remote host closed the connection]
<airlied> karolherbst: for non aco llvm has an abi to launch compuye
<karolherbst> airlied: ohh.. and radeonsi depends on abi?
<karolherbst> mhhhh
Company has joined #dri-devel
<karolherbst> well.. that's all very annoying.. guess we have to keep clover around for r600/radeonsi then :(
aravind has joined #dri-devel
<karolherbst> soo.. let's run rusticl on top of nouveau and see what happens
tzimmermann has joined #dri-devel
<karolherbst> EMBEDDED_PROFILE yeah well.. need to work out more images, but okay, at least it loads :)
<karolherbst> guess that didn't worked out so well
columbarius has joined #dri-devel
itoral has quit [Remote host closed the connection]
YuGiOhJCJ has quit [Remote host closed the connection]
YuGiOhJCJ has joined #dri-devel
maxzor has joined #dri-devel
ella-0 has joined #dri-devel
ella-0_ has quit [Read error: Connection reset by peer]
kchibisov has quit [Read error: No route to host]
kchibisov has joined #dri-devel
tzimmermann has quit [Remote host closed the connection]
tzimmermann has joined #dri-devel
adjtm has quit [Quit: Leaving]
Daanct12 has quit [Remote host closed the connection]
reduz has joined #dri-devel
rkanwal has quit [Quit: rkanwal]
rkanwal has joined #dri-devel
<alyssa> airlied: ACO+rusticl the future is now? :-p
thellstrom has joined #dri-devel
<karolherbst> ehhh.. libdrm_nouveau is so broken :( honestly, I am soo close to just write a new driver from scratch
<karolherbst> (before rusticl people would see it as an empty threat, an expression of highest degree of annoyence, but now people have to wonder: is karolherbst serious? Is karolherbst joking? Let's see in 4 weeks)
YuGiOhJCJ has quit [Quit: YuGiOhJCJ]
<alyssa> Lol
shankaru has quit [Quit: Leaving.]
gawin has joined #dri-devel
heat has joined #dri-devel
shankaru has joined #dri-devel
* tomeu sets the stopwatch
<bcheng> Other than carefully seeing what atomic properties are being set, are there any tricks to figure out why drmModeAtomicCommit returns EINVAL?
<pq> bcheng, drm.debug kernel module option set to a suitable mask should tell in the logs. You can change that via sysfs.
<krh> karolherbst: we always wonder if you're joking :)
sdutt has joined #dri-devel
sdutt has quit []
sdutt has joined #dri-devel
rasterman- has joined #dri-devel
rasterman has quit [Read error: Connection reset by peer]
shankaru has quit []
nchery has quit [Read error: Connection reset by peer]
<danylo> I have a weird issue with non-immediate present mode (on Turnip): I have a board without any display attached, any non-immediate mode causes vk apps to run at 1fps due to `xcb_wait_for_special_event` taking ~1000ms to complete. I wouldn't care if I didn't have to run vk traces in non-immediate mode...
<pq> That sounds like the fallback 1 Hz timer in Xorg which makes sure that completely off-screen apps do not get completely stuck when they want to throttle to refresh rate.
<danylo> Huh, sounds plausible, what could I do with it?
<pq> I dunno... plug in a monitor, or don't trottle to refresh rate? Or stop using Xorg.
<danylo> I cannot plug the monitor and cannot stop using Xorg =)
<danylo> So it leaves me with not trottling the framerate
<pq> Is using Xorg specifically a requirement, or do you only need any X11 server?
<MrCooper> danylo: MESA_VK_WSI_PRESENT_MODE=immediate
<danylo> MrCooper: that helps, unless gfxreconstruct trace HAS to use non-immediate mode
<MrCooper> has to why?
<danylo> pq: that board is purely for dev purposes, so it's not a requirement, it's just last time I tried using wayland - it wasn't a good experience
<MrCooper> you'd likely hit the same issue with Xwayland as well
<pq> even with a specifically headless Wayland compositor?
<MrCooper> it depends on how the compositor drives its frame events
<danylo> MrCooper: I recorded a trace on Android, where game seem to be bent on using non-immediate mode. Replaying the trace with forced imm mode causes hangs.
<danylo> But on the second though, is it ok that changing present mode leads to a hang...
<MrCooper> have you tried MESA_VK_WSI_PRESENT_MODE=relaxed instead?
alyssa has left #dri-devel [#dri-devel]
<danylo> with relaxed - still 1fps
<MrCooper> was afraid so
<flto> danylo: have you tried weston with headless backend, --use-gl and xwayland? (if the problem is not having a display, this usually works)
<danylo> flto: I did not. One of issues with wayland for me was vnc access, or anything comparable
<daniels> rdp has been supported out of the box for quite a while on Weston; Mutter supports at least RDP and possibly also VNC
<karolherbst> uhhh... function calling... :(
<pq> weston/rdp is software-rendered only though, for now. weston/headless has a 16 ms frame cycle and would allow hw rendering.
<pq> danylo, do you need both remote access and headless operation simultaneously?
thellstrom has quit [Ping timeout: 480 seconds]
<pq> hmm, weston/headless + remoting plugin... might get you at least video out, but no input.
<danylo> yes, I have to see what renders there
<karolherbst> maybe I should fix nir passes first, might be a good first step
<pq> oh wait, remoting plugin requires drm-backend, meh
<pq> weston/drm + remoting...? maybe for the adventurous
<danylo> =)
<pq> drm-backend should be fine without connected monitors, and remoting plugin creates a virtual output you can "watch" with gst
<pq> setting that up might be a little bit magic, but I think the runes were somewhere
<MrCooper> pq: will drm-backend send frame events without connected monitors though?
<pq> MrCooper, the virtual output created by remoting plugin will.
<pq> I think
<pq> well, it wouldn't work otherwise, so
<MrCooper> yeah, makes sense
<pq> I may have a weston.ini somewhere doing the sending, if anyone's interested
<jekstrand> dj-death: Sorry this silly RT MR is taking so many rounds of review. I'm getting happier and happier with it with every round but I keep finding things.
jewins has joined #dri-devel
<jekstrand> mattst88, cmarcelo: Would one of you care to review https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16060 quick?
<jekstrand> It fixes bugs for Google.
<bnieuwenhuizen> jekstrand: done
<karolherbst> jekstrand: mhh.. so nir_shader_gather_info requires inlined functions, but is there actually a good reason for that?
<karolherbst> I assume we could just follow the calltree starting from the entry point
<jekstrand> karolherbst: If you have multiple entrypoints, it'd be problematic, but with a single entrypoint, just crawl all the functions.
Haaninjo has joined #dri-devel
Akari` has quit []
<karolherbst> nir_shader_gather_info requires an entry point anyway (if the passed one is NULL, the entry point will be used). but yeah, that would be my solution as well. I am so happy that not even CL allows recursion
ced117 has quit [Ping timeout: 480 seconds]
nchery has joined #dri-devel
i-garrison has quit [Ping timeout: 480 seconds]
<jekstrand> karolherbst: I wouldn't even bother making it recursive. Just check that only one thing is flagged entrypoing, and walk all the functions.
<jekstrand> If someone left dead functions lying around, that's their fault.
lemonzest has joined #dri-devel
tzimmermann has quit [Quit: Leaving]
aravind has quit [Ping timeout: 480 seconds]
<karolherbst> jekstrand: yeah..
adjtm has joined #dri-devel
ybogdano has joined #dri-devel
<karolherbst> okay.. let's see what breaks inlining after the first optimizaton loop :)
Akari has joined #dri-devel
<karolherbst> not too terrible: "Pass 2246 Fails 1 Crashes 103"
<karolherbst> okay... images are busted
<karolherbst> ohhhh noooo
<jenatali> Yeah images are going to be hard with function calls...
<karolherbst> well.. the crash I am seeing is more basic than that
<karolherbst> nir_deref_instr_get_variable returns NULL
<karolherbst> ahh
i-garrison has joined #dri-devel
<karolherbst> because it is a cast now
<karolherbst> vec1 64 ssa_19 = deref_cast (vimage2D *)ssa_1 (image vimage2D) /* ptr_stride=0, align_mul=0, align_offset=0 */
slattann has quit [Quit: Leaving.]
<karolherbst> okay.. mhh
<karolherbst> right.. because those were func args
<karolherbst> and I didn't run opt_deref after inlining
<karolherbst> guess I have to run the opt loop once after inlining
<jekstrand> Yeah, opt_deref after inlining is important
<karolherbst> but cool that nothing else broke
<jekstrand> :+1:
<karolherbst> well.. I also didn't move inlining as the very last step, but surviving the first opt loop is kind of a good sign :)
<jekstrand> \o/
<karolherbst> I am more worried what it does to my argument handling though.. like DCEing dead inputs and all this stuff
<karolherbst> jekstrand: we might want to add a nir_opt_dead_args or something
<karolherbst> or make opt_dce handle that for us
<karolherbst> dunno.. maybe it already exists
<jekstrand> karolherbst: for deleting unused function parameters?
<karolherbst> yes
<jekstrand> Yeah, we probably need some function optimizations
<jekstrand> dead_functions would be one, dead_parameters might be another
alatiera has left #dri-devel [The Lounge - https://thelounge.chat]
<karolherbst> mhh, yeah maybe
<karolherbst> my hope would be that dead_cf kills dead functions
<jekstrand> It doesn't today
<jekstrand> We could put killing dead functions there but I don't know that we want to
<jekstrand> If you're compiling a library, you don't want dead functions
<jekstrand> You want to do that after full linking.
<karolherbst> mhh, true, although then for libs we need to know what gets exported and what not anyway, no?
<karolherbst> but yeah.. I guess we can skip some stuff until full linking
<jekstrand> I think SPIR-V has decorations for things that are exported but we do nothing with it.
<karolherbst> I am not linking at a nir level atm anyway
<karolherbst> not sure if we even want to besides libclc
Duke`` has joined #dri-devel
<jekstrand> Yeah, but we want to be able to run dead_cf on libclc if we want
<karolherbst> right
<jekstrand> Or maybe we don't care? In any case, I don't think it's similar enough to the rest of dead_cf that it needs to go in there.
jkrzyszt has quit [Ping timeout: 480 seconds]
HankB has quit [Remote host closed the connection]
HankB has joined #dri-devel
<karolherbst> mhh, nir_lower_readonly_images_to_tex
<karolherbst> ehh no, that one seems fine, it's nir_lower_cl_images which breaks things
shankaru has joined #dri-devel
<karolherbst> jekstrand: do we have a pass which turns sampler/texture offsets into constants?
<karolherbst> because I think our best path forward here is to lower derefs to offsets and hope it all pans out in the end
<jekstrand> karolherbst: not yet. I've been meaning to add that to constant folding
* jekstrand should type that up
<karolherbst> okay
<zmike> dcbaker: do you prefer multiple MRs for the staging branch which match up 1:1 with the original MRs in main or is one giant MR better?
<zmike> I have a ton of kopper backports pending
<jekstrand> daniels: This is happening because Marge is now using "merge when pipeline succeeds". If marge times out, CI is left running with the MR still marked "merge when pipeline succeeds" and may merge after marge has "forgotton" about it.
<dcbaker> zmike: personally, 1 giant MR is easier to manage
<dcbaker> but if you prefer to make multiple MRs thats fine
<dcbaker> so no strong preference :)
<zmike> 1 is easier for me too
<daniels> jekstrand: there's something more exotic going on which I think is related to the -perf jobs
<daniels> jekstrand: marge only whacks merge when she observes the pipeline having succeeded
<daniels> so the pipeline was successful, then became not-yet-successful
<daniels> so yeah, there's some weird race going on there
<daniels> (marge hasn't changed in ages)
<jekstrand> daniels: Thought you might like to be aware anyway. :)
<jekstrand> daniels: I don't understand the black magic that is Marge. I keep my sorcery skills to NIR.
<daniels> ha, thanks
heat has quit [Read error: No route to host]
heat has joined #dri-devel
<karolherbst> jekstrand: you wanted to remove sampler/texture offsets?
<jekstrand> I wanted to remove the index+offset
<jekstrand> And just have an index source
<karolherbst> ahh
<karolherbst> well
<karolherbst> right..
<karolherbst> I guess one can just check if it's a constant in the backend
<karolherbst> okay.. now I just need to fix nir_lower_cl_images :)
<jenatali> jekstrand: For our backend at least, it's handy to have a constant to use as a base, which is separate from the dynamic index
<jenatali> That's actually how DXIL is
<karolherbst> I think we can also do it in hw, but if you already have an indirect it really doesn't matter I think
<karolherbst> I guess it would safe one iadd
<karolherbst> but.. don't we have a helper for that as well?
<jekstrand> jenatali: Yeah, it's annoyingly useful. :-/
<jekstrand> We used to depend on it in i965 but a) that driver doesn't exist anymore and b) we got rid of that garbage from the Intel back-end even before we killed i965.
<jenatali> If it disappeared, we'd have to reverse-engineer / guess a constant base from the alu ops that construct the dynamic index
<jenatali> And... yeah that's just fragile
<jekstrand> Or start everything at 0
<karolherbst> this base + offset thing is such a common patterns, we might be able to solve this by having a nir helper reading that out for offset (like texture/samplers/ubos/whvr...)
<jenatali> Can't
<jekstrand> Which isn't great
<jekstrand> Can't?
<jenatali> DXIL is typed. If the source shader declares sampler2D[5] and sampler3D[2], and you want to dynamically index those 3D samplers, we need the base to point to 5 (the beginning of that array)
<jekstrand> Ah
<anholt> virgl (aka ntt) also needs base+offset for anything that gets array indexed basically.
<jekstrand> That's annoying but GLSL is typed like that too.
ybogdano has quit [Ping timeout: 480 seconds]
<jekstrand> Old Intel hardware (gen4, maybe 5?) even requires the type in the hardware instruction. Fun, fun!
<karolherbst> ahh nooo..
<karolherbst> nir_lower_cl_images doens't follow calls
<karolherbst> oopsi
<karolherbst> should be easy to fix
<jekstrand> What do you mean "doesn't follow calls"?
<karolherbst> it starts with the entry point and that's it
<karolherbst> jekstrand: what's the proper stuff to use to iterate through the entire shader or do I have to parse call instructions?
<jekstrand> nir_foreach_function()
<karolherbst> ahh right
<jekstrand> Or use nir_shader_instructions_pass()
<karolherbst> for whatever reason the pass iterates in reversed order
<jekstrand> Hrm... I probably had a reason for that
<karolherbst> I am sure you had
<jekstrand> Right. I wanted to be able to crawl up deref chains a bit
* jekstrand looks
<jekstrand> Yup, that's the reason.
<jekstrand> And... it looks like I did it because we don't have constant folding. (-:
<karolherbst> :D
<karolherbst> how convenient
<jekstrand> Yeah, I think we can run it forwards now that we can constant-fold.
LexSfX has quit [Ping timeout: 480 seconds]
<karolherbst> yeah, I'll play around with it :)
<jekstrand> The only reason I can see for going backwards was so that the nir_deref_type_var special case for setting texture_index directly would work.
LexSfX has joined #dri-devel
ybogdano has joined #dri-devel
<jekstrand> Also, we can seriously simplify nir_lower_samplers once we can fold.
<karolherbst> yep
alyssa has joined #dri-devel
<alyssa> Kayden: you're my goto gallium expert-- is there a good reason to use hash tables for shader keys?
<alyssa> panfrost currently has a dynarray of variants with open coded key states, with a `bool variant_matches_state` check and a `copy_state_to_new_variant` helper
<alyssa> kinda brittle, but going to a hash table would bump complexity quite a bit.
<alyssa> I guess hash tables have better asymptoptic perf on average, but probably a large constant factor...
<jekstrand> alyssa: I guess it depends on how big your key is and how many variants you expect.
<karolherbst> jekstrand: ehh... nir_rewrite_image_intrinsic requires an image var, doesn't it? :(
<karolherbst> now that's annoying
<karolherbst> not that the code matters.. maybe I just check if there is a var? but... mhh
<karolherbst> we'll end up with PIPE_FORMAT_NONE no matter what
<karolherbst> mhh, but it's also needed for the access
mareko has quit [Read error: Connection reset by peer]
<jekstrand> karolherbst: Does it? I guess it does. I had a version that didn't.
mareko has joined #dri-devel
<karolherbst> yeah.. it uses the var to update format and access of the intrinsic
Akari has quit [Read error: Connection reset by peer]
<jekstrand> You can wrap all that stuff in `if (var)`. I think we can provide at least the format in spirv_to_nir so we don't actually need the var.
<karolherbst> yeah, that's what I ended up doing
<jekstrand> Because the format is part of the type in SPIR-V and we have the type when we process the SPIR-V instruction
<karolherbst> I think we might want to assert on KERNEL
Akari has joined #dri-devel
<karolherbst> and allow no var only then
<Kayden> alyssa: No, I think hash tables are probably a bad idea.
<Kayden> alyssa: It might make sense if you have a -lot- of variants
<Kayden> the hash table lookup was a very noticable part of the CPU overhead in iris, though
<jekstrand> karolherbst: Maybe?
<karolherbst> jekstrand: anyway, seems to work :)
<jekstrand> karolherbst: Really, it's fine whenever it comes from SPIR-V.
<karolherbst> now I just need to verify if you MR fixes things up
<jekstrand> It's just GLSL that has it on the variable
<jekstrand> And maybe TGSI
<karolherbst> ahh
<jekstrand> And those will always give us derefs
<Kayden> alyssa: with modern intel GPUs having a lot less NOS than the old ones, and tiny shader keys, I moved iris to use a list of variants, and just linearly walk and memcmp the keys. the first one is the precompile where we guess the most common key
<Kayden> alyssa: because I figure that in most cases, the variant list is 1 long, sometimes 2 long, and very rarely 3 long
<jekstrand> crocus, though, probably wants a hash table. :sob:
<karolherbst> jekstrand: yeah, and your MR seems to work out nicely :)
<jenatali> I don't suppose anybody ever built a tool for filtering down GALLIUM_REFCNT_LOG logs to leaked objects...
<zmike> oof those are the worst
<jenatali> I've got a slow buffer leak in real-world apps that doesn't reproduce in toy apps like gears :(
<karolherbst> :(
<jekstrand> karolherbst: \o/
<Kayden> that is a really handy tool, but no, I don't think we had a tool to filter it down
<zmike> tried testing with tc disabled?
<Kayden> also, at least when I last used it, there was something weird where it was showing a double ref at the start so nearly everything was unbalanced
<zmike> most of my buffer leaks were related to tc replacement handling
<jenatali> zmike: Not yet, that's a good idea
<jenatali> It's also broken now that mareko (I think) started using some high values for refcounts to temporarily allow underflow. I was hitting near-infinite loops on just printing AddRef when those objects were first seen :)
<Kayden> but if you ignored that, it worked
<Kayden> ouch
<zmike> otherwise my strategy for that has been to tag each buffer with a unique id, determine when it gets allocated with static counters, then manually set debug flags for introspection in my driver
<zmike> the log is kinda useless unless you're assuming there's a bug outside your driver, which is probably not the case
<jenatali> I don't think that's really true, I think the log would be useful if I was able to quickly pare it down to just the objects that are still alive at the end of the trace
pcercuei has quit [Quit: brb]
<karolherbst> nice, no regressions :)
<karolherbst> let's see if luxmark still runs
pcercuei has joined #dri-devel
<karolherbst> yes it does :) nice
<daniels> jekstrand: it's vanishingly unlikely that anyone else will ever hit that race, but https://gitlab.freedesktop.org/gfx-ci/mesa-performance-tracking/-/merge_requests/27 fixes it anyway
<karolherbst> jekstrand: wow.. if I do inlining right before handing it into the driver, perf drops a lot
<karolherbst> guess I'll need to figure out what pass is so critical or breaking perf
<karolherbst> or maybe it was ms teams .. :D
<karolherbst> nope.. things get slower for real
<alyssa> Kayden: ahhh, fair enough
<alyssa> will stick to the dynarray (don't feel like poking more bears than needed) but group together the key properly so I can do a memcmp
<alyssa> which is probably less brittle too
<karolherbst> jekstrand: uhh... you won't like what I figured out.. soo.. it's lower_io which causes problems :(
<karolherbst> I assume scratch space explodes
<karolherbst> but maybe it makes sense to inline before that
<karolherbst> and make the inliner aware of costs of not inlining certain stuff
<karolherbst> but besides that, I think all the passes I am using atm can deal with non inlined kernels :)
<jekstrand> lower_explicit_io should be fine. lower_io not so much
<karolherbst> ohh.. I am not saying those are not fine, they just have to deal with tons of function_temp memory
<karolherbst> and moving that into scratch
<jekstrand> lower_io shouldn't be doing that
<karolherbst> nir_lower_vars_to_explicit_types I think is doing the scratch stuff
<karolherbst> could be that only nir_lower_vars_to_explicit_types is the problem here
<jekstrand> shouldn't be
<karolherbst> I am sure it is though.. but let me try
<karolherbst> those luxmark kernels have like 64 kernel args
<karolherbst> and 50 of them beind dead
<karolherbst> *being
<karolherbst> jekstrand: yeah okay.. inlining after nir_lower_explicit_io is causing those perf issues :(
<karolherbst> and the one I identified only lowers mem_shared, function_temp and uniform
<jekstrand> karolherbst: That's not entirely surprising. We can't do load/store elimination after lower_explicit_io
<karolherbst> yeah..
<karolherbst> hence me wondering if we should run those passes only after whatever we decide to inline later
<jekstrand> Also, if you lower_explicit_io before inlining, all return parameters will turn into scratch
<jekstrand> And possibly some input parameters depending on $STUFF
<karolherbst> yeah...
<alyssa> karolherbst: with rusticl do drivers still have to ingest NIR_SERIALIZED?
<karolherbst> guess the order is then 1. "general stuff and lowering" 2. inlining 3. lower_io 4. int64 lowering 5. another opt_loop 6. finalize_nir
<karolherbst> alyssa: nope
<karolherbst> they just need to advertize it because most disable/enable CL based on that
<alyssa> sweet. so I can delete that code once rusticl is in and I drop clover support? :-p
<karolherbst> more or less, yes
* alyssa throws a-b over the wall :-p
<karolherbst> I think we need another way of enabling/disabling CL, but yeah
<karolherbst> alyssa: there is a reason why clinfo only needs 1/3 of the time it needs with clover :D
<alyssa> lol
<karolherbst> but maybe that's also because I cache stuff
<karolherbst> still don't cache spir-v linking because that feels pointless
<jessica_24> hey mattrope, can you follow up on this (https://patchwork.freedesktop.org/patch/482987/?series=102966&rev=2#comment_869298)?
tjmercier has quit [Remote host closed the connection]
maxzor has quit [Ping timeout: 480 seconds]
alanc has quit [Remote host closed the connection]
alanc has joined #dri-devel
<karolherbst> okay, no regressions with late inlining :)
<karolherbst> soo.. let's see if I am able to kill dead function args
<airlied> karolherbst: any idea if the spir-v has inline hints in it?
<karolherbst> airlied: it has
<airlied> then maybe early inline with hints and late inline all the things
<karolherbst> yeah.. that might be an idea
<karolherbst> but for now I am just ignoring partial inlining, because backends have to support calling functions first anyway :)
<airlied> oops
<airlied> those are probably the only generic ones I had from before
<karolherbst> right... sure, those will come in handy later, but for now we need to get non inlined kernels into drivers, and them do handle those
<airlied> yeah making llvmpipe handle functions got messy around intrinsics
<karolherbst> airlied: which ones?
<airlied> anything that accessed system values
<karolherbst> that shouldn't be an issue as those are lowered away in my rusticl model
<airlied> or stuff passed into the top level kernel implicitly
<karolherbst> passed implicitly?
coke has joined #dri-devel
<airlied> args that aren't in the CL function
<karolherbst> you mean like the printf buffer and stuff?
<airlied> yeah and/or grid/block sizes etc
<karolherbst> not seeing why that should be a problem?
<airlied> because if you access them down 5 functions deep it needs to be able to pull them from the input args buffer
<karolherbst> it doesn't
<karolherbst> maybe in clover it does
<airlied> how do you pass them in rusticl then?
<karolherbst> I load the uniform?
<airlied> well then the backend still needs to pass down the ptrs to the ubos
<airlied> there's lots of implicit state
<karolherbst> ? you just get a load_kernel_input
<airlied> now what happens if that gets lowered by the backend to a global ptr
<airlied> or maybe a local ptr
<karolherbst> it should probably have a driver ubo with the pointer to read from
<karolherbst> but yeah.. I guess if you don't have ubos at all you are kind fo screwed
<airlied> so at llvmpipe passes all of the ptrs to all of the resources into the toplevel function
<airlied> there's no "global" state
<karolherbst> yeah.....
<karolherbst> airlied: I guess one needs to back propagate that then
lynxeye has quit [Quit: Leaving.]
<airlied> it might not be as bad for on-gpu drivers, but I think would be
<karolherbst> or just add it to every function and DCE those args agian
<airlied> since at least on amd you pass in a few ptrs to the shader via user args registers
<airlied> which won't be valid by the time you read them a few fns down
<karolherbst> well on nv you just have your driver ubo with all the random stuff
<airlied> yeah a ubo on amd is a descriptor stored in registers
<karolherbst> ahh
<karolherbst> guess they aren't that big then?
<karolherbst> oh wait.. the descriptor..
<karolherbst> uhhh
<karolherbst> okay, that's annoying
<karolherbst> I never thought any of that is an issue, because on nv hw that's all very trivial as you don't have to pass descriptors around
alyssa has left #dri-devel [#dri-devel]
<karolherbst> okay... let's see what happens if I don't inline before finalize_nir :O
<karolherbst> probably bad things
<karolherbst> "Assertion `!"Invalid instruction type in GCM"' failed." :3
<karolherbst> probably a call
<karolherbst> yeah...
<karolherbst> ehh
<karolherbst> opt_dce kills func args :(
<karolherbst> ohh wait no
<karolherbst> mhh
<karolherbst> call test9 ssa_1
<karolherbst> vec1 32 ssa_0 = load_const (0x00000000 = 0.000000)
<karolherbst> vec1 64 ssa_1 = intrinsic load_kernel_input (ssa_0) (base=0, range=0, align_mul=256, align_offset=0)
<karolherbst> dunno, but that looks wrong to me :)
<karolherbst> yeah..
<karolherbst> that's gcm
<karolherbst> okay, I did somethign wrong
<karolherbst> jekstrand: any idea how to deal with call inside opt_gcm?
<karolherbst> I mean.. I know what it shouldn't do, but I don't know how to make gcm not mess it up
<karolherbst> ehh.. I had to pin it.. okay
<jekstrand> Yup
<jekstrand> pin it
<karolherbst> okay... that works surprisingly well.. I inline after finalize_nir and I still didn't got a fail :)
<karolherbst> things are getting quite slow though
<karolherbst> let's do a benchmark and then deal with DCEing args
<karolherbst> 1020 :'( my points
icecream95 has joined #dri-devel
<karolherbst> but at least "Pass 2343 Fails 6 Crashes 0" :)
<karolherbst> and all fails are expected
<jekstrand> :D
<jenatali> Woo I wrote a script for the refcount logs and it tells me I have no leaks...
<karolherbst> jenatali: try valgrind
<jenatali> zmike: Had to disable tc though because otherwise the log was racy with stacks getting intertwined between threads
<jenatali> So either I code-inspection fixed my leak by pure guessing (going to revert and double-check) or tc has a leak
<zmike> or your handling of tc buffer replacement has a leak
<jenatali> True. I don't think it does but maybe
<jenatali> I think the leak would be much bigger
<zmike> I've had very small leaks from it in the past
<jenatali> Cool I'll take a much closer look then, thanks
<jenatali> Looks right to me
<karolherbst> yeah well... I didn't do anything to those yet :D
<karolherbst> it was already like that when I got here
ybogdano has quit [Ping timeout: 480 seconds]
ngcortes has joined #dri-devel
<karolherbst> jekstrand: mhhh.. do you think it's better to move params inside nir_call_instr or to build a new one?
<karolherbst> or well.. what would you prefer
lyudess has quit []
Lyude has joined #dri-devel
<jenatali> zmike: I'm not seeing docs for what the expected refcount behavior is for tc buffer replacement. Am I supposed to add/release a ref?
<zmike> jenatali: you're supposed to transfer backing for buffer src into buffer dst
<zmike> the resource refs themselves remain unchanged
<jenatali> Right okay
<jekstrand> karolherbst: Don't know what you're talking about
<karolherbst> jekstrand: like if I nuke params from functions, I also have to fix up all call instructions
rasterman- has quit [Remote host closed the connection]
<jekstrand> Yes
<karolherbst> so there are two ways I think we could do: 1. move params inside nir_call_instr or 2. replace the existing one with a new one
<jekstrand> karolherbst: Or nir_instr_rewrite_src as needed and then whack num_params to a lower number.
<jekstrand> You're shrinking so you don't actually need to re-alloc
<karolherbst> I know, but it's not a list of pointers, but an array of nir_src objects :)
<jekstrand> Yup
<jekstrand> So you have to be careful with it
<jekstrand> If it's easier to just replace it, then do that.
<karolherbst> ohhh.. you mean to use nir_instr_rewrite_src to just "move" sources
<jekstrand> Yup
<karolherbst> I see
<jekstrand> See nir_tex_instr_remove_src
<karolherbst> yeah that would be a nice method of fixing up the exisiting call instruction
<jekstrand> Yeah, something like this should do it:
<jekstrand> for (unsigned i = 0, j = 0; i < call->num_srcs; i++) {
<jekstrand> I'm going to paste it
karolherbst has quit [Remote host closed the connection]
karolherbst has joined #dri-devel
<karolherbst> cool, thanks
<jekstrand> Oh, then call->num_srcs = j
Haaninjo has quit [Quit: Ex-Chat]
ybogdano has joined #dri-devel
ced117 has joined #dri-devel
camus1 has quit [Remote host closed the connection]
camus has joined #dri-devel
HankB_ has joined #dri-devel
HankB has quit [Ping timeout: 480 seconds]
Duke`` has quit [Ping timeout: 480 seconds]
<karolherbst> do we have something like sorted lists in mesa? Would make the impl a little easier
ybogdano has quit [Ping timeout: 480 seconds]
<FLHerne> karolherbst: there's a prio queue in gallium/drivers/r600/sb/sb_shader.h:142 but it's a C++ template
<karolherbst> mhhh
<FLHerne> I'm surprised there isn't something like that in util/
<karolherbst> we also have a rb_tree, but I can't modify the structs I want to sort
<karolherbst> maybe I do a stupid wrapper node, but.....
mvlad has quit [Remote host closed the connection]
<mdnavare> vsyrjala: jani: The arm_atomic_check_only warning of adding CRTC not allowed without modesets is hurting us - I had fixed that for bigjoiner where we steal a CRTC so I add requested crtc only if enabled, but now its hurting us for MST case.
<mdnavare> Lyude: Also any thoughts from you here
<mdnavare> Lyude: vsyrjala: adding CRTC not allowed without modesets: requested 0x4, affected 0xc
<mdnavare> <4> [106.294403] WARNING: CPU: 8 PID: 1275 at drivers/gpu/drm/drm_atomic.c:1339 drm_atomic_check_only+0x7d9/0x8e0
ybogdano has joined #dri-devel
<Lyude> mdnavare: mind giving more context, what issues is this causing exactly?
<Lyude> mdnavare: also btw, if you're looking at this because you're trying to add additional modesetting deps, probably worth noting I'm nearing completion on this https://gitlab.freedesktop.org/lyudess/linux/-/commits/wip/mst-atomic-only-v1 one of the things included there is adding a bunch more CRTC tracking through the mst state object (to protect against the potential of SST/MST
<Lyude> commits using the same physical DP connector from racing with eachother)
<Lyude> (I had hoped I'd have this done by now but it ended up taking a bit more work to get things working nicely, I'm nearly there though and have been spending most of my time these last few weeks working on this)
<mdnavare> Thanks Lyudess , I am still analyzing the logs but I see that it calls drm_atomic_set_property and then that calls drm_atomic_check_only but it requests modeset on CRTC 1 for Pipe A and gets affected crtc macro as 0xf
zackr has joined #dri-devel
<jekstrand> karolherbst: qsort()?
<karolherbst> yeah... probably
<Lyude> mdnavare: hm, well maybe you'll end up figuring something else out from the logs. Just a wild guess from the context I do have though, it sounds like that either something's accidentally pulling something in for a modeset when it shouldn't be, or that whatever property you're trying to change might need it's own private state object of sorts or something along those lines so it
<Lyude> doesn't need to pull in the CRTC
<Lyude> (I have definitely called drm_*_get_foo_state() instead of drm_*_get_new_foo_state() before, so someone mistakenly pulling smething in seems somewhat plausible)
<jenatali> Hm, I think there's some bad interaction here between tc and the "private refcount" stuff that's trying to avoid atomics
<jenatali> Seeing a buffer that ends up with 19 refs after the private refs get removed, but if I turn off tc, I don't see that behavior
heat has quit [Read error: Connection reset by peer]
heat has joined #dri-devel
<karolherbst> okay nice... qsort works
<karolherbst> collecting info seems to work, now I just need to drop those args
lstrano has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
<jenatali> Aha, looks like take_index_buffer_ownership is a bit busted
lstrano has joined #dri-devel
<jenatali> Ahhhh it's falling over in primconvert
<jenatali> Which isn't dropping the ref that take_index_buffer_ownership says needs to be dropped
<jenatali> zmike: ^^
<zmike> jenatali: 🤔
<zmike> seems like it should be simple to fix then? can just add an unref after the index buffer gets read back
<jenatali> Yep
<jenatali> Working on a patch. Hopefully that's the only leak I was hitting
<zmike> so this is what microsoft employees do while on vacation 🤔
<jenatali> I'm not on vacation anymore
<zmike> oh
<zmike> incredible
<zmike> welcome back
<zmike> we missed you
<jenatali> Thanks, good to be back
<zmike> dcbaker: I put up the staging MR for kopper backports; gonna have another couple still before the final 22.1, but this is the last "big" one
<dcbaker> zmike: perfect, I've sent them to marge
<zmike> \o/
<dcbaker> I don't know if you noticed but the wgl bakport is failing CI
<zmike> I think it was just flakes?
<zmike> some jobs taking too long
<zmike> not anything the MR is causing
<dcbaker> I kicked the jobs that failed, lets see what happens
<zmike> ah I just reassigned
<daniels> if it doesn’t need a rebase, reassign will insta-fail whilst the jobs are red
<zmike> yeah but there's another job going in first anyway
<daniels> I shouldn’t have doubted you
<zmike> it's ok I doubt me too
pcercuei has quit [Quit: dodo]
morphis has quit [Ping timeout: 480 seconds]
morphis has joined #dri-devel
heat_ has joined #dri-devel
heat has quit [Remote host closed the connection]
tursulin has quit [Read error: Connection reset by peer]
<anholt> do we have a thing that tells us the float/int/uintness of spirv alu op operands?
danvet has quit [Ping timeout: 480 seconds]
<jekstrand> anholt: Not automagic, I don't think.
<jekstrand> anholt: The metadata might be in the SPIR-V json file
<anholt> going to start with a bit of hand-written stuff and see if this approach is garbage, first.
<anholt> (looking at !6346)
<jekstrand> anholt: THat's an ACO MR
<anholt> oops, issue number
slattann has joined #dri-devel
<jekstrand> right
<karolherbst> jekstrand: ... what should I do with like call instructions which get all their params removed? simply remove them? :D
<karolherbst> I can see how that can cause some weirdo problems
<jekstrand> karolherbst: THey may still have side-effects
<jekstrand> karolherbst: But if the callee is empty, you can remove it.
<karolherbst> yeah well.. nir_validate doesn't like to call functions without args though
<karolherbst> should I just keep a fake one then, or...
<jekstrand> fix nir_validate
<jekstrand> I'm surprised that hasn't blown up yet, TBH.
<karolherbst> okay
<karolherbst> uhhhh.... I hate C
<jenatali> Woo, clean refcount logs
<karolherbst> unsigned j = 0; for (unsigned i = 0, j = 0;.....)
<karolherbst> .....
<karolherbst> ....
<jenatali> Kayden: I ended up writing a tool and it turned out super useful so I included it in https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16182
<Kayden> jenatali: awesome!
<karolherbst> jekstrand: huh.. maybe it validated on something else... guess I was wrong
<Kayden> jenatali: that does indeed look super useful
<jenatali> :)
<karolherbst> it works \o/
heat_ has quit [Remote host closed the connection]
heat_ has joined #dri-devel
nchery is now known as Guest2913
nchery has joined #dri-devel
<karolherbst> nice nice nice :)
<karolherbst> and now my kernel input DCE works again
<karolherbst> that one kernel be like: uniforms: 288 -> uniforms: 64
Guest2913 has quit [Ping timeout: 480 seconds]