ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar
ngcortes has quit [Remote host closed the connection]
tursulin has quit [Read error: Connection reset by peer]
Lightkey has quit [Ping timeout: 480 seconds]
Lightkey has joined #dri-devel
bryanv has joined #dri-devel
bryanv has quit []
camus has joined #dri-devel
mbrost has quit [Read error: Connection reset by peer]
mbrost has joined #dri-devel
sdutt_ has joined #dri-devel
jewins has quit [Remote host closed the connection]
jewins has joined #dri-devel
camus1 has quit [Ping timeout: 480 seconds]
sdutt has quit [Ping timeout: 480 seconds]
boistordu has joined #dri-devel
boistordu_old has quit [Ping timeout: 480 seconds]
camus1 has joined #dri-devel
camus has quit [Remote host closed the connection]
<airlied> okay reproduced the container qemu crash locally but now have to work out how to debug/fix it
Company has quit [Quit: Leaving]
tarceri_ has quit []
tarceri has joined #dri-devel
<tarceri> anholt_: this normally is handled on a driver bases. by handling flags and passing them into disk_cache_create()
camus has joined #dri-devel
camus1 has quit [Read error: Connection reset by peer]
Bennett has quit [Remote host closed the connection]
aravind has joined #dri-devel
camus1 has joined #dri-devel
camus has quit [Remote host closed the connection]
<mareko> anholt_: you might have to add the flag into the shader cache key
punit has joined #dri-devel
sdutt_ has quit [Remote host closed the connection]
sdutt_ has joined #dri-devel
<mareko> Kayden: see this mesa-dev thread from 2016 for some of the reasons why we use pb_slab: [Mesa-dev] [PATCH 00/14] radeon/winsyses: sub-allocation for small buffers
<mareko> one thing that is different today compared to that thread is that we use pb_slab for textures now too
aravind has quit [Ping timeout: 480 seconds]
aravind has joined #dri-devel
jewins has quit [Read error: Connection reset by peer]
Duke`` has joined #dri-devel
<Kayden> mareko: thanks! I'm still not entirely sure what all the pb_* framework really buys you
<Kayden> the approach seems good, it just seems like a lot of boilerplate for not very much functionality
<Kayden> i.e. if the common code wants to do anything with the base objects, it can't, so it resorts to vtbl functions for pretty much everything
itoral has joined #dri-devel
mattrope has quit [Remote host closed the connection]
sdutt_ has quit [Ping timeout: 480 seconds]
pnowack has joined #dri-devel
pnowack has quit [Remote host closed the connection]
pnowack has joined #dri-devel
dv_ has quit [Ping timeout: 480 seconds]
<airlied> wierd seems like ppc64le f16 loads end up trying to indirect callout to something
<airlied> have to dream up some sort of workaround for llvmpipe
dv_ has joined #dri-devel
danvet has joined #dri-devel
Duke`` has quit [Ping timeout: 480 seconds]
tobiasjakobi has joined #dri-devel
aravind has quit [Remote host closed the connection]
lemonzest has joined #dri-devel
tobiasjakobi has quit [Remote host closed the connection]
rpigott has quit [Remote host closed the connection]
rpigott has joined #dri-devel
frieder has joined #dri-devel
Erandir has quit [Remote host closed the connection]
Erandir has joined #dri-devel
aravind has joined #dri-devel
alanc has quit [Remote host closed the connection]
alanc has joined #dri-devel
idr has quit [Ping timeout: 480 seconds]
pochu has joined #dri-devel
aravind has quit [Ping timeout: 480 seconds]
aravind has joined #dri-devel
YuGiOhJCJ has joined #dri-devel
airlied_ has joined #dri-devel
airlied has quit [Ping timeout: 480 seconds]
rasterman has joined #dri-devel
thellstrom has joined #dri-devel
airlied has joined #dri-devel
thellstrom1 has joined #dri-devel
airlied_ has quit [Ping timeout: 480 seconds]
thellstrom has quit [Ping timeout: 480 seconds]
mbrost_ has joined #dri-devel
lynxeye has joined #dri-devel
thellstrom1 has quit [Remote host closed the connection]
thellstrom has joined #dri-devel
mbrost has quit [Ping timeout: 480 seconds]
agd5f_ has joined #dri-devel
agd5f has quit [Ping timeout: 480 seconds]
tursulin has joined #dri-devel
<glennk> airlied, qemu ppc64le?
Lucretia has joined #dri-devel
<MrCooper> glennk: <airlied> ugggh ppc64le in CI crashes that aren't happening on real ppc64le hw I tested on ftl
thellstrom has quit [Remote host closed the connection]
thellstrom has joined #dri-devel
camus has joined #dri-devel
<MrCooper> CI runs ppc64le binaries on x86 via qemu
pcercuei has joined #dri-devel
thellstrom has quit []
camus1 has quit [Read error: Connection reset by peer]
<glennk> not sure how good its floating point emulation is, in particular half precision floats
<MrCooper> yeah, we've hit qemu bugs in CI before
<HdkR> qemu bugs are...fun
<glennk> had a few of those over the years, my favorite was arm qadd typo:ed in qemu as add A,A
xexaxo has joined #dri-devel
xexaxo_ has quit [Ping timeout: 480 seconds]
mlankhorst has joined #dri-devel
lemonzest has quit [Quit: Quitting]
lemonzest has joined #dri-devel
Surkow|laptop is now known as Surkow
nchery has quit [Remote host closed the connection]
muhomor has joined #dri-devel
muhomor has quit [Remote host closed the connection]
Hi-Angel has joined #dri-devel
muhomor has joined #dri-devel
iive has joined #dri-devel
muhomor has quit []
phomes has quit [Quit: Page closed]
muhomor has joined #dri-devel
muhomor has quit []
muhomor has joined #dri-devel
muhomor has quit []
muhomor has joined #dri-devel
muhomor has quit []
muhomor has joined #dri-devel
elongbug has quit [Ping timeout: 480 seconds]
camus1 has joined #dri-devel
camus has quit [Remote host closed the connection]
pochu has quit [Ping timeout: 480 seconds]
camus has joined #dri-devel
zf has quit [Ping timeout: 480 seconds]
camus1 has quit [Read error: Connection reset by peer]
<mareko> Kayden: we only us pb_slab and pb_cache, we don't care about the rest
<mareko> Kayden: if you don't layer buffers like we do (pb_buffer* inside pipe_resource) or track buffer busyness for each suballocation, then it might not make sense
pochu has joined #dri-devel
elongbug has joined #dri-devel
<mareko> we might consider forking pb_buffer and removing vtbl
zf has joined #dri-devel
camus1 has joined #dri-devel
camus has quit [Ping timeout: 480 seconds]
Namarrgon has quit [Ping timeout: 480 seconds]
Namarrgon has joined #dri-devel
cbaylis has joined #dri-devel
xexaxo has quit [Ping timeout: 480 seconds]
xexaxo has joined #dri-devel
zf has quit [Ping timeout: 480 seconds]
muhomor has quit [Remote host closed the connection]
muhomor has joined #dri-devel
itoral has quit [Remote host closed the connection]
camus has joined #dri-devel
zf has joined #dri-devel
sdutt has joined #dri-devel
sdutt has quit []
sdutt has joined #dri-devel
camus1 has quit [Ping timeout: 480 seconds]
mbrost_ has quit [Ping timeout: 480 seconds]
jewins has joined #dri-devel
pochu has quit [Ping timeout: 480 seconds]
zf has quit [Ping timeout: 480 seconds]
robertfoss_ has joined #dri-devel
robertfoss_ has quit [Remote host closed the connection]
vivijim has joined #dri-devel
pochu has joined #dri-devel
aravind has quit []
YuGiOhJCJ has quit [Quit: YuGiOhJCJ]
pochu has quit [Ping timeout: 480 seconds]
frieder has quit [Remote host closed the connection]
pochu has joined #dri-devel
<zmike> danylo: you motivated me to track down that crash finally
<danylo> =)
<zmike> ZINK_DESCRIPTORS=lazy should work fine for the time being until that's fixed
<danylo> yes, the crash is gone with lazy descriptors
<danylo> zmike: There is validation error with the same trace:
<danylo> ``` The pStrides[1] (0) parameter in the last call to vkCmdBindVertexBuffers2EXT is less than the extent of the binding for attribute 1 (16).```
<zmike> yeah ignore that, I'm working on it
<zmike> it's a tricky one
<danylo> could it cause issues?
<danylo> since the issue with the trace is vertex explosion
<zmike> it's possible I suppose, but I haven't seen it cause issues in practice, so it'd have to be a hw thing
<danylo> ok, thanks
mattrope has joined #dri-devel
<zmike> the trace seems fine on radv
ngcortes has joined #dri-devel
ngcortes has quit [Remote host closed the connection]
lemonzest has quit [Ping timeout: 480 seconds]
elongbug has quit [Ping timeout: 480 seconds]
Duke`` has joined #dri-devel
mbrost has joined #dri-devel
flacks has quit [Ping timeout: 480 seconds]
pekkari has joined #dri-devel
<zmike> any gallium experts know what the trick to sampling from a GL_ARB_texture_cube_map_array texture is?
<zmike> thought I had it working but maybe not...
flacks has joined #dri-devel
<zmike> ah, can they not be sampled as 2D_ARRAY?
flacks has quit [Ping timeout: 480 seconds]
lynxeye has quit [Quit: Leaving.]
camus1 has joined #dri-devel
pochu has quit [Ping timeout: 480 seconds]
flacks has joined #dri-devel
camus has quit [Ping timeout: 480 seconds]
nchery has joined #dri-devel
morenonatural has joined #dri-devel
pochu has joined #dri-devel
pochu has quit [Ping timeout: 480 seconds]
morenonatural has left #dri-devel [#dri-devel]
morenitonatural has joined #dri-devel
flacks has quit [Quit: Quitter]
morenitonatural has left #dri-devel [#dri-devel]
morenonatural has joined #dri-devel
<sravn> mripard: Started to look at your nice patch-set. devm_drm_of_get_next() I like but a more descriptive name would be nice - like devm_drm_of_get_bridge()
morenonatural has quit []
gouchi has joined #dri-devel
morenonatural has joined #dri-devel
flacks has joined #dri-devel
camus has joined #dri-devel
camus1 has quit [Ping timeout: 480 seconds]
imirkin_ has joined #dri-devel
<imirkin_> zmike: what's your question about cube map arrays?
<zmike> I don't know that I know what I need to ask, so probably it'll get answered in the course of this rfc I'm putting up
<imirkin_> you can use GL_ARB_texture_view to sample them as 2d arrays, but at the gallium level that requires you to have PIPE_CAP_SAMPLER_TARGET or something
<imirkin_> zmike: ok
<imirkin_> are you have a problem of some sort, or just trying to sort things out in your head?
<zmike> problem
<imirkin_> which is?
<zmike> I'm typing it in this MR, hold on
<imirkin_> ah ok
<imirkin_> leave a link when you're done
<imirkin_> sampling from a cube map is a way of having an array of cubes. so if you do it as 2d, you lose the cube-ness when sampling.
<imirkin_> er, make that "sampling from a cube map array is ..."
glisse has quit [Read error: Connection reset by peer]
mareko has quit [Write error: connection closed]
mslusarz has quit [Remote host closed the connection]
dri-logger has quit [Write error: connection closed]
flacks_ has joined #dri-devel
flacks has quit [Ping timeout: 480 seconds]
lemonzest has joined #dri-devel
ngcortes has joined #dri-devel
morenonatural has quit [Ping timeout: 480 seconds]
cmarcelo has joined #dri-devel
<cmarcelo> cwabbott: pendingchaos_ could one of you take a look at https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9938?
<zmike> pepp: this is the MR^
<imirkin_> zmike: will have a look
<zmike> no rush
dri-logger has joined #dri-devel
<imirkin_> zmike: so ... with images, there are (effectively) no cube arrays. so the "2d array" bit really oughtn't matter
<imirkin_> without looking at it closer, my guess is that the problem has absolutely nothing to do with cube arrays
<imirkin_> although .. hm. will have ot check how the pbo code works
<imirkin_> does it do texelFetch or imageLoad?
<zmike> txf
<imirkin_> zmike: where do you convert gallium sampler to something vulkan-y?
<zmike> ?
<zmike> I don't
<imirkin_> pipe_sampler_view -> something
<zmike> zink can't use this right now
<zmike> if that's what you mean
<imirkin_> can't use what?
idr has joined #dri-devel
<zmike> the MR
<imirkin_> forget the MR
<imirkin_> in zink in general, where do you convert pipe_sampler_view into something vulkan can understand?
<zmike> uhhh zink_create_sampler_view?
<imirkin_> right. what file might that be in?
<zmike> zink_context.c
<imirkin_> aha ok
<imirkin_> thanks
<zmike> not sure why you're looking at zink for this though?
<zmike> or is this an unrelated query
<imirkin_> oh, this is "in general"?
<zmike> ah
<imirkin_> i assumed the failure was on zink
<zmike> no, like I said, zink can't use this
<imirkin_> where do you see the failure?
<zmike> iris and radeonsi
<imirkin_> hrmph ok
<imirkin_> they can't BOTH be wrong :)
<zmike> no, I imagine they're not
<imirkin_> lesseee here
<zmike> I thought I had this case passing when I did the original implementation, but I'm too tired to try and bisect it across 4 months of mesa
<zmike> and/or maybe the test just wasn't running back then
heat has joined #dri-devel
<zmike> anyway gotta run for a bit, will be transient for the next while
<imirkin_> zmike: yeah, there's some slightly dodgy things going on here.
<imirkin_> zmike: i don't think glsl_get_sampler_coordinate_components does what you want it to, in your case
<imirkin_> i'm not 100% sure how the nir sampling stuff deals with array indices
Peste_Bubonica has joined #dri-devel
<zmike> I tried padding it out to 4 coords but that didn't help
<zmike> using a cube array for the sampler
Hi-Angel has quit [Quit: Konversation terminated!]
Hi-Angel has joined #dri-devel
<imirkin_> zmike: no, you would never have 4 coords there
<imirkin_> you'd have 3 coords
<imirkin_> but like ...
<imirkin_> for 2d array you only have 2 coords
<imirkin_> which seems like it might not be enough
<imirkin_> unless the array index goes somewhere else
Peste_Bubonica has quit [Ping timeout: 480 seconds]
Peste_Bubonica has joined #dri-devel
<imirkin_> i also don't immediately see where you're iterating over all the layers
<imirkin_> (or how you indicate which layer you want)
<zmike> it's passed in the constant buffer
<imirkin_> ok, so one invocation only does 1 2d face at a time?
<zmike> I think so? not at my pc now to check
<zmike> sounds right tho
<imirkin_> ok, that makes more sense
<imirkin_> basically you need to unify how you treat CUBE/2D_ARRAY/CUBE_ARRAY... txf is a bit different than regular tex() in that regard
<imirkin_> i'm not familiar enough with nir to know precisely how that's represented unfortunately
mareko has joined #dri-devel
muhomor has quit [Remote host closed the connection]
<zmike> hm I thought I'd done that
<zmike> guess not
<imirkin_> in sampler_type_for_target, you return diff things for cube/2d array
riku has joined #dri-devel
<imirkin_> and glsl_get_sampler_coordinate_components(2D) = 2, (CUBE) = 3
<zmike> right but it's all flattened to 2d array by then
<zmike> so it's not actually getting anything else
Hi-Angel has quit [Ping timeout: 480 seconds]
<imirkin_> ah i see, so it is.
glisse has joined #dri-devel
<imirkin_> zmike: ah i see. grid[2] = depth
<imirkin_> so yeah. i just don't think you're ever passing in the array component.
<imirkin_> and the fact that it's working for 2d arrays and cubes seems surprising.
<zmike> hmm will have to check when I get back, but pretty sure I am
<imirkin_> i'm just reading the code, not actually testing
<imirkin_> so easily could be that some bit is working out differently than i expect
<imirkin_> but if nothing else, it's worth a closer look
<zmike> I haven't really looked at it in about 3 months, so I don't remember the exact details
<zmike> always good to have a starting point
Company has joined #dri-devel
riku has quit [Ping timeout: 480 seconds]
<zmike> imirkin: when you say "the array component" what exactly are you referring to
mlankhorst has quit [Ping timeout: 480 seconds]
riku has joined #dri-devel
<imirkin_> zmike: the array index
<imirkin_> like in a 2d_array thing, what is often referred to as "z"
<zmike> okay, trying to disambiguate with z there since there's a lot of conflation
<zmike> my brain
<imirkin_> so when you're fetching from a 2d array (or cube or cube array), you have to indicate which 2d image you want to fetch from
<zmike> yea
<imirkin_> for texelFetch, a cube is no different than a 2d array
<imirkin_> since there is no face selection
<zmike> trying to remember
<zmike> I think in nir this is just the z coord component
<imirkin_> as opposed to with texture() where you select which face of the cube to sample from based on the spherical coordinates you give it
riku has quit [Ping timeout: 480 seconds]
pekkari has quit [Quit: Konversation terminated!]
pnowack has quit [Quit: pnowack]
riku has joined #dri-devel
<imirkin_> zmike: anyways, you clamp the coordinates to the number of coord_components, which doesn't work with array textures since the function to determine it doesn't receive the arrayed thing
<imirkin_> zmike: ohhh, wait. there's a glsl_sampler_type thing at the end. i think it'll all work if you just map cube/cube_array to 2D, and ensure that cube is treated as is_array = true.
<zmike> I'm already doing that though...
<imirkin_> no
<imirkin_> sampler_type_for_target
<imirkin_> erm
<imirkin_> but by then it's already 2D_ARRAY isn't it
<imirkin_> sigh
<imirkin_> and the cube stuff is just there to confuse me
<imirkin_> which it has done quite well
<zmike> would you believe me if I said I wrote this whole thing in about a week of getting 3-4 hours of sleep each night?
<imirkin_> hopefully you were out partying most of the day
<imirkin_> :)
<zmike> no, just insomnia
<zmike> partying would have been better but also there's been a pandemic so
<imirkin_> virtual party?
<imirkin_> VR party?
<zmike> smart
Hi-Angel has joined #dri-devel
mslusarz has joined #dri-devel
mslusarz has quit []
mslusarz has joined #dri-devel
pnowack has joined #dri-devel
DPA has quit [Quit: ZNC 1.8.2+deb2~bpo10+1 - https://znc.in]
<karolherbst> I've heard there is something like VR chat :D
Koniiiik has joined #dri-devel
<karolherbst> zmike: I basically wrote the entire nir stuff for nouveau in a week....
<Koniiiik> Can someone point me in the right direction to generate an API trace for a mesa bug report?
<Koniiiik> Thanks, imirkin_!
<karolherbst> maybe two...
<zmike> you know this pain well then
<imirkin_> Koniiiik: if you're doing stuff with wine or steam, check out the wiki there. it's not trivially obvious, sometimes
<karolherbst> zmike: well.. it was christmas
<karolherbst> but yeah
<karolherbst> sometimes you are just writing that stuff
<Koniiiik> imirkin_: Nah, this is a native Linux application.
<imirkin_> Koniiiik: also make sure that whatever problem reproduces when replaying the trace
<Koniiiik> imirkin_: Hm, before I go ahead with this, is a trace of any value if the problem is a segfault?
sdutt has quit [Remote host closed the connection]
<imirkin_> Koniiiik: it can be
<Koniiiik> Okay then!
<imirkin_> ideally it should be obvious what's going on from the backtrace itself
<karolherbst> as long as it segfaults replaying
<imirkin_> but if it's not, a trace will allow a developer to hit the segfault as well, and figure out wtf is going on
<imirkin_> right, if replaying the trace doesn't segfault, then it's of limited use
<Koniiiik> Ah, understood.
DPA has joined #dri-devel
lemonzest has quit [Quit: Quitting]
<airlied> glennk, MrCooper : oh I managed to reproduce on a real ppc64le, not sure how I messed up first time
<airlied> anholt_, MrCooper : so aniso texturing blows some of the virgl traces out past the 300 piglit timeout limit, increasing the timeout to 400 lets them finish, but I could also just comment out those traces (some are already), any opinions?
Duke`` has quit [Ping timeout: 480 seconds]
<imirkin_> airlied: do you have anything clever to say about https://gitlab.freedesktop.org/mesa/mesa/-/issues/4763 ?
camus1 has joined #dri-devel
Daanct12 has joined #dri-devel
<idr> Are NIR optimization passes expected to clear pass_flags when they're done or before they start?
<idr> (I'm assuming the latter.)
<pendingchaos> I think before they start
<pendingchaos> I don't remember seeing any pass clear pass_flags when it's done
<idr> That's what I thought. Thanks. :)
camus has quit [Ping timeout: 480 seconds]
Danct12 has quit [Ping timeout: 480 seconds]
Daanct12 is now known as Danct12
<airlied> imirkin_: nope, might take a closer look later
<imirkin_> airlied: ok. i'm guessing ajax is out for a bit?
reductum has quit [Quit: WeeChat 2.8]
<airlied> he is around, but his focus is moving about
sdutt has joined #dri-devel
nchery has quit [Remote host closed the connection]
<cmarcelo> pendingchaos: I'm still missing something... seems to me we can't rely on !ACCESS_COHERENT to decide whether a load (for SSBO/global) can be ignored (and a loop with such load removed). i.e. patch should go as-is.
<cbaylis> is there a simple example of a igt test to copy? I tried copying tests/core_getversion.c but if I add "igt_assert_eq(1, 2);" it still passes
<jekstrand> That seems.... wrong.
<cbaylis> I was surprised too
<imirkin_> so 1 == 2 ... as we all know ...
<imirkin_> does it take a bool or something?
<jekstrand> zmike: For the subgroups MR: Is it even still needed? Or can you set .ballot_bit_size=64 .ballot_components=1 and walk away?
<zmike> uhhhh
<jekstrand> No, igt_assert_eq(1, 2) should definitely fail
<zmike> I guess I'll check that tomorrow 🤔
<jekstrand> cbaylis: I usually copy gem_basic or gem_exec_basic when making new i915 tests.
<jekstrand> cbaylis: There's basically nothing interesting that's i915-specific in gem_basic.c
<danvet> cbaylis, can you pastebin your test that passes with that?
<jekstrand> cbaylis: Is your igt_assert() inside an ibt_subtest?
<danvet> jekstrand, without arguments we run them all by default
<danvet> plus we should have asserts in place if you hand-roll everything and forget some setup pieces
<danvet> there's even some tests for this stuff in igt/lib/tests
<danvet> so if there's a funny hole, we should probably cover it
<jekstrand> zmike: With cwabbott's subgroup changes for freedreno, it may "just work".
orbea has quit [Ping timeout: 480 seconds]
<zmike> that would be nice
<danvet> jekstrand, also no igt_subtests in igt_simple_main
<cbaylis> jekstrand: danvet: https://pastebin.com/RVMMwUDY
orbea has joined #dri-devel
<danvet> cbaylis, for minimalist useful example gem_basic.c indeed looks better
<cbaylis> ok, I'll try modifying gem_basic.c
<danvet> cbaylis, just tried that, splats like expected
<danvet> backtrace and everything included
pochu has joined #dri-devel
<danvet> cbaylis, tbh we have a complete lack of tests for basic drm_ioctl tests
<danvet> so since you're looking and maybe if you're bored, add subtests for other gotchas that might be worth testing in there
<danvet> and then call that core_ioctl or so
pochu has quit [Ping timeout: 480 seconds]
<cbaylis> danvet: I get the same as you if I run the binary directly, but via meson: https://pastebin.com/rWiqEuq2
<danvet> cbaylis, meson doesn't run igt tests
<danvet> or I'm confused
<danvet> oh
<danvet> this validates that you didn't screw up some of the igt things we can't easily check otherwise
<danvet> so it runs your test
<danvet> but in special modes that CI needs to enumerate tests and stuff like that
<danvet> so this tells you "your test is looking good"
<danvet> it does _not_ run the test itself
Ben has joined #dri-devel
<danvet> for that there's an igt runner somewhere in igt which our CI uses to actually run the tests
<danvet> cbaylis, outside of CI just run the resulting binary directly
<danvet> also --help for common options
Ben is now known as Guest1543
<cbaylis> ok, thanks!
<cbaylis> test systems, always a learning curve :)
<zmike> jekstrand: yea no, that still explodes vtn
<zmike> so "just work" is sadly not something that is happening on this occasion
<jekstrand> zmike: Ok. I believe you. I'm just confused as to why yet again
mbrost has quit [Ping timeout: 480 seconds]
<zmike> OpGroupNonUniformBallot must return a uvec4
<zmike> this is that same thing that came up in the original review
<zmike> it's just a stupid circular rewrite
<jekstrand> Yeah
<jekstrand> But now I'm stuck trying to remember why we need to make changes.
<jekstrand> nir_lower_subgroups *should* make all nir_intrinsic_ballot() have ballot_bit_size and ballot_components
<jekstrand> At least I thought it did. (-:
<zmike> because I need 32bit/4component from spirv, but then the rest of the shader is still using 64bit/1component from glsl
gouchi has quit [Remote host closed the connection]
<zmike> I need both
<zmike> not just one or the other
* jekstrand is so confused
<jekstrand> I know I groked this at one point in time in the past. Honest, I did!
<zmike> it's like running nir to spirv to do glsl spirv -> vk spirv
<zmike> it feels very stupid
<zmike> and is
<zmike> but it still has to happen
<zmike> I can put up a branch with all the changes for you to experience it with your own body if you must
<jekstrand> zmike: For reviews of this sort, don't you think an out-of-body experience is better?
<zmike> oh I thought that was just me
mbrost has joined #dri-devel
<anholt_> airlied: oof. I would comment them out, though it's unfortunate
camus has joined #dri-devel
camus1 has quit [Ping timeout: 480 seconds]
<zmike> jekstrand: so I don't want to rush you if your trip is still grooving, but I'm starting to come down from mine, and it still looks like I'm gonna need this pass
<jekstrand> zmike: lol. Give me a few more minutes. Trying to rev an i915 patch set ATM.
<zmike> no, no, like I said, you just keep on keepin on
<jekstrand> zmike: If you set a breakpoint in the handling of nir_intrinsic_ballot, what are intrin->dest.ssa.bit_size/num_components?
<zmike> jekstrand: like I said, it's 64/1
<jekstrand> zmike: So it sounds to me like we just need to adjust uint_to_ballot_type to also support down-casting
<zmike> I don't know what you mean by that, but is it really going to be less complex than my MR?
reductum has joined #dri-devel
ced117_ has quit [Ping timeout: 480 seconds]
Guest1543 has quit [Ping timeout: 480 seconds]
<zmike> ah, you mean just do it implicitly?
nchery has joined #dri-devel
<zmike> 🤔
Company has quit [Ping timeout: 480 seconds]
<zmike> yup, that'll do
<jekstrand> :D
<jekstrand> See, I said it would "just work". :P
<zmike> ehhhhh I dunno about that one
<zmike> Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
rasterman has quit [Quit: Gettin' stinky!]
<jekstrand> I'll assign marge tomorrow or once my Jenkins comes back. If I forget, remind me.
<zmike> 👍
yoslin has quit [Quit: WeeChat 3.2]
nchery has quit [Quit: Page closed]
<zmike> thanks for helping get that out finally too
<zmike> one of my oldest patches
<jekstrand> sorry it took so long
<jekstrand> But I like the fairly general thing we have now a lot better than the zink-specific solutions
zf has joined #dri-devel
<zmike> sure
<zmike> subgroups isn't exactly super high priority
Bennett has joined #dri-devel
ngcortes has quit [Ping timeout: 480 seconds]
alyssa has joined #dri-devel
<alyssa> jekstrand: Is NIR supposed to CSE load_global when there aren't any stores in between?
<alyssa> I have an SSBO `layout(std430, binding = 0) buffer Inputs { vec4 foobar; }`
<alyssa> and then a shader `foobar = foobar * foobar + foobar;`
<alyssa> and that compiles to 3 identical loads, instead of just 1
<alyssa> although this is on the standalone compiler so maybe I'm missing an opt pass that st/mesa calls
<pendingchaos> nir_opt_cse doesn't do that
<pendingchaos> nir_opt_copy_prop_vars and nir_opt_load_store_vectorize can though
yoslin has joined #dri-devel
<alyssa> got it
<alyssa> thanks
ced117 has joined #dri-devel
cbaylis has quit [Ping timeout: 480 seconds]
iive has quit []
heat has quit [Remote host closed the connection]
<alyssa> pendingchaos: yep, I was missing a call to nir_opt_copy_prop_vars, thanks 👍
danvet has quit [Ping timeout: 480 seconds]
camus1 has joined #dri-devel
camus has quit [Ping timeout: 480 seconds]
* alyssa is bringing up support for a new instruction set in her NIR backend
<alyssa> Any guesses which instruction set? 😋
<imirkin_> x86
<imirkin_> intel 4004!
<daniels> Morello
<alyssa> imirkin_: is the winner, 4004
flto has quit [Ping timeout: 480 seconds]
<imirkin_> w00t! what do i win?
<alyssa> fish
<imirkin_> a brand new 4004 as my desktop? :)
* alyssa slaps around imirkin_ with a large trout
flto has joined #dri-devel
<imirkin_> Max. CPU clock rate740-750 kHz
<jekstrand> alyssa: It will cse if you use vars
<imirkin_> that's basically the speed we get on nvidia anyways...
<alyssa> jekstrand: Yeah.. I'm wondering if we need a helper to call all the common frontend NIR passes in the right order
<alyssa> since all the standalone compilers are cargoculting them and getting them subtly wrong compared to mesa/st, never mind the vk drivers..
<jekstrand> alyssa: *sigh* Maybe we should.
<jekstrand> I've been trying to avoid it but things may be getting out of hand....
<alyssa> jekstrand: Why avoid it?
<alyssa> (Next bug on that list, "(foobar * -foobar) + foobar" is getting compiled as "fmul + fsub" instead of "fneg + ffma", in what subtle way did I screw up this time..)
<jekstrand> zmike: Jenkins is happy with my subgroup MR. Let's give it 24 hours or so in case cwabbott or someone else wants to NAK or suggest a different approach. Then feel free to marge (or I will)
<zmike> jekstrand: 👍
<jekstrand> alyssa: My approach with NIR has always been "here is a toolbox for you to build a compiler" rather than "here is a compiler for you to plug into".
<jekstrand> Not every optimization or lowering makes sense for everybody and different drivers may want different heuristics.
<alyssa> jekstrand: If that approach were respected, mesa/st would do no optimization whatsoever, and almost no lowerings, and our compilers would consume the garbage that glsl emits
<alyssa> that mesa/st does a ton of generic stuff, and each vk driver (not the compilers hooked up to the vk driver) does a ton of stuff, says there is a common subset? 🤷
<jekstrand> alyssa: Yeah. I was pretty annoyed when that much common stuff was added to st/mesa. :P
<alyssa> (Maybe you disagree with mesa/st running algebraic opts)
<alyssa> --ah. k.
<jekstrand> alyssa: But I may be swinging around.
<jekstrand> alyssa: I just don't want more structs of doom if we can avoid them.
<alyssa> Nod.
<jekstrand> Like I think you could pull most of brw_nir_optimize() out into a helper and it would be fine.
pcercuei has quit [Quit: dodo]
<jekstrand> There's a weird lower_flrp thing in there you might want to pull and the strange opt_peephole_select for vec4 tessellation stuff.
<jekstrand> And the loop_indirect_mask
<alyssa> can lower_flrp burn in a fire
<jekstrand> So maybe it wouldn't be a struct of doom
mbrost has quit [Ping timeout: 480 seconds]
<alyssa> jekstrand: relatedly, I hate that mesa/st calls nir_opt_peephole_select
<jekstrand> I'm inclined to say maybe it can. It helped a small portion of shader-db by an average of 3.5 instructions per shader.
<jekstrand> alyssa: See! Common optimizations are bad! >:-P
<idr> ouch
<alyssa> jekstrand: nir_opt_peephole_select is an exception to the rule
<alyssa> in that it is inherently a _heuristic_ that depends entirely on hw behaviour
<jekstrand> alyssa: Everything's a heuristic but I do get what you mean.
<jekstrand> That one's particularly bad
<alyssa> dead code elimination is not a heuristic
<alyssa> :-p
<idr> I guess it's time to bash idr code again.
<jekstrand> I'd be fine with keeping nir_lower_flrp and just pulling it out of brw_nir_optimize() too.
<jekstrand> We have code in there to ensure that it only runs once but it's in the loop which is a bit weird.
<idr> That seemed the least painful way to get it properly ordered with other passes.
<idr> Like opt_algebraic.
<jekstrand> Do you remember what the ordering constraints were?
<jekstrand> For instance, could we shut off the opt_algebraic lowering and run it after the first full brw_nir_optimize?
<idr> Most passes should happen both before and after.
<jekstrand> Yeah, but brw_nir_optimize gets called a minimum of 3 times in your average compile.
<idr> Because it creates a bunch of new arithmetic that can be algebraic, dead code, CSE, etc. optimized.
<idr> I guess I put it in the place that seemed most logical, and I made it only run once as an optimization.
<idr> At the time, there didn't appear to be any value in doing differently.
<jekstrand> That's fair. Thinking about how to make a generic opt loop changes that calculus a bit, I think.
mbrost has joined #dri-devel
<idr> Other Gallium drivers use nir_lower_flrp too.
* zmike sweats nervously
<jekstrand> I'm not saying they shouldn't. Just that it seems like something we could say "call yourself" to avoid an options struct of doom on nir_opt_common().
<idr> I feel like that ship has already sailed. :(
<idr> There are already a huge pile of .lower_foo (including .lower_flrp##) that both trigger the lowering passes and prevent other (usually algebraic) optimizations.
<idr> I don't love it either.
<jekstrand> What can I say? It's one of my favorite windmills. :)
<alyssa> I'm always good to NAK new .lower_foo options ...
<idr> It seems like there's not very much (practical) middle ground between "it's toolbox, do it yourself" and "check all the customization boxes."
<idr> It's very, very steep s-curve.
Hi-Angel has quit [Ping timeout: 480 seconds]
<mattst88> I don't think I comprehend what's the practical difference
<jekstrand> I guess we already have nir_shader_compiler_options::lower_flrpN so maybe we should just use that?
<jekstrand> Some days, I kind-of wonder if we don't want to just have a bitset of what nir_op you support and call it a day.
<idr> mattst88: One bit of shared code with a billion options to control it vs. copy-and-paste-and-modify for every driver.
<jekstrand> I think if you find yourself changing the flow for different drivers, it'd be a problem
<jekstrand> But if everyone wants the same passes in the same order just with slightly different knobs, knobs are probably ok.
<idr> jekstrand: I seem to recall that we already have that too.
<idr> Except the different drivers are Iris vs. i965 and ANV.
<jekstrand> idr: I think there is a pass somewhere that does that, yes. (-:
<jekstrand> idr: We don't change brw_nir_optimize() per-driver.
<idr> No, but we do end up going through shared st code that does some optimizations in some different orders.
<idr> ...
<jekstrand> *sigh* Yeah
<idr> by virtue of not calling some that brw_nir_optimize does.
<mattst88> welp, nir had a good run
<mattst88> time for nnir
<zmike> too many other letters
<zmike> how about just nnnnnnnnn
<jekstrand> "better IR". It leaves all the others out in the cold...
<alyssa> jekstrand: already use BIR for bifrost
<alyssa> mattst88: Humans clearly suck at coming up with IRs so we're now presenting nnir, which is powered by neural networks
<mattst88> n²ir
<mattst88> alyssa: I'm sold
<jekstrand> Now, not only does no one know how the whole compiler works, no one even knows how the patches are written!
<alyssa> jekstrand: Of course we know it's written by GitHub Copilot trained on LLVM.
<jekstrand> lol
<mattst88> monkeys banging on keyboards, using github copilot
<ccr> :O
<jekstrand> alyssa: So, we need to write a JIT to produce NIR to run NN in shaders to produce a compiler to compile shaders....
<alyssa> jekstrand: yo, i heard you like shaders..
<ccr> add shaders and STIR
<jekstrand> alyssa: While we're at it, can we have the NN convert Mesa to rust?
<alyssa> jekstrand: I was just going to use Microsoft Pilot for that...
<mattst88> toml \o/
tursulin has quit [Read error: Connection reset by peer]
<mattst88> wow, that's cool. so you can run deqp and piglit in the same deqp-runner invocation?
<jekstrand> neato