ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar
ngcortes has quit [Remote host closed the connection]
tursulin has quit [Read error: Connection reset by peer]
Lightkey has quit [Ping timeout: 480 seconds]
Lightkey has joined #dri-devel
bryanv has joined #dri-devel
bryanv has quit []
camus has joined #dri-devel
mbrost has quit [Read error: Connection reset by peer]
mbrost has joined #dri-devel
sdutt_ has joined #dri-devel
jewins has quit [Remote host closed the connection]
jewins has joined #dri-devel
camus1 has quit [Ping timeout: 480 seconds]
sdutt has quit [Ping timeout: 480 seconds]
boistordu has joined #dri-devel
boistordu_old has quit [Ping timeout: 480 seconds]
camus1 has joined #dri-devel
camus has quit [Remote host closed the connection]
<airlied>
okay reproduced the container qemu crash locally but now have to work out how to debug/fix it
Company has quit [Quit: Leaving]
tarceri_ has quit []
tarceri has joined #dri-devel
<tarceri>
anholt_: this normally is handled on a driver bases. by handling flags and passing them into disk_cache_create()
camus has joined #dri-devel
camus1 has quit [Read error: Connection reset by peer]
Bennett has quit [Remote host closed the connection]
aravind has joined #dri-devel
camus1 has joined #dri-devel
camus has quit [Remote host closed the connection]
<mareko>
anholt_: you might have to add the flag into the shader cache key
punit has joined #dri-devel
sdutt_ has quit [Remote host closed the connection]
sdutt_ has joined #dri-devel
<mareko>
Kayden: see this mesa-dev thread from 2016 for some of the reasons why we use pb_slab: [Mesa-dev] [PATCH 00/14] radeon/winsyses: sub-allocation for small buffers
<mareko>
one thing that is different today compared to that thread is that we use pb_slab for textures now too
aravind has quit [Ping timeout: 480 seconds]
aravind has joined #dri-devel
jewins has quit [Read error: Connection reset by peer]
Duke`` has joined #dri-devel
<Kayden>
mareko: thanks! I'm still not entirely sure what all the pb_* framework really buys you
<Kayden>
the approach seems good, it just seems like a lot of boilerplate for not very much functionality
<Kayden>
i.e. if the common code wants to do anything with the base objects, it can't, so it resorts to vtbl functions for pretty much everything
itoral has joined #dri-devel
mattrope has quit [Remote host closed the connection]
sdutt_ has quit [Ping timeout: 480 seconds]
pnowack has joined #dri-devel
pnowack has quit [Remote host closed the connection]
pnowack has joined #dri-devel
dv_ has quit [Ping timeout: 480 seconds]
<airlied>
wierd seems like ppc64le f16 loads end up trying to indirect callout to something
<airlied>
have to dream up some sort of workaround for llvmpipe
dv_ has joined #dri-devel
danvet has joined #dri-devel
Duke`` has quit [Ping timeout: 480 seconds]
tobiasjakobi has joined #dri-devel
aravind has quit [Remote host closed the connection]
lemonzest has joined #dri-devel
tobiasjakobi has quit [Remote host closed the connection]
rpigott has quit [Remote host closed the connection]
rpigott has joined #dri-devel
frieder has joined #dri-devel
Erandir has quit [Remote host closed the connection]
Erandir has joined #dri-devel
aravind has joined #dri-devel
alanc has quit [Remote host closed the connection]
alanc has joined #dri-devel
idr has quit [Ping timeout: 480 seconds]
pochu has joined #dri-devel
aravind has quit [Ping timeout: 480 seconds]
aravind has joined #dri-devel
YuGiOhJCJ has joined #dri-devel
airlied_ has joined #dri-devel
airlied has quit [Ping timeout: 480 seconds]
rasterman has joined #dri-devel
thellstrom has joined #dri-devel
airlied has joined #dri-devel
thellstrom1 has joined #dri-devel
airlied_ has quit [Ping timeout: 480 seconds]
thellstrom has quit [Ping timeout: 480 seconds]
mbrost_ has joined #dri-devel
lynxeye has joined #dri-devel
thellstrom1 has quit [Remote host closed the connection]
thellstrom has joined #dri-devel
mbrost has quit [Ping timeout: 480 seconds]
agd5f_ has joined #dri-devel
agd5f has quit [Ping timeout: 480 seconds]
tursulin has joined #dri-devel
<glennk>
airlied, qemu ppc64le?
Lucretia has joined #dri-devel
<MrCooper>
glennk: <airlied> ugggh ppc64le in CI crashes that aren't happening on real ppc64le hw I tested on ftl
thellstrom has quit [Remote host closed the connection]
thellstrom has joined #dri-devel
camus has joined #dri-devel
<MrCooper>
CI runs ppc64le binaries on x86 via qemu
pcercuei has joined #dri-devel
thellstrom has quit []
camus1 has quit [Read error: Connection reset by peer]
<glennk>
not sure how good its floating point emulation is, in particular half precision floats
<MrCooper>
yeah, we've hit qemu bugs in CI before
<HdkR>
qemu bugs are...fun
<glennk>
had a few of those over the years, my favorite was arm qadd typo:ed in qemu as add A,A
xexaxo has joined #dri-devel
xexaxo_ has quit [Ping timeout: 480 seconds]
mlankhorst has joined #dri-devel
lemonzest has quit [Quit: Quitting]
lemonzest has joined #dri-devel
Surkow|laptop is now known as Surkow
nchery has quit [Remote host closed the connection]
muhomor has joined #dri-devel
muhomor has quit [Remote host closed the connection]
Hi-Angel has joined #dri-devel
muhomor has joined #dri-devel
iive has joined #dri-devel
muhomor has quit []
phomes has quit [Quit: Page closed]
muhomor has joined #dri-devel
muhomor has quit []
muhomor has joined #dri-devel
muhomor has quit []
muhomor has joined #dri-devel
muhomor has quit []
muhomor has joined #dri-devel
elongbug has quit [Ping timeout: 480 seconds]
camus1 has joined #dri-devel
camus has quit [Remote host closed the connection]
pochu has quit [Ping timeout: 480 seconds]
camus has joined #dri-devel
zf has quit [Ping timeout: 480 seconds]
camus1 has quit [Read error: Connection reset by peer]
<mareko>
Kayden: we only us pb_slab and pb_cache, we don't care about the rest
<mareko>
Kayden: if you don't layer buffers like we do (pb_buffer* inside pipe_resource) or track buffer busyness for each suballocation, then it might not make sense
pochu has joined #dri-devel
elongbug has joined #dri-devel
<mareko>
we might consider forking pb_buffer and removing vtbl
zf has joined #dri-devel
camus1 has joined #dri-devel
camus has quit [Ping timeout: 480 seconds]
Namarrgon has quit [Ping timeout: 480 seconds]
Namarrgon has joined #dri-devel
cbaylis has joined #dri-devel
xexaxo has quit [Ping timeout: 480 seconds]
xexaxo has joined #dri-devel
zf has quit [Ping timeout: 480 seconds]
muhomor has quit [Remote host closed the connection]
muhomor has joined #dri-devel
itoral has quit [Remote host closed the connection]
camus has joined #dri-devel
zf has joined #dri-devel
sdutt has joined #dri-devel
sdutt has quit []
sdutt has joined #dri-devel
camus1 has quit [Ping timeout: 480 seconds]
mbrost_ has quit [Ping timeout: 480 seconds]
jewins has joined #dri-devel
pochu has quit [Ping timeout: 480 seconds]
zf has quit [Ping timeout: 480 seconds]
robertfoss_ has joined #dri-devel
robertfoss_ has quit [Remote host closed the connection]
vivijim has joined #dri-devel
pochu has joined #dri-devel
aravind has quit []
YuGiOhJCJ has quit [Quit: YuGiOhJCJ]
pochu has quit [Ping timeout: 480 seconds]
frieder has quit [Remote host closed the connection]
pochu has joined #dri-devel
<zmike>
danylo: you motivated me to track down that crash finally
<danylo>
=)
<zmike>
ZINK_DESCRIPTORS=lazy should work fine for the time being until that's fixed
<danylo>
yes, the crash is gone with lazy descriptors
<danylo>
zmike: There is validation error with the same trace:
<danylo>
``` The pStrides[1] (0) parameter in the last call to vkCmdBindVertexBuffers2EXT is less than the extent of the binding for attribute 1 (16).```
<zmike>
yeah ignore that, I'm working on it
<zmike>
it's a tricky one
<danylo>
could it cause issues?
<danylo>
since the issue with the trace is vertex explosion
<zmike>
it's possible I suppose, but I haven't seen it cause issues in practice, so it'd have to be a hw thing
<danylo>
ok, thanks
mattrope has joined #dri-devel
<zmike>
the trace seems fine on radv
ngcortes has joined #dri-devel
ngcortes has quit [Remote host closed the connection]
lemonzest has quit [Ping timeout: 480 seconds]
elongbug has quit [Ping timeout: 480 seconds]
Duke`` has joined #dri-devel
mbrost has joined #dri-devel
flacks has quit [Ping timeout: 480 seconds]
pekkari has joined #dri-devel
<zmike>
any gallium experts know what the trick to sampling from a GL_ARB_texture_cube_map_array texture is?
<zmike>
thought I had it working but maybe not...
flacks has joined #dri-devel
<zmike>
ah, can they not be sampled as 2D_ARRAY?
flacks has quit [Ping timeout: 480 seconds]
lynxeye has quit [Quit: Leaving.]
camus1 has joined #dri-devel
pochu has quit [Ping timeout: 480 seconds]
flacks has joined #dri-devel
camus has quit [Ping timeout: 480 seconds]
nchery has joined #dri-devel
morenonatural has joined #dri-devel
pochu has joined #dri-devel
pochu has quit [Ping timeout: 480 seconds]
morenonatural has left #dri-devel [#dri-devel]
morenitonatural has joined #dri-devel
flacks has quit [Quit: Quitter]
morenitonatural has left #dri-devel [#dri-devel]
morenonatural has joined #dri-devel
<sravn>
mripard: Started to look at your nice patch-set. devm_drm_of_get_next() I like but a more descriptive name would be nice - like devm_drm_of_get_bridge()
morenonatural has quit []
gouchi has joined #dri-devel
morenonatural has joined #dri-devel
flacks has joined #dri-devel
camus has joined #dri-devel
camus1 has quit [Ping timeout: 480 seconds]
imirkin_ has joined #dri-devel
<imirkin_>
zmike: what's your question about cube map arrays?
<zmike>
I don't know that I know what I need to ask, so probably it'll get answered in the course of this rfc I'm putting up
<imirkin_>
you can use GL_ARB_texture_view to sample them as 2d arrays, but at the gallium level that requires you to have PIPE_CAP_SAMPLER_TARGET or something
<imirkin_>
zmike: ok
<imirkin_>
are you have a problem of some sort, or just trying to sort things out in your head?
<zmike>
problem
<imirkin_>
which is?
<zmike>
I'm typing it in this MR, hold on
<imirkin_>
ah ok
<imirkin_>
leave a link when you're done
<imirkin_>
sampling from a cube map is a way of having an array of cubes. so if you do it as 2d, you lose the cube-ness when sampling.
<imirkin_>
er, make that "sampling from a cube map array is ..."
glisse has quit [Read error: Connection reset by peer]
mareko has quit [Write error: connection closed]
mslusarz has quit [Remote host closed the connection]
dri-logger has quit [Write error: connection closed]
flacks_ has joined #dri-devel
flacks has quit [Ping timeout: 480 seconds]
lemonzest has joined #dri-devel
ngcortes has joined #dri-devel
morenonatural has quit [Ping timeout: 480 seconds]
<imirkin_>
zmike: so ... with images, there are (effectively) no cube arrays. so the "2d array" bit really oughtn't matter
<imirkin_>
without looking at it closer, my guess is that the problem has absolutely nothing to do with cube arrays
<imirkin_>
although .. hm. will have ot check how the pbo code works
<imirkin_>
does it do texelFetch or imageLoad?
<zmike>
txf
<imirkin_>
zmike: where do you convert gallium sampler to something vulkan-y?
<zmike>
?
<zmike>
I don't
<imirkin_>
pipe_sampler_view -> something
<zmike>
zink can't use this right now
<zmike>
if that's what you mean
<imirkin_>
can't use what?
idr has joined #dri-devel
<zmike>
the MR
<imirkin_>
forget the MR
<imirkin_>
in zink in general, where do you convert pipe_sampler_view into something vulkan can understand?
<zmike>
uhhh zink_create_sampler_view?
<imirkin_>
right. what file might that be in?
<zmike>
zink_context.c
<imirkin_>
aha ok
<imirkin_>
thanks
<zmike>
not sure why you're looking at zink for this though?
<zmike>
or is this an unrelated query
<imirkin_>
oh, this is "in general"?
<zmike>
ah
<imirkin_>
i assumed the failure was on zink
<zmike>
no, like I said, zink can't use this
<imirkin_>
where do you see the failure?
<zmike>
iris and radeonsi
<imirkin_>
hrmph ok
<imirkin_>
they can't BOTH be wrong :)
<zmike>
no, I imagine they're not
<imirkin_>
lesseee here
<zmike>
I thought I had this case passing when I did the original implementation, but I'm too tired to try and bisect it across 4 months of mesa
<zmike>
and/or maybe the test just wasn't running back then
heat has joined #dri-devel
<zmike>
anyway gotta run for a bit, will be transient for the next while
<imirkin_>
zmike: yeah, there's some slightly dodgy things going on here.
<imirkin_>
zmike: i don't think glsl_get_sampler_coordinate_components does what you want it to, in your case
<imirkin_>
i'm not 100% sure how the nir sampling stuff deals with array indices
Peste_Bubonica has joined #dri-devel
<zmike>
I tried padding it out to 4 coords but that didn't help
<zmike>
using a cube array for the sampler
Hi-Angel has quit [Quit: Konversation terminated!]
Hi-Angel has joined #dri-devel
<imirkin_>
zmike: no, you would never have 4 coords there
<imirkin_>
you'd have 3 coords
<imirkin_>
but like ...
<imirkin_>
for 2d array you only have 2 coords
<imirkin_>
which seems like it might not be enough
<imirkin_>
unless the array index goes somewhere else
Peste_Bubonica has quit [Ping timeout: 480 seconds]
Peste_Bubonica has joined #dri-devel
<imirkin_>
i also don't immediately see where you're iterating over all the layers
<imirkin_>
(or how you indicate which layer you want)
<zmike>
it's passed in the constant buffer
<imirkin_>
ok, so one invocation only does 1 2d face at a time?
<zmike>
I think so? not at my pc now to check
<zmike>
sounds right tho
<imirkin_>
ok, that makes more sense
<imirkin_>
basically you need to unify how you treat CUBE/2D_ARRAY/CUBE_ARRAY... txf is a bit different than regular tex() in that regard
<imirkin_>
i'm not familiar enough with nir to know precisely how that's represented unfortunately
mareko has joined #dri-devel
muhomor has quit [Remote host closed the connection]
<zmike>
hm I thought I'd done that
<zmike>
guess not
<imirkin_>
in sampler_type_for_target, you return diff things for cube/2d array
riku has joined #dri-devel
<imirkin_>
and glsl_get_sampler_coordinate_components(2D) = 2, (CUBE) = 3
<zmike>
right but it's all flattened to 2d array by then
<zmike>
so it's not actually getting anything else
Hi-Angel has quit [Ping timeout: 480 seconds]
<imirkin_>
ah i see, so it is.
glisse has joined #dri-devel
<imirkin_>
zmike: ah i see. grid[2] = depth
<imirkin_>
so yeah. i just don't think you're ever passing in the array component.
<imirkin_>
and the fact that it's working for 2d arrays and cubes seems surprising.
<zmike>
hmm will have to check when I get back, but pretty sure I am
<imirkin_>
i'm just reading the code, not actually testing
<imirkin_>
so easily could be that some bit is working out differently than i expect
<imirkin_>
but if nothing else, it's worth a closer look
<zmike>
I haven't really looked at it in about 3 months, so I don't remember the exact details
<zmike>
always good to have a starting point
Company has joined #dri-devel
riku has quit [Ping timeout: 480 seconds]
<zmike>
imirkin: when you say "the array component" what exactly are you referring to
mlankhorst has quit [Ping timeout: 480 seconds]
riku has joined #dri-devel
<imirkin_>
zmike: the array index
<imirkin_>
like in a 2d_array thing, what is often referred to as "z"
<zmike>
okay, trying to disambiguate with z there since there's a lot of conflation
<zmike>
my brain
<imirkin_>
so when you're fetching from a 2d array (or cube or cube array), you have to indicate which 2d image you want to fetch from
<zmike>
yea
<imirkin_>
for texelFetch, a cube is no different than a 2d array
<imirkin_>
since there is no face selection
<zmike>
trying to remember
<zmike>
I think in nir this is just the z coord component
<imirkin_>
as opposed to with texture() where you select which face of the cube to sample from based on the spherical coordinates you give it
riku has quit [Ping timeout: 480 seconds]
pekkari has quit [Quit: Konversation terminated!]
pnowack has quit [Quit: pnowack]
riku has joined #dri-devel
<imirkin_>
zmike: anyways, you clamp the coordinates to the number of coord_components, which doesn't work with array textures since the function to determine it doesn't receive the arrayed thing
<imirkin_>
zmike: ohhh, wait. there's a glsl_sampler_type thing at the end. i think it'll all work if you just map cube/cube_array to 2D, and ensure that cube is treated as is_array = true.
<zmike>
I'm already doing that though...
<imirkin_>
no
<imirkin_>
sampler_type_for_target
<imirkin_>
erm
<imirkin_>
but by then it's already 2D_ARRAY isn't it
<imirkin_>
sigh
<imirkin_>
and the cube stuff is just there to confuse me
<imirkin_>
which it has done quite well
<zmike>
would you believe me if I said I wrote this whole thing in about a week of getting 3-4 hours of sleep each night?
<imirkin_>
hopefully you were out partying most of the day
<imirkin_>
:)
<zmike>
no, just insomnia
<zmike>
partying would have been better but also there's been a pandemic so
<imirkin_>
virtual party?
<imirkin_>
VR party?
<zmike>
smart
Hi-Angel has joined #dri-devel
mslusarz has joined #dri-devel
mslusarz has quit []
mslusarz has joined #dri-devel
pnowack has joined #dri-devel
DPA has quit [Quit: ZNC 1.8.2+deb2~bpo10+1 - https://znc.in]
<karolherbst>
I've heard there is something like VR chat :D
Koniiiik has joined #dri-devel
<karolherbst>
zmike: I basically wrote the entire nir stuff for nouveau in a week....
<Koniiiik>
Can someone point me in the right direction to generate an API trace for a mesa bug report?
<imirkin_>
Koniiiik: if you're doing stuff with wine or steam, check out the wiki there. it's not trivially obvious, sometimes
<karolherbst>
zmike: well.. it was christmas
<karolherbst>
but yeah
<karolherbst>
sometimes you are just writing that stuff
<Koniiiik>
imirkin_: Nah, this is a native Linux application.
<imirkin_>
Koniiiik: also make sure that whatever problem reproduces when replaying the trace
<Koniiiik>
imirkin_: Hm, before I go ahead with this, is a trace of any value if the problem is a segfault?
sdutt has quit [Remote host closed the connection]
<imirkin_>
Koniiiik: it can be
<Koniiiik>
Okay then!
<imirkin_>
ideally it should be obvious what's going on from the backtrace itself
<karolherbst>
as long as it segfaults replaying
<imirkin_>
but if it's not, a trace will allow a developer to hit the segfault as well, and figure out wtf is going on
<imirkin_>
right, if replaying the trace doesn't segfault, then it's of limited use
<Koniiiik>
Ah, understood.
DPA has joined #dri-devel
lemonzest has quit [Quit: Quitting]
<airlied>
glennk, MrCooper : oh I managed to reproduce on a real ppc64le, not sure how I messed up first time
<airlied>
anholt_, MrCooper : so aniso texturing blows some of the virgl traces out past the 300 piglit timeout limit, increasing the timeout to 400 lets them finish, but I could also just comment out those traces (some are already), any opinions?
<idr>
Are NIR optimization passes expected to clear pass_flags when they're done or before they start?
<idr>
(I'm assuming the latter.)
<pendingchaos>
I think before they start
<pendingchaos>
I don't remember seeing any pass clear pass_flags when it's done
<idr>
That's what I thought. Thanks. :)
camus has quit [Ping timeout: 480 seconds]
Danct12 has quit [Ping timeout: 480 seconds]
Daanct12 is now known as Danct12
<airlied>
imirkin_: nope, might take a closer look later
<imirkin_>
airlied: ok. i'm guessing ajax is out for a bit?
reductum has quit [Quit: WeeChat 2.8]
<airlied>
he is around, but his focus is moving about
sdutt has joined #dri-devel
nchery has quit [Remote host closed the connection]
<cmarcelo>
pendingchaos: I'm still missing something... seems to me we can't rely on !ACCESS_COHERENT to decide whether a load (for SSBO/global) can be ignored (and a loop with such load removed). i.e. patch should go as-is.
<cbaylis>
is there a simple example of a igt test to copy? I tried copying tests/core_getversion.c but if I add "igt_assert_eq(1, 2);" it still passes
<jekstrand>
That seems.... wrong.
<cbaylis>
I was surprised too
<imirkin_>
so 1 == 2 ... as we all know ...
<imirkin_>
does it take a bool or something?
<jekstrand>
zmike: For the subgroups MR: Is it even still needed? Or can you set .ballot_bit_size=64 .ballot_components=1 and walk away?
<zmike>
uhhhh
<jekstrand>
No, igt_assert_eq(1, 2) should definitely fail
<zmike>
I guess I'll check that tomorrow 🤔
<jekstrand>
cbaylis: I usually copy gem_basic or gem_exec_basic when making new i915 tests.
<jekstrand>
cbaylis: There's basically nothing interesting that's i915-specific in gem_basic.c
<danvet>
cbaylis, can you pastebin your test that passes with that?
<jekstrand>
cbaylis: Is your igt_assert() inside an ibt_subtest?
<danvet>
jekstrand, without arguments we run them all by default
<danvet>
plus we should have asserts in place if you hand-roll everything and forget some setup pieces
<danvet>
there's even some tests for this stuff in igt/lib/tests
<danvet>
so if there's a funny hole, we should probably cover it
<jekstrand>
zmike: With cwabbott's subgroup changes for freedreno, it may "just work".
orbea has quit [Ping timeout: 480 seconds]
<zmike>
that would be nice
<danvet>
jekstrand, also no igt_subtests in igt_simple_main
<danvet>
this validates that you didn't screw up some of the igt things we can't easily check otherwise
<danvet>
so it runs your test
<danvet>
but in special modes that CI needs to enumerate tests and stuff like that
<danvet>
so this tells you "your test is looking good"
<danvet>
it does _not_ run the test itself
Ben has joined #dri-devel
<danvet>
for that there's an igt runner somewhere in igt which our CI uses to actually run the tests
<danvet>
cbaylis, outside of CI just run the resulting binary directly
<danvet>
also --help for common options
Ben is now known as Guest1543
<cbaylis>
ok, thanks!
<cbaylis>
test systems, always a learning curve :)
<zmike>
jekstrand: yea no, that still explodes vtn
<zmike>
so "just work" is sadly not something that is happening on this occasion
<jekstrand>
zmike: Ok. I believe you. I'm just confused as to why yet again
mbrost has quit [Ping timeout: 480 seconds]
<zmike>
OpGroupNonUniformBallot must return a uvec4
<zmike>
this is that same thing that came up in the original review
<zmike>
it's just a stupid circular rewrite
<jekstrand>
Yeah
<jekstrand>
But now I'm stuck trying to remember why we need to make changes.
<jekstrand>
nir_lower_subgroups *should* make all nir_intrinsic_ballot() have ballot_bit_size and ballot_components
<jekstrand>
At least I thought it did. (-:
<zmike>
because I need 32bit/4component from spirv, but then the rest of the shader is still using 64bit/1component from glsl
gouchi has quit [Remote host closed the connection]
<zmike>
I need both
<zmike>
not just one or the other
* jekstrand
is so confused
<jekstrand>
I know I groked this at one point in time in the past. Honest, I did!
<zmike>
it's like running nir to spirv to do glsl spirv -> vk spirv
<zmike>
it feels very stupid
<zmike>
and is
<zmike>
but it still has to happen
<zmike>
I can put up a branch with all the changes for you to experience it with your own body if you must
<jekstrand>
zmike: For reviews of this sort, don't you think an out-of-body experience is better?
<zmike>
oh I thought that was just me
mbrost has joined #dri-devel
<anholt_>
airlied: oof. I would comment them out, though it's unfortunate
camus has joined #dri-devel
camus1 has quit [Ping timeout: 480 seconds]
<zmike>
jekstrand: so I don't want to rush you if your trip is still grooving, but I'm starting to come down from mine, and it still looks like I'm gonna need this pass
<jekstrand>
zmike: lol. Give me a few more minutes. Trying to rev an i915 patch set ATM.
<zmike>
no, no, like I said, you just keep on keepin on
<jekstrand>
zmike: If you set a breakpoint in the handling of nir_intrinsic_ballot, what are intrin->dest.ssa.bit_size/num_components?
<zmike>
jekstrand: like I said, it's 64/1
<jekstrand>
zmike: So it sounds to me like we just need to adjust uint_to_ballot_type to also support down-casting
<zmike>
I don't know what you mean by that, but is it really going to be less complex than my MR?
<jekstrand>
I'll assign marge tomorrow or once my Jenkins comes back. If I forget, remind me.
<zmike>
👍
yoslin has quit [Quit: WeeChat 3.2]
nchery has quit [Quit: Page closed]
<zmike>
thanks for helping get that out finally too
<zmike>
one of my oldest patches
<jekstrand>
sorry it took so long
<jekstrand>
But I like the fairly general thing we have now a lot better than the zink-specific solutions
zf has joined #dri-devel
<zmike>
sure
<zmike>
subgroups isn't exactly super high priority
Bennett has joined #dri-devel
ngcortes has quit [Ping timeout: 480 seconds]
alyssa has joined #dri-devel
<alyssa>
jekstrand: Is NIR supposed to CSE load_global when there aren't any stores in between?
<alyssa>
I have an SSBO `layout(std430, binding = 0) buffer Inputs { vec4 foobar; }`
<alyssa>
and then a shader `foobar = foobar * foobar + foobar;`
<alyssa>
and that compiles to 3 identical loads, instead of just 1
<alyssa>
although this is on the standalone compiler so maybe I'm missing an opt pass that st/mesa calls
<pendingchaos>
nir_opt_cse doesn't do that
<pendingchaos>
nir_opt_copy_prop_vars and nir_opt_load_store_vectorize can though
yoslin has joined #dri-devel
<alyssa>
got it
<alyssa>
thanks
ced117 has joined #dri-devel
cbaylis has quit [Ping timeout: 480 seconds]
iive has quit []
heat has quit [Remote host closed the connection]
<alyssa>
pendingchaos: yep, I was missing a call to nir_opt_copy_prop_vars, thanks 👍
danvet has quit [Ping timeout: 480 seconds]
camus1 has joined #dri-devel
camus has quit [Ping timeout: 480 seconds]
* alyssa
is bringing up support for a new instruction set in her NIR backend
<alyssa>
Any guesses which instruction set? 😋
<imirkin_>
x86
<imirkin_>
intel 4004!
<daniels>
Morello
<alyssa>
imirkin_: is the winner, 4004
flto has quit [Ping timeout: 480 seconds]
<imirkin_>
w00t! what do i win?
<alyssa>
fish
<imirkin_>
a brand new 4004 as my desktop? :)
* alyssa
slaps around imirkin_ with a large trout
flto has joined #dri-devel
<imirkin_>
Max. CPU clock rate740-750 kHz
<jekstrand>
alyssa: It will cse if you use vars
<imirkin_>
that's basically the speed we get on nvidia anyways...
<alyssa>
jekstrand: Yeah.. I'm wondering if we need a helper to call all the common frontend NIR passes in the right order
<alyssa>
since all the standalone compilers are cargoculting them and getting them subtly wrong compared to mesa/st, never mind the vk drivers..
<jekstrand>
alyssa: *sigh* Maybe we should.
<jekstrand>
I've been trying to avoid it but things may be getting out of hand....
<alyssa>
jekstrand: Why avoid it?
<alyssa>
(Next bug on that list, "(foobar * -foobar) + foobar" is getting compiled as "fmul + fsub" instead of "fneg + ffma", in what subtle way did I screw up this time..)
<jekstrand>
zmike: Jenkins is happy with my subgroup MR. Let's give it 24 hours or so in case cwabbott or someone else wants to NAK or suggest a different approach. Then feel free to marge (or I will)
<zmike>
jekstrand: 👍
<jekstrand>
alyssa: My approach with NIR has always been "here is a toolbox for you to build a compiler" rather than "here is a compiler for you to plug into".
<jekstrand>
Not every optimization or lowering makes sense for everybody and different drivers may want different heuristics.
<alyssa>
jekstrand: If that approach were respected, mesa/st would do no optimization whatsoever, and almost no lowerings, and our compilers would consume the garbage that glsl emits
<alyssa>
that mesa/st does a ton of generic stuff, and each vk driver (not the compilers hooked up to the vk driver) does a ton of stuff, says there is a common subset? 🤷
<jekstrand>
alyssa: Yeah. I was pretty annoyed when that much common stuff was added to st/mesa. :P
<alyssa>
(Maybe you disagree with mesa/st running algebraic opts)
<alyssa>
--ah. k.
<jekstrand>
alyssa: But I may be swinging around.
<jekstrand>
alyssa: I just don't want more structs of doom if we can avoid them.
<alyssa>
Nod.
<jekstrand>
Like I think you could pull most of brw_nir_optimize() out into a helper and it would be fine.
pcercuei has quit [Quit: dodo]
<jekstrand>
There's a weird lower_flrp thing in there you might want to pull and the strange opt_peephole_select for vec4 tessellation stuff.
<jekstrand>
And the loop_indirect_mask
<alyssa>
can lower_flrp burn in a fire
<jekstrand>
So maybe it wouldn't be a struct of doom
mbrost has quit [Ping timeout: 480 seconds]
<alyssa>
jekstrand: relatedly, I hate that mesa/st calls nir_opt_peephole_select
<jekstrand>
I'm inclined to say maybe it can. It helped a small portion of shader-db by an average of 3.5 instructions per shader.
<jekstrand>
alyssa: See! Common optimizations are bad! >:-P
<idr>
ouch
<alyssa>
jekstrand: nir_opt_peephole_select is an exception to the rule
<alyssa>
in that it is inherently a _heuristic_ that depends entirely on hw behaviour
<jekstrand>
alyssa: Everything's a heuristic but I do get what you mean.
<jekstrand>
That one's particularly bad
<alyssa>
dead code elimination is not a heuristic
<alyssa>
:-p
<idr>
I guess it's time to bash idr code again.
<jekstrand>
I'd be fine with keeping nir_lower_flrp and just pulling it out of brw_nir_optimize() too.
<jekstrand>
We have code in there to ensure that it only runs once but it's in the loop which is a bit weird.
<idr>
That seemed the least painful way to get it properly ordered with other passes.
<idr>
Like opt_algebraic.
<jekstrand>
Do you remember what the ordering constraints were?
<jekstrand>
For instance, could we shut off the opt_algebraic lowering and run it after the first full brw_nir_optimize?
<idr>
Most passes should happen both before and after.
<jekstrand>
Yeah, but brw_nir_optimize gets called a minimum of 3 times in your average compile.
<idr>
Because it creates a bunch of new arithmetic that can be algebraic, dead code, CSE, etc. optimized.
<idr>
I guess I put it in the place that seemed most logical, and I made it only run once as an optimization.
<idr>
At the time, there didn't appear to be any value in doing differently.
<jekstrand>
That's fair. Thinking about how to make a generic opt loop changes that calculus a bit, I think.
mbrost has joined #dri-devel
<idr>
Other Gallium drivers use nir_lower_flrp too.
* zmike
sweats nervously
<jekstrand>
I'm not saying they shouldn't. Just that it seems like something we could say "call yourself" to avoid an options struct of doom on nir_opt_common().
<idr>
I feel like that ship has already sailed. :(
<idr>
There are already a huge pile of .lower_foo (including .lower_flrp##) that both trigger the lowering passes and prevent other (usually algebraic) optimizations.
<idr>
I don't love it either.
<jekstrand>
What can I say? It's one of my favorite windmills. :)
<alyssa>
I'm always good to NAK new .lower_foo options ...
<idr>
It seems like there's not very much (practical) middle ground between "it's toolbox, do it yourself" and "check all the customization boxes."
<idr>
It's very, very steep s-curve.
Hi-Angel has quit [Ping timeout: 480 seconds]
<mattst88>
I don't think I comprehend what's the practical difference
<jekstrand>
I guess we already have nir_shader_compiler_options::lower_flrpN so maybe we should just use that?
<jekstrand>
Some days, I kind-of wonder if we don't want to just have a bitset of what nir_op you support and call it a day.
<idr>
mattst88: One bit of shared code with a billion options to control it vs. copy-and-paste-and-modify for every driver.
<jekstrand>
I think if you find yourself changing the flow for different drivers, it'd be a problem
<jekstrand>
But if everyone wants the same passes in the same order just with slightly different knobs, knobs are probably ok.
<idr>
jekstrand: I seem to recall that we already have that too.
<idr>
Except the different drivers are Iris vs. i965 and ANV.
<jekstrand>
idr: I think there is a pass somewhere that does that, yes. (-:
<jekstrand>
idr: We don't change brw_nir_optimize() per-driver.
<idr>
No, but we do end up going through shared st code that does some optimizations in some different orders.
<idr>
...
<jekstrand>
*sigh* Yeah
<idr>
by virtue of not calling some that brw_nir_optimize does.
<mattst88>
welp, nir had a good run
<mattst88>
time for nnir
<zmike>
too many other letters
<zmike>
how about just nnnnnnnnn
<jekstrand>
"better IR". It leaves all the others out in the cold...
<alyssa>
jekstrand: already use BIR for bifrost
<alyssa>
mattst88: Humans clearly suck at coming up with IRs so we're now presenting nnir, which is powered by neural networks
<mattst88>
n²ir
<mattst88>
alyssa: I'm sold
<jekstrand>
Now, not only does no one know how the whole compiler works, no one even knows how the patches are written!
<alyssa>
jekstrand: Of course we know it's written by GitHub Copilot trained on LLVM.
<jekstrand>
lol
<mattst88>
monkeys banging on keyboards, using github copilot
<ccr>
:O
<jekstrand>
alyssa: So, we need to write a JIT to produce NIR to run NN in shaders to produce a compiler to compile shaders....
<alyssa>
jekstrand: yo, i heard you like shaders..
<ccr>
add shaders and STIR
<jekstrand>
alyssa: While we're at it, can we have the NN convert Mesa to rust?
<alyssa>
jekstrand: I was just going to use Microsoft Pilot for that...