<dschuermann>
airlied: there is two options: either clover can pass max_workgroup_size in case of variable, or it already calculates a suitable max depending on the required shared memory (otherwise, aco will have to do that)
sarnex has quit [Quit: Quit]
sarnex has joined #dri-devel
airlied has quit [Remote host closed the connection]
sdutt has quit [Read error: Connection reset by peer]
airlied has joined #dri-devel
dj-death has joined #dri-devel
mvlad has joined #dri-devel
Duke`` has quit [Ping timeout: 480 seconds]
alanc has quit [Remote host closed the connection]
alanc has joined #dri-devel
fxkamd has quit []
lemonzest has joined #dri-devel
shankaru has quit [Quit: Leaving.]
<neonking>
hello everyone o/ i just made a mesa 22.0.1 build with libglvnd enabled but i'm not really getting the point of libglvnd
<neonking>
as i am running on BSD, i'm not expecting to run binary proprietary drivers, so is it really worth it ?
tzimmermann has joined #dri-devel
<neonking>
also, the point of my mesa build was to make a wayland-enabled build so i disabled x11 related options, making me wondering even more if libglvnd necessary||useful ?
jkrzyszt has joined #dri-devel
nchery has quit [Read error: Connection reset by peer]
JohnnyonF has joined #dri-devel
i-garrison has quit []
i-garrison has joined #dri-devel
jfalempe has joined #dri-devel
shankaru has joined #dri-devel
JohnnyonFlame has quit [Ping timeout: 480 seconds]
nchery has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
jkrzyszt has quit [Remote host closed the connection]
ahajda has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
dj-death has quit [Ping timeout: 480 seconds]
rpigott has quit [Ping timeout: 480 seconds]
rasterman has joined #dri-devel
Gorg has quit [Read error: Connection reset by peer]
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
rpigott has joined #dri-devel
jkrzyszt has joined #dri-devel
maxzor has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
shankaru has quit [Quit: Leaving.]
pcercuei has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
thellstrom has joined #dri-devel
apinheiro has joined #dri-devel
Daanct12 has quit [Quit: Leaving]
dj-death has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
dviola has joined #dri-devel
i-garrison has quit [Read error: Connection reset by peer]
i-garrison has joined #dri-devel
shankaru has joined #dri-devel
<bbrezillon>
kusma, jenatali: any objection to merging !15911? I have a bunch of other MRs depending on this one...
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
<jenatali>
bbrezillon: ugh notifications got disabled for me on that one. I can re-review in a few hours when I'm awake for real
itoral has quit [Remote host closed the connection]
<kusma>
bbrezillon: looks reasonable enough to me...
itoral has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
<bbrezillon>
jenatali: np, it can wait a few more hours ;-)
<bbrezillon>
kusma: thanks
rkanwal has joined #dri-devel
maxzor has quit [Ping timeout: 480 seconds]
jimjams has quit [Quit: Connection closed for inactivity]
flacks has quit [Quit: Quitter]
flacks has joined #dri-devel
icecream95 has quit [Ping timeout: 480 seconds]
pallavim has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
mclasen has joined #dri-devel
jkrzyszt has quit [Remote host closed the connection]
alatiera5 has quit []
alatiera5 has joined #dri-devel
jkrzyszt has joined #dri-devel
alatiera5 is now known as alatiera
<HdkR>
Anyone know if there were any DRM ioctl additions between kernel 5.16 and current 5.18-rc3? I've been busy this last month and haven't had the time to look.
<HdkR>
I'm sort of guessing virtio had some changes, but other than that?
shankaru has quit [Quit: Leaving.]
Company has joined #dri-devel
The_Company has joined #dri-devel
The_Company has quit []
Company has quit []
Company has joined #dri-devel
shankaru has joined #dri-devel
slattann has quit [Ping timeout: 480 seconds]
slattann has joined #dri-devel
rgallaispou1 has joined #dri-devel
rgallaispou has quit [Remote host closed the connection]
shankaru has quit []
shankaru has joined #dri-devel
MajorBiscuit has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
lemonzest has quit [Quit: WeeChat 3.4]
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
FireBurn has quit [Quit: Konversation terminated!]
<karolherbst>
jekstrand: yeah... dunno if 26 fails are better or not as with 128 read only images you get a full profile and some things might get tested in more depth
<karolherbst>
but if you have something which works, I can take a look and check
itoral has quit [Remote host closed the connection]
<karolherbst>
airlied: nice
itoral has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
<alyssa>
Dumb floating point question
<alyssa>
Is f2f16(f_32(f2f32(x), f2f32(y))) necessarily equal to f_16(x, y)?
<alyssa>
(where f is fsub in this case but I don't know if that matters)
<alyssa>
tarceri_: ^ investigating the bump shaders
<karolherbst>
alyssa: it's not, question is, if it matters
<karolherbst>
like imagine x and y being in this "normal in f32 but subnormal in f16 domain"
<karolherbst>
alyssa: f_f32 stands for any float op, correct?
<daniels>
jenatali: hmm, I'm not sure when this started happening, but windows-vs2019 builds are now pulling & building zlib patches with every build, rather than it being available system-wide ... is that new or am I just hallucinating?
<karolherbst>
anyway.. precision matters for floats and stuff can be weird. but GL generally doesn't care
<jenatali>
daniels: not new AFAIK
<karolherbst>
things can get even weirder if you have 1 / 0.00000000x (which is 0.0 in f16, but not 0 in f32 and the fp16 ops generates NaN)
<daniels>
jenatali: fair enough
<jenatali>
daniels: we could (and maybe should) add it to the container rather than pulling it at build time
itoral has quit []
<daniels>
jenatali: I only noticed because pulling it appears to be heroically slow
<daniels>
like, about as long as the entire Mesa compilation
<daniels>
(could just be a one-off)
<alyssa>
karolherbst: aaah, right..
<jenatali>
Ooh fun
<karolherbst>
alyssa: but we do a lot of optimizations with floats where the result does change, but things are fine, because people still see the "correct" thing
<alyssa>
yeah but some of these could cause unnoticable differences that would cause a trace job to fail, yeah?
<karolherbst>
that's why all of this is quite hard to CI, because things does change, and people have to check if anything broke
<karolherbst>
alyssa: sure, that's why people check the results
<jenatali>
daniels: of course adding it to the container requires rebuilding llvm currently, which is also heroically slow
<alyssa>
tarceri_: ^^ I think you should update the checksum, then
<alyssa>
Slight scheduling differences lead to differences in FP16 vs FP32 use in that shader, and apparently that's not ok
<karolherbst>
uhh
<karolherbst>
scheduling as in "it affects future opts"?
<karolherbst>
because normally scheduling itself shouldn't affect that stuff
<karolherbst>
or at least I would assume it doesn't
<alyssa>
karolherbst: I thought the hw had a "fp16 or fp32" slot
<alyssa>
actually it's an fp32-only slot, that lets you fold in f2f32 on the inputs and f2f16 on the output for free
<karolherbst>
ahh
<karolherbst>
"fun"
<alyssa>
which means the scheduler is turning "fsub16(x, y)" into "f2f16(fsub32(f2f32(x), f2f32(y)))" and while that's legal in GLES it's not necessarily the same
<karolherbst>
yeah..
<karolherbst>
for GL that's probably totally fine
<karolherbst>
not sure about CL :) but you can also just say you support round to nearest but non of that denormal business
<karolherbst>
not sure how well that would work for the fp16 ext though
<alyssa>
if we wanted conformant CL on this chip (we don't), we'd disable this optimization
<alyssa>
or rather, gate it on !exact and pipe through exact into the IR
<karolherbst>
yeah.. probably
<alyssa>
^backend IR
<alyssa>
tomeu: It does raise some questions about the validity of tracie on t860
<tomeu>
well, right now what it does is to warn you of a possibly unintended change in rendering
<tomeu>
it cannot figure by itself if the change is "good" or "bad"
<alyssa>
sure, I know what the behaviour is, but it raises questions about whether it's acceptable to have midgard+tracie+fp16 in pre-merge CI
<tomeu>
we will need a morality ISA extension in CPUs before then
<alyssa>
(if correct changes to common NIR passes can change things in such a way that the scheduler does something slightly different and then the checksum changes imperceptibly)
<tomeu>
well, how does the value of the capacity to know of unintended changes in rendering compare to the hassle of reacting to them?
<daniels>
well, we could force it to be no-fp16, but then it's only testing things people don't run
<alyssa>
tomeu: TBD.
<alyssa>
This particular issue wasn't on my radar when we started doing trace testing, and in all that time, this is the first time it's caused a possible issue
<alyssa>
So maybe it's irrelevant
<alyssa>
Maybe we want fuzzy image comparisons (like dEQP) instead of straight up checksums, though...?
<tomeu>
hard to say, up to the maintainer of the driver I would say
<tomeu>
we can do fuzzy without much problems, but then we are just changing the problem slightly, but not the fundamental problem
<alyssa>
(In the case under discussion, tracie already says 0 failed pixels and it's unclear what changed at all to my human eyes.)
<tomeu>
but if we can reduce the hassle while maintaining the value, then it could be a good thing to do
<tomeu>
yeah, sorry, that's a bug I should fix
<alyssa>
what bug..?
<tomeu>
the JS that compares the image has a tolerance value, but the job fails without any tolerance
<alyssa>
Ahh, right
tony[m]12 has joined #dri-devel
Daaanct12 is now known as Danct12
sdutt has joined #dri-devel
tony[m]12 has left #dri-devel [#dri-devel]
sdutt has quit []
sdutt has joined #dri-devel
rgallaispou has joined #dri-devel
<neonking>
gentle ping about some pointer to libglvnd role ?
<neonking>
maybe i'm asking the wrong place?
rgallaispou1 has quit [Ping timeout: 480 seconds]
agd5f has quit [Read error: Connection reset by peer]
agd5f has joined #dri-devel
shankaru has quit []
tlwoerner_ has quit []
tlwoerner has joined #dri-devel
Emmy_ has quit [Remote host closed the connection]
Emmy_ has joined #dri-devel
tonyk has quit []
tonyk has joined #dri-devel
alyssa has left #dri-devel [#dri-devel]
<marex>
robertfoss: thanks for the lt9211 review
<robertfoss>
marex: no worries, sorry about beeing a bit slow to get started.
<marex>
robertfoss: no worries
<robertfoss>
marex: it's in better shape than most drivers :)
<marex>
MESA-LOADER: failed to open swrast: /usr/lib/dri/swrast_dri.so: cannot open shared object file: Permission denied (search paths /usr/lib/dri, suffix _dri)
<marex>
what ?
<marex>
chromium what kind of oddity is this new
<marex>
*now
Haaninjo has joined #dri-devel
neonking has quit [Remote host closed the connection]
lemonzest has joined #dri-devel
alarumbe has joined #dri-devel
jewins has joined #dri-devel
neonking has joined #dri-devel
markyacoub has left #dri-devel [#dri-devel]
ella-0_ has joined #dri-devel
<jekstrand>
airlied: \o/
ella-0 has quit [Read error: Connection reset by peer]
<ajax>
neonking: the other thing glvnd (will) get used for is maintaining the drivers for pre-dx9 hardware like first-gen radeons and stuff
<ajax>
neonking: but yeah, if you're building your own everything anyway then glvnd doesn't win you anything, you can safely build without it
<ajax>
neonking: that said, i'm pretty sure you can get nvidia drivers for freebsd still, so if that's the bsd in question and you're planning on sharing this build and/or its recipe with others, maybe keep it
<neonking>
ajax, thanks for the detailed answer ! I'm running OpenBSD here, so unfortunately proprietary drivers aren't a thing. Trying to get a wayland enabled mesa here :)
fxkamd has joined #dri-devel
<jekstrand>
karolherbst: I got ahold of a friend on the CL team at Intel who's going to see if their driver does something funny for those normalized sampling tests.
<karolherbst>
cool :)
<karolherbst>
jekstrand: do you have a branch with all those sampler/image changes btw? I want to clean up my branch a little
alyssa has joined #dri-devel
<alyssa>
jekstrand: I kind of want to make nir_block_worklist generic and stick it in util/ (or compiler/util/)
<jekstrand>
karolherbst: rusticl/wip
<alyssa>
I keep open coding work list data structures, wrongly.
<alyssa>
it just treats nir_block as a black box *except* for ->index
<alyssa>
but maybe some macro abomination can workaround that
<alyssa>
and then just `#define nir_block_worklist_push_head(w, block) u_block_worklist_push_head(w, block, nir_block, index)`
<alyssa>
again like we do with util/list
<alyssa>
that data structure makes liveness analysis a lot nicer, for example
<alyssa>
my drivers use sets which is dubious and bit me last night, ir3 is lazy about progress and thus does way more work than it should,
<jekstrand>
alyssa: I guess I'm fine with it.
<jekstrand>
alyssa: I'd call it u_worklist, not u_block_worklist, though. No reason why it needs to be blocks. :)
<alyssa>
lima/ppir is lazy like ir3
<alyssa>
v3d ""
<jekstrand>
alyssa: The only question I have is if we should assume a uint32 index or if we should have a u_worklist_link and then have nir_block have a `union { struct u_worklist_link worlist_link, uint32_t index };`
<alyssa>
honestly dunno what aco does I can't read aco code
khfeng has quit [Ping timeout: 480 seconds]
<alyssa>
I think intel is lazy, at least I don't see an explicit worklist
<alyssa>
...point is, lots of drivers would benefit from u_worklist :-p
<dschuermann>
we usually just use an index and have the blocks strictly enumerated
* alyssa
blinks
<dschuermann>
and all blocks are in an std::vector with program->blocks[block_idx] ;)
<alyssa>
sure... what do you use for a worklist though? a BITSET?
<dschuermann>
no, just the index. updating the index becomes something like worklist = std::max(worklist, preds[i])
<alyssa>
so with complex CF you reprocess some blocks unnecessarily?
<dschuermann>
I think in loops, we could skip nested cf as long as phis stay untouched, but otherwise... meh
<dschuermann>
maybe I can change it to idom on second iterations
<alyssa>
NIR's algorithm seems pretty reasonable
neonking has quit [Remote host closed the connection]
<dschuermann>
I think with a second index, you can entirely remove bitset or set. just keep track of what has yet to be visited
<karolherbst>
alyssa: why did I read the CL spec :p
<jekstrand>
alyssa: Because it's useful, sadly.
<karolherbst>
it's annoying that most of the ieee stuff even has a good reason for being like that
<karolherbst>
jekstrand: we might need to revisit adding a "CL vs GL precision rules" flag or something :( it's just annoying that CL allows you do optimize according to weaker precision
<karolherbst>
well.. if you specify the compiler flag
<karolherbst>
there is also like "-cl-no-signed-zeros" :(
elongbug__ has quit [Read error: Connection reset by peer]
<karolherbst>
so you can even decide what opts to turn on
elongbug__ has joined #dri-devel
Guest2307 has quit [Ping timeout: 480 seconds]
i-garrison has quit [Remote host closed the connection]
i-garrison has joined #dri-devel
<cwabbott>
karolherbst: you know we have the same thing in vulkan, right?
<tango_>
FP contraction should be a different thing from inf/nan/signed-zero handling though, isn't it?
<tango_>
(and from rounding modes)
<karolherbst>
yeah.. should be
<karolherbst>
I need to take a look at all the details here at some point and do the right thing (tm)
<jekstrand>
Yes, but those tests trick us with redistribution rules.
<alyssa>
jekstrand: the other question with u_worklist is that it's designed for IRs that store blocks as a linked list
<karolherbst>
oh wow.. now I crashed my machine with -j4
<alyssa>
for backends that don't do much CF manipulation, it's more efficient to store as an array, in which case some of the complexity of nir_worklist wouldn't be needed
<alyssa>
(Namely, storing pointers to the blocks, as opposed to just a queue of integer indices)
<alyssa>
(and in that case there's no need for any macros/templating/etc)
elongbug_ has joined #dri-devel
<karolherbst>
jekstrand: I just hope that this clamping bug isn't like terribly annoying to fix
<alyssa>
OTOH, there's not much overhead from doing it nir_worklist style even if you have an array of blocks
<karolherbst>
like "you have to calculate coords in the kernel" kind of annoying
<karolherbst>
that's something I found inside intels stack
<karolherbst>
might be a good hint on what might go wrong :)
<karolherbst>
like maybe the filtering filter is always low for mesa
<karolherbst>
but besides that there doesn't seem to be anything they are doing... unless I missed it
<jekstrand>
karolherbst: Pretty sure iris sets to always high. :-/
<karolherbst>
ahh :(
<karolherbst>
maybe we have to set low in some cases
<karolherbst>
:D
<karolherbst>
seems like a debug var though
<karolherbst>
:(
<jekstrand>
We set it to FULL. I'll play with others.
tzimmermann has quit [Quit: Leaving]
<jekstrand>
karolherbst: Yeah, all modes there fail
<karolherbst>
:(
gouchi has joined #dri-devel
<karolherbst>
jekstrand: is anything of that implemented inside shaders? like coords adjustments? Or is that all the hw?
<jekstrand>
Should all be HW
<karolherbst>
mhh
<jekstrand>
OpenCL meeting is going on right now. I'm hoping to hear from bashbaugh after it's over.
<karolherbst>
cool
aravind has quit [Ping timeout: 480 seconds]
<cwabbott>
karolherbst: looks like contraction is per-kernel and zero/inf/nan is per-instruction
<karolherbst>
yeah.. possibly
<cwabbott>
except the intel extension allows contraction per-instruction, I think
<cwabbott>
that's annoyingly different to how nir works (which is derived from vulkan float_controls)
<karolherbst>
it's a bit hard to care about per instruction zero/inf/nan if nothing uses it though :) Although I would be interested on how that looks like in the spirv tbh
<karolherbst>
like if you do the global falg
<karolherbst>
*flag
<cwabbott>
right, sounds like you still need to conseratively derive the global flag from the per-instruction flag
<cwabbott>
or sth like that
dllud has quit [Ping timeout: 480 seconds]
<karolherbst>
yeah.. dunno
<alyssa>
me: Now that I have u_worklist in bifrost, I can make this data flow pass so much simpler
<alyssa>
also me: Wait this pass doesn't need a worklist at all, just a trivial recursion
<alyssa>
Good work Alyssa from 2020
dllud has joined #dri-devel
stuartsummers has quit []
shankaru has quit [Quit: Leaving.]
MajorBiscuit has joined #dri-devel
mclasen has quit [Remote host closed the connection]
dllud has quit [Ping timeout: 480 seconds]
mi6x3m has joined #dri-devel
<mi6x3m>
hey friends, can anyone tell me where libGL is linked against libgallium-dri?
<mi6x3m>
or does it happen during runtime?
<anholt>
dri drivers are dlopened by GL at runtime.
dllud has joined #dri-devel
<mi6x3m>
seems to be different for gallium because it statically links all drivers
<mi6x3m>
or does it just delegate?
<anholt>
what I said is true, and all the gallium drivers are linked together into a single dri driver to save disk space.
<mi6x3m>
ok, so it all goes through loader.c as the old drivers?
<jekstrand>
karolherbst: That's what I did for panfrost and, given that iris uses memcpy for buffer_subdata, I don't figure it's worse. So maybe all that blorp code I wrote isn't really needed.
<jekstrand>
Still need to figure out clear_texture on panfrost but that probably does need to happen on the GPU.
Kayden has quit [Quit: reboot]
eletrotupi has quit [Remote host closed the connection]
clever has quit [Ping timeout: 480 seconds]
<mi6x3m>
anholt, I see it now, libgallium_dri is linked to lib<driver>_dri.so
<mi6x3m>
in install_megadrivers.py
<mi6x3m>
so when you load lib<driver>_dri.so you always load libgallium_dri and you get the respective config
mbrost has joined #dri-devel
eletrotupi has joined #dri-devel
<karolherbst>
jekstrand: cool
LexSfX has quit [Ping timeout: 480 seconds]
LexSfX has joined #dri-devel
mclasen has joined #dri-devel
Kayden has joined #dri-devel
<anholt>
whee, looks like my MR is going to time out on windows again.
shankaru has joined #dri-devel
dllud has quit [Read error: Connection reset by peer]
ybogdano has quit [Ping timeout: 480 seconds]
clever has joined #dri-devel
<alyssa>
wee :(
<karolherbst>
jekstrand: ohh.. I found something interesting in the compute runtime..
<Kayden>
Yeah, I've built IGC with upstream LLVM...11. It's been a while since I've looked at it.
<Kayden>
IIRC they have a patched version that tunes some thresholds like copy propagation passes to be tuned less for CPUs with limited registers and toward GPUs with more, and things like that, but they're not necessary to run
<airlied>
so all vals < 0 get rounded to -1.0?
<jekstrand>
airlied: Aparently
<jekstrand>
I can wire that up in iris but eewww
<karolherbst>
mhhh
<jekstrand>
I think iris is about to grow a new magic system value. Kayden, I'm sorry.
<karolherbst>
LOL
<karolherbst>
jekstrand: btw.. for CL runtimes we declare the COMPUTE_ONLY screen flag or whatever it's called
<karolherbst>
not sure if that's usefull here or if we have to change the sampler API
<karolherbst>
probably don't want that for GL
<karolherbst>
or vk
<jekstrand>
Yeah, ver much no.
<jekstrand>
*very
<jekstrand>
But, also, kind-of meh. It's only for rectangle textures.
<jekstrand>
hrm...
<jekstrand>
Yeah, I think we only want this for CL
iive has joined #dri-devel
<jekstrand>
I may be able to key it off SHADER_STAGE_KERNEL
<karolherbst>
probably
<karolherbst>
yeah.. sounds like the better idea
<Kayden>
I'm not finding where this is actually applied
<jenatali>
Huh. Yeah nevermind. Guess I'd just assumed it said zero when I hooked it up
<karolherbst>
maybe it's phrased like "OOB image accesses return the color 0,0,0,0" or something
<karolherbst>
ahh
<airlied>
like it seems like GL should be able to hit that case as well
shankaru has quit []
<karolherbst>
the OpenCL C spec has more to say
<karolherbst>
airlied: GL probably doesn't care
<karolherbst>
Kayden, jenatali: "CLK_ADDRESS_CLAMP - out-of-range image coordinates will return a border color [66]" 66: "This is similar to the GL_ADDRESS_CLAMP_TO_BORDER addressing mode." :D
<airlied>
ah yeah clc already gets that right I think
<karolherbst>
it does
<karolherbst>
so now how to fix test_compiler features_macro ...
<karolherbst>
I fixed the header, but now it fails for fp64 :(
<alyssa>
panfrost has a lot of code to allow configuring the stride of linear images explicitly in vulkan... I can't find where Vulkan allows this though..?
<karolherbst>
"error: options cl_khr_fp64 and __opencl_c_fp64 are set to different values" :(
<karolherbst>
airlied: is there an API to disable CL exts in clang?
ybogdano has quit [Ping timeout: 480 seconds]
<karolherbst>
I don't even know what enables cl_khr_fp64
<airlied>
karolherbst: I think that's part of the pain of -base.h stuff
<karolherbst>
it's not
<karolherbst>
well.. fp64 isn't
<karolherbst>
for the other fails I fixed the header
<airlied>
I've vague memories of tracing through that one, but no ideas of what it was now
<karolherbst>
but something enables the fp64 ext and it's not the header
<karolherbst>
and I checked with clinfo that I don't advertise the ext either