ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar
<karolherbst>
ohh no.. I found the bug with sub buffers
TJ_Mercier has joined #dri-devel
nchery is now known as Guest1724
nchery has joined #dri-devel
mbrost_ has quit [Ping timeout: 480 seconds]
Guest1724 has quit [Ping timeout: 480 seconds]
khfeng has joined #dri-devel
ybogdano has quit [Ping timeout: 480 seconds]
rkanwal has quit [Ping timeout: 480 seconds]
mbrost has joined #dri-devel
iive has quit []
co1umbarius has joined #dri-devel
mhenning has joined #dri-devel
columbarius has quit [Ping timeout: 480 seconds]
<karolherbst>
sub buffers fixed :)
apinheiro has quit [Ping timeout: 480 seconds]
danct12_ has joined #dri-devel
heat has joined #dri-devel
danct12_ has quit []
danct12_ has joined #dri-devel
danct12_ has quit []
Daanct12 has joined #dri-devel
nchery has quit [Read error: Connection reset by peer]
nchery has joined #dri-devel
<jekstrand>
karolherbst: \o/
nchery has quit [Read error: Connection reset by peer]
rpigott has quit [Remote host closed the connection]
nchery has joined #dri-devel
rpigott has joined #dri-devel
sdutt has quit [Read error: Connection reset by peer]
maxzor has quit [Ping timeout: 480 seconds]
<karolherbst>
jekstrand: soo.. slowly I am running out of stuff to fix :)
mbrost has quit [Read error: Connection reset by peer]
<karolherbst>
two things which is hit by a lot of tests: 1. host ptr support for array and 3d images 2. use_host_ptr sometimes fails on weird alignments
shankaru has quit [Read error: Connection reset by peer]
abhinav__ has joined #dri-devel
abhinav__5 has joined #dri-devel
abhinav__ has quit []
abhinav__5 has quit []
abhinav__5 has joined #dri-devel
abhinav__ has joined #dri-devel
paulk has joined #dri-devel
MajorBiscuit has joined #dri-devel
Major_Biscuit has joined #dri-devel
<airlied>
karolherbst, jekstrand : okay I've rebased by amd nir compute backend support, prelim nir/clover patches are in 15876
<airlied>
no images 9 out of 95 basic passing
MajorBiscuit has quit [Ping timeout: 480 seconds]
rpigott has quit [Read error: Connection reset by peer]
jkrzyszt has joined #dri-devel
rpigott has joined #dri-devel
OftenTimeConsuming has quit [Remote host closed the connection]
OftenTimeConsuming has joined #dri-devel
pnowack has joined #dri-devel
JohnnyonFlame has quit [Read error: Network is unreachable]
JohnnyonFlame has joined #dri-devel
<airlied>
dolphin: drm-next is pushed out now
<dolphin>
airlied: thanks, will send early -fixes PR tomorrow due to easter holidays
lynxeye has joined #dri-devel
<dolphin>
last week there were none picked up, this week there is one patch
maxzor has joined #dri-devel
apinheiro has joined #dri-devel
<remexre>
does anyone know what's blocking VK_EXT_acquire_drm_display support on more drivers? I haven't really dug into the mesa code before, but from c8ed5ac206a7 and 2fe2eb1911f4 it *looks* pretty trivial
<remexre>
(and if I'm not missing something and it is pretty trivial, is this something that'd be a good first commit?)
shashanks has joined #dri-devel
shashank_s has joined #dri-devel
shashank_sharma has quit [Ping timeout: 480 seconds]
shashanks has quit [Ping timeout: 480 seconds]
<emersion>
yes it should be pretty trivial to wire up
<emersion>
yup, would be a good first contribution
<emersion>
feel free to CC me for a review
<remexre>
okay, thanks!
mvlad has joined #dri-devel
seanpaul has quit [Ping timeout: 480 seconds]
seanpaul has joined #dri-devel
rasterman has joined #dri-devel
i-garrison has quit [Read error: Connection reset by peer]
i-garrison has joined #dri-devel
Company has joined #dri-devel
rkanwal has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
shashank_s has quit [Ping timeout: 480 seconds]
mclasen has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
Kayden has quit [Read error: Connection reset by peer]
Kayden has joined #dri-devel
itoral has quit [Remote host closed the connection]
flacks has quit [Quit: Quitter]
flacks has joined #dri-devel
devilhorns has joined #dri-devel
libv has quit [Read error: Connection reset by peer]
<daniels>
marex: I'm not involved at all with bridge stuff; pinchartl is the one who mostly deals with that
shashank_sharma has joined #dri-devel
shashank_s has joined #dri-devel
shashanks has quit [Ping timeout: 480 seconds]
shashank_sharma has quit [Ping timeout: 480 seconds]
devilhorns has quit [Remote host closed the connection]
devilhorns has joined #dri-devel
MajorBiscuit has joined #dri-devel
Major_Biscuit has quit [Ping timeout: 480 seconds]
Major_Biscuit has joined #dri-devel
MajorBiscuit has quit [Ping timeout: 480 seconds]
tango_ has quit [Ping timeout: 480 seconds]
tango_ has joined #dri-devel
sdutt has joined #dri-devel
<zmike>
MrCooper: looking into that log error, it seems like I'd need to programmatically #define the string name in meson
<zmike>
but my meson-fu only extends to static #defines
<MrCooper>
not sure offhand how best to deal with that, sorry
<zmike>
yeah I'm writing it into the ticket
<zmike>
maybe dcbaker can rescue us
Danct12 has quit [Remote host closed the connection]
Danct12 has joined #dri-devel
pcercuei has joined #dri-devel
jewins has joined #dri-devel
apinheiro has quit [Ping timeout: 480 seconds]
mbrost has joined #dri-devel
maxzor has quit [Ping timeout: 480 seconds]
Major_Biscuit has quit [Ping timeout: 480 seconds]
Net147 has quit [Ping timeout: 480 seconds]
ramaling has quit []
ramaling has joined #dri-devel
ella-0 has joined #dri-devel
nchery has quit [Read error: Connection reset by peer]
ella-0_ has quit [Remote host closed the connection]
Net147 has joined #dri-devel
<jekstrand>
airlied: Have you looked at hooking up ACO?
maxzor has joined #dri-devel
ramaling has quit []
ramaling has joined #dri-devel
pcercuei_ has joined #dri-devel
pcercuei has quit [Read error: Connection reset by peer]
Major_Biscuit has joined #dri-devel
pnowack has quit [Quit: pnowack]
<jekstrand>
Kayden: Were you going to do any more review on !15829? You left some detailed comments and I think I addressed them.
<karolherbst>
jekstrand: what's the deal with the binding table in iris? Does it contain like everything? Because I have tests crashing on that having too many entries for the tests using 32 images
<jekstrand>
karolherbst: Yeah, we're going to need to do something about iris' binding tables. I can work on that today, if you'd like.
<jekstrand>
It'll mean some driver surgery but I don't think it'll be that bad.
<jekstrand>
Basically, I need to separate samplers and textures and make it look at variables or something instead of relying on textures_used.
<karolherbst>
mhh, I think it might more sense to fix that stuff for llvmpipe first (because we have to bump mesa core limits anyway)
<jekstrand>
But I'm not sure how much gallium deletes before we get into iris.
<jekstrand>
But iris needs to be the premier CL driver. :P
<karolherbst>
:D
<karolherbst>
sure
<karolherbst>
but llvmpipe has that stuff split already
<karolherbst>
so I'd just make mesa able to support 128 textures (or samplers?) and then we can fix iris
mdroper has joined #dri-devel
<karolherbst>
there are two issues which cause way more fails/crashes anyway: use_host_ptr on arrays + 3D and use_host_ptr failing on weirdly aligned ptrs for buffers and 1d images
<karolherbst>
like if the alignment is 0x1
<karolherbst>
those two things would probably kill ~50% of the fails I still have
<jekstrand>
karolherbst: Yeah, I'm going to try and look at those today too.
<karolherbst>
cool
<jekstrand>
karolherbst: RE: weirdly aligned pointers, Intel HW can't handle writes to images which aren't pixel-aligned. Texturing should work, though.
<karolherbst>
mhh, okay, let me see how the pointer looks like for the 1d image cases
Major_Biscuit has quit [Ping timeout: 480 seconds]
ramaling has quit [Quit: leaving]
<karolherbst>
ohh wait, no, I mistook that one issue for image_buffers not really supported atm
<karolherbst>
so I think use_ptr on 1D images are fine
yogesh_mohan has quit [Ping timeout: 480 seconds]
<karolherbst>
but iris seems to suffer from the same issues llvmpipe does for texture clamping :(
<jekstrand>
karolherbst: I'm going to post the list of failing tests once this CTS run finishes. It's 76% done
<karolherbst>
I can do that as well :p
<karolherbst>
it's already done here
<jekstrand>
Oh, ok then.
<jekstrand>
I've also got a list of open iris MRs on there.
<karolherbst>
yeah. I usually just pull in those commits into my branch
<karolherbst>
list added
shashank_s has quit [Ping timeout: 480 seconds]
<jekstrand>
Thanks!
<karolherbst>
given that's a full CL 3.0 run the list isn't all that huge anymore :)
<karolherbst>
printf is annoying to fix
<jekstrand>
:D
<karolherbst>
so is this linker bug
<karolherbst>
ignoring those two things, it's really not much left anymore
<jekstrand>
One of these days, I need to make nir_lower_conversions stop lowering all the things. Intel can do a bunch of it in in HW.
<karolherbst>
yeah, nvidia can do like everything in hardware as long as you don't step over 32 bit
nchery has joined #dri-devel
<karolherbst>
if you step over 32 you need to do two conversions
<jekstrand>
Right
<jekstrand>
Intel has a similar issue
<karolherbst>
so 64 -> 8 would be 64 -> 32 -> 8
<karolherbst>
okay
<jekstrand>
We can't do 64 -> 8
<jekstrand>
We might be able to do 64 -> 16 but I'm not sure
<karolherbst>
yeah, not sure about that either
<jekstrand>
Well, not on TGL since there is no 64... :sob:
<karolherbst>
maybe we got the same IP blocks :P
<karolherbst>
another thing is fma
<karolherbst>
but...
<karolherbst>
fma is a mess in mesa, so
<jekstrand>
Yeah...
<jekstrand>
That's on the list of things we need to fix for real one of these days.
<karolherbst>
most comes from the assumption that glsl would know fma, but it doesn't
<jekstrand>
Yeah, GLSL is the mess. CL is pretty sane, IMO.
<karolherbst>
yep
<karolherbst>
having two ops is the only sane path here
<jekstrand>
I think so
<karolherbst>
everything else is just hacks over hacks
<jekstrand>
There may also be some need somewhere for something like "exact" but "ieee-correct". We might be able to do it with that, maybe.
Duke`` has joined #dri-devel
<karolherbst>
mhh
<karolherbst>
yeah exact doesn't mean ieee correct
<karolherbst>
again, glsl doesn't know fma, so an "exact" thing just means to stay consistent
<karolherbst>
if your hw can't do fma, then exact is meaningless
<jekstrand>
Yup
<jekstrand>
Not quite meaningless.
<karolherbst>
why not?
<karolherbst>
sure, for other ops
<jekstrand>
Exact means "don't change the value when you optimize" That has a meaning regardless of whether fma is high-precision or not.
<jekstrand>
But, yeah, I think we want separate fma and fmad.
<jekstrand>
*ffma
<karolherbst>
ohh sure, but you can split/merge it all you want
<jekstrand>
yup
<dj-death>
oh crap
<dj-death>
a nir_opt_if() bug :(
<karolherbst>
the best ones
<jekstrand>
dj-death: Ouch
<dj-death>
where is my hack that prints out what nir line modified an instruction
Net147 has quit [Ping timeout: 480 seconds]
<karolherbst>
mhh I think I have to modify clc :(
<jekstrand>
karolherbst: Mind fixing some other stuff while you're at it?
Net147 has joined #dri-devel
<karolherbst>
jenatali: if somebody at MS has some internal CL 3.0 support patches to wire up OpenCLExtensionsAsWritten, now would be the best time to show them :p
<karolherbst>
jekstrand: what exactly?
<jenatali>
Hm?
<karolherbst>
jenatali: yeah so.. we have to disable/enable opencl c feautres via c->getTargetOpts().OpenCLExtensionsAsWritten
<jekstrand>
karolherbst: generic variants of arithmetic built-ins and wait_group_events
<karolherbst>
ohh
<karolherbst>
I meant src/compiler/clc
<jekstrand>
Oh
<karolherbst>
not libclc
* jekstrand
had his hopes up
<karolherbst>
:D
ybogdano has joined #dri-devel
<karolherbst>
jenatali: I guess we can just disable everything?
<swick>
ajax: I heard you use silverblue. Any tips and tricks for kernel development? I'm not really happy with rebuilding a new rpm for every change.
<jenatali>
Yeah, my thinking was to disable everything in the clc code that the LLVM/SPIR-V/NIR compiler stack doesn't support, and then leave it up the frontend to disable stuff that the compiler can support but the frontend doesn't
<karolherbst>
at some point we have to pass in lists with supported things, but atm I don't support anything anyway
<karolherbst>
I guess bad things can happen indeed
<karolherbst>
emit_state as well
<karolherbst>
yeah.. I will try to tackle multithreading stuff over the weekend, this might get a bit annoying as I'd have to start to wrap more of our stuff
<karolherbst>
I turned on Send for Event task once and got ~300 compiler errors :(
<jekstrand>
Ok, according to the little atomic I added, we really are burning 4GB
<jekstrand>
Oof
<jekstrand>
Seems a bit insane
<karolherbst>
:(
<jekstrand>
That's a lot of quite big shaders
<karolherbst>
well
<karolherbst>
it's clc stuff
<karolherbst>
although long math shouldn't cause such huge shaders
<jekstrand>
Have you looked at those kernels? You can't get much simpler.
<karolherbst>
yeah...
<karolherbst>
I don't tink we leak though
<karolherbst>
I see my mem consumption to jump around a little
<karolherbst>
but it never exceeds a certian value
<jekstrand>
Oh, I'm now very sure something's leaking. :)
<jekstrand>
Something around kernels, specifically. I'm just not sure what yet.
<karolherbst>
I doubt it
<karolherbst>
the memory gets freed
gouchi has joined #dri-devel
<karolherbst>
it just uses a lot of memory here
<karolherbst>
but it does run
<karolherbst>
jekstrand: it probably just has a huge buffer for the results
<jekstrand>
That could be
<jekstrand>
No, that's not it
<jekstrand>
It's burning 4GB of shader memory
<karolherbst>
I see the memory consumption to drop by a lot after a subtest finished
<karolherbst>
mhh
* jekstrand
wonders if we're leaking contexts
jessica_24 has joined #dri-devel
mszyprow has quit [Ping timeout: 480 seconds]
<jekstrand>
Or if It's creating a context per test submission, effectively.
<karolherbst>
I see multiple annoying races though
<karolherbst>
soo
<karolherbst>
you know you can't trust reporting, right?
paulk has joined #dri-devel
<karolherbst>
like if the runtime races on the "allocated" value then that can just lead to broken values down the line
<karolherbst>
vma.c also races.. uhh
<dcbaker>
zmike: which issue?
<jekstrand>
vma.c doesn't race. It's protected by a mutex
<karolherbst>
yeah well
<karolherbst>
it does
<jekstrand>
according to what?
* dcbaker
is excited for problems that are not dead firewalls
<karolherbst>
tsan at least reports a few things
<jekstrand>
Looks like the test creates 2037 contexts before dying
<karolherbst>
mhh
<karolherbst>
that would be annoying
<zmike>
dcbaker: daniels beat you to it
<karolherbst>
it could be that we refrence the context somewhere we shouldn't? mhh
<zmike>
get back to that dead firewall
<karolherbst>
jekstrand: context as in pipe_context?
<jekstrand>
karolherbst: I'm checking now to see if it cleans them up
<jekstrand>
karolherbst: Yes, pipe_context
<jekstrand>
Hrm... It says it destroys 2027 of them
<karolherbst>
sounds sane
<karolherbst>
running with tsan makes stuff not crash here, so I'd assume we have some ugly races here and there
<jekstrand>
So I should only have 10 instances of my 6MB scratch buffer live at any given time
<jekstrand>
Uh...Yeah, iris is leaking scratch surfaces for some reason
<jekstrand>
awesome.
<karolherbst>
jekstrand: btw, is iris_bufmgr.c protected?
<karolherbst>
although looks like it
<karolherbst>
werid
<karolherbst>
the tooling around figuring out races is really terribly bad anyway :(
<karolherbst>
valgrinds solution to not knowing that atomics don't race is "just annotate your code, duh"
alyssa has joined #dri-devel
<dcbaker>
zmike: I finally got it working. Spent from 7:30 yesterday till about 22:30 but finally got it working again... mostly
<dcbaker>
still haven't figured out if it was the CPU or the motherboard that died
<zmike>
dcbaker: yikes, that sounds awful
<jekstrand>
karolherbst:
<jekstrand>
long_math passed
<jekstrand>
PASSED sub-test.
<jekstrand>
PASSED test.
<dcbaker>
yeah, kernel panic within 5 minutes of booting
<alyssa>
that gets a fake (mali) render node and then loads the m1 driver as a fake software rast
<alyssa>
which tricks shader-db into using the agx compiler
<alyssa>
despite no render nodes whatsoever
<alyssa>
wait, no, now it's running panfrost.
<alyssa>
gah
TJ_Mercier has quit [Remote host closed the connection]
* alyssa
supposes she should start stubbing a DRM interface for Asahi, now that work is happening on the kernel
frieder has quit [Remote host closed the connection]
jkrzyszt has quit [Ping timeout: 480 seconds]
* alyssa
copies piles of code from panfrost
iive has joined #dri-devel
pendingchaos_ has joined #dri-devel
pendingchaos has quit [Ping timeout: 480 seconds]
Peste_Bubonica has joined #dri-devel
devilhorns has quit []
rpigott has joined #dri-devel
mbrost has quit []
<Kayden>
jekstrand: part of what I was trying to get at with my comment on !15829 is that if OpenCL is using uclz or ifind_msb on non-32-bit datatypes...it may be broken. at least it would be on 64-bit types
<Kayden>
jekstrand: because the nir constant expression handling hardcodes 31
<alyssa>
Ruh roh
<alyssa>
Kayden: good chance to fix int8/int16 too ;)
<Kayden>
those might work
<alyssa>
the lack of an in-tree agx disassembler is really starting to biite
<Kayden>
jekstrand: posted R-b for patches 2-3, I'm not planning to review patch 1, feel free to land it
<alyssa>
am I going to take the isaspec plunge? or open code yet another disasm? idk
<karolherbst>
alyssa: I'd use isaspec
<jekstrand>
Kayden: uh... WAT?
minecrell has quit [Quit: :( ]
<jekstrand>
So it does...
<jekstrand>
Kayden: That's easily fixed
<dj-death>
found the culprit
<dj-death>
opt_if_rewrite_uniform_uses()
<alyssa>
karolherbst: do we have proof of it working for !freedreno yet?
<jekstrand>
Kayden: We also appear to not have int64 lowering for them. :-/
<jekstrand>
idk why the CL tests didn't pick up on that.
<karolherbst>
alyssa: you could be that proof
<karolherbst>
dunno if anybody else is using it, but _if_ I write a new compiler, I'd use them
<karolherbst>
*that
minecrell has joined #dri-devel
<alyssa>
it's overwhelming
<alyssa>
and agx has some weird encoding details to it
<karolherbst>
jekstrand: I guess we could just do a u2u for cl then?
<jekstrand>
Yeah, maybe. Though the lowering for that one is tricky.
<karolherbst>
right...
<karolherbst>
what was our conclusion about textures_used though? airlied said something we probably don't have to bump it, but?
<jekstrand>
I don't know. I've been putting out other fires.
<karolherbst>
smart
<karolherbst>
I'll just reap airlied branches and cherry-pick until it works
<dj-death>
aaaa, looks like the block_index is not preserved on resume shaders
<dj-death>
but why is it not rebuilt?
<jekstrand>
karolherbst: Ok, now that I've sorted out that BO leak, maybe images are next. Where should I start?
<karolherbst>
host ptrs on arrays + 3d?
<jekstrand>
which test hits that?
<jekstrand>
The image_streams test fails seem to be CLAMP_TO_EDGE with normalized coords
ngcortes has joined #dri-devel
<karolherbst>
./build/test_conformance/images/kernel_read_write/test_image_streams write 3D CL_MEM_USE_HOST_PTR
<karolherbst>
no idea why textures work though
<karolherbst>
I mean read images
<jekstrand>
karolherbst: I'm not sure how to even implement USE_HOST_PTR for 3D images reliably. Intel has all sorts of restrictions there.
<karolherbst>
:(
<karolherbst>
what's the problem?
<jekstrand>
Maybe not if it's linear?
* jekstrand
looks
<karolherbst>
jekstrand: also.. is 1d/2d array an issue?
<karolherbst>
although... mhh disabling writes to 3d images also suck
<karolherbst>
anyway.. seems like the CTS doesn't use host_ptr for the read image tests at all
<karolherbst>
jekstrand: mem_host_flags mem_host_read_only_image also fails, but I think it's the same issue, just different
<jekstrand>
Yeah, any array is going to be an issue
<karolherbst>
ehh no, mem_host_read_only_image is a different fail
<karolherbst>
that's I think alignment stuff
<karolherbst>
yeah.. somehting weird is happening there
<karolherbst>
isl_calc_row_pitch fails
<karolherbst>
min_row_pitch_B is 1024, but surf_info->row_pitch_B is 800
ybogdano has quit [Ping timeout: 480 seconds]
<karolherbst>
"api/test_api get_image1d_info" hits this issue as well
<jekstrand>
karolherbst: I don't see how we can implement USE_HOST_PTR for 3D or array images on iris. We either need to lower to basically a buffer or we need to have a shadow copy that the GPU accesses.
<karolherbst>
okay
<jekstrand>
It's literally impossible to program the hardware with a slice pitch that's not a multiple of 4 rows
<karolherbst>
uhh
<jekstrand>
idk how the Intel CL driver does this
<karolherbst>
yeah, dunno either
<karolherbst>
let me check something...
TJ_Mercier has joined #dri-devel
<karolherbst>
it's so strange
<karolherbst>
so there is a query to get the required alignment, but only for 2D images created from a buffer object.. it's so wild
TJ_Mercier has quit []
tjmercier has joined #dri-devel
<jekstrand>
Yeah
tjmercier_ has joined #dri-devel
tjmercier_ has left #dri-devel [#dri-devel]
<jekstrand>
karolherbst: Hrm... I wonder...
tjmercier has quit []
<jekstrand>
OpenCL implementations are allowed to cache the buffer contents pointed to by host_ptr in device memory. This cached copy can be used when kernels are executed on a device.
<karolherbst>
yeah, so that just means we don't have to immediately reflect changes to the host
<jekstrand>
So I think you're still required to map/unmap around CPU usage, it's just that the client basically gave you the map they want.
<karolherbst>
and contents are only at sync at synchronization points
<karolherbst>
yeah, you have to sync explicitly
<jekstrand>
Well, that's the bit that's not clear
<jekstrand>
When are they supposed to be synchronized.
<jekstrand>
The naeve answer would be map/unmap
<karolherbst>
when all maps are dropped
<karolherbst>
it's written in the spec actually
<jekstrand>
Ok
<karolherbst>
or ehh.. when you map
<karolherbst>
"5.5.3. Accessing mapped regions of a memory object"
<karolherbst>
"If a memory object is currently mapped for writing .." and so on
<karolherbst>
those two sections implicitly require this
<jekstrand>
ok
<karolherbst>
okay, so we have to do shadow buffering.. shouldn't be too painful
<karolherbst>
I can hack something up
<jekstrand>
I just pushed to my branch again with an iris commit which makes it gently return NULL if it can't create the resource instead of asserting.
<karolherbst>
jekstrand: so what you should change is to just fail from_user if it's not possible to use
<jekstrand>
So you can go ahead and try to create it with a host pointer and then fall back.
<karolherbst>
ahh, okay
<karolherbst>
cool
<karolherbst>
that's what clover is already doing, but in the past we failed for any non page aligned pointer :)
<jekstrand>
OpenCL is starting to get stupid....
<karolherbst>
yes...
<jekstrand>
karolherbst: Sure. We can do a lot better than that, fortunately. :)
<karolherbst>
those are the things which make you "ahh, that's why clover was doing it like that"
<karolherbst>
jekstrand: k, mind looking into that 1d fail though with the non matching pitch?
<karolherbst>
dunno what's up there
<karolherbst>
but might be fixable?
<karolherbst>
"api/test_api get_image1d_info" is probably the easiest to hit it
<jekstrand>
sure
ybogdano has joined #dri-devel
<dj-death>
does adding a nir_push_if() in a shader preserves the dominance metadata?
<jekstrand>
no
Haaninjo has joined #dri-devel
<jekstrand>
If you alter control-flow at all, throw away all metadata
<karolherbst>
random thought: until today I am actually surprised how well the structurizer holds up.
<jekstrand>
Did it break today?
<karolherbst>
nope
<jekstrand>
There were a couple bugs in the original implementation but I got them sorted as part of the ray-tracing work. Those kernels are pretty brutal.
<jekstrand>
karolherbst: Looks like SKL+ requires 1D images to have a stride aligned to 64 pixels. That's why it's failing.
<karolherbst>
ahh, I guess I missed the fun
<karolherbst>
okay
<karolherbst>
makes kind of sense
<karolherbst>
I am sure shadow buffering fixes that as well, but maybe we can be a bit better here
<jekstrand>
So those will hit the same fall-back shadow path
<karolherbst>
then I guess there isn't much left to do actually
<karolherbst>
fill_buffer fails due to blorp hittin assters
<karolherbst>
I think because the fill data is bigger then this 16 bs value? something like that
<karolherbst>
"buffers buffer_fill_float" e.g.
<jekstrand>
Yup
<karolherbst>
but I guess we want an accelerated kernel there anyway
<jekstrand>
I need to implement fill_buffer for realz
<jekstrand>
Or we need the state tracker to do something
<karolherbst>
so two projects for me: 1. accelerated fill_buffer 2. shadow buffers
<karolherbst>
yeah.. something
<karolherbst>
doesn't have to be CL specific
<jekstrand>
I can implement it in blorp or in iris. I just have to figure out how I want to go about it.
<karolherbst>
I still need to find somebody to fix spirv-link :D
<dj-death>
jekstrand: done
<jekstrand>
dj-death: Thanks!
<karolherbst>
jekstrand: ohh.. and we have to think about what to do about those clamp fails
<jekstrand>
karolherbst: Yeah, let me take a peak
<jekstrand>
Those seem super weird
* karolherbst
kicks of another CTS run
<karolherbst>
jekstrand: from what I know is, that most/all of this fails are like in the "precision" area, probably weird rounding in some places or something
<karolherbst>
but dunno what we can actually do about it, if the hardware is not good enough
<karolherbst>
we could probably also cheat and check what the intel stack is doing, but wouldn't surprise me if that falls inside llvm land
<airlied>
jekstrand: aco isn't hooked up to radeonsi at all yet
<karolherbst>
airlied: ahh the person I waited for :D
<karolherbst>
airlied: soo.. opencl-c-base.h + all those feature defines to enable/disable compiler features, how does that stuff work?
<karolherbst>
I noticed that compilation speed skyrocket once I use base.h, but it didn't help me fix the compiler features_macro test
<airlied>
karolherbst: it works badly
<airlied>
using the base + generated stuff probably is useable in llvm14
<airlied>
tbh for CL3.0 I expect there is still cleanup to be done on in both methods even now
<karolherbst>
ahh yeah...
<karolherbst>
sad
<airlied>
at least now I have llvm commit rights, so I can speed up landing the fixes
<airlied>
karolherbst, jekstrand : if you can look at 15876 and make sure you are comfortable with it (acks please), I think for radeonsi rusticl will need to handle the last two bits
<jekstrand>
airlied: I'm very confused by the need to unlower GLOBAL_GROUP_SIZE. Does radeon not have workgroup_size and num_workgroups?
<airlied>
jekstrand: the LLVM backend has a fixed API, it doesn't have those
<airlied>
and yes that is at least what ROCm was doing 10 months ago, I should recheck if they ever worked out how dumb it was
<jekstrand>
airlied: Ok, so at least it's consistently stpuid
<karolherbst>
jekstrand: for details, read up on the clover llvm backend :p
shashank_sharma has joined #dri-devel
<jekstrand>
I think, with that, we have an almost complete combinatorial set of possible lowerings of global/local_size/index.
<karolherbst>
:D
<airlied>
gotta catch em all
gawin has joined #dri-devel
<jekstrand>
I kind of want an interface that's just "tell me what system values you have and I'll sort it out"
<airlied>
jekstrand: yeah I fell over the using the compute system values flags yesterday, but since lots of things call it from various places it can't really do negative flags
<airlied>
hence why I stuck it into options
shashanks has quit [Ping timeout: 480 seconds]
<airlied>
jekstrand: the kernel args backend work is also a horrid thing
<jekstrand>
airlied: I really hate that whole MR.
<jekstrand>
Maybe ACO would be easier. :P
<airlied>
I'd have to trick dschuermann then :-P
<karolherbst>
:D
<karolherbst>
something we could change in rusticl to make it less annoying?
mbrost has quit [Remote host closed the connection]
mbrost has joined #dri-devel
<airlied>
karolherbst: nope, the annoying is the LLVM/amdgpu ABI
<airlied>
I think those two things that need changing are just don't lower the kernel args, and lower work_dim support
<airlied>
karolherbst: where is your cts launcher again?
<jekstrand>
Doesn't do anything. It's not capable of using the newly compiled shader. Need to figure that out nex<t.
<alyssa>
;_D
<jekstrand>
dcbaker: Can you make waffle releases?
<jekstrand>
Or maybe jljusten?
mszyprow has quit [Ping timeout: 480 seconds]
icecream95 has joined #dri-devel
maxzor has quit [Ping timeout: 480 seconds]
sneil has quit [Remote host closed the connection]
sneil has joined #dri-devel
<jljusten>
jekstrand: yes, in theory. :) xexaxo has been doing most of the maintenance of waffle for quite some time though.
<jekstrand>
jljusten: I've been bugging him to make a release for a while and nothing happens. :-(
<jekstrand>
jljusten: So I was hoping someone else would maybe make it happen?
<jekstrand>
We've already missed f36 :(
<jekstrand>
I fixed a bug affecting radeonsi 5 months ago and it's not in a release yet. :-(
<jljusten>
jekstrand: hmm, maybe if I trick dcbaker into working on it, then I can get the Mesa 22.1 branch point delayed. :) (still waiting on i915 for dg2...)
<jekstrand>
jljusten: From what I've heard, we probably don't want to wait on i915 DG2. :-(
<jekstrand>
But I suppose you're supposed to know more about that than me. :)
<alyssa>
jekstrand: why is i915 for dg2 a thing
* jekstrand
no longer has to justify Intel's decisions. :P
<jljusten>
jekstrand: it seems there is chance for drm-next to have it merged by the end of April
<jljusten>
alyssa: we need a couple i915 query items to be defined
* jekstrand
is getting very confused by this driver<->compiler "interface"
<dcbaker>
jekstrand: I can't for some reason, I think only Emil and jljusten can
<jljusten>
jekstrand: I think dcbaker will try to work on it
<jekstrand>
jljusten: Works for me. As long as one happens, I don't care who types the "bump the version" commits and does the tagging.