coldfeet has quit [Remote host closed the connection]
vliaskov has joined #dri-devel
jkrzyszt has joined #dri-devel
apinheiro has joined #dri-devel
yyds has quit [Remote host closed the connection]
yyds has joined #dri-devel
Duke`` has quit [Ping timeout: 480 seconds]
mvlad has joined #dri-devel
<karolherbst>
pendingchaos: mhhh.. looks like the ISA has v_mul_hi_i32_i24_e32 and v_mul_hi_u32_u24_e32, but LLVM selects the former.. but I also don't see a way through the LLVM API to make sure the unsigned version to be picked... but I also don't know if that one would fix the issue anyway
<karolherbst>
but it looks like this bug exists for quite some time, but I don't know if it's technically a LLVM bug or a nir to llvm one
<karolherbst>
nir to llvm just does a 64 bit mul and shifts the result
<pendingchaos>
v_mul_hi_u32_u24_e32 can't be used because 0xff803fe1 is not an unsigned 24-bit integer
<karolherbst>
ohhh....
Duke`` has joined #dri-devel
<karolherbst>
sadly llvm only has mul, no umul/imul split
<karolherbst>
what would I need to do to emit v_mul_hi_u32 directly?
<karolherbst>
mhh I guess one of the llvm.amdgcn. intrinsics if one for that one exists...
<pendingchaos>
there would be no difference between umul and imul
<pendingchaos>
this is probably a llvm bug, if it's turning a mul(zext(a), zext(b))>>32 into v_mul_hi_i32_i24
<karolherbst>
yeah.. that seems what's happening here
<karolherbst>
.... rocm has the same bug
<karolherbst>
Intel's CL stack behaves like the reporter expects it (different than rocm)
<karolherbst>
as a workaround we could make nir_opt_idiv_const to not emit negative numbers...
u-amarsh04 has quit []
u-amarsh04 has joined #dri-devel
<karolherbst>
yeah... that works
<karolherbst>
llvm emits v_mul_hi_u32 then
<karolherbst>
just need to avoid numbers which can be interpreted as negative signed integers
warpme has quit []
<karolherbst>
though this emits an additional uadd_sat
<mareko>
and please a description of the problem that I can send to the LLVM team?
<karolherbst>
yeah, looks similar to what I got
<karolherbst>
using those inputs yields the wrong result doing mul_hi: 67451429, 684514641, 694514641, 794514641, 894514641, 99451464, 123343442, 23111252, 412341412, 354325253
<karolherbst>
wait.. I can even write a CL file whith shows the bug with rocm
<mareko>
karolherbst: you can keep pendingchaos's suggestion, but replace block_range_analysis with ac_build_optimization_barrier
<karolherbst>
okay, cool
<mareko>
the effect should be identical
<pendingchaos>
block_range_analysis is more optimizable
<pendingchaos>
ac_build_optimization_barrier can't be CSE'd and claims to have side effects
<karolherbst>
ehh wait, I misunderstood what you said
bnieuwenhuizen_ has joined #dri-devel
<mareko>
ok
<pendingchaos>
my suggestion doesn't need to change emit_imul_high()
<karolherbst>
yeah.. I've put the asm in the MR
<karolherbst>
it's a signficiant difference
<mareko>
you can keep pendingchaos's suggestion as-is
<karolherbst>
pendingchaos: any thoughts on using LLVMIsConstant on both sources? not really sure what's the idea behind the else there
<mareko>
pendingchaos: you said "gfx9 and later don't seem to have the v_mul_hi_i32_i24 bug for some reason", is that true given that karolherbst is on rdna2?
<pendingchaos>
the inline assembly needs to be used for at least one source, and using it for constant ones prevents the constant from being combined into the instruction
<karolherbst>
okay
<pendingchaos>
apparently the gfx9 and later thing isn't true, since it happens for karolherbst on rdna2
<pendingchaos>
not sure what I'm missing in my testing
<mareko>
ok
<karolherbst>
pendingchaos: did you test with the division or the optimized umul_high pattern?
<karolherbst>
and what did you test? Anyway, ROCm doesn't show the bug with the divisions, only with using umul_high directly here
bnieuwenhuizen has quit [Ping timeout: 480 seconds]
riteo has quit [Ping timeout: 480 seconds]
<karolherbst>
mhhh yeah.. that seems to emit v_mul_hi_u32 v0, 0xff803fe1, v0 ; D56A0000 000200FF FF803FE1 here
<karolherbst>
LLVM: %12 = mul nuw nsw i64 %11, 4286595041
<karolherbst>
maybe llvm takes more into account or something...
sima has quit [Ping timeout: 480 seconds]
<karolherbst>
I haven't tried to figure out where the optimization gets applied inside LLVM, but a quick git grep didn't show anything (but I might have used the wrong things to search for)
davispuh has joined #dri-devel
feaneron has joined #dri-devel
<karolherbst>
mhh.. okay.. ac_build_optimization_barrier works as well, I just modified the code incorrectly
<karolherbst>
(I accidentally swaped the condition in regards to src0 and src1
feaneron has quit []
kts has quit [Quit: Konversation terminated!]
Dark-Show has quit [Quit: Leaving]
rasterman has quit [Quit: Gettin' stinky!]
kts has joined #dri-devel
sima has joined #dri-devel
fireburn has joined #dri-devel
<fireburn>
Hey, a commit in the last 24hrs has broken Chromium Vulkan EGL, I'm just bisecting now
<karolherbst>
pendingchaos: sooo, it seems like I can prevent llvm to make that opt by doing "unsigned int tmp1 = A[0];" instead of "unsigned int tmp1 = A[get_global_id(0)];" 🙃
<pendingchaos>
it uses s_mul_hi_u32?
<karolherbst>
yeah
user has joined #dri-devel
<pendingchaos>
well, that's just because the source is no longer divergent
user is now known as Guest1240
<karolherbst>
I guess there is no s_mul_hi_i32_i24_e32 then?
<pendingchaos>
no
<pendingchaos>
s_mul_hi_u32 is as fast as any other SALU
<karolherbst>
ahh...
<karolherbst>
anyway, your suggestion prevents the use of s_mul_hi_u32 if it's convergent
<pendingchaos>
yeah, I don't know how to prevent that without the divergence analysis idea mentioned in the MR
<karolherbst>
though LLVM could choose to still decide differently than nir's analysis would guess, no?
dv_ has joined #dri-devel
bolson has joined #dri-devel
LeviYun has quit [Remote host closed the connection]
<karolherbst>
yeah well.. soo mesa doens't implement cl_khr_gl_msaa_sharing ...
<karolherbst>
alyssa: do you need it fir something?
<alyssa>
gpu crimes yeah
<karolherbst>
mhh
<karolherbst>
this ext depends on cl_khr_gl_depth_images
<alyssa>
i'll just backdoor it with a vendor intrinsic because I do not want to think about llvm right now
<karolherbst>
which... is something I wanted to add at some point as well
<karolherbst>
yeah, fair
<karolherbst>
depth_images are a bit of a pain to add
<alyssa>
trying to get my decompression kernel to pass GL CTS
<karolherbst>
but I can prioritize it a bit higher if others want to use it as well
<karolherbst>
though I think it's mostly API stuff I need to add here, but the spirv/nir side might also need a bit of work
<alyssa>
yeah I'll backdoor for now and we can port to the standard syntax when the rusticl side is there
<karolherbst>
okay, cool
<karolherbst>
soo.. three features: depth/stencil images, cl_khr_gl_depth_images and cl_khr_gl_msaa_sharing
epoch101 has quit [Ping timeout: 480 seconds]
tzimmermann has quit [Quit: Leaving]
riteo has joined #dri-devel
feaneron has joined #dri-devel
<alyssa>
k
<alyssa>
oops
feaneron has quit []
mbrost has joined #dri-devel
epoch101 has joined #dri-devel
feaneron has joined #dri-devel
jenatali has joined #dri-devel
<jenatali>
karolherbst: Feel free to ping me for reviews on the spirv/nir stuff. Not sure if/when I'd get to plumbing it myself but I'd like to make sure it's workable
<karolherbst>
jenatali: for depth images and co?
<jenatali>
Yeha
<karolherbst>
will do
Guest1240 has quit [Ping timeout: 480 seconds]
fab has joined #dri-devel
user has joined #dri-devel
user is now known as Guest1253
<fireburn>
@zmike device_select: shortcut EnumeratePhysicalDevice* for count-only calls
fab has quit [Ping timeout: 480 seconds]
<zmike>
FireBurn: okay, just make a MR to revert that then
<karolherbst>
oneforall2: yeah.. ultimately this seems to be a meson bug, but I've added that workaround for now
gouchi has joined #dri-devel
gouchi has quit [Remote host closed the connection]
gouchi has joined #dri-devel
sassefa has joined #dri-devel
gouchi has quit []
sima has quit [Ping timeout: 480 seconds]
<oneforall2>
1 thing I dislike about meson is it doesn't like to work with flags to well like LDFLAGS to pick the right lib dir 32bit or 64
<oneforall2>
like right now :)
coldfeet has quit [Remote host closed the connection]
alanc has quit [Remote host closed the connection]
YuGiOhJCJ has quit [Remote host closed the connection]
YuGiOhJCJ has joined #dri-devel
alanc has joined #dri-devel
<alyssa>
everybody needs opencl
* alyssa
laughs villainously
feaneron has quit [Ping timeout: 480 seconds]
* urja
shudders
<ccr>
the knights who say "opencl"
feaneron has joined #dri-devel
feaneron has quit [Quit: feaneron]
feaneron has joined #dri-devel
<Company>
I don't understand this whole dri2_query_image()/resource_get_handle() thing but I think llvmpipe doesn't dup() the fds it returns from eglExportDMABUFImageMESA() - or it dups them and returns the wrong one?
<karolherbst>
jenatali: looks like we are already good in regards to depth images
<airlied>
but as you mentioned previously it already has a long lived dup
<airlied>
maybe we should drop the other one
<Company>
what happened 12 hours ago is that our sysadmins finally exposed /dev/udmabuf into our CI
<Company>
and suddenly both Vulkan and GL llvmpipe feel the full force of our dmabuf import/exports
<Company>
because they finally hit those codepaths
<Company>
and it seems that is not a very common thing that people do - export and import dmabufs from software renderers ;)
<karolherbst>
jenatali: I think msaa will need some work because that actually has its own GLSL_SAMPER_DIM type, but depth images are just like plain ones
<jenatali>
👍
<karolherbst>
clEnqueueFillImage needs fixing, because the pixel size is one, not four, but... that's just API stuff
<Company>
airlied: that works for avoiding the double close but runs into VK_ERROR_INVALID_DRM_FORMAT_MODIFIER_PLANE_LAYOUT_EXT when importing
<Company>
(with AMD, lvp Vulkan doesn't complain but also doesn't show the texture)
vignesh has quit [Quit: Connection closed for inactivity]
digetx is now known as Guest1265
digetx has joined #dri-devel
Guest1265 has quit [Ping timeout: 480 seconds]
<airlied>
Company: no idea on that, would probably require debugging :-)
<Company>
yeah, I suppose I get to file a bunch of issues
<Company>
once I've figured out how to unblock GTK's CI
<airlied>
rm /dev/udmabuf :-P
<airlied>
or maybe chmod
<Company>
I worked hard to get it turned on
<DemiMarie>
Company: do you use containers or ephemeral VMs for CI?
<airlied>
where are you importing into?
rasterman has quit [Quit: Gettin' stinky!]
<Company>
DemiMarie: I use whatever the gnome sysadmins give me - and that's some docker image
<Company>
airlied: the current test I'm debugging is GL content in Vulkan, ie llvmpipe is exporting, lvp is importing - and when I'm running it locally I also run the test against my AMD (usually because I forget the env vars)
<Company>
airlied: and the reproducer above does the same thing
<airlied>
Company: not sure how importing into amd should work there
<Company>
I have no idea if it should - it might be perfectly fine that it's complaining
<Company>
no idea what the state of udmabuf is
<Company>
or if llvmpipe creates dmabufs that AMD likes with its stride requirements
<Company>
the worse problem is that lvp imports it but then doesn't show anything
epoch101_ has joined #dri-devel
epoch101 has quit [Ping timeout: 480 seconds]
YuGiOhJCJ has quit [Quit: YuGiOhJCJ]
sukuna has joined #dri-devel
jkrzyszt_ has joined #dri-devel
jkrzyszt has quit [Remote host closed the connection]
sukuna1 has joined #dri-devel
<airlied>
Company: yeah definitely should file that one and see where it's going wrong
sukuna has quit [Ping timeout: 480 seconds]
bolson has quit [Ping timeout: 480 seconds]
mvlad has quit [Remote host closed the connection]
digetx has quit [Read error: Connection reset by peer]
digetx has joined #dri-devel
DarkShadow4444 has joined #dri-devel
heat is now known as Guest1273
heat has joined #dri-devel
Guest1273 has quit [Read error: Connection reset by peer]
ryanneph has joined #dri-devel
digetx has quit [Read error: Connection reset by peer]
digetx has joined #dri-devel
DarkShadow44 has quit [Ping timeout: 480 seconds]
ciwoudofontara^ has joined #dri-devel
nerdopolis has quit [Remote host closed the connection]
nerdopolis has joined #dri-devel
rgallaispou has joined #dri-devel
sukuna1 has quit [Ping timeout: 480 seconds]
soul has quit []
glennk has quit [Ping timeout: 480 seconds]
<Company>
airlied: with udmabuf, does one have to keep the memfd around as long as the dmabuf_fd exists or can I close() it? It's not documented anywhere
Haaninjo has quit [Quit: Ex-Chat]
alyssa has quit [Quit: alyssa]
vliaskov has quit [Ping timeout: 480 seconds]
jkrzyszt_ has quit [Remote host closed the connection]
jkrzyszt_ has joined #dri-devel
pcercuei has quit [Quit: dodo]
mbrost has quit [Ping timeout: 480 seconds]
orbea has quit [Quit: You defeated orbea! 2383232 XP gained!]
orbea has joined #dri-devel
soreau has quit [Ping timeout: 480 seconds]
sukuna has joined #dri-devel
soreau has joined #dri-devel
ryanneph has quit [Remote host closed the connection]
Guest1253 has quit [Remote host closed the connection]