ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar
<jekstrand> Ok, done with Twitter story time. Let's get this thing working.
nchery is now known as Guest2451
nchery has joined #dri-devel
<jekstrand> karolherbst: CL_DEVICE_NOT_FOUND?!?
<karolherbst> what are you doing?
<jekstrand> Not setting the CL device type explicitly, aparently
<karolherbst> ohhhh...
<karolherbst> ffs
<karolherbst> CL_DEVICE_TYPE=CL_DEVICE_TYPE_GPU
<jekstrand> hehe
<jekstrand> Yeah, I figured that out. :)
<karolherbst> until I clean up that fix
icecream95 has joined #dri-devel
<jekstrand> karolherbst: Are you sure CLAMP_TO_EDGE fails?
<karolherbst> CLAMP
<karolherbst> see above
Guest2451 has quit [Ping timeout: 480 seconds]
<jekstrand> Oh
slattann has joined #dri-devel
* karolherbst should use less as and more .into()
ybogdano has quit [Ping timeout: 480 seconds]
<karolherbst> okay.. conversions already runs for nearly 3 hours and it is still only using 172MB :)
<jekstrand> \o/
<karolherbst> but I really have to offload callbacks
<karolherbst> GPU is like 42% busy
<karolherbst> but gen12 is really something else
<jekstrand> karolherbst: Are these write_image tests that are failing?
<karolherbst> same as the other one
<karolherbst> ./build/test_conformance/images/kernel_read_write/test_image_streams read 1D CL_R CL_FLOAT CL_FILTER_NEAREST CL_ADDRESS_CLAMP UNNORMALIZED
<jekstrand> karolherbst: That passes here
<karolherbst> huh
<karolherbst> indeed
<karolherbst> seems like gen12 doesn't need it anymore
<jekstrand> I don't know that it ever did
<karolherbst> it fails on my laptop
<karolherbst> and intels runtime does have code for that specific case
<jekstrand> It's 3D images that are busted
<jekstrand> And it's writes
<jekstrand> idk why
<karolherbst> no, it's also for other image types in there
co1umbarius has joined #dri-devel
<karolherbst> like check for __builtin_IB_get_snap_wa_reqd
<jekstrand> karolherbst: I know about that workaround. I don't think it's needed
<karolherbst> well
<karolherbst> it does fit the case
<jekstrand> which case?
<karolherbst> coord slightly below 0
<karolherbst> CL_FILTER_NEAREST and CL_ADDRESS_CLAMP
columbarius has quit [Ping timeout: 480 seconds]
<karolherbst> getSnapWaValue inside compute-runtime
<karolherbst> but it doens't seem like they skip it for gen12
<jekstrand> I don't want to put in horrible workarounds that aren't actually needed.
<karolherbst> anyway, it's true for NEAREST + CLAMP
<karolherbst> jekstrand: well.. it does seem to be required on my laptop
<jekstrand> karolherbst: What test is failing? The one you gave me passes.
<karolherbst> it fails on my laptop ;)
<jekstrand> Unless you're saying it fails on gen9 and passes on gen12
<karolherbst> yes
<karolherbst> that's what I am saying
<jekstrand> That's... unfortunate.
<karolherbst> very much so
<karolherbst> but yeah
<karolherbst> it passes on my desktop
* karolherbst restarts CTS or something
<karolherbst> huh
<karolherbst> I found some weird case which is failing
<karolherbst> ahh yeah.. 3D
<karolherbst> let's see....
<karolherbst> jekstrand: I think it's the same thing
<jekstrand> karolherbst: rusticl/wip
<jekstrand> karolherbst: You need the top patch and the "stop setting the cursor..." one
<jekstrand> karolherbst: 3D writes are failing on TGL. I'll look into that tomorrow. I need to call it quits for the evening.
alatiera8 has joined #dri-devel
<karolherbst> yeah... writes look really messed up
alatiera has quit [Ping timeout: 480 seconds]
ngcortes has joined #dri-devel
<karolherbst> jekstrand: it passes now :)
<karolherbst> had to fix your patch a little though
<jekstrand> karolherbst: fine with me. You could check the mip filter too but I think that's covered by !normalized
<karolherbst> yeah, probably
<karolherbst> currently running the entire thing on gen9 here, let's see if 3d image work
<karolherbst> seems like 3D is busted here the same way as on gen12 :)
<karolherbst> jekstrand: for 3d images intel inserts a as_int4(Texel) thing
<karolherbst> for writes
<karolherbst> but there is more stuff
<karolherbst> sounds annoying as well :(
LexSfX has quit [Ping timeout: 480 seconds]
LexSfX has joined #dri-devel
slattann has quit [Quit: Leaving.]
mhenning has joined #dri-devel
rasterman has quit [Quit: Gettin' stinky!]
mclasen has quit [Ping timeout: 480 seconds]
bond has joined #dri-devel
bond is now known as Guest2456
Guest2456 has quit []
loalmtdea^ has joined #dri-devel
lemonzest has quit [Quit: WeeChat 3.4]
heat_ has quit [Remote host closed the connection]
ngcortes has quit [Remote host closed the connection]
khfeng has joined #dri-devel
Company has quit [Quit: Leaving]
mhenning has quit [Quit: mhenning]
ella-0_ has joined #dri-devel
ella-0 has quit [Read error: Connection reset by peer]
robink has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
robink has joined #dri-devel
elongbug__ has quit [Read error: Connection reset by peer]
Duke`` has joined #dri-devel
<emersion> mattst88: you're the first person I've seen to insta-merge their own stuff
<mattst88> emersion: did you look at the changes?
<emersion> yes
sdutt has quit [Read error: Connection reset by peer]
<mattst88> then I think maybe you could imagine why not waiting a day for review would be okay
lemonzest has joined #dri-devel
slattann has joined #dri-devel
<airlied> anholt: is the g41 offline/
<airlied> ?
danvet has joined #dri-devel
Duke`` has quit [Ping timeout: 480 seconds]
maxzor has joined #dri-devel
<Kayden> I've definitely insta-merged my own stuff
mvlad has joined #dri-devel
cheako has quit [Quit: Connection closed for inactivity]
ahajda has joined #dri-devel
alanc has quit [Remote host closed the connection]
alanc has joined #dri-devel
<emersion> it's bad practice and doesn't provide any value
mi6x3m has joined #dri-devel
<mi6x3m> hey, does anyone know of a site aggregating FPS scores for mesa?
<mi6x3m> I'm trying to find out whether DRI GLX i915 driver fast enough is
<mi6x3m> I have an i7 hd graphics 4000 with 1300x700 pixels
loalmtdea^ has quit [Remote host closed the connection]
alatiera8 is now known as alatiera
<mattst88> learned a new word today: jobsworth
Major_Biscuit has joined #dri-devel
tzimmermann_ has quit []
tzimmermann has joined #dri-devel
jkrzyszt has joined #dri-devel
MajorBiscuit has joined #dri-devel
Major_Biscuit has quit [Ping timeout: 480 seconds]
<karolherbst> what the "(21-Apr 09:08:41) PASSED Conversions : (33930s, test 39/57)"
<karolherbst> at least I designed stuff correctly for math_brute_force which really keeps the GPU close to 100% load
<karolherbst> jekstrand: btw, I think this snap workaround is only needed for float coords as far as I can tell. Not sure how to express that inside nir though.
<karolherbst> and I think iris_get_name races :)
<karolherbst> I see some fail in math brute force and this spawns multiple threads I think
<karolherbst> (I had to fix something similiar for nouveau in the past)
itoral has joined #dri-devel
<mi6x3m> is it expected that 32 bit mesa is slower than 64 bit?
<karolherbst> mi6x3m: since when is 32 bit expected to be faster than 64 bit?
<karolherbst> and slower in what
<karolherbst> GPU perf? yeah.. that shouldn't happen. CPU overhead? that might actually be simply the case
<karolherbst> oh wow.. the fract test only made the GPU like 2% busy
<mi6x3m> karolherbst, glxgears gives me 15FPS fullscreen for 32 bit self-compiled i915 version
<mi6x3m> it gives me solid 60 for 64 bit preinstalled mesa with i915
<karolherbst> mi6x3m: make sure it actually uses hw rendering
<mi6x3m> karolherbst, how to make sure?
<karolherbst> check with a 32 bit glxinfo
<karolherbst> ohh wait
<mi6x3m> ibGL: using driver i915 for 4
<karolherbst> there is glxgears -info as well
<mi6x3m> GL_RENDERER = softpipe
<mi6x3m> ehm
<karolherbst> there you go
<mi6x3m> wait what
<mi6x3m> but why omg
<mi6x3m> thanks lemme check it out
<karolherbst> probably set the wrong build options or something
<mi6x3m> i mean it reports i dont have crocus but i thought it falls back to i915
<mi6x3m> karolherbst, thanks let me see what i can do
<mi6x3m> karolherbst, that was it, thanks :)
<karolherbst> np
apinheiro has joined #dri-devel
pcercuei has joined #dri-devel
Haaninjo has joined #dri-devel
Haaninjo has quit [Remote host closed the connection]
mi6x3m has quit [Quit: Leaving]
rasterman has joined #dri-devel
thellstrom has joined #dri-devel
thellstrom has quit [Ping timeout: 480 seconds]
rkanwal has joined #dri-devel
Company has joined #dri-devel
khfeng has quit [Remote host closed the connection]
khfeng has joined #dri-devel
Daanct12 has joined #dri-devel
icecream95 has quit [Ping timeout: 480 seconds]
<karolherbst> "(21-Apr 12:41:54) Testing complete. 4 failures for 57 tests."
<karolherbst> it just took 10 hours
<karolherbst> * (20-Apr 22:47:03) Test Images (Kernel) ==> FAILED: 1
<karolherbst> * (20-Apr 22:47:56) Test Images (Kernel pitch) ==> FAILED: 1
<karolherbst> * (20-Apr 22:49:53) Test Images (Kernel max size) ==> FAILED: 1
<karolherbst> * (21-Apr 12:41:24) Test Half Ops ==> FAILED: 1
<karolherbst> "ehhh "Log: input.cl:4:11: error: implicit declaration of function 'vload_half' is invalid in OpenCL" for test_half
<karolherbst> I should add it to the clc test runner
<karolherbst> ahh right.. parsing tests out of that one was annoying
flacks has quit [Quit: Quitter]
flacks has joined #dri-devel
garrison has joined #dri-devel
i-garrison has quit [Ping timeout: 480 seconds]
Daanct12 has quit [Remote host closed the connection]
maxzor has quit [Ping timeout: 480 seconds]
Daanct12 has joined #dri-devel
mclasen has joined #dri-devel
nchery has quit [Read error: Connection reset by peer]
cheako has joined #dri-devel
nchery has joined #dri-devel
maxzor has joined #dri-devel
itoral has quit [Remote host closed the connection]
konstantin has joined #dri-devel
Daanct12 has quit [Quit: Leaving]
MajorBiscuit has quit [Quit: WeeChat 3.4]
konstantin has quit [Remote host closed the connection]
apinheiro has quit [Ping timeout: 480 seconds]
sdutt has joined #dri-devel
jewins has joined #dri-devel
gawin has joined #dri-devel
fxkamd has joined #dri-devel
dliviu has quit [Read error: No route to host]
dliviu has joined #dri-devel
fxkamd has quit []
khfeng has quit [Remote host closed the connection]
maxzor has quit [Ping timeout: 480 seconds]
khfeng has joined #dri-devel
camus has quit [Remote host closed the connection]
fxkamd has joined #dri-devel
khfeng has quit [Remote host closed the connection]
camus has joined #dri-devel
gawin has quit [Ping timeout: 480 seconds]
MajorBiscuit has joined #dri-devel
lemonzest has quit [Quit: WeeChat 3.4]
camus has quit []
<jekstrand> karolherbst: Not my bug. :P
<karolherbst> oh no :D
<karolherbst> but I think there was something else broken.. let me check with that
<jekstrand> PASSED 21 of 21 sub-tests.
<jekstrand> That's all the 3D tests
<karolherbst> sure, but I meant like on gen9
<jekstrand> Oh, well that's entirely possible
<karolherbst> I am not sure what to do about the fp16 test though.. it fails to compile, but we also don't advertise support for fp16, so I think we could just patch it out
<jekstrand> karolherbst: weird
<jekstrand> karolherbst: That's on Gen9?
<karolherbst> yeah
<jekstrand> I don't have one of those. :-/
<karolherbst> I think I saw hacks in intels stack though
<jekstrand> nah. RGBA32F images should "just work"
<karolherbst> yeah.. well,, but they do adjustements
<jekstrand> Oh, that's a unnormalized read... Yeah, that's not a well-tested path.
<karolherbst> mhh, maybe intel doesn't advertise support for float coords on 3d image writes
<karolherbst> 3d image writes are entirely optional anyway
maxzor has joined #dri-devel
<karolherbst> ahh no, it's hidden behind macro magic
<karolherbst> as_uint4(Texel) and as_int4(Texel) further below
<karolherbst> I am a bit curious about those
<karolherbst> mhh, but those are int coords as well
<karolherbst> yeah.. maybe only int coords are supported on intel
<karolherbst> let's see
<karolherbst> ehh wait..
<karolherbst> what I am talking about, it fails for read 3d :)
<jekstrand> We should probably fix that first. :)
<karolherbst> okay.. float results
<karolherbst> interesting they convert int coords to flow
camus has joined #dri-devel
<karolherbst> but doesn't matter
<karolherbst> the case above fails
<jekstrand> karolherbst: We do too, in spirv_to_nir
<karolherbst> ahh
<jekstrand> karolherbst: Can you do INTEL_DEBUG=bat and paste it?
<karolherbst> mhh, still seeing some fails
<karolherbst> let's see
<karolherbst> (on gen12 I mean)
<karolherbst> jekstrand: yeah.. same fail on gen 12
<karolherbst> test_image_streams read 3D CL_RGBA CL_FLOAT CL_FILTER_LINEAR CL_ADDRESS_CLAMP_TO_EDGE UNNORMALIZED
<karolherbst> maybe something with slices, as I see other issues with 2darray as well
<karolherbst> "Sample 31109" mhhh, sounds like something something annoying
<karolherbst> same thing just NORMALIZED fails as well
sdutt has quit []
sdutt has joined #dri-devel
<karolherbst> ohhhhhhh
<karolherbst> nooooooo
<karolherbst> jekstrand: (5.41167e+09,0.0140819,1.43328e-38,2.65733e+35) vs (5.41167e+09,0.0140819,0,2.65733e+35)
<karolherbst> (5.41167e+09,0.0140819,1.43328e-38,2.65733e+35)
<karolherbst> (5.41167e+09,0.0140819,0, 2.65733e+35)
<karolherbst> 0 vs 1.43328e-38
<karolherbst> which is a normal value
<karolherbst> just very small
<karolherbst> which is a little weird as it seems like we store that directly.. well.. at least within the nir shader
<karolherbst> but given that it's CL_FILTER_LINEAR
slattann has quit []
<karolherbst> let's see...
heat has joined #dri-devel
<jekstrand> karolherbst: You think we're flushing?
<karolherbst> no
<karolherbst> filtering is different
<karolherbst> not sure if that matters though
<karolherbst> iris uses MAPFILTER_ANISOTROPIC
<karolherbst> intels CL runtime uses MAPFILTER_LINEAR
<karolherbst> but that doesn't seem to change anything yet
<karolherbst> maybe that's not it though
* karolherbst scratches head
<karolherbst> jekstrand: maybe we do flush...
<karolherbst> I can't read the intels ISA thing anyway
<karolherbst> but doens't look like anything is done with the values
<karolherbst> jekstrand: yeah soo.. no idea what's up. I tried to match what's in the CL runtime and it still fails
Duke`` has joined #dri-devel
ybogdano has joined #dri-devel
<jekstrand> karolherbst: The result of the sample operation goes straight into the store_global so there's no flushing unless the sampler is flushing internally
<karolherbst> yeah...
<karolherbst> that's why I was fiddling with the sampler state
<karolherbst> let me break it to see if anything changes anyway
<jekstrand> karolherbst: It's linear filtering so it may be flushing somewhere inside the sampler
<karolherbst> yes
<jessica_24> hey mattrope, I'm working on debugging some issues with the IGT kms_universal_plane and I'm having trouble understanding your comment here (https://gitlab.freedesktop.org/drm/igt-gpu-tools/-/blob/master/tests/kms_universal_plane.c#L145)
<karolherbst> and intels CL runtime uses different filtering
<karolherbst> but that's not it
<jessica_24> From my understanding, drm_plane_init will convert the boolean into an enum then call drm_universal_plane_init (https://gitlab.freedesktop.org/drm/msm/-/blob/msm-next/drivers/gpu/drm/drm_plane.c#L504)
<jekstrand> What do you mean uses different filtering?
<jessica_24> so even if plane_init is called instead of universal_plane_init, there shouldn't be a problem since plane_init will pass everything to universal_plane_init
<karolherbst> iris: MAPFILTER_ANISOTROPIC
<karolherbst> intel: MAPFILTER_LINEAR
<jessica_24> Also, I was curious why you added an assert for num_plane == 1 since it's possible to have more than 1 primary plane.
<jessica_24> basically, I think I'm missing some context for this part of the code. Can you elaborate more on why this check was added?
<jekstrand> karolherbst: Oh, we very much don't want anisotropic for CL, I don't think.
<karolherbst> yeah, but it doens't fix this test sadly
<karolherbst> maybe I missed some toggles
<karolherbst> phhhh
<karolherbst> ohhh
<karolherbst> evil
<jekstrand> ?
<karolherbst> MipModeFilter uses different constants?
<jekstrand> Yes
<karolherbst> okay, doesn't fix it either
<jekstrand> karolherbst: Are you setting pipe_sampler_state::max_anisotropy = 0?
cef is now known as Guest2495
<karolherbst> yes
<jekstrand> Ok, that should get rid of any aniso
cef has joined #dri-devel
<jekstrand> mip filter shouldn't matter because LOD=0 explicitly
<karolherbst> yeah...
<karolherbst> I just know how to break it for real, but everything else doesn't changy anything at all
<karolherbst> maybe there is a magic bit somewhere else
jkrzyszt has quit [Ping timeout: 480 seconds]
Guest2495 has quit [Ping timeout: 480 seconds]
<karolherbst> yeah.. sooo.. no idea :)
<mattrope> jessica_24: That test was written way back when the kernel was first getting support for multiple plane types (about 8 years ago). Way back then the signatures of the two functions were pretty much identical aside from drm_plane_init having a bool where drm_universal_plane_init had an enum. There were a couple cases where we accidentally typed drm_plane_init out of habit where we meant drm_universal_plane_init, causing primary planes
<mattrope> to show up as overlays or vice versa. The signatures are quite different now (e.g., the old drm_plane_init() doesn't take modifier lists) so it's not really a concern these days like it was back then.
mclasen has quit [Ping timeout: 480 seconds]
<mattrope> jessica_24: Also, you can only have one primary plane per CRTC (since "primary" is really just supposed to be an indication of which plane some of the other legacy ioctls like SET_CRTC will operate on).
<ajax> does anyone actually understand the vulkan wsi code or is this one of those chernobyl type deals where you get in and get out as fast as possible
<jekstrand> karolherbst: You could try playing with the LOD pre-clamp mode
<jekstrand> shouldn't make a difference but idk
<jekstrand> karolherbst: You could also fiddle with the filter quality
gawin has joined #dri-devel
<ajax> in particular why does the list of VkImages on a swapchain live in the backend private struct instead of on the wsi_swapchain itself
<jekstrand> karolherbst: The lower qualities cause linear filtering to "snap" to pixel centers when they get close where "close" is closer the higher the quality.
<jekstrand> hrm... Actually that just controls snapping between mipmaps but we already have lod = 0 so it shouldn't matter either
lemonzest has joined #dri-devel
Duke`` has quit [Ping timeout: 480 seconds]
ybogdano has quit [Ping timeout: 480 seconds]
gawin has quit [Ping timeout: 480 seconds]
Duke`` has joined #dri-devel
<abhinav__> mattrope thanks for the info, agreed that one CRTC can have only one primary plane. but this check is adding up all the primary planes of the display https://gitlab.freedesktop.org/drm/igt-gpu-tools/-/blob/master/tests/kms_universal_plane.c#L153
<abhinav__> so lets say there are two pipes for the same display, each pipe had one primary plane, then this adds up to 2
<abhinav__> but actually there was only one primary plane per pipe
<mattrope> abhinav__: That function is operating on a single pipe only.
<mattrope> The function will be called multiple times, once for each pipe, and each time we'll start counting from scratch (and should only find one primary).
ybogdano has joined #dri-devel
tobiasjakobi has joined #dri-devel
tobiasjakobi has quit [Remote host closed the connection]
<abhinav__> mattrope understood it now. so having a single primary plane is only for legacy drivers? because msm driver has been marking 2 primary planes per pipe and IGT does handle that already because it still marks only one of them as the plane_primary https://gitlab.freedesktop.org/drm/igt-gpu-tools/-/blob/master/lib/igt_kms.c#L2308
<abhinav__> so do we really need this check anymore?
<jekstrand> karolherbst: V
<mattrope> abhinav__: That sounds like it might be a bug in MSM. "Primary" and "cursor" aren't really supposed to be a description of hardware capabilities these days, but rather an indicator of which plane gets updated by some of the other old pre-atomic ioctls. E.g., if you call the SET_CRTC ioctl and provide a framebuffer, that shows up on one specific plane (which is the one that should be marked as PLANE_PRIMARY).
<mattrope> abhinav__: Even if your hardware has multiple fully-featured planes on a pipe, the others should still just be classified as generic OVERLAY type.
<mattrope> abhinav__: DRM properties associated with the planes (rather than the type) are how you figure out what the plane really is/isn't capable of doing.
<abhinav__> mattrope thanks for the info, i wasnt aware that having more than one primary plane is incorrect as it has been working alright robclark any comments on this? seems like https://gitlab.freedesktop.org/drm/msm/-/blob/msm-next/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c#L736 can mark more than one plane as primary ... do we need to protect this
ybogdano has quit [Ping timeout: 480 seconds]
<mattrope> abhinav__: I'm not really familiar with the hardware, but it's possible that platform has planes that can migrate between pipes. In that case it may be marking an equivalent # of planes as PRIMARY to the number of pipes that exist with the expectation that each one will get used for one specific pipe.
<mattrope> That's different than the Intel world where every plane is tied to a specific pipe and can't migrate.
<abhinav__> mattrope yes in this hardware, planes can migrate between pipes
MajorBiscuit has quit [Quit: WeeChat 3.4]
<abhinav__> yes and thats exactly whats happening here
<abhinav__> both planes are marked as primary because they can be shared
slattann has joined #dri-devel
<abhinav__> in that case to acommodate this, can we now get rid of assert and perhaps replace it with a igt_warn?
<karolherbst> jekstrand: yay
ybogdano has joined #dri-devel
mclasen has joined #dri-devel
<mattrope> abhinav__: In theory having multiple planes marked as PRIMARY that can be used on a single pipe could confuse userspace since it won't know what to expect if legacy ioctls are used. But depending on the platform, there may never have been any userspace that wasn't 100% atomic from day 1 and it won't really matter if all of the logic is based on the capabilities determined from DRM properties. It's probably more of an issue for drivers
<mattrope> like i915 that have been around forever and need to maintain ABI compatibility for pre-atomic, mid-atomic-transition, and post-atomic userspace.
<mattrope> abhinav__: So if msm is already doing that and it isn't causing problems, then we can probably demote the IGT failure to a warning.
LexSfX has quit [Ping timeout: 480 seconds]
LexSfX has joined #dri-devel
<abhinav__> mattrope yes thanks, there are no issues with msm driver with having multiple primary planes, its only this igt sub-test which throws the assert ..... we will post a chance to make it a warning .... will copy you on the change
<pzanoni> jekstrand, Kayden: I'll submit the updated version of the locking fix today
<jekstrand> pzanoni: :+1:
tzimmermann has quit [Quit: Leaving]
<karolherbst> jekstrand: where can I change CS shader metadata?
<daniels> ajax: no particularly good reason, no
<karolherbst> like INTERFACE_DESCRIPTOR_DATA has this DenormMode field I set to SetByKernel, but that also means the kernel itself has a field for it
<karolherbst> just to rule out it's really not a denorm issue
<karolherbst> the sampled values are -3.065715e-38 and 1.493025e-37 though :/ so highly unlikely
<daniels> ajax: it does mean that you can just allocate a single array of them and always refer to them that way in the backend; if you stored the base in the common struct, then you'd either need the backend doing a separate alloc for its private data if the frontend allocated an array of base, or the frontend allocating an array of pointers if the backend allocated an array of full images
<karolherbst> but who knows.. maybe something does flush that to 0 somewhere internally
<daniels> ajax: but I guess that could be solved with something devPrivates-ish where you just tell it how much private storage you need tacked on
<ajax> daniels: i think my real question is more like why are the fence and blit semas arrays on the wsi_swapchain, instead of members of the wsi_image they're 1:1 with
<ajax> seeing image_index everywhere makes me itchy
frankbinns has quit [Remote host closed the connection]
<daniels> ajax: my impression is 'that's how Dave typed it up at one point and no-one was much interested in changing it'
<daniels> but there might be something more subtle I'm missing
<jekstrand> karolherbst: What do you mean?
<karolherbst> jekstrand: I am trying to understand what I need to do, so that "SetByKernel" has any effect
<jekstrand> karolherbst: That's tricky
<jekstrand> I had some hacks yesterday but I deleted them.
<karolherbst> ahh
<jekstrand> karolherbst: But, also, nothing is going to flush if it goes straight from the texture op to the write without any non-trivial ALU in between.
<karolherbst> yeah..
<jekstrand> Unless that bit also controls the sampler.
<jekstrand> But I kind-of doubt that.
<karolherbst> yeah....
<jekstrand> Then again, I really don't know. I guess maybe it could? Seems unlikely.
<karolherbst> I have no idea
<jekstrand> I doubt those bits are passed onto the sampler
<karolherbst> but it's also unlikely because all those values are normals
<jekstrand> Yeah
<karolherbst> but still.. why does it become 0.0
<jekstrand> idk
garrison has quit []
i-garrison has joined #dri-devel
rasterman has quit [Quit: Gettin' stinky!]
slattann has quit []
<anholt> imirkin: so, virgl's going in today, which leaves nouveau NIR before !8044 can land. I have an rb from karolherbst. I'd love to get an ack from you as well on the decision to do NIR backend instead of NTT.
<karolherbst> jekstrand: yeah... it makes like no sense at all..
<karolherbst> I have seriously no idea what's wrong here
<jekstrand> karolherbst: Neither do I. I may have a gen9 machine but it's my wife's laptop and has Windows on it. Not sure if it's got a big enough disk to really dual-boot.
<karolherbst> jekstrand: this happens on gen12 as well
<jekstrand> karolherbst: It does?
<karolherbst> yes
<jekstrand> Ok, then I can maybe debug
<karolherbst> jekstrand: ./build/test_conformance/images/kernel_read_write/test_image_streams read 3D CL_RGBA CL_FLOAT CL_FILTER_LINEAR CL_ADDRESS_CLAMP_TO_EDGE UNNORMALIZED
<karolherbst> same error on gen12
<karolherbst> same sample and everything :)
<jekstrand> woo
<karolherbst> mhhh
<jekstrand> karolherbst: Fails on the Intel driver too. :) Same failure mode.
<karolherbst> huh?
<karolherbst> oh wow
<karolherbst> but a later sample
<karolherbst> but yeah.. more or less the same issue
<karolherbst> mhh
<karolherbst> maybe the CTS doesn't care if one or two of those fail...
<jekstrand> idk
<karolherbst> only one way to find out
<karolherbst> I _think_ I saw the CTS just ignoring some fails
<karolherbst> sooo
<karolherbst> this one has to be it then
<karolherbst> half will fail, but why are they running fp16 tests if we don't support the ext :P
<karolherbst> well.. we still have to fix llvm, but
<karolherbst> then I can continue hacking on making everything Send
<jekstrand> I poked bashbaugh about the 3D fail on the official driver.
<karolherbst> in 10 hours we know if the CTS cares or not :)
<jekstrand> hehe
<karolherbst> yeah... but I work on speeding it up
<karolherbst> throwing callbacks into a threadpoll should help
<jekstrand> I asked if they have a waiver. They might.
<karolherbst> potentially
<karolherbst> still a bit odd that they fail at a different sample
<jekstrand> Or they might have a CL_NO_REALLY_PASS_THE_CTS environment variable with a horrible workaround. :)
<karolherbst> :D
<jekstrand> We'll see what Ben says
<karolherbst> "error: aborting due to 246 previous errors" :'(
<jekstrand> :(
<karolherbst> Send in Rust really has strict requierements
<karolherbst> so any C pointer is !Send and !Sync
<karolherbst> cxx::UniquePtr is Send and Sync
<karolherbst> oh well
<karolherbst> I added a SafeCPtr to wrap those thingies
<karolherbst> but I might need more wrappers
<karolherbst> like pointers we've got from the client can't be just waived as safe
<jekstrand> Now that Arc::new_cyclic exists, I'm mildly tempted to go back to wrappers.
<jekstrand> Not sure how tempted, though... (-:
<karolherbst> question is, which problem would it solve?
<jekstrand> If we did that, then most of our "real" types wouldn't have to be c-repr
<karolherbst> ahh yeah.. probably
alanc has quit [Remote host closed the connection]
mclasen has quit [Ping timeout: 480 seconds]
<jekstrand> karolherbst: The official driver also fails some sRGB 3D texture tests
<karolherbst> "fun"
ybogdano has quit [Ping timeout: 480 seconds]
ybogdano has joined #dri-devel
alanc has joined #dri-devel
<karolherbst> guess it doesn't matter
<karolherbst> ¯\_(ツ)_/¯
<jekstrand> karolherbst: A little disconcerting but ok
<karolherbst> they might run with a special seed or something special config
<karolherbst> jekstrand: ohhh...
<karolherbst> LINEAR isn't tested at all
<jekstrand> srsly?
<karolherbst> yeah
<jekstrand> :facepalm:
<karolherbst> let's see if the spec says anything about that
<karolherbst> at least the CL C spec describes how CLK_FILTER_LINEAR is calculated
JohnnyonF has joined #dri-devel
<karolherbst> so it's using frac() mhh
<karolherbst> jekstrand: "For all other sampler combinations of normalized or unnormalized coordinates, filter and addressing modes, the relative error or precision of the addressing mode calculations and the image filter operation are not defined by this revision of the OpenCL specification." :D :D :D
<karolherbst> "If the sampler is specified as using unnormalized coordinates (floating-point or integer coordinates), filter mode set to CLK_FILTER_NEAREST and addressing mode set to one of the following modes - CLK_ADDRESS_NONE, CLK_ADDRESS_CLAMP_TO_EDGE or CLK_ADDRESS_CLAMP, the location of the image element in the image given by (i,j,k) will be computed without any loss of precision."
<karolherbst> sooo...
<karolherbst> either you use unorm and NEAREST or.... well... you can get literally anything
Haaninjo has joined #dri-devel
<karolherbst> jekstrand: anyway, I think we would pass conformance :)
<jekstrand> \o/
<jekstrand> and also :facepalm:
<karolherbst> here is the issue about fp16: https://github.com/KhronosGroup/OpenCL-CTS/issues/17
<karolherbst> worst case we just add support for fp16
<karolherbst> let me hack it in and see what happens
<jekstrand> karolherbst: Yeah, iris should be able to handle fp16
JohnnyonFlame has quit [Ping timeout: 480 seconds]
<karolherbst> jekstrand: iris doens't enable FP16 support :)
<jekstrand> karolherbst: Oh...
<jekstrand> karolherbst: You can throw in a hack patch. It should work
<jekstrand> idk if we want to enable it for GL, though.
<karolherbst> yeah, seems like it
<karolherbst> mhh
<karolherbst> why not?
<karolherbst> it's a shader_cap in gallium, so we could enable it only for compute shaders
<jekstrand> Yeah, we could do that.
<karolherbst> mhh some fails
<jekstrand> Why not? Because that might get us mediump with weird perf
<karolherbst> ehh
<karolherbst> rte fails
<karolherbst> I lost interest, at this point patching the CTS would be simplier :D
<karolherbst> huh.. lp fails the same way.. maybe that's easily fixable
<karolherbst> ahh probably not
<karolherbst> I think we only support rtz/rtn and rtp in mesa/nir?
rasterman has joined #dri-devel
<zmike> jenatali: maybe you want to ack this https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16085
RSpliet has quit [Ping timeout: 480 seconds]
<jekstrand> karolherbst: for fp16, you mean?
RSpliet has joined #dri-devel
<karolherbst> jekstrand: yes
<jekstrand> karolherbst: Then let's try to fix the CTS
<karolherbst> I already wrote the fix :)
<jekstrand> I can look at it but it may not be easy
<jekstrand> cool
<karolherbst> extension check is trivial
<karolherbst> you just need to find all the places to stick that in :)
<jenatali> zmike: Yep, r-b
<zmike> jenatali: can I also assume you're good with !16068 ?
<zmike> I don't know that there's any other wgl stakeholders
<karolherbst> jekstrand: I think I'll mod my runner then and split LINEAR and NEAREST
<jenatali> The other wgl stakeholders are vmware, if you wanted to ping them
<jenatali> Lemme take a closer look
mclasen has joined #dri-devel
mclasen has quit [Remote host closed the connection]
<zmike> it's more or less mimicking everything in glx/egl
mclasen has joined #dri-devel
<jekstrand> karolherbst: sounds reasonable.
MajorBiscuit has joined #dri-devel
* karolherbst kicks another CTS run
iive has joined #dri-devel
gouchi has joined #dri-devel
apinheiro has joined #dri-devel
mvlad has quit [Remote host closed the connection]
pcercuei has quit [Quit: brb]
pcercuei has joined #dri-devel
mclasen has quit []
mclasen has joined #dri-devel
Haaninjo has quit [Quit: Ex-Chat]
maxzor has quit [Ping timeout: 480 seconds]
ngcortes has joined #dri-devel
gawin has joined #dri-devel
pcercuei has quit [Quit: brb]
pcercuei has joined #dri-devel
<karolherbst> why is wrapping C callbacks such a painful thing in rust :(
pcercuei has quit []
<jekstrand> karolherbst: Because they're unsafe and scarry and stuff
* jekstrand builds rusticl on his raspberry pi
<karolherbst> yeah.. that's not really my problem though, I want to wrap all CL callback things with a struct, and that's annoying
<karolherbst> ohhh.. wait
<karolherbst> maybe I can cheat
pcercuei has joined #dri-devel
<karolherbst> nice
gawin has quit [Ping timeout: 480 seconds]
pcercuei has quit [Quit: brb]
pcercuei has joined #dri-devel
<jekstrand> karolherbst: Does OpenCL require that if you store a pointer to a global in another global, that it works in the next kernel?
<karolherbst> jekstrand: good question
<karolherbst> but I don't think so
<karolherbst> well
<karolherbst> not always at least
<karolherbst> with SVM of course that has to work, but I think the client is responsible for making sure shit doesn't break
maxzor has joined #dri-devel
<karolherbst> like you can't use that ptr once the client maps it for writing or something
<jekstrand> of course
MajorBiscuit has quit [Ping timeout: 480 seconds]
<karolherbst> I just don't know if buffers need to have stable addresses across the entire lifetime
<karolherbst> _but_ given that you have to you know, specify buffers on kernels, I think it only works with bound buffers
<karolherbst> for lower SVM levels you also have to specify _all_ indirectly used SVM buffers
<karolherbst> except you do system SVM
MajorBiscuit has joined #dri-devel
MajorBiscuit has quit []
<karolherbst> jekstrand: any specific reason you are asking?
<jekstrand> Thinking about haswell
<karolherbst> can the GPU address change between invocations?
Duke`` has quit [Ping timeout: 480 seconds]
<karolherbst> anyway.. I think as long as you don't do SVM things can crash if you use pointers across different kernels
<karolherbst> or even invocations
gouchi has quit [Remote host closed the connection]
<jekstrand> ugh...
<jekstrand> Went to create a LLVM code review account and it sent the verification e-mail to my intel address. :-(
<karolherbst> :(
<jekstrand> Yup, so now I'm locked out forever, probably. :-/
<karolherbst> noooooo :(
<jekstrand> sent an e-mail to the llvm-admin list. Hopefully someone can delete it and let me try again.
craftyguy has quit [Quit: craftyguy]
<karolherbst> "error: aborting due to 178 previous errors" at least some progress
craftyguy has joined #dri-devel
gawin has joined #dri-devel
danvet has quit [Ping timeout: 480 seconds]
rkanwal has quit [Quit: rkanwal]
ahajda has quit [Remote host closed the connection]
ahajda has joined #dri-devel
ahajda has quit [Remote host closed the connection]
ahajda has joined #dri-devel
pcercuei has quit [Quit: brb]
mclasen has quit []
mclasen has joined #dri-devel
rasterman has quit [Quit: Gettin' stinky!]
Kayden has quit [Quit: to office]
nchery has quit [Ping timeout: 480 seconds]
mclasen has quit [Read error: Connection reset by peer]
<daniels> jekstrand: I'm pretty sure that storing a pointer into a global (passed only in f1) into a global (passed in both f1 and f2) that you later dereference from f2, is not even a little bit valid
mclasen has joined #dri-devel
<karolherbst> daniels: it's totally valid with SVM
pcercuei has joined #dri-devel
<daniels> right
<daniels> but not without
<karolherbst> yeah, most likely not
<karolherbst> but it could be valid if you make sure all used buffers are bound to the invocation
zf has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
icecream95 has joined #dri-devel
ybogdano has quit [Ping timeout: 480 seconds]
zf has joined #dri-devel
ahajda has quit [Read error: Connection reset by peer]
<karolherbst> daniels: ehh.. also what about ctors in CL 2.0?
pcercuei has quit [Quit: brb]
<karolherbst> and pointers to in kernel global memory
<daniels> karolherbst: the world I inhabit is complex enough without OpenCL C++
<karolherbst> :)
<karolherbst> but in kernel globals are part of OpenCL C
<karolherbst> and you could store a pointer to that in kernel buffer, because it should kind of live across invocations, no?
pcercuei has joined #dri-devel
<karolherbst> but I guess it would be safer to just store offsets
<daniels> should it live across invocations ... ?
<daniels> I don't remember anything from the spec which indicated it should at all (bar SVM ofc)
<karolherbst> daniels: "Variables at program scope or static or extern variables inside functions can be declared in global address space if the __opencl_c_program_scope_global_variables feature is supported. These variables in the global address space have the same lifetime as the program, and their values persist between calls to any of the kernels in the program. They are not shared across devices and have distinct storage."
<daniels> ah right, I don't think I went quite that high in CL versions :)
<karolherbst> from the spec even: "global float4 *color; // An array of float4 elements"
<karolherbst> so... what ptr can you store that if not a pointer from a buffer?
<karolherbst> question is... is that just a bad example, or are buffer addresses expected to live across invocations
<daniels> that makes sense for what you've said about program scope / static / extern
<karolherbst> or is that just for variables in global scope, which... makes that pointless
<daniels> so those would have to persist across program lifetime, rather than kernel lifetime
<karolherbst> ohh wait... no, we have to assign storage to those vars and init them with 0.. right
<karolherbst> you don't assign a pointer to them
<daniels> you don't assign pointers _to_ them, but you can store addresses _of_ them?
<karolherbst> I think so
<karolherbst> one example is doing so even
<daniels> yeah, that's also the most open reading of the spec, and CL requires nothing if not very open reading of the spec
<karolherbst> global int baz; // OK.
<karolherbst> global int *constant ptr = &baz; // OK.
<karolherbst> like that
<karolherbst> all of that is just super wild
<daniels> tbf that whole spec is just 'what if ... ?' 'yeah cool why not'
<karolherbst> :)
<karolherbst> that's CL 2.0, yes
<daniels> I stopped at 1.2
<karolherbst> I didn't
<karolherbst> :D
<karolherbst> and I think we'll curse mesa with a compliant CL 3.0 impl
zf has quit []
<daniels> some of the reason was time (finite), some of the reason was DXIL, some of the reason was my extremely limited sanity
<karolherbst> yeah...
<daniels> 3.0 is the best version though
<daniels> 'remember 2.0? yeah, you don't need any of that, unless you really want it'
<karolherbst> I can see what getting to 1.2 even is hard if you have to use some graphics API
ybogdano has joined #dri-devel
<karolherbst> it requires some stuff
<karolherbst> but not the annoying bits
<daniels> eh, 1.2 was on DX compute, and that was ... I'm not going to say 'easy', but it was certainly doable
<karolherbst> oh sure
pcercuei has quit [Quit: dodo]
<daniels> most of the friction was around memory access models
<daniels> 2.x though ...
<karolherbst> but after that it becomes painful
<karolherbst> CL 2.0 wouldn't be such a pain if it wouldn't add like completely insane APIs and require them
<karolherbst> sure _some_ want to enqueue more stuff from within kernels, but most don't
<daniels> (well ok, most of the friction was reverse-engineering the constraints of a fork of LLVM 3.8 bitcode, but most of the _rest_ of the friction was memory access models)
zf has joined #dri-devel
mclasen has quit [Remote host closed the connection]
<karolherbst> not sure how brutal it would be to implement enqueue_kernel
<karolherbst> does gl/vk allow this stuff?
<karolherbst> *Do
<daniels> yeah, the GPU-side enqueue/copy/wait/etc is pretty punchy
neonking has quit [Read error: Connection reset by peer]
<karolherbst> pipes sound annoying as hell as well
<karolherbst> let's do unix pipes, but on the GPU
<karolherbst> and not just 1:1, but like n:m
mclasen has joined #dri-devel
<daniels> more or less annoying than printf?
<karolherbst> printf is nothing in comparison
<daniels> sweet
<karolherbst> pipes are like real unix pipes, just across kernels
<HdkR> unix pipes on the GPU? Now you're talking to me like DirectStorage.
<karolherbst> "reserve_read_pipe" what the hell even
<daniels> just wait until you can set a p2pdma endpoint as a kernel arg
<daniels> (wait, this channel is publicly logged. shouldn't give anyone ideas.)
<karolherbst> daniels: I am sure you can do this shit with SVM
<karolherbst> too late
<karolherbst> cl_mesa_linux_p2pdma
<daniels> luckily for me I'm just a lowly winsys guy
<HdkR> Lucky for me I'm just a lowly CPU emulation guy
<karolherbst> ....
neonking has joined #dri-devel
<karolherbst> I won't be able to escape, will I?
mclasen has quit [Remote host closed the connection]
<daniels> karolherbst: if you need to ask ...
<daniels> but you have my sympathies
<bnieuwenhuizen> could always choose not to implement
<karolherbst> let me finish CL 3.0 first :P
<DrNick> they released the DirectStorage to the public and it didn't contain anything interesting
<karolherbst> well
<karolherbst> that's just memory
<karolherbst> you get a pointer, you can read from it
<karolherbst> big deal
<HdkR> DrNick: The biggest deal is that it is a easy to use, clean, and nice API
<HdkR> Not like it would expose implementation details of GPU decompression routines (Which isn't there yet anyway)
<DrNick> yeah, that was the nothing interesting part
<DrNick> just a wrapper around IoRing and D3D 12
gawin has quit [Ping timeout: 480 seconds]
<jekstrand> daniels: I'd feel more comfortable if I could find something in the spec.
<jekstrand> But I'm not an OpenCL spec expert. Maybe I should print it out and start reading before bed. /o\
<karolherbst> jekstrand: noooooo
alarumbe has quit [Read error: No route to host]
<daniels> jekstrand: cool, see you at Christmas
<jekstrand> Ugh... fedora 36 has LLVM 14 but SPIRV-LLVMTranslator 13. :(
<karolherbst> daniels: it's not that huge
<karolherbst> jekstrand: :)
<jekstrand> 1.2 is only 380 pages.
<airlied> jekstrand: I should rebuild that then
<jekstrand> airlied: Yes, yes you should. :P
<jekstrand> Ugh... new GCC wit LOTS of warnings
alarumbe has joined #dri-devel
heat has quit [Remote host closed the connection]
heat has joined #dri-devel
<jekstrand> And now rusticl won't build... *sigh*
<jekstrand> /usr/lib64/clang/14.0.0/include/emmintrin.h:2378:10: error: invalid conversion between vector type '__m128i' (vector of 2 'long long' values) and integer type 'int' of different size
Kayden has joined #dri-devel
morphis has quit [Ping timeout: 480 seconds]
<karolherbst> jekstrand: fun
morphis has joined #dri-devel
* jekstrand installs bindgen w/ cargo
<jekstrand> That fixed it
<dcbaker> PSA: I'll be cutting a new 22.0 release later tonight. The staging/22.0 branch has a lot of manual backports if anyone cares to look what's there and what I did
apinheiro has quit [Quit: Leaving]
<kisak> dcbaker: thanks!
sdutt has quit [Remote host closed the connection]
mdnavare has quit [Remote host closed the connection]
mdnavare has joined #dri-devel
sdutt has joined #dri-devel
ramaling has quit [Remote host closed the connection]
ramaling has joined #dri-devel
sarnex has quit [Quit: Quit]
jewins has quit [Remote host closed the connection]
jewins has joined #dri-devel
<airlied> mlankhorst: should I be getting drm-misc-fixes from you or someone else?
<airlied> mripard: ^ maybe you know
<airlied> jekstrand: f36 update filed
sarnex has joined #dri-devel
iive has quit []