ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar
iive has quit []
nchery is now known as Guest3020
nchery has joined #dri-devel
FireBurn has joined #dri-devel
FireBurn has quit [Quit: Konversation terminated!]
ahajda has quit [Read error: Connection reset by peer]
Guest3020 has quit [Ping timeout: 480 seconds]
jernej has joined #dri-devel
co1umbarius has joined #dri-devel
<anholt> karolherbst: sounds like I need to do deqp-runner for CL CTS.
columbarius has quit [Ping timeout: 480 seconds]
<karolherbst> anholt: we only use piglit
<anholt> right, but piglit doesn't have things like flake lists
<karolherbst> yeah :(
<karolherbst> but it's not a flake
<karolherbst> it just takes a long time
<karolherbst> which is kind of a flake, but not a real one
<karolherbst> anholt: but I already have a weird runner for the CL CTS
<karolherbst> it just doesn't really support parsing lists of what's expected to pass/fail/flake
jernej has quit [Quit: Free ZNC ~ Powered by LunarBNC: https://LunarBNC.net]
jernej has joined #dri-devel
<anholt> right, and deqp-runner would get you all that.
<karolherbst> it's terrible because the CTS is terrible
<airlied> CL CTS is so inconsistent within itself
<karolherbst> anholt: problem is: the CL CTS doesn't give you list of tests :)
<karolherbst> most of the code there really just deals with figuring out what subtests we've got
FireBurn has joined #dri-devel
<karolherbst> airlied: you know what? I'll just disable fp64 support, that should make llvmpipe pass most of the stuff as well :D
<karolherbst> and I think all crashes in JIT code are fp64 kernels anyway
<karolherbst> but probably a bit more will fail.. oh well
<airlied> yeah fine with that
<karolherbst> okay.... mhhh
<karolherbst> either I found a resource leak in iris or llvmpipe is buggy
nvishwa1 has joined #dri-devel
<karolherbst> mhhh...
<karolherbst> after creating a new resource, do I have to pipe_resource_reference it or not?
<airlied> nope, it should have a ref count of 1
<karolherbst> okay.. then llvmpipe is buggy
<karolherbst> so I only call pipe_resource_reference once when droping my rust wrapper, which should be the correct thing
<karolherbst> but llvmpipe crashes this way. when I increase the ref with pipe_resource_reference after creating the resource, it doesn't crash
<karolherbst> mhh something is up...
<karolherbst> maybe I forget to ref it somewhere else
<karolherbst> airlied: it crashes inside this lpr->list thing
<karolherbst> I guess with refing I never called into llvmpipe_resource_destroy
<airlied> wierd, no idea where the refcounts could be off there on the driver side
<airlied> it's a pretty well smashed path
<karolherbst> mhh, I think the refcount is fine, but there is some weird corruption going on
<karolherbst> p lpr->list
<karolherbst> $20 = {prev = 0x61500000ca70, next = 0x7ffff341a6b0 <resource_list+496>}
<karolherbst> mhhh
<airlied> that does look back
<airlied> bad
<karolherbst> yeah...
<karolherbst> let me see if the resources I get are all fine actually
<airlied> uggh that debug resource list is horrible, but does haave a mutex
<karolherbst> make it's something with user pointers
<airlied> ah there is a bug there
<karolherbst> :)
<airlied> the user memory path never adds the resource list
<karolherbst> I just wanted to say that literally nothing tests this :D
<airlied> at least nothing built with debug turned on :-P
<karolherbst> probably :)
<airlied> karolherbst: 16207
<karolherbst> airlied: btw.. what's your view on using an ubo for kernel inputs?
<karolherbst> does it make things for llvmpipe more complicated or doesn't it matter?
<airlied> karolherbst: I think it's okay for llvmpipe, impossible for radeonsi/llvm
<airlied> well not impossible, but very messy
<karolherbst> mhhh
<karolherbst> how are they handling uniforms?
<karolherbst> because normally the uniforms are getting pushed into a driver via a constbuf anyway, no?
<airlied> well I suppose if the NIR can keep the function params definitions then it's fine
mbrost has joined #dri-devel
<airlied> the problem is on the compiler side
<airlied> though one workaround I may need is to add extra args to the kernel inputs buffer
<karolherbst> mhhhh
<airlied> which is easier if I can see it
<karolherbst> yeah.. I don't delete the args
<karolherbst> well.. at least not the used ones
<karolherbst> :D
<karolherbst> airlied: but why would it matter?
<karolherbst> atm I played around by creating a constant_buffer with a user_buffer
<karolherbst> ohh.. compiler.. mhh
<karolherbst> annoying
<karolherbst> anyway, the info should be all there
<airlied> karolherbst: to build the llvm compute kernel, you have to have the function signature
<karolherbst> right....
<airlied> after lowering that isn't available anymore
<karolherbst> I thought you wanted to use aco? :P
<karolherbst> or is that also using llvm?
<karolherbst> (never checked)
<airlied> nope this problem is just llvm
<airlied> using aco is a very large project, I'd like to get some baseline working with llvm to compare against
<karolherbst> right...
<airlied> but yeah I'm back and forth on what is the best answer here
<karolherbst> anyway.. you get the uniforms
<airlied> maybe rusticl should do it's thing and then I can fix radeonsi/aco up for that
<airlied> and just baseline using clover hacks for now
<karolherbst> like this
<karolherbst> uhm...
<karolherbst> why aren't we const folding u2u(0) to 0? something isn't right here...
<karolherbst> airlied: yeah.. well.. your choice. If something works with clover, it should also use with rusticl.. more or less. At least nothing from the explicit expectation chagnes
<karolherbst> airlied: image_size is busted
<karolherbst> airlied: so it gets the coords via load_input, llvmpipe replaces it with 0 somewhere
<karolherbst> but between the load and the image_size is a u2u
<karolherbst> so it can't read out the const value
<airlied> shouldn't that u2u disappear? :-P
<karolherbst> well.. yeah? :P
<karolherbst> but I am little confused....
<karolherbst> what's the first arg to image_size anyway?
<airlied> isn't it just the image?
<airlied> texture size takes an lod
<karolherbst> ahh
<karolherbst> airlied: I guess this is fallout from that: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16205
<karolherbst> I insert that u2u so that... some drivers deal with it :D
<karolherbst> but I think llvmpipe might wants to run constant_folding after inserting inputs? dunno
<karolherbst> don't know when that happens, except after I'm finished messing with it
<karolherbst> maybe we want to opt u2u(load_input) to load_input ?
<airlied> seems like it should be easy to convert that to a load input of a different size, but it might be a bit tricky I suppose
<karolherbst> something isn't right though...
<karolherbst> yeah soo...
<karolherbst> the first arg is the image
<karolherbst> the second one is the lod
<karolherbst> the lod is a plain 0 32 bit
<karolherbst> it's the image which is messed up
<karolherbst> huh?
slattann has joined #dri-devel
<karolherbst> ahh.. I had some WIP non inlining stuff still not reverted
<karolherbst> now stuff is better :)
<karolherbst> btw, who do I need to ping to get i915 locking fixes in :D
<airlied> did you cc #intel-gfx?
<airlied> or rather intel-gfx@
<karolherbst> I did
<karolherbst> even created a bug and everything
<karolherbst> but it's also only a week, soo...
<karolherbst> but I kind of assumed a memory coruption bug and...
<airlied> thellstrom or mlankhorst might be a good place to start
<karolherbst> okay.. btw, the patch is here: https://patchwork.freedesktop.org/patch/482687/ (if either of those see the messages here)
<karolherbst> "Pass 2298 Fails 32 Crashes 22" not too bad
<karolherbst> airlied: okay.. soo.. crashes are all images related, fails are mostly images + contractions
<karolherbst> contractions: 5) Error for float kernel1: -0x1.f6b356p-51 * -0x1.d947f8p-90 + -0x0p+0 = *0x0p+0 vs. 0x1.d08p-140
<karolherbst> mhh
<karolherbst> ohhh
<karolherbst> that's FTZ
<karolherbst> okay...
<karolherbst> hum
<karolherbst> airlied: soo.. should llvmpipe flush denorms to zero for compute or not
<karolherbst> ?
<karolherbst> it doesn't seem to adjust the fpstate for compute
ngcortes has quit [Ping timeout: 480 seconds]
ybogdano has quit [Ping timeout: 480 seconds]
mbrost_ has joined #dri-devel
mbrost has quit [Ping timeout: 480 seconds]
heat_ has joined #dri-devel
heat has quit [Remote host closed the connection]
heat_ has quit [Read error: Connection reset by peer]
heat has joined #dri-devel
Company has quit [Quit: Leaving]
mbrost_ has quit [Remote host closed the connection]
mbrost has joined #dri-devel
mdroper has joined #dri-devel
mbrost has quit [Ping timeout: 480 seconds]
minecrell has quit [Quit: Ping timeout (120 seconds)]
minecrell has joined #dri-devel
mbrost has joined #dri-devel
mbrost_ has joined #dri-devel
mbrost has quit [Read error: Connection reset by peer]
shankaru has joined #dri-devel
heat has quit [Ping timeout: 480 seconds]
<airlied> karolherbst: no idea what the fpstate is set for, would have to dig into it
LexSfX has joined #dri-devel
mbrost_ has quit [Remote host closed the connection]
mbrost_ has joined #dri-devel
i-garrison has quit [Ping timeout: 480 seconds]
i-garrison has joined #dri-devel
Duke`` has joined #dri-devel
mbrost_ has quit [Remote host closed the connection]
mattst88 has quit [Ping timeout: 480 seconds]
sdutt_ has joined #dri-devel
sdutt has quit [Read error: Connection reset by peer]
mdroper has quit [Read error: Connection reset by peer]
nvishwa1_ has joined #dri-devel
nvishwa1 has quit [Read error: Connection reset by peer]
Duke`` has quit [Ping timeout: 480 seconds]
DanaG has joined #dri-devel
<DanaG> Weird, my radeon E9173 fails to start any displays or anything. Could it be a damaged GPU? https://dpaste.com//BH7WB9CPB
mvlad has joined #dri-devel
LexSfX has quit []
<FLHerne> DanaG: see after "Call Trace:", something blows up in amdgpu during device init
nchery has quit [Ping timeout: 480 seconds]
<FLHerne> More likely to be a kernel than hardware problem IMO
<FLHerne> (although I'm sure broken hardware *could* throw the driver off)
<DanaG> It died in the same place on my aarch64 board, too.
<DanaG> Well, not necessarily the same place, but the same message (big difference)
itoral has joined #dri-devel
LexSfX has joined #dri-devel
Daanct12 has joined #dri-devel
<FLHerne> What kernel are you using?
<DanaG> Here's a new paste with drm.debug=0x1fff: http://sprunge.us/ja6VLM
<DanaG> It's ubuntu 5.15.0-27-generic
tzimmermann has joined #dri-devel
<DanaG> I've always found it odd that there are "warnings" with no message, just a backtrace. "There's a problem at this address in the neighborhood." But what is the problem? <no answer>
<DanaG> WARNING: CPU: 3 PID: 3101 at drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dce_aux.c:395 dce_aux_transfer_raw+0x28d/0x2f0 [amdgpu]
tursulin has joined #dri-devel
YuGiOhJCJ has joined #dri-devel
<FLHerne> Did it work before with that kernel, or what did you have before?
<DanaG> I think the only thing that changed is that I tried the GPU in an HP T730 thin client's PCIe slot, and I'm not sure how many watts that slot can supply, but it's not a high-wattage card.
<DanaG> And I had a previous WX 4100 go broken when I tried it in that slot, but couldn't be sure it wasn't a Supermicro board I had at the time that did it.
icecream95 has quit [Remote host closed the connection]
<FLHerne> All PCIe x16 slots should support 75W by spec
<FLHerne> I guess an OEM thin client might not bother
<FLHerne> but I really doubt underpowering a card would permanently damage it anyway
<FLHerne> I'd try some other kernels and/or file a bug
<FLHerne> 5.15 (and even some backports to 5.14) have definitely had regressions on certain hardware
<DanaG> I can try booting an older one.
danvet has joined #dri-devel
<DanaG> I should also note that I have a displayport KVM switch, which can be finicky (even though it's an optical displayport cable on the output). Actually, let me try direct connecting it...
ahajda has joined #dri-devel
slattann has quit []
<DanaG> I'll also check if it works properly in Windows.
nchery has joined #dri-devel
<FLHerne> those both sound like good ideas
LexSfX has quit [Ping timeout: 480 seconds]
LexSfX has joined #dri-devel
adjtm has quit [Quit: Leaving]
adjtm has joined #dri-devel
LexSfX has quit []
adjtm has quit []
<HdkR> `UBSAN: shift-out-of-bounds in /build/linux-HMZHpV/linux-5.15.0/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c:601:30` The UBSAN in amdgpu is pretty questionable as well
<HdkR> Could be UBSAN is tweaking some behaviour just enough that it is breaking things and needs to not be enabled :)
<DanaG> Interesting, I disabled the EFI CSM, and now I at least get a framebuffer at boot, but still get the flip timeout. Now going to try old kernel.
icecream95 has joined #dri-devel
LexSfX has joined #dri-devel
<DanaG> 5.13 kernel: same timeout.
<javierm> tzimmermann: answered to the list. What you mention was my first thought as well but then realized that wasn't the correct thing to do
jkrzyszt has joined #dri-devel
shankaru has quit [Read error: Connection reset by peer]
<DanaG> Driver seems to work okay in Windows, at least under a quick test. When the WX 4100 failed, it was dying in Windows too, but this is better at least.
<DanaG> I'm pondering getting a W6400 for this machine to mess with KVM. Has that GPU had the dang PCI reset problems fixed?
<DanaG> I do have IOMMU and ACS and ARI enabled, I wonder if any of the settings in that group are relevant? I recall seeing Navi get a quirk to disable ARI.
<DanaG> This isn't a Navi, though.
rasterman has joined #dri-devel
<DanaG> Looks like amdgpu.dc=0 makes it not die. I guess it's bugreport time, indeed. But first, just time for sleep.
frieder has joined #dri-devel
<tzimmermann> javierm, the way dp_aux_chardev and dp_cec work is a bit messy. but we cannot really do much about it ATM
slattann has joined #dri-devel
<javierm> tzimmermann: yeah, I've sent yet another proposal in the thread
<tzimmermann> javierm, one alternative it to provide empty stubs of drm_dp_dpcd_write()/read()/etc if DISPLAY_DP_HELPER has been disabled. that would resolve the linker error at least
<DanaG> With dc disabled, I see [drm:amdgpu_atombios_dp_process_aux_ch.constprop.0 [amdgpu]] dp_aux_ch flags not zero
lynxeye has joined #dri-devel
<DanaG> dmesg with dc=0 (back to drm.debug=0x10e): http://sprunge.us/T2aA69
<javierm> tzimmermann: indeed, I can include that in v3 if you don't agree with the latest option I mentioned
<javierm> I would just kill all these user configurable options, I think that made more sense before you made the split but have little value nowadays
<DanaG> That reminds me, I had hangs on aarch64 on my wx 4100, until I switched from active DP-HDMI to passive DP-HDMI (target device is pikvm so doesn't need active).
<DanaG> Backtrace mentioned CEC.
<DanaG> I don't have that cec backtrace handy, though.
<emersion> pq: has anyone replied to "It's bit moot to e.g. render everything in electrical 10 bit RGB, if the link is just going to squash that into electrical 8 bit RGB, right?"
DanaG has quit [Remote host closed the connection]
<pq> emersion, nope
MrCooper_ is now known as MrCooper
<emersion> okay. wondering if that would be sane default behavior
<pq> emersion, but MrCooper_ did point out that electrical 8 bpc FB may not be a reason to turn link bpc down to 8 too, because the KMS color pipeline can have more precision.
<MrCooper> on a different (though somewhat related) topic: with temporal dithering, the effective observable bpc can be higher than the HW bpc, right?
<pq> that's the idea, I believe, yes
<pq> or even spatial dithering - you rarely look at individual pixels
<MrCooper> then requiring a minimum HW bpc might artificially exclude some scenarios which would actually work as intended
<pq> probably, as long as we have no idea if dithering is there or not
<MrCooper> swick: ^ so user space actually can't know the required minimum HW bpc
<MrCooper> I guess drivers could be allowed to take dithering into account for the minimum bpc
<MrCooper> or maybe dithering should be explicitly controlled as well (e.g. apparently some people physically cannot bear temporal dithering)
<pq> I would want explicit control and knowledge of dithering.
lemonzest has joined #dri-devel
gouchi has joined #dri-devel
gouchi has quit []
rasterman has quit [Quit: Gettin' stinky!]
pcercuei has joined #dri-devel
digetx has quit [Read error: Connection reset by peer]
<mripard> danvet bbrezillon : do you know what this FIXME references: https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/vc4/vc4_crtc.c#L839 ? did the generic async-page-flip turns out to be asynchronous plane updates, or something else?
digetx has joined #dri-devel
aravind has joined #dri-devel
rasterman has joined #dri-devel
<javierm> tzimmermann: funny that the first local patch I had here to just bypass the link error was "depends on DRM && DRM_DISPLAY_HELPER && DRM_DISPLAY_DP_HELPER"
<javierm> tzimmermann: but that didn't feel quite right to me for the reasons I mentioned in the thread. I didn't expect this change to be that controversial :)
rasterman has quit [Remote host closed the connection]
rasterman has joined #dri-devel
<danvet> mripard, no, never got wired through
<danvet> but probably a good idea to do so?
<danvet> the rough plan was to reuse the async cursor flip stuff of some sorts
<danvet> and let drivers figure out which exact kind of async we really need
<danvet> but the idea of flip_done was that that would hide this all sufficiently well
Daanct12 has quit [Remote host closed the connection]
sdutt_ has quit [Ping timeout: 480 seconds]
gawin has joined #dri-devel
rkanwal has joined #dri-devel
icecream95 has quit [Ping timeout: 480 seconds]
<javierm> tzimmermann: sure. Btw, Dan's report about pach 1/5 missing a mutex_unlock(&info->mm_lock) before return seems correct
<tzimmermann> javierm, yes. i forgot that mutex_unlock.
<karolherbst> jekstrand, airlied: mhh.. I think we have to disable user resources for non 1D images as we currently have no way of specifying custom slices :(
thellstrom has joined #dri-devel
nvishwa1_ has quit [Ping timeout: 480 seconds]
<karolherbst> but fixing the interfaces... uhh... maybe I should read up on how the GL stuff works for using that
Daanct12 has joined #dri-devel
<karolherbst> mhh, I guess from a GL perspective it was only valid for buffers anyway
<karolherbst> mhh.. and textures? oh wow
<karolherbst> so the problem is, that llvmpipe just calculates its own pitch/slice and that breaks stuff
ella-0 has joined #dri-devel
ella-0_ has quit [Read error: Connection reset by peer]
rasterman has quit [Quit: Gettin' stinky!]
rasterman has joined #dri-devel
frieder has quit [Remote host closed the connection]
<tzimmermann> danvet, async cursor updates always acquire plane->mutex and crtc->mutex, right?
rasterman has quit [Quit: Gettin' stinky!]
<tzimmermann> and whenever a plane updates, either sync or async, the commit acquires crtc->mutex?
Daanct12 has quit [Quit: Leaving]
rasterman has joined #dri-devel
<karolherbst> jekstrand: ... treating llvmpipe as a GPU makes the image stuff pass :(
<karolherbst> maybe we can get a waiver for CPU impls
<karolherbst> at least I'd try
<danvet> tzimmermann, yeah the locking is the same
<tzimmermann> danvet, ah, thanks. related question: concurrent updates to planes of the same crtc will never interfere, because they are serialized via crtc->mutex?
<danvet> not quite
<danvet> on the sw state, yes
<danvet> on the hw state, nonblocking updates are pushed through without holding any locks
<danvet> and ordering is ensured by waiting for/signalling drm_crtc_commit appropriately
<danvet> which makes this all a bit more complicated
<tzimmermann> that is basically what happens in commit_tail, right?
<tzimmermann> the hw-state update
<tzimmermann> danvet, but two concurrent hw-state updates are serialized by DRM's atomic helpers, right?
<danvet> yeah should be
<danvet> the real fun only starts when you have cross crtc state
<danvet> hence the epic discussions recently with Lyude on dp mst state
<danvet> lynxeye, random thought really, but have you looked at moving etnaviv over to shmem helpers?
<tzimmermann> danvet, that's luckily not the case
<tzimmermann> danvet, i'm still having that ast bug where mouse movement interferes with modesetting
<danvet> yeah as long as any hw is strictly attached to either crtc or connector you should be fine with atomic helpers
<danvet> even when the connector moves around, it keeps track of that stuff and should order it all
<tzimmermann> and i'm looking for ways they could overlap
<danvet> tzimmermann, is that with my patch to nuke legacy cursor already applied?
<tzimmermann> danvet, no without the patch
<tzimmermann> but i cannot even reproduce it. the reporter fixed it by repeatedly setting I/O registers
<tzimmermann> maybe the HW is too slow to catch up with the rest
<tzimmermann> it doesn't look like the correct fix
<danvet> uh yeah that's pretty horrible
<danvet> I would try with the legacy cursor patch applied, I think it can cause stuff like this
<danvet> if it's something else then I guess a spinlock around all the indexed register writes is what's needed
<tzimmermann> ok, i'll the patch
<tzimmermann> if the atomic updates are serialized, maybe the ast HW simply needs time after a full modeswitch before it accepts new commands; just guessing here.
<danvet> hm yeah maybe that could be it too
<danvet> that it's not a race, but the hw being a bit slow, and you actually have to hammer the index reg until it goes through
<danvet> I guess that could be the 3rd option really
<danvet> and for that case ofc no spinlock needed
<danvet> I guess we could test this by adding some tracing to the index_reg functions?
<danvet> if they run concurrently, then there's an sw bug
<danvet> if they never run concurrently, then probably a hw issue
soreau has quit [Read error: Connection reset by peer]
<tzimmermann> thanks for confirming
<danvet> like just do an atomic_inc/dec around the code and complain if it's ever elevated
<danvet> and test with that
soreau has joined #dri-devel
<tzimmermann> good idea
<tzimmermann> thank god ast HW is all fully documented with example code and reference drivers directly from the manufacturer /sarcasm
<lynxeye> danvet: Yea, I've had that on my list for while, but didn't really get around to having an opinion yet due to other things having higher prio.
<danvet> lynxeye, I think once the shrinker stuff that's in the works has landed, there's really not anything left for shmem helpers
<danvet> so might actually be good to have etnaviv using it, to make sure that stuff all fits
itoral has quit []
<lynxeye> danvet: agreed. The shrinker patches bumped this up a bit on my prio list, but still not at the top.
<danvet> yeah makes sense, just wanted to make sure you've seen this
<danvet> I'm expecting the shrinker patches to take some time still anyway, the locking is a bit a mess
YuGiOhJCJ has quit [Quit: YuGiOhJCJ]
<danvet> sravn_, I'm assuming you're also pushing the fbcon patch you acked?
slattann has quit [Ping timeout: 480 seconds]
<swick> pq: MrCooper: it's not only dithering, it's the complete color pipeline we would have to know about to get the effective bpc
<swick> we would have to know the precision before and after each block and every time a block goes from a higher to lower bpc there could be dithering
<swick> so either user space has to know about all that stuff or min_bpc must be the effective minimum bpc
<pq> swick, IOW, would you reject an atomic commit if min_bpc happens to be larger than, say, CTM block precision?
<pq> that's an interesting idea, but I wonder what that means for discoverability of working configurations...
<pq> it seems every new KMS property adds a new dimension to the combinatorial explosion
<swick> pq: if the CTM block is used in a way that the precision is below min_bpc then yes
<pq> At least with the color pipeline we have the option to not use any of it. Link bpc we cannot ignore.
<swick> it could be passthrough and retain the full pipe precision or dithering the CTM result could increase the effective bpc
<swick> yeah
<pq> I'm kinda hoping I could just ignore the precisions of the color pipeline hardware blocks, until the "libcamera for KMS" exists.
Company has joined #dri-devel
<swick> honestly if drivers ignore all of that in the beginning I wouldn't be too mad
sh-zam has joined #dri-devel
heat has joined #dri-devel
<swick> I would assume that most hardware is designed to provide the precision it can drive the display at, too
sh_zam has quit [Ping timeout: 480 seconds]
fxkamd has joined #dri-devel
jewins has joined #dri-devel
mdroper has joined #dri-devel
<rgallaispou> Hi. I'm playing with kms_color to test the gamma property. After the test ends, the pointer associated with the last gamma lut passed to the kernel is still in use when wayland-weston is started. Is my driver broken or it is a standard behavior to keep the gamma lut ?
<pq> rgallaispou, it's standard KMS behavior. Weston lacks resetting most KMS properties.
<pq> these proeprties in particular are in my todo to fix in Weston
<pq> I've had fun with fbdev in HDR mode...
<rgallaispou> pq: okay it's good to know
nchery has quit [Read error: Connection reset by peer]
<rgallaispou> pq: yes, I can imagine why
<rgallaispou> Thanks anyway :)
<marex> sigh, seems like powervr driver is effectively back to being dead
<ajax> oh?
<marex> is there any activity ?
<ajax> !16040 was last touched on monday. i've gone six months without touching some of my MRs.
<marex> every time I tried to bring it up on HW with powervr I have here, I found the kernel driver is outdated, specific firmware does not work or is unavailable or I have the wrong version with no way of getting the right version ... userspace at least compiles
Haaninjo has joined #dri-devel
<kj> The 1.17 update is stuck in an internal process. I was going to ping people tomorrow since I haven't received approval from all the necessary people
<marex> that doesn't help me, the hardware I have ships with blob built for API 1.16 . Until there is easy support for different APIs , the powervr stuff is unusable except for one specific SoC model
<kj> There should be a 1.17 firmware binary released soon so that might be of some use
<marex> kj: and that works on all SoCs and thus powervr revisions ?
<marex> kj: or is this one specific binary for one specific SoC again ?
<ajax> do we actually do anything with the semaphore passed to vkAcquireNextImageKHR ?
<mlankhorst> karolherbst: Currently not doing much on locking, patch you linked seems sane, hence I'm worried that it probably breaks. ;)
<ajax> :q
<kj> marex: It likely wouldn't work on all platforms out there but it might be worth trying out. I think it might just work however we still need to upstream the firmware binary and the pvrsrvkm kernel module + the mesa side changes which are stuck in the internal process
nchery has joined #dri-devel
<ajax> getting the distinct impression that nobody understands wsi
<karolherbst> mlankhorst: :D
<karolherbst> mlankhorst: if it breaks something that would worry me even more
<mlankhorst> You haven't written many locking patches to i915 then.
<karolherbst> I didn't, but the code looks obviously wrong though
<karolherbst> well, the current one that is
<emersion> ajax: /me adds to WSI related quote list
<marex> kj: is there a tree with up-to-date kernel driver ?
<kj> The 1.17 should be released soon too but it hasn't yet been put thought our internal process
<kj> The new kernel module requires https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15507 . I'm not sure if the MR needs rebasing or not but there are a bunch of change to that which need releasing and Frank's off till Monday
<ajax> okay, got it. what we do with that semaphore is we require that the backend ANI actually acquire an image synchronously, and then we signal the sema on our way out
<ajax> this seems: bad
sdutt has joined #dri-devel
<ajax> i mean, not wrong, but also not good
thellstrom has quit [Ping timeout: 480 seconds]
sdutt has quit []
sdutt has joined #dri-devel
<marex> kj: so yes, that kernel driver is based on some 6 months old kernel version, ancient
thellstrom has joined #dri-devel
gawin has quit [Ping timeout: 480 seconds]
<marex> I'll just stop here
<melissawen> mripard, danvet, related to the previous discussion on async flip (I guess): I see in drm_mode_atomic_ioctl we are aborting when there is a ASYNC_FLIP flag and there is also a comment in crtc->async_flip that `It's not wired up for the atomic IOCTL itself yet`
<melissawen> so how do we usually handle async page flip in a atomic context? we just don't do it right now?
<melissawen> is there any drivers doing it in a custom implementation, for example?
<melissawen> I'm looking into this topic and ended up quite confused on what we can currently do or not...
iive has joined #dri-devel
<mlankhorst> modify a single fb is usually most what is allowed async
jkrzyszt has quit [Ping timeout: 480 seconds]
mdroper has quit [Read error: Connection reset by peer]
pcercuei has quit [Quit: brb]
pcercuei has joined #dri-devel
aravind has quit [Ping timeout: 480 seconds]
nvishwa1 has joined #dri-devel
tzimmermann has quit [Quit: Leaving]
thellstrom1 has joined #dri-devel
rgallaispou has left #dri-devel [#dri-devel]
thellstrom has quit [Ping timeout: 480 seconds]
sdutt has quit [Remote host closed the connection]
sdutt has joined #dri-devel
mbrost has joined #dri-devel
Duke`` has joined #dri-devel
DanaG has joined #dri-devel
<Lyude> danvet: re cross-crtc state: honestly now that I have this working with my mst stuff it's nowhere near as complicated as it seems: https://gitlab.freedesktop.org/lyudess/linux/-/blob/wip/mst-atomic-only-v1/drivers/gpu/drm/dp/drm_dp_mst_topology.c#L4436 basically just build up a bitmask of every CRTC you're involving and then do something in setup_commit() like I did there to
<Lyude> actually retrieve the drm_crtc_commits
ybogdano has joined #dri-devel
<Lyude> even managed to get it so that we can change payload->vcpi_start_slot during the atomic commit like I needed
<danvet> Lyude, hm why do we only copy stuff over that late?
<Lyude> danvet: because we're not holding any locks by the time we start committing the state potentially in non-blocking modesetting, right?
<danvet> or can we compute the vcpi slots only when we do the actual commit and not precompute things?
<Lyude> yeah you can't precompute them
<Lyude> because you'd need to know what order the driver is bringing up the payloads in order to do that
<danvet> ah ok
<danvet> hm
<danvet> I guess it's not super atomic-y thought, but should work
<danvet> deserve a huge comment for that state that it doesn't work quite like the others
<Lyude> yeah, luckily the thing about start slots is they're totally irrelevant to any actual state computation - so the values there also don't really matter until commit time
<danvet> yeah I think just a comment that we use this as scratch patch and that it works like hw state essentially - i.e. races are prevented by careful ordering, not locking
<danvet> and also a comment that the drm_crtc_commit completion provides the necessary cpu memory barriers for this to work correctly
<danvet> since strictly speaking you're doing a fancy lockless thing here
<Lyude> mhm - gotcha, was planning on doing something like that once I start cleaning this series up
sravn_ has quit []
<anholt> jekstrand: were you objecting to making tex->txl lowering optional? or just asking for explanation
ybogdano has quit [Ping timeout: 480 seconds]
slattann has joined #dri-devel
mbrost_ has joined #dri-devel
<jekstrand> It still feels kinda bogus to me.
<anholt> frontend shading language doesn't have txl(shadowcube) or txl(shadow2darray)
<jekstrand> anholt: It's an obviously correct transform that most hardware needs.
<anholt> but NIR insists on making those, because...?
<jekstrand> Ugh...
mbrost has quit [Read error: Connection reset by peer]
<anholt> so then the drivers have to back out the lowering they didn't want
<jekstrand> which drivers is this causing a problem for? virgl, maybe, I guess.
<anholt> nir_lower_tex has a bunch of options, and then there's this one non-optional thing it does, too.
<karolherbst> jekstrand: nv50
<karolherbst> we don't have lod sources for those
<anholt> nouveau's the one that didn't have a workaround for NIR yet.
* jekstrand wonders if NV hardware does the right thing there automatically or if shadow sampling in vertex shaders is just busted and no one cares.
<anholt> in ntt (virgl) and radeonsi we recognize that NIR did the thing we didn't want and back it out.
<karolherbst> jekstrand: we just don't have it on nv50
<karolherbst> the hw is.. broken :)
<jekstrand> Bingo!
<karolherbst> so we can only pass 4 sources into tex
<karolherbst> and most of them are filled up with coords
<anholt> jekstrand: uh, do you positively know that VS texturing on nv50 doesn't set lod to 0?
<karolherbst> and then you add shadow and you got 4
<jekstrand> anholt: I have no idea. But I'd kind-of like someone to prove that it does before they say the workaround isn't needed.
<anholt> given that nvc0 got a knob for lod 0, but the knob isn't on nv50, I would easily believe that it's a shader stage knob on older, then an instruction knob later when they realize the same HW bits would shave instructions in the FS.
<karolherbst> we can't encode the lod for shadowcube or shadow2darray
<karolherbst> on nv50 that is
<karolherbst> there is nothing
<jekstrand> I'm not arguing that you can't encode it
<jekstrand> I'm asking if VS texturing works
<jekstrand> Maybe it does by automagic
<karolherbst> I think it doesn't :)
<karolherbst> imirkin said something about parts being broken, but nvidia exposing that anyway and fails to compile
<karolherbst> something like that
<jekstrand> :facepalm:
<DanaG> I'm curious, how hard would it be for somebody to add DRI_PRIME support to the `ast` driver? I'd like to be able to DRI_PRIME offload, or use xrandr offload to render on the Radeon and dump into the ASPEED's framebuffer. If making it not bog down the Radeon would require skipping 3/4 of the frames, or make it allow horrible tearing, that would be fine.
<anholt> karolherbst: that was about bias.
<karolherbst> ahh
<karolherbst> so the lod works, but the bias is dropped, right?
<jekstrand> Maybe it's just too many years at Intel but I don't assume hardware does things right.
<karolherbst> anyway.. we have 4 sources and have to do something
<jekstrand> I assume someone noticed VS texturing was broken on nv50 and went "ugh... we need to force LOD to 0, let's add an instruction bit"
<karolherbst> on later gens we can encode up to 8 sources
<karolherbst> but we do have a special zero lod flag as well
<anholt> jekstrand: I've never understood. What is the argument for why NIR *should* turn tex into txl on VS on all drivers?
<anholt> like, we don't force many lowering passes on everyone, why this one?
<jekstrand> anholt: Yeah, it's weird. I'm not opposed to making it optional, in principal. I'm opposed to saying it breaks nv50 when it may actually be sort-of fixing nv50 and we're all too lazy to care and figure out why.
<anholt> it's valid, but also why not turn tex into txb(bias=0) in the FS? that's also legal.
<karolherbst> I think it would be fine to force this lowering, but then we have to explicitly say : tex won't reach the driver or something
Emmy_ has quit [Remote host closed the connection]
<anholt> and my argument is: you shouldn't be adding tex srcs when you don't have to. and nir-to-tgsi wishes you wouldn't.
<jekstrand> I also don't like instructions having subtly different behavior per-stage unless they're stage-specific.
sdutt has quit [Remote host closed the connection]
sdutt has joined #dri-devel
<jenatali> FWIW, D3D only has shadow tex with implicit mip, and shadow txl for level 0, and for VS only the latter is allowed. There's no arbitrary level txl
<anholt> (nir-to-tgsi has to back it out, because then virgl would need to recognize txl(shadowcube/2darray, lod=0) and turn it back into tex since txl on those samplers is illegal!)
<jekstrand> Why does texturing behave a little differently in 2/3 of the stages? Uh... we didn't think about it when designing GL until it was too late?
<karolherbst> well back then you only had two stages, no?
<karolherbst> or was there a time with just one stage even? :D
<jekstrand> There have always been >= 2
slattann has quit [Read error: Connection reset by peer]
<jenatali> Isn't it more like 1/6 stages? FS is the only one that does implicit LOD
<jekstrand> jenatali: Compute too, with some extensions.
Emmy_ has joined #dri-devel
<jenatali> Ah right, I forgot about those extensions
<anholt> we definitely had one stage in the fragment program days.
<karolherbst> couldn't we make those instruction to behave the same inside nir?
<jekstrand> anholt: right...
<karolherbst> sure.. glsl is stupid, but why should we carry the stupid over
<jekstrand> karolherbst: That's my argument. :)
<karolherbst> yeah...
<jekstrand> If VIRGL wants to translate back to GLSL, it's got to deal with GLSL being stupid. Same for Zink.
<jekstrand> Maybe we want a txz opcode which is txl with lod0?
<karolherbst> I am not opposed to check for a zero lod inside codegen, especially as we do have that special lz flag and could make use of it, but...
<karolherbst> jekstrand: maybe
<anholt> jekstrand: TGSI's got one of those.
<karolherbst> tex_lz?
<anholt> yep
<jekstrand> anholt: Are those allowed for shadow and cube?
<anholt> nvc0 and radeonsi have it in hw.
<anholt> yes
<jekstrand> Then why not translate nir_texop_txl with lod=0 to tex_lz?
<anholt> jekstrand: in ntt? because the ntt-consuming drivers don't have support for it, it's a cap.
<jekstrand> oh
<anholt> the reason to fix nir_lower_tex was because if we can get a decision for where to fix NIR's undesired lowering, we can get nouveau onto NIR on all chipsets and not go the ntt route for it.
<anholt> and then I can finally land !8044
<jekstrand> Yeah, I know.
<karolherbst> well.. we could handle that in some way inside from_nir, but... I also kind of prefer to move lowering from codegen into nir :)
<jekstrand> And I really want to land 8044
<karolherbst> but...
<karolherbst> maybe it does make sense to add a tex_lz? but that's going to be messy
<anholt> right. so karolherbst wants the fixup not in nir frontend. imirkin wants the fixup not in the backend, because it's not legal in the shading language. jekstrand wants the fixup not in nir because VS tex instead of txl is silly. I want someone to budge because I just want to move on with my life.
<jekstrand> anholt: I know
* jekstrand kind-of wants to replace the entire nouveau compiler. :-P
* karolherbst wants the same
<jekstrand> But that's not going to happen today. :)
rasterman has quit [Quit: Gettin' stinky!]
<karolherbst> yeah... so my reason for moving stuff into nir is, so that codegen shrinks
<karolherbst> if we can rely on using nir, I can throw out quite a lot of code
<jekstrand> I guess it's probably fine. I really don't like "nouveau is sloppy" to be the reason we carry tech debt in NIR.
<jekstrand> But I can get over it. glsl_to_tgsi is more tech debt than this tiny bit of lowering.
<karolherbst> well.. if we are expected to fix stuff up when going from nir to codegen, that's fine by me
<jekstrand> So we're still going the right direction.
<karolherbst> I just don't want to add more stuff into codegens lowering
<anholt> jekstrand: when I look at doing this in NIR, it feels like paying down tech debt because so many drivers have to reverse the undesired thing nir adds.
<jekstrand> anholt: You say "so many" and it's really just 2 AFAICT.
<anholt> (though, that argument doesn't hold so much because radeonsi and nvc0 would want to recognize lod==zero anyway since they can do tex_lz on all stages)
<jekstrand> radeonsi isn't undoing anything.
ybogdano has joined #dri-devel
<jekstrand> In fact, with the NIR lowering, radeonsi doesn't need to be doign its stage check which is the point.
<jekstrand> And neither does nouveau except for one piece of hardware we aren't sure works.
<jekstrand> But whatever, I don't want to keep arguing.
<jekstrand> As long as we can come up with a better name for the bit (I suggested one), I guess I'm fine adding it.
aravind has joined #dri-devel
slattann has joined #dri-devel
<ajax> jekstrand: want to pick your brain about !4037 if you have a minute
<jekstrand> ajax: Sure, what about it?
<ajax> hah, race condition, i just posted a comment on the mr
<jekstrand> ajax: Feel free to pull the first 6 into a different MR and land them.
neonking has quit [Ping timeout: 480 seconds]
<ajax> hmm. could really use a way to tunnel arbitrary events back through an XGE channel.
<jenatali> zmike: https://gitlab.freedesktop.org/mesa/mesa/-/jobs/21908000 - that a flake or did I somehow break it?
<zmike> fucking hell what now
<zmike> it's a flake
aravind has quit [Ping timeout: 480 seconds]
<zmike> ajax: really really really need that xlib change ^^^^^^
<ajax> you know what especially sucks
<ajax> i probably want to fix that in glvnd's frontend too
<ajax> let's see if i remember how to do x module releases
slattann has quit []
<ajax> ugh okay.
<ajax> jekstrand: so the irritating thing here is, we're using x11/present in a totally cromulent way, we're passing in the idle fence and waiting for it before we hand it back from ANI.
<jekstrand> yup
<ajax> the bug is that the way present releases the pixmaop is it calls into the ddx to flush rendering, and glamor treats that "flush" literally as glFlush instead of anything stronger.
<ajax> so, iiuc, all xserver has done is submit commands to the device, it hasn't waited for their completion and it can't guarantee those commands get submitted before the client takes back over
<ajax> (assuming single-queue to the hardware from the kernel, but with no particular ordering among drm clients)
ngcortes has joined #dri-devel
<anholt> this sounds correct to me -- at the time we wrote that, everyone had implicit sync, and glFlush() got you to the kernel.
<jekstrand> ajax: Yup. That's why we still need implicit sync
neonking has joined #dri-devel
sdutt has quit [Read error: Connection reset by peer]
stuart has joined #dri-devel
rasterman has joined #dri-devel
<DanaG> One other idea for the DRI_PRIME with ASPEED: have the ASPEED pull from the Radeon, instead of having the Radeon push to the ASPEED.
<ajax> DanaG: i didn't think aspeed chips had enough dma to do that
DanaG has quit [Remote host closed the connection]
<ajax> jekstrand: is ARB_sync explicit enough here? if glamor did glFenceSync and waited for it to pass before emitting the present-idle event, good enough?
<ajax> is there any benefit to that over just glFinish, too
<jekstrand> ajax: Both would stall in X and kill any pipelinling
<ajax> i don't need to stall, i have a main loop and i can ClientWaitSync(timeout=0) just fine.
gawin has joined #dri-devel
<jekstrand> I don't mean it stalls X I mean we have to do a full round-trip through userspace and possible wait inside the client (with whatever implications that has) before work gets submitted.
<jekstrand> With the likely solution being num_buffers++
<ajax> i don't follow "before work gets submitted" here. the idle fence wouldn't get touched until after the gl fence passed, and we (wsi) wait on the idle fence before ANI will give it back to the app
<jekstrand> With the way things are today, as soon as X has submitted its compositing job or blit, it can flush, hand us back the buffer, and we can hand it off to the app so they can start rendering to it.
<ajax> yes i am suggesting glamor would be better
<ajax> can easily be, small numbers of lines of code.
<jekstrand> If we glFinish or wait on a fence in X11, regardless of where that wait happens, the client can't even start building command buffers to render until after the GPU is done with that buffer.
eukara has quit []
YuGiOhJCJ has joined #dri-devel
<ajax> walk me through that? what part of the buffer state is so mutable while it's busy that you can't even start?
eukara has joined #dri-devel
<ajax> it's not going to change size
<ajax> it's probably not getting memmoved anywhere
<jekstrand> The app can't start until it gets it back from ANI because it doesn't know which image ANI is going to return next.
<jekstrand> So we want to be able to return it from ANI the moment X knows a buffer is going to be free
<jekstrand> That's the whole reason ANI comes with fences and semaphore.s
<ajax> (thinking)
<ajax> that sounds a bit like you want to virtualize the mapping between VkImages and X Pixmaps?
<jekstrand> You might want me to. :P
<jekstrand> That's something that has been discussed but no.
<jekstrand> We want to know the actual BO that's going to become free
<jekstrand> And have a fence/semaphore
<jekstrand> that tell you when it's actually free
<jekstrand> The theory being that there's no point in the app trying to race with X on its current composit anyway.
<jekstrand> (Unless that app is a VR thing but those are crazy and special and don't want X in the way to begin with)
alanc has quit [Remote host closed the connection]
alanc has joined #dri-devel
<ajax> fine.
<ajax> how much of a stall ReadPixels
<ajax> is ReadPixels, excuse me
<ajax> there _has_ to be some way to learn that a particular command buffer has been completed without stalling the whole chip
<ajax> how else do you ever reclaim old command bufs
<ajax> enh. i guess the ring buffer just tells you when a command goes into the gpu, not when it finishes.
<airlied> karolherbst: oh you have to specify row/image strides on user buffers? seems dangerous :-P
<karolherbst> airlied: well... the user provides the buffer
<karolherbst> anyway.. CL allows custom strides, so...
Haaninjo has quit [Quit: Ex-Chat]
<nchery> dcbaker: did you see my ping ab a backport for 22.0.3 https://gitlab.freedesktop.org/mesa/mesa/-/issues/6350#note_1355006 ?
<airlied> karolherbst: does the gallium interface allow that?
<ajax> jekstrand: is it actually that hard to predict which image will be returned next? it's the one with the lowest sbc.
<karolherbst> nope
sravn has joined #dri-devel
<airlied> karolherbst: did clover deal with it somehow?
<karolherbst> airlied: not at all
<karolherbst> I think it relies on resource_from_user_memory to fail
<karolherbst> the GL interface also doesn't allow custom strides which is a bit ... strange
<airlied> karolherbst: ah so cl_image_desc is the thing that specs it?
<karolherbst> yeah
<karolherbst> AMD_pinned_memory is the GL extension btw
<airlied> should check the vulkan ext
<airlied> just for completeness
<HdkR> pinned_memory is such a wacky extension
<karolherbst> it looks broken
<airlied> karolherbst: pinned memory seems buffer only
<karolherbst> so you specify the host ptr and say how big the texture is, but... not a word about strides?!?
<karolherbst> could be
<karolherbst> but
<karolherbst> you can read pixels out of it
<karolherbst> boxed
<airlied> you use the gl packing to do that then
<HdkR> SSBOs, UBOS, pixels. Dolphin-emu should abuse all of those with pinned_memory
<karolherbst> but yeah.. looks like the underlying memory needs to be a plain buffer
<jekstrand> ajax: It's impossible for the app to predict.
<jekstrand> ajax: The driver might be able to predict it, sure, but that doesn't solve any problems.
<karolherbst> airlied: yeah
<ajax> it does if it means ANI can return a promise and the fences/semas actually work
<ajax> right?
<jekstrand> ajax: But the only way to actually provide that promise is if the driver then stalls later because we can't actually pass the fence we get from X11 off to the kernel.
<jekstrand> Most drivers do use submit threads these days (or can) so we could, in theory, do it.
<ajax> unless we fix x11
<jekstrand> But oof
<ajax> which i keep telling you i know how to do
<jekstrand> What are we going to fix in x11?
<karolherbst> airlied: anyway.. the thing is, CL requires custom pitches and gallium doesn't allow it :(
<jekstrand> I'm still unclear on that
<karolherbst> it's fine for buffers, because it doesn't matter
<jekstrand> The only "fix" for x11 is if it starts using syncobj instead of shmfence
<airlied> karolherbst: guess you get to fixing gallium then :)
<ajax> or, for its shmfences to reflect explicit sync instead of implicit
<karolherbst> airlied: it seems that way :)
<jekstrand> ajax: Then we're back to driver threads
<ajax> i mean i have one of those for wsi for fifo modes anyway...
abws has joined #dri-devel
<jekstrand> Unless it's an explicit sync primitive I can hand off to the kernel as part of my exec, we have to manage the whole VkQueue with a driver thread and wait for the fence in the driver before we submit to X11
<karolherbst> airlied: but actually, I think it's time to upstream some patches atm, so I am a bit reluctant to add more stuff at this point :D
<karolherbst> the MR is way to big already anyway
<jekstrand> At which point X11 not being able to use kernel sync primitives is impacting driver design. Please, no.
<jekstrand> (Impacting far deeper than a bit of annoyance in the WSI code, that is.)
<karolherbst> jekstrand: ohh btw.. would you mind if I fold your rusticl reference stuff into the "initial" commit? I'd like to clean it all up and would rather prefer to fix the existing commits than to add a bunch of new ones :D
<jekstrand> karolherbst: fine with me
<karolherbst> okay, cool
<airlied> karolherbst: do both :-)
<airlied> create some upstreaming MRs, hack away while reviewers hold back your brilliant code :-P
<karolherbst> :D
<karolherbst> well..
<karolherbst> I don't want people to review patches if I change stuff 100 patches later
<airlied> karolherbst: oh the upstreaming MR should be cleanly rewritten usually
<karolherbst> so I wwant to move all "fixups" into the original code
<karolherbst> yeah
<karolherbst> that's my plan for now :P
<airlied> hopefully next week I can get back to trying to get any images working
<karolherbst> airlied: if you want to help, we track MRs here: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6311
<karolherbst> :D
<airlied> karolherbst: question on rusticl when mapping resources do you use the appropriate buffer/texture map functions?
<karolherbst> airlied: yes
<airlied> my feeling is clover is willing to use buffer map on texture resources
<karolherbst> it does
<airlied> and that might be the cause of some of my issues
<karolherbst> possible
<karolherbst> every one of those are... annoying issues
soreau has quit [Read error: Connection reset by peer]
<karolherbst> but nothing really broken
<karolherbst> just... annoying
<karolherbst> allocations fail, because the first try fails and then the CTS is broken checking the second
soreau has joined #dri-devel
<karolherbst> basic is alignment of long16
<karolherbst> contractions is... FTZ
<karolherbst> images_image_streams is just clamping being not correct, but works if llvmpipe is exposed as a GPU....
<karolherbst> min_max_write_image_args is... crashing because I don't know, but I'd wait until jekstrand MRs with bumping limits lands
* airlied just marvels at CTS sometimes
<karolherbst> anyway.. for the image fails we could probably get a waiver
<karolherbst> for contractions we have to figure out if llvmpipe supports denorms or not
<karolherbst> there is one issue though which is driving me nuts: luxmark v3.1 crashes with llvmpipe in JIT code, and I never figured out why
<airlied> without tweaking fpstate I've no real idea :-P
<karolherbst> I suspect something is weird with nir passes, but...
<karolherbst> airlied: yeah...
<karolherbst> I honestly have no idea what would be the best approach regarding fpstate
<karolherbst> anyway.. I think I'll get an AMD GPU, so I might be able to try out some things later
* airlied will try and get back to digging a bit next week, this week has been short and written off elsewhere :-P
<karolherbst> sure
mvlad has quit [Remote host closed the connection]
lynxeye has quit [Quit: Leaving.]
ngcortes has quit [Ping timeout: 480 seconds]
famfo_znc has joined #dri-devel
famfo has quit [Ping timeout: 480 seconds]
ngcortes has joined #dri-devel
famfo_znc has quit []
<karolherbst> jekstrand: anything preventing this MR from landing? https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15673
<jekstrand> karolherbst: nope
rasterman has quit [Quit: Gettin' stinky!]
lemonzest has quit [Quit: WeeChat 3.4]
famfo has joined #dri-devel
YuGiOhJCJ has quit [Quit: YuGiOhJCJ]
<karolherbst> airlied: ohh, if you have a little time, this is something you could think about: https://gitlab.freedesktop.org/karolherbst/mesa/-/commit/1edb8e2797f9420f3d5080670476fc48cf1d43cc
<karolherbst> but yeah.. it sucks
<karolherbst> or maybe we could leave it in late, but also call it inside lp_get_disk_shader_cache if necessary?
<karolherbst> anyway, I need it for caching the libclc
ybogdano has quit [Ping timeout: 480 seconds]
mbrost has joined #dri-devel
mbrost_ has quit [Read error: Connection reset by peer]
Duke`` has quit [Ping timeout: 480 seconds]
GeorgesStavracasfeaneron[m] has joined #dri-devel
<zmike> dcbaker: do you prefer to look over backport MRs or should I just marge them?
frieder has joined #dri-devel
<jenatali> Georges Stavracas (feaneron): Yes, but you're not registered correctly, so only other people who are connected via Matrix can see your messages
frieder has quit [Remote host closed the connection]
famfo has quit [Ping timeout: 480 seconds]
famfo has joined #dri-devel
<airlied> karolherbst: yeah that patch would be fine to init it early I think
<karolherbst> airlied: ohh.. how easy would you think would it be to wire up function calling support in llvmpipe?
<airlied> karolherbst: I had it mostly working, except for implicit args
<karolherbst> airlied: well... creating the screen also ends up spawning all the threads sadly
<airlied> and flow control :-P
<karolherbst> ahh
<airlied> karolherbst: ah yeah that bit is messy isn't it
<karolherbst> it is, hence I asked you :P
<karolherbst> the thing is...
<karolherbst> luxmark crashes in JIT code :(
<airlied> yeah I might go dig into that one next week
<karolherbst> cool
<airlied> sigsegv or sigbus?
<airlied> or worse SIGILL
<airlied> ?
<karolherbst> huh.. I think it's a sigsegv, but let me check
<airlied> usually it's just some calculation going into memory it shouldn't
<karolherbst> thing is.. it _sometimes_ doesn't crash
<karolherbst> but then the rendering is all wrong
<karolherbst> like it uses uninit values or something
<karolherbst> ehh I think it only does this inside gdb
<karolherbst> anyway, it's a segfault
<karolherbst> mhh.. let me try to lower undefs to 0. I think I already tried, but you never know
<jenatali> Oh no... a real world app using glProgramStringARB...
<karolherbst> I am sure they have a good reason for it
<karolherbst> "using assembly, because it's faster"
<jenatali> No, I seriously doubt it
<bnieuwenhuizen> you'd be surprised what some people are capable of
<karolherbst> ehh wait.. glProgramStringARB is something else
<jenatali> And Mesa's rejecting their program, woo
<karolherbst> good
<karolherbst> glProgramStringARB really sounds ugly
<karolherbst> wait.. so glProgramStringARB "simply" replaces the current glsl source code?
<jenatali> Looks like they have a comment after END and it seems Mesa's parser doesn't like that
<karolherbst> or is that ARB shader stuff?
<jenatali> This... is not a thing I thought I'd be debugging today
<karolherbst> (I am too young to know this)
<jenatali> karolherbst: It's an assembly shader
ngcortes has quit [Ping timeout: 480 seconds]
<karolherbst> ahh okay, so my first guess was indeed correct
<karolherbst> I am sure they do it because of performance
<karolherbst> as everybody knows, assembly is faster
<jenatali> Uh huh
gawin has quit [Ping timeout: 480 seconds]
<karolherbst> heh nice.. it still crashes with undef to zero
famfo_znc has joined #dri-devel
famfo has quit [Ping timeout: 480 seconds]
rasterman has joined #dri-devel
gawin has joined #dri-devel
<jenatali> Ugh... I think it looks like Mesa's lexer for ARB programs doesn't like that the last comment line in this program doesn't have a newline
<HdkR> Oh no, ARB
danvet has quit [Ping timeout: 480 seconds]
icecream95 has joined #dri-devel
<karolherbst> jenatali: well.. I'd say it's an application bug then :)
<karolherbst> case closed
<jenatali> But it works on all Windows GL drivers...
<karolherbst> ehh.. the spec disagrees with me, I say the spec is wrong
<karolherbst> "Comments begin with the character "#" and are terminated by a newline, a carriage return, or the end of the program array." :(
<HdkR> I think someone actually complained about this a year or so ago
<Sachiel> not ending a text file with a newline should always be a bug
ngcortes has joined #dri-devel
<karolherbst> Sachiel: it's windows
<karolherbst> they seem to like that
<karolherbst> :P
ybogdano has joined #dri-devel
<karolherbst> jenatali: workaround: insert a new line at the end of the program :)
<jenatali> I'm tempted
<karolherbst> I am sure we copy the string anyway, so we can also just add it :D
<jenatali> Yeah actually that's probably the easiest thing...
<HdkR> `/* The newline of shame gets added to all ARB programs as a workaround */`
ahajda_ has joined #dri-devel
ahajda has quit [Read error: Connection reset by peer]
<jenatali> Yeah
<karolherbst> I wouldn't be surprised if that's even better from a perf perspective
<jenatali> Otherwise apparently the only other option is for EOF to generate a different character that can be matched against? WTF flex?
<karolherbst> the alternative is to make the parser/lexer more complicated
<jenatali> Yeah
maxzor has joined #dri-devel
pcercuei has quit [Quit: dodo]
mbrost has quit [Ping timeout: 480 seconds]
* jekstrand realizes he knows way too much about NaN and begins questioning life choices
<jenatali> There we go, fixed by !16230, as ugly as it is
<HdkR> ah, probably fixes a bunch of things compiling to ARB with CG?
illwieckz has joined #dri-devel
<HdkR> Probably also fixes dolphin-emu 2.0 era OpenGL then :P
<jenatali> Probably?
<jenatali> Our compat folks just flagged this particular app because it doesn't need GL to render, but if you add in a Mesa GL impl, then it stops rendering
<jenatali> So it's technically a regression by adding GL support
tursulin has quit [Read error: Connection reset by peer]
<jenatali> Anyways, reviews or acks welcome. I have no idea if that code has an owner these days
morphis has quit [Ping timeout: 480 seconds]
morphis has joined #dri-devel
maxzor has quit [Ping timeout: 480 seconds]
<airlied> karolherbst: you can't create a context before libclc? though that would be pretty ugly
<dcbaker> zmike: just let me know that you’ve merged stuff so I don’t force push over it
<airlied> the other option would be to add a flag to screen creation I suppose
<karolherbst> airlied: the problem isn't that I can't, the thing is, I don't want to :P
<karolherbst> I kind of load the libclc when I create the device struct
<karolherbst> and that's like really really early
<airlied> plumbing a flag through would also be messy
<karolherbst> yeah...
ahajda_ has quit []
<zmike> dcbaker: kk
<airlied> karolherbst: the other option could be to delay disk cache thread init
<FireBurn> airlied: Have you thought about https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15331
<FireBurn> A solution will be needed for 22.1.0
<airlied> FireBurn: jekstrand, bnieuwenhuizen : can some just ack the fix, it's a clear regression
<airlied> if you want to figure out wtf is going wrong feel free, but I've burned a fair bit of time failing there
<FireBurn> Cheers
iive has quit []
illwieckz has quit [Remote host closed the connection]
<jekstrand> airlied: Yeah, let's land that. I hate it but idk what's going on and don't have time to burn figuring it out
<bnieuwenhuizen> FireBurn: the bug has "Issue starting Horizon Zero Dawn", what are your symptoms?
rkanwal has quit [Ping timeout: 480 seconds]
<bnieuwenhuizen> is it a crash?
<bnieuwenhuizen> if so it would be very helpful if we could get a backtrace
<FireBurn> No it doesn't crash
<FireBurn> Let me just revert the fix locally, it was a month ago and things are a bit fuzzy
illwieckz has joined #dri-devel
<FireBurn> I get a "Unfortunately the game has crashed. Do you want to help us fix the issue by sending a crash report? Yes / No"
<airlied> so one of those crashes we can't get a backtrace from because the game steals it :-P
<airlied> karolherbst: uggh late init the threads is messy as well
YuGiOhJCJ has joined #dri-devel
<bnieuwenhuizen> FireBurn: is this the only game where this happens?
<bnieuwenhuizen> or does happen for all vulkan games?
<FireBurn> Someone else reported other games that no longer worked
<FireBurn> Rise of the Tomb Raider loads
<bnieuwenhuizen> hmm, looks like it might be correlated with vkd3d-proton
<FireBurn> Detroit: Become Human appears to load and it's vkd3d-proton
<FireBurn> Does that game dump help?
<bnieuwenhuizen> no :(
<airlied> karolherbst: clover manages it's own clc disk cache, maybe that is an option?\
<karolherbst> airlied: the way clc loads the libclc is device specific
<karolherbst> also, we run optimizations
<karolherbst> and I don't think rusticl should try to know how to cache a device specific nir :)
<FireBurn> Ah detroit crashed after the shader compile at first start up
<karolherbst> or well.. we don't run optimizations yet, but we want to
<airlied> karolherbst: clover creates a disk cache per device
<karolherbst> airlied: that sounds horrible
<karolherbst> and also very buggy
<karolherbst> what's the proper key?
<karolherbst> how do we make sure we don't load the wrong nir for a different device?
<airlied> yeah it does sounds a bit broken, granted it doesn't change the nir per device
<karolherbst> yeah...
<karolherbst> anyway, it's the drivers responsibility to get me the properly configured disk_cache
<karolherbst> airlied: maybe we should skip loading llvmpipe if we have a hw device?
<karolherbst> or only load llvmpipe if the hw device fails to load
<airlied> well the workaround is for vulkan
<karolherbst> I could do that from within rusticl
<airlied> where you don't have that option
<karolherbst> ohh :(
ybogdano has quit [Remote host closed the connection]
<karolherbst> annoying...
<airlied> so you end up with pipe screens in the vulkan instance, and all the resources associated with that
<bnieuwenhuizen> FireBurn: want to try https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16232 ? (<--- airlied for review?)
<karolherbst> airlied: yeah... uhm...
<karolherbst> airlied: ohhhh... wait.. I have an idea
<karolherbst> but I don't like it
<karolherbst> airlied: after adding that, I also started to create a helper context per device
<airlied> bnieuwenhuizen: I think we tried that one, but maybe we didn't
<karolherbst> airlied: so I think I could get around by loading the libclc after I created the helper context
<airlied> karolherbst: yeah if you are creating a helper context, then just do disk cache later
<airlied> if you are considering removing helper context then it's something we need to dig into deeper
<karolherbst> I am not considering it, because I need it
<karolherbst> not for much, but...
<FireBurn> bnieuwenhuizen: Compiling now
<karolherbst> airlied: we have to be able to upload data to pipe_resurces without a CL queue
<karolherbst> soo.. no way around having such a helper context really
<karolherbst> (also I use it for async maps)
icecream95 has quit [Ping timeout: 480 seconds]