#dri-devel on 2022-04-28 — irc logs at oftc.irclog.whitequark.org

2022-03-22 11:57 ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar

00:00 iive has quit []

00:02 nchery is now known as Guest3020

00:02 nchery has joined #dri-devel

00:02 FireBurn has joined #dri-devel

00:07 FireBurn has quit [Quit: Konversation terminated!]

00:07 ahajda has quit [Read error: Connection reset by peer]

00:08 Guest3020 has quit [Ping timeout: 480 seconds]

00:11 jernej has joined #dri-devel

00:13 co1umbarius has joined #dri-devel

00:13 <anholt> karolherbst: sounds like I need to do deqp-runner for CL CTS.

00:14 columbarius has quit [Ping timeout: 480 seconds]

00:15 <karolherbst> anholt: we only use piglit

00:15 <anholt> right, but piglit doesn't have things like flake lists

00:15 <karolherbst> yeah :(

00:15 <karolherbst> but it's not a flake

00:15 <karolherbst> it just takes a long time

00:15 <karolherbst> which is kind of a flake, but not a real one

00:16 <karolherbst> anholt: but I already have a weird runner for the CL CTS

00:16 <karolherbst> it just doesn't really support parsing lists of what's expected to pass/fail/flake

00:16 jernej has quit [Quit: Free ZNC ~ Powered by LunarBNC: https://LunarBNC.net]

00:16 jernej has joined #dri-devel

00:17 <karolherbst> anholt: https://gitlab.freedesktop.org/karolherbst/opencl_cts_runner/-/blob/master/clctsrunner.py

00:17 <anholt> right, and deqp-runner would get you all that.

00:17 <karolherbst> it's terrible because the CTS is terrible

00:17 <airlied> CL CTS is so inconsistent within itself

00:17 <karolherbst> anholt: problem is: the CL CTS doesn't give you list of tests :)

00:18 <karolherbst> most of the code there really just deals with figuring out what subtests we've got

00:23 FireBurn has joined #dri-devel

00:23 <karolherbst> airlied: you know what? I'll just disable fp64 support, that should make llvmpipe pass most of the stuff as well :D

00:23 <karolherbst> and I think all crashes in JIT code are fp64 kernels anyway

00:24 <karolherbst> but probably a bit more will fail.. oh well

00:25 <airlied> yeah fine with that

00:25 <karolherbst> okay.... mhhh

00:26 <karolherbst> either I found a resource leak in iris or llvmpipe is buggy

00:27 nvishwa1 has joined #dri-devel

00:27 <karolherbst> mhhh...

00:30 <karolherbst> after creating a new resource, do I have to pipe_resource_reference it or not?

00:31 <airlied> nope, it should have a ref count of 1

00:31 <karolherbst> okay.. then llvmpipe is buggy

00:32 <karolherbst> so I only call pipe_resource_reference once when droping my rust wrapper, which should be the correct thing

00:32 <karolherbst> but llvmpipe crashes this way. when I increase the ref with pipe_resource_reference after creating the resource, it doesn't crash

00:32 <karolherbst> mhh something is up...

00:33 <karolherbst> maybe I forget to ref it somewhere else

00:33 <karolherbst> airlied: it crashes inside this lpr->list thing

00:34 <karolherbst> I guess with refing I never called into llvmpipe_resource_destroy

00:37 <airlied> wierd, no idea where the refcounts could be off there on the driver side

00:37 <airlied> it's a pretty well smashed path

00:37 <karolherbst> mhh, I think the refcount is fine, but there is some weird corruption going on

00:37 <karolherbst> p lpr->list

00:37 <karolherbst> $20 = {prev = 0x61500000ca70, next = 0x7ffff341a6b0 <resource_list+496>}

00:37 <karolherbst> mhhh

00:38 <airlied> that does look back

00:38 <airlied> bad

00:38 <karolherbst> yeah...

00:38 <karolherbst> let me see if the resources I get are all fine actually

00:40 <airlied> uggh that debug resource list is horrible, but does haave a mutex

00:40 <karolherbst> make it's something with user pointers

00:41 <airlied> ah there is a bug there

00:41 <karolherbst> :)

00:41 <airlied> the user memory path never adds the resource list

00:42 <karolherbst> I just wanted to say that literally nothing tests this :D

00:42 <airlied> at least nothing built with debug turned on :-P

00:42 <karolherbst> probably :)

00:45 <airlied> karolherbst: 16207

00:46 <karolherbst> airlied: btw.. what's your view on using an ubo for kernel inputs?

00:46 <karolherbst> does it make things for llvmpipe more complicated or doesn't it matter?

00:47 <airlied> karolherbst: I think it's okay for llvmpipe, impossible for radeonsi/llvm

00:47 <airlied> well not impossible, but very messy

00:47 <karolherbst> mhhh

00:47 <karolherbst> how are they handling uniforms?

00:48 <karolherbst> because normally the uniforms are getting pushed into a driver via a constbuf anyway, no?

00:48 <airlied> well I suppose if the NIR can keep the function params definitions then it's fine

00:48 mbrost has joined #dri-devel

00:48 <airlied> the problem is on the compiler side

00:49 <airlied> though one workaround I may need is to add extra args to the kernel inputs buffer

00:49 <karolherbst> mhhhh

00:49 <airlied> which is easier if I can see it

00:49 <karolherbst> yeah.. I don't delete the args

00:49 <karolherbst> well.. at least not the used ones

00:49 <karolherbst> :D

00:50 <karolherbst> airlied: but why would it matter?

00:50 <karolherbst> atm I played around by creating a constant_buffer with a user_buffer

00:51 <karolherbst> ohh.. compiler.. mhh

00:51 <karolherbst> annoying

00:51 <karolherbst> anyway, the info should be all there

00:51 <airlied> karolherbst: to build the llvm compute kernel, you have to have the function signature

00:51 <karolherbst> right....

00:51 <airlied> after lowering that isn't available anymore

00:51 <karolherbst> I thought you wanted to use aco? :P

00:51 <karolherbst> or is that also using llvm?

00:52 <karolherbst> (never checked)

00:52 <airlied> nope this problem is just llvm

00:52 <airlied> using aco is a very large project, I'd like to get some baseline working with llvm to compare against

00:52 <karolherbst> right...

00:52 <airlied> but yeah I'm back and forth on what is the best answer here

00:52 <karolherbst> anyway.. you get the uniforms

00:52 <airlied> maybe rusticl should do it's thing and then I can fix radeonsi/aco up for that

00:52 <airlied> and just baseline using clover hacks for now

00:53 <karolherbst> airlied: https://gist.githubusercontent.com/karolherbst/b384131a6596f6077e51b1bcb27e3592/raw/febd9cc7057e12ca430c633308033bee98f7b485/gistfile1.txt

00:53 <karolherbst> like this

00:57 <karolherbst> uhm...

00:58 <karolherbst> why aren't we const folding u2u(0) to 0? something isn't right here...

00:59 <karolherbst> airlied: yeah.. well.. your choice. If something works with clover, it should also use with rusticl.. more or less. At least nothing from the explicit expectation chagnes

01:06 <karolherbst> airlied: image_size is busted

01:07 <karolherbst> airlied: so it gets the coords via load_input, llvmpipe replaces it with 0 somewhere

01:07 <karolherbst> but between the load and the image_size is a u2u

01:07 <karolherbst> so it can't read out the const value

01:09 <airlied> shouldn't that u2u disappear? :-P

01:09 <karolherbst> well.. yeah? :P

01:09 <karolherbst> but I am little confused....

01:10 <karolherbst> what's the first arg to image_size anyway?

01:12 <airlied> isn't it just the image?

01:12 <airlied> texture size takes an lod

01:12 <karolherbst> ahh

01:12 <karolherbst> airlied: I guess this is fallout from that: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16205

01:12 <karolherbst> I insert that u2u so that... some drivers deal with it :D

01:13 <karolherbst> but I think llvmpipe might wants to run constant_folding after inserting inputs? dunno

01:13 <karolherbst> don't know when that happens, except after I'm finished messing with it

01:14 <karolherbst> maybe we want to opt u2u(load_input) to load_input ?

01:14 <airlied> seems like it should be easy to convert that to a load input of a different size, but it might be a bit tricky I suppose

01:15 <karolherbst> something isn't right though...

01:16 <karolherbst> yeah soo...

01:16 <karolherbst> the first arg is the image

01:16 <karolherbst> the second one is the lod

01:16 <karolherbst> the lod is a plain 0 32 bit

01:16 <karolherbst> it's the image which is messed up

01:17 <karolherbst> huh?

01:20 slattann has joined #dri-devel

01:21 <karolherbst> ahh.. I had some WIP non inlining stuff still not reverted

01:27 <karolherbst> now stuff is better :)

01:32 <karolherbst> btw, who do I need to ping to get i915 locking fixes in :D

01:33 <airlied> did you cc #intel-gfx?

01:33 <airlied> or rather intel-gfx@

01:34 <karolherbst> I did

01:34 <karolherbst> even created a bug and everything

01:34 <karolherbst> but it's also only a week, soo...

01:34 <karolherbst> but I kind of assumed a memory coruption bug and...

01:35 <airlied> thellstrom or mlankhorst might be a good place to start

01:36 <karolherbst> okay.. btw, the patch is here: https://patchwork.freedesktop.org/patch/482687/ (if either of those see the messages here)

01:41 <karolherbst> "Pass 2298 Fails 32 Crashes 22" not too bad

01:43 <karolherbst> airlied: okay.. soo.. crashes are all images related, fails are mostly images + contractions

01:44 <karolherbst> contractions: 5) Error for float kernel1: -0x1.f6b356p-51 * -0x1.d947f8p-90 + -0x0p+0 = *0x0p+0 vs. 0x1.d08p-140

01:44 <karolherbst> mhh

01:44 <karolherbst> ohhh

01:45 <karolherbst> that's FTZ

01:45 <karolherbst> okay...

01:45 <karolherbst> hum

01:49 <karolherbst> airlied: soo.. should llvmpipe flush denorms to zero for compute or not

01:50 <karolherbst> ?

01:50 <karolherbst> it doesn't seem to adjust the fpstate for compute

01:51 ngcortes has quit [Ping timeout: 480 seconds]

01:54 ybogdano has quit [Ping timeout: 480 seconds]

02:08 mbrost_ has joined #dri-devel

02:11 mbrost has quit [Ping timeout: 480 seconds]

02:19 heat_ has joined #dri-devel

02:19 heat has quit [Remote host closed the connection]

02:26 heat_ has quit [Read error: Connection reset by peer]

02:26 heat has joined #dri-devel

02:29 Company has quit [Quit: Leaving]

02:30 mbrost_ has quit [Remote host closed the connection]

02:47 mbrost has joined #dri-devel

03:01 mdroper has joined #dri-devel

03:12 mbrost has quit [Ping timeout: 480 seconds]

03:19 minecrell has quit [Quit: Ping timeout (120 seconds)]

03:19 minecrell has joined #dri-devel

03:22 mbrost has joined #dri-devel

03:32 mbrost_ has joined #dri-devel

03:32 mbrost has quit [Read error: Connection reset by peer]

03:32 shankaru has joined #dri-devel

03:35 heat has quit [Ping timeout: 480 seconds]

03:43 <airlied> karolherbst: no idea what the fpstate is set for, would have to dig into it

03:48 LexSfX has joined #dri-devel

04:19 mbrost_ has quit [Remote host closed the connection]

04:19 mbrost_ has joined #dri-devel

04:20 i-garrison has quit [Ping timeout: 480 seconds]

04:22 i-garrison has joined #dri-devel

04:41 Duke`` has joined #dri-devel

04:58 mbrost_ has quit [Remote host closed the connection]

05:25 mattst88 has quit [Ping timeout: 480 seconds]

05:30 sdutt_ has joined #dri-devel

05:30 sdutt has quit [Read error: Connection reset by peer]

05:37 mdroper has quit [Read error: Connection reset by peer]

05:57 nvishwa1_ has joined #dri-devel

05:57 nvishwa1 has quit [Read error: Connection reset by peer]

05:58 Duke`` has quit [Ping timeout: 480 seconds]

06:02 DanaG has joined #dri-devel

06:04 <DanaG> Weird, my radeon E9173 fails to start any displays or anything. Could it be a damaged GPU? https://dpaste.com//BH7WB9CPB

06:31 mvlad has joined #dri-devel

06:43 LexSfX has quit []

06:48 <FLHerne> DanaG: see after "Call Trace:", something blows up in amdgpu during device init

06:48 nchery has quit [Ping timeout: 480 seconds]

06:49 <FLHerne> More likely to be a kernel than hardware problem IMO

06:49 <FLHerne> (although I'm sure broken hardware *could* throw the driver off)

06:50 <DanaG> It died in the same place on my aarch64 board, too.

06:50 <DanaG> Well, not necessarily the same place, but the same message (big difference)

06:52 itoral has joined #dri-devel

06:52 LexSfX has joined #dri-devel

06:53 Daanct12 has joined #dri-devel

06:54 <FLHerne> What kernel are you using?

06:55 <DanaG> Here's a new paste with drm.debug=0x1fff: http://sprunge.us/ja6VLM

06:56 <DanaG> It's ubuntu 5.15.0-27-generic

06:56 tzimmermann has joined #dri-devel

07:01 <DanaG> I've always found it odd that there are "warnings" with no message, just a backtrace. "There's a problem at this address in the neighborhood." But what is the problem? <no answer>

07:01 <DanaG> WARNING: CPU: 3 PID: 3101 at drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dce_aux.c:395 dce_aux_transfer_raw+0x28d/0x2f0 [amdgpu]

07:02 tursulin has joined #dri-devel

07:02 YuGiOhJCJ has joined #dri-devel

07:08 <FLHerne> Did it work before with that kernel, or what did you have before?

07:09 <FLHerne> sounds similar to https://gitlab.freedesktop.org/drm/amd/-/issues/1862

07:09 <DanaG> I think the only thing that changed is that I tried the GPU in an HP T730 thin client's PCIe slot, and I'm not sure how many watts that slot can supply, but it's not a high-wattage card.

07:10 <DanaG> And I had a previous WX 4100 go broken when I tried it in that slot, but couldn't be sure it wasn't a Supermicro board I had at the time that did it.

07:10 icecream95 has quit [Remote host closed the connection]

07:10 <FLHerne> All PCIe x16 slots should support 75W by spec

07:11 <FLHerne> I guess an OEM thin client might not bother

07:11 <FLHerne> but I really doubt underpowering a card would permanently damage it anyway

07:12 <FLHerne> I'd try some other kernels and/or file a bug

07:12 <FLHerne> 5.15 (and even some backports to 5.14) have definitely had regressions on certain hardware

07:13 <DanaG> I can try booting an older one.

07:13 danvet has joined #dri-devel

07:13 <DanaG> I should also note that I have a displayport KVM switch, which can be finicky (even though it's an optical displayport cable on the output). Actually, let me try direct connecting it...

07:14 ahajda has joined #dri-devel

07:16 slattann has quit []

07:16 <DanaG> I'll also check if it works properly in Windows.

07:17 nchery has joined #dri-devel

07:19 <FLHerne> those both sound like good ideas

07:20 LexSfX has quit [Ping timeout: 480 seconds]

07:20 LexSfX has joined #dri-devel

07:20 adjtm has quit [Quit: Leaving]

07:21 adjtm has joined #dri-devel

07:22 LexSfX has quit []

07:23 adjtm has quit []

07:23 <HdkR> `UBSAN: shift-out-of-bounds in /build/linux-HMZHpV/linux-5.15.0/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c:601:30` The UBSAN in amdgpu is pretty questionable as well

07:24 <HdkR> Could be UBSAN is tweaking some behaviour just enough that it is breaking things and needs to not be enabled :)

07:24 <DanaG> Interesting, I disabled the EFI CSM, and now I at least get a framebuffer at boot, but still get the flip timeout. Now going to try old kernel.

07:24 icecream95 has joined #dri-devel

07:25 LexSfX has joined #dri-devel

07:26 <DanaG> 5.13 kernel: same timeout.

07:27 <javierm> tzimmermann: answered to the list. What you mention was my first thought as well but then realized that wasn't the correct thing to do

07:27 jkrzyszt has joined #dri-devel

07:30 shankaru has quit [Read error: Connection reset by peer]

07:31 <DanaG> Driver seems to work okay in Windows, at least under a quick test. When the WX 4100 failed, it was dying in Windows too, but this is better at least.

07:32 <DanaG> I'm pondering getting a W6400 for this machine to mess with KVM. Has that GPU had the dang PCI reset problems fixed?

07:38 <DanaG> I do have IOMMU and ACS and ARI enabled, I wonder if any of the settings in that group are relevant? I recall seeing Navi get a quirk to disable ARI.

07:40 <DanaG> This isn't a Navi, though.

07:43 rasterman has joined #dri-devel

07:44 <DanaG> Looks like amdgpu.dc=0 makes it not die. I guess it's bugreport time, indeed. But first, just time for sleep.

07:49 frieder has joined #dri-devel

07:52 <tzimmermann> javierm, the way dp_aux_chardev and dp_cec work is a bit messy. but we cannot really do much about it ATM

07:52 slattann has joined #dri-devel

07:53 <javierm> tzimmermann: yeah, I've sent yet another proposal in the thread

07:54 <tzimmermann> javierm, one alternative it to provide empty stubs of drm_dp_dpcd_write()/read()/etc if DISPLAY_DP_HELPER has been disabled. that would resolve the linker error at least

07:54 <DanaG> With dc disabled, I see [drm:amdgpu_atombios_dp_process_aux_ch.constprop.0 [amdgpu]] dp_aux_ch flags not zero

07:55 lynxeye has joined #dri-devel

07:56 <DanaG> dmesg with dc=0 (back to drm.debug=0x10e): http://sprunge.us/T2aA69

07:56 <javierm> tzimmermann: indeed, I can include that in v3 if you don't agree with the latest option I mentioned

07:57 <javierm> I would just kill all these user configurable options, I think that made more sense before you made the split but have little value nowadays

07:58 <DanaG> That reminds me, I had hangs on aarch64 on my wx 4100, until I switched from active DP-HDMI to passive DP-HDMI (target device is pikvm so doesn't need active).

07:58 <DanaG> Backtrace mentioned CEC.

07:59 <DanaG> I don't have that cec backtrace handy, though.

08:00 <emersion> pq: has anyone replied to "It's bit moot to e.g. render everything in electrical 10 bit RGB, if the link is just going to squash that into electrical 8 bit RGB, right?"

08:03 DanaG has quit [Remote host closed the connection]

08:05 <pq> emersion, nope

08:05 MrCooper_ is now known as MrCooper

08:05 <emersion> okay. wondering if that would be sane default behavior

08:06 <pq> emersion, but MrCooper_ did point out that electrical 8 bpc FB may not be a reason to turn link bpc down to 8 too, because the KMS color pipeline can have more precision.

08:08 <MrCooper> on a different (though somewhat related) topic: with temporal dithering, the effective observable bpc can be higher than the HW bpc, right?

08:08 <pq> that's the idea, I believe, yes

08:09 <pq> or even spatial dithering - you rarely look at individual pixels

08:09 <MrCooper> then requiring a minimum HW bpc might artificially exclude some scenarios which would actually work as intended

08:09 <pq> probably, as long as we have no idea if dithering is there or not

08:11 <MrCooper> swick: ^ so user space actually can't know the required minimum HW bpc

08:13 <MrCooper> I guess drivers could be allowed to take dithering into account for the minimum bpc

08:14 <MrCooper> or maybe dithering should be explicitly controlled as well (e.g. apparently some people physically cannot bear temporal dithering)

08:17 <pq> I would want explicit control and knowledge of dithering.

08:18 lemonzest has joined #dri-devel

08:33 gouchi has joined #dri-devel

08:34 gouchi has quit []

08:35 rasterman has quit [Quit: Gettin' stinky!]

08:37 pcercuei has joined #dri-devel

08:39 digetx has quit [Read error: Connection reset by peer]

08:39 <mripard> danvet bbrezillon : do you know what this FIXME references: https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/vc4/vc4_crtc.c#L839 ? did the generic async-page-flip turns out to be asynchronous plane updates, or something else?

08:39 digetx has joined #dri-devel

08:53 aravind has joined #dri-devel

09:00 rasterman has joined #dri-devel

09:27 <javierm> tzimmermann: funny that the first local patch I had here to just bypass the link error was "depends on DRM && DRM_DISPLAY_HELPER && DRM_DISPLAY_DP_HELPER"

09:28 <javierm> tzimmermann: but that didn't feel quite right to me for the reasons I mentioned in the thread. I didn't expect this change to be that controversial :)

09:39 rasterman has quit [Remote host closed the connection]

09:39 rasterman has joined #dri-devel

09:42 <danvet> mripard, no, never got wired through

09:43 <danvet> but probably a good idea to do so?

09:43 <danvet> the rough plan was to reuse the async cursor flip stuff of some sorts

09:43 <danvet> and let drivers figure out which exact kind of async we really need

09:44 <danvet> but the idea of flip_done was that that would hide this all sufficiently well

10:04 Daanct12 has quit [Remote host closed the connection]

10:13 sdutt_ has quit [Ping timeout: 480 seconds]

10:22 gawin has joined #dri-devel

10:40 <tzimmermann> javierm, may i ask you for another review of https://lore.kernel.org/dri-devel/9acf4bb3-e765-c1dd-bc75-05e9c7a0430f@suse.de/T/#m0fbd0ea36fecc5ba5880896c610bed02494deab2 patches 4+5 are new

10:41 rkanwal has joined #dri-devel

10:42 icecream95 has quit [Ping timeout: 480 seconds]

10:43 <javierm> tzimmermann: sure. Btw, Dan's report about pach 1/5 missing a mutex_unlock(&info->mm_lock) before return seems correct

10:43 <tzimmermann> javierm, yes. i forgot that mutex_unlock.

10:45 <karolherbst> jekstrand, airlied: mhh.. I think we have to disable user resources for non 1D images as we currently have no way of specifying custom slices :(

10:45 thellstrom has joined #dri-devel

10:47 nvishwa1_ has quit [Ping timeout: 480 seconds]

10:55 <karolherbst> but fixing the interfaces... uhh... maybe I should read up on how the GL stuff works for using that

10:57 Daanct12 has joined #dri-devel

10:58 <karolherbst> mhh, I guess from a GL perspective it was only valid for buffers anyway

10:58 <karolherbst> mhh.. and textures? oh wow

10:58 <karolherbst> so the problem is, that llvmpipe just calculates its own pitch/slice and that breaks stuff

10:59 ella-0 has joined #dri-devel

11:01 ella-0_ has quit [Read error: Connection reset by peer]

11:02 rasterman has quit [Quit: Gettin' stinky!]

11:02 rasterman has joined #dri-devel

11:08 frieder has quit [Remote host closed the connection]

11:28 <tzimmermann> danvet, async cursor updates always acquire plane->mutex and crtc->mutex, right?

11:28 rasterman has quit [Quit: Gettin' stinky!]

11:29 <tzimmermann> and whenever a plane updates, either sync or async, the commit acquires crtc->mutex?

11:29 Daanct12 has quit [Quit: Leaving]

11:39 rasterman has joined #dri-devel

11:39 <karolherbst> jekstrand: ... treating llvmpipe as a GPU makes the image stuff pass :(

11:42 <karolherbst> maybe we can get a waiver for CPU impls

11:42 <karolherbst> at least I'd try

11:58 <danvet> tzimmermann, yeah the locking is the same

11:59 <tzimmermann> danvet, ah, thanks. related question: concurrent updates to planes of the same crtc will never interfere, because they are serialized via crtc->mutex?

11:59 <danvet> not quite

12:00 <danvet> on the sw state, yes

12:00 <danvet> on the hw state, nonblocking updates are pushed through without holding any locks

12:00 <danvet> and ordering is ensured by waiting for/signalling drm_crtc_commit appropriately

12:00 <danvet> which makes this all a bit more complicated

12:02 <tzimmermann> that is basically what happens in commit_tail, right?

12:02 <tzimmermann> the hw-state update

12:04 <tzimmermann> danvet, but two concurrent hw-state updates are serialized by DRM's atomic helpers, right?

12:04 <danvet> yeah should be

12:04 <danvet> the real fun only starts when you have cross crtc state

12:04 <danvet> hence the epic discussions recently with Lyude on dp mst state

12:05 <danvet> lynxeye, random thought really, but have you looked at moving etnaviv over to shmem helpers?

12:05 <tzimmermann> danvet, that's luckily not the case

12:05 <tzimmermann> danvet, i'm still having that ast bug where mouse movement interferes with modesetting

12:05 <danvet> yeah as long as any hw is strictly attached to either crtc or connector you should be fine with atomic helpers

12:05 <danvet> even when the connector moves around, it keeps track of that stuff and should order it all

12:06 <tzimmermann> and i'm looking for ways they could overlap

12:06 <danvet> tzimmermann, is that with my patch to nuke legacy cursor already applied?

12:06 <tzimmermann> danvet, no without the patch

12:07 <tzimmermann> but i cannot even reproduce it. the reporter fixed it by repeatedly setting I/O registers

12:08 <tzimmermann> maybe the HW is too slow to catch up with the rest

12:09 <tzimmermann> here's the patch: https://lore.kernel.org/dri-devel/20210917072226.17357-1-kuohsiang_chou@aspeedtech.com/#t

12:09 <tzimmermann> it doesn't look like the correct fix

12:10 <danvet> uh yeah that's pretty horrible

12:10 <danvet> I would try with the legacy cursor patch applied, I think it can cause stuff like this

12:11 <danvet> if it's something else then I guess a spinlock around all the indexed register writes is what's needed

12:11 <tzimmermann> ok, i'll the patch

12:12 <tzimmermann> if the atomic updates are serialized, maybe the ast HW simply needs time after a full modeswitch before it accepts new commands; just guessing here.

12:12 <danvet> hm yeah maybe that could be it too

12:12 <danvet> that it's not a race, but the hw being a bit slow, and you actually have to hammer the index reg until it goes through

12:13 <danvet> I guess that could be the 3rd option really

12:13 <danvet> and for that case ofc no spinlock needed

12:13 <danvet> I guess we could test this by adding some tracing to the index_reg functions?

12:13 <danvet> if they run concurrently, then there's an sw bug

12:13 <danvet> if they never run concurrently, then probably a hw issue

12:13 soreau has quit [Read error: Connection reset by peer]

12:14 <tzimmermann> thanks for confirming

12:14 <danvet> like just do an atomic_inc/dec around the code and complain if it's ever elevated

12:14 <danvet> and test with that

12:14 soreau has joined #dri-devel

12:14 <tzimmermann> good idea

12:15 <tzimmermann> thank god ast HW is all fully documented with example code and reference drivers directly from the manufacturer /sarcasm

12:15 <lynxeye> danvet: Yea, I've had that on my list for while, but didn't really get around to having an opinion yet due to other things having higher prio.

12:19 <danvet> lynxeye, I think once the shrinker stuff that's in the works has landed, there's really not anything left for shmem helpers

12:19 <danvet> so might actually be good to have etnaviv using it, to make sure that stuff all fits

12:21 itoral has quit []

12:21 <lynxeye> danvet: agreed. The shrinker patches bumped this up a bit on my prio list, but still not at the top.

12:27 <danvet> yeah makes sense, just wanted to make sure you've seen this

12:27 <danvet> I'm expecting the shrinker patches to take some time still anyway, the locking is a bit a mess

12:30 YuGiOhJCJ has quit [Quit: YuGiOhJCJ]

12:31 <danvet> sravn_, I'm assuming you're also pushing the fbcon patch you acked?

12:46 slattann has quit [Ping timeout: 480 seconds]

13:13 <swick> pq: MrCooper: it's not only dithering, it's the complete color pipeline we would have to know about to get the effective bpc

13:14 <swick> we would have to know the precision before and after each block and every time a block goes from a higher to lower bpc there could be dithering

13:15 <swick> so either user space has to know about all that stuff or min_bpc must be the effective minimum bpc

13:16 <pq> swick, IOW, would you reject an atomic commit if min_bpc happens to be larger than, say, CTM block precision?

13:18 <pq> that's an interesting idea, but I wonder what that means for discoverability of working configurations...

13:18 <pq> it seems every new KMS property adds a new dimension to the combinatorial explosion

13:20 <swick> pq: if the CTM block is used in a way that the precision is below min_bpc then yes

13:20 <pq> At least with the color pipeline we have the option to not use any of it. Link bpc we cannot ignore.

13:21 <swick> it could be passthrough and retain the full pipe precision or dithering the CTM result could increase the effective bpc

13:21 <swick> yeah

13:25 <pq> I'm kinda hoping I could just ignore the precisions of the color pipeline hardware blocks, until the "libcamera for KMS" exists.

13:30 Company has joined #dri-devel

13:31 <swick> honestly if drivers ignore all of that in the beginning I wouldn't be too mad

13:31 sh-zam has joined #dri-devel

13:31 heat has joined #dri-devel

13:32 <swick> I would assume that most hardware is designed to provide the precision it can drive the display at, too

13:38 sh_zam has quit [Ping timeout: 480 seconds]

13:43 fxkamd has joined #dri-devel

13:47 jewins has joined #dri-devel

13:54 mdroper has joined #dri-devel

13:55 <rgallaispou> Hi. I'm playing with kms_color to test the gamma property. After the test ends, the pointer associated with the last gamma lut passed to the kernel is still in use when wayland-weston is started. Is my driver broken or it is a standard behavior to keep the gamma lut ?

13:57 <pq> rgallaispou, it's standard KMS behavior. Weston lacks resetting most KMS properties.

13:58 <pq> these proeprties in particular are in my todo to fix in Weston

13:58 <pq> I've had fun with fbdev in HDR mode...

14:01 <rgallaispou> pq: okay it's good to know

14:01 nchery has quit [Read error: Connection reset by peer]

14:02 <rgallaispou> pq: yes, I can imagine why

14:02 <rgallaispou> Thanks anyway :)

14:03 <marex> sigh, seems like powervr driver is effectively back to being dead

14:04 <ajax> oh?

14:04 <marex> is there any activity ?

14:05 <ajax> !16040 was last touched on monday. i've gone six months without touching some of my MRs.

14:05 <marex> every time I tried to bring it up on HW with powervr I have here, I found the kernel driver is outdated, specific firmware does not work or is unavailable or I have the wrong version with no way of getting the right version ... userspace at least compiles

14:11 Haaninjo has joined #dri-devel

14:14 <kj> The 1.17 update is stuck in an internal process. I was going to ping people tomorrow since I haven't received approval from all the necessary people

14:15 <marex> that doesn't help me, the hardware I have ships with blob built for API 1.16 . Until there is easy support for different APIs , the powervr stuff is unusable except for one specific SoC model

14:17 <kj> There should be a 1.17 firmware binary released soon so that might be of some use

14:17 <marex> kj: and that works on all SoCs and thus powervr revisions ?

14:17 <marex> kj: or is this one specific binary for one specific SoC again ?

14:19 <ajax> do we actually do anything with the semaphore passed to vkAcquireNextImageKHR ?

14:20 <mlankhorst> karolherbst: Currently not doing much on locking, patch you linked seems sane, hence I'm worried that it probably breaks. ;)

14:21 <ajax> :q

14:22 <kj> marex: It likely wouldn't work on all platforms out there but it might be worth trying out. I think it might just work however we still need to upstream the firmware binary and the pvrsrvkm kernel module + the mesa side changes which are stuck in the internal process

14:25 nchery has joined #dri-devel

14:34 <ajax> getting the distinct impression that nobody understands wsi

14:35 <karolherbst> mlankhorst: :D

14:35 <karolherbst> mlankhorst: if it breaks something that would worry me even more

14:35 <mlankhorst> You haven't written many locking patches to i915 then.

14:36 <karolherbst> I didn't, but the code looks obviously wrong though

14:36 <karolherbst> well, the current one that is

14:37 <emersion> ajax: /me adds to WSI related quote list

14:38 <marex> kj: is there a tree with up-to-date kernel driver ?

14:40 <kj> For the downstream kernel 1.14 this should be it https://gitlab.freedesktop.org/frankbinns/powervr/-/tree/ddk/1.14@6193520. For the new one it's https://gitlab.freedesktop.org/frankbinns/powervr/-/tree/powervr-next

14:41 <kj> The 1.17 should be released soon too but it hasn't yet been put thought our internal process

14:43 <kj> The new kernel module requires https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15507 . I'm not sure if the MR needs rebasing or not but there are a bunch of change to that which need releasing and Frank's off till Monday

14:47 <ajax> okay, got it. what we do with that semaphore is we require that the backend ANI actually acquire an image synchronously, and then we signal the sema on our way out

14:48 <ajax> this seems: bad

14:49 sdutt has joined #dri-devel

14:51 <ajax> i mean, not wrong, but also not good

14:52 thellstrom has quit [Ping timeout: 480 seconds]

14:54 sdutt has quit []

14:54 sdutt has joined #dri-devel

14:55 <marex> kj: so yes, that kernel driver is based on some 6 months old kernel version, ancient

14:57 thellstrom has joined #dri-devel

14:58 gawin has quit [Ping timeout: 480 seconds]

14:58 <marex> I'll just stop here

15:01 <melissawen> mripard, danvet, related to the previous discussion on async flip (I guess): I see in drm_mode_atomic_ioctl we are aborting when there is a ASYNC_FLIP flag and there is also a comment in crtc->async_flip that `It's not wired up for the atomic IOCTL itself yet`

15:01 <melissawen> so how do we usually handle async page flip in a atomic context? we just don't do it right now?

15:01 <melissawen> is there any drivers doing it in a custom implementation, for example?

15:02 <melissawen> I'm looking into this topic and ended up quite confused on what we can currently do or not...

15:03 iive has joined #dri-devel

15:07 <mlankhorst> modify a single fb is usually most what is allowed async

15:15 jkrzyszt has quit [Ping timeout: 480 seconds]

15:15 mdroper has quit [Read error: Connection reset by peer]

15:19 pcercuei has quit [Quit: brb]

15:31 pcercuei has joined #dri-devel

15:43 aravind has quit [Ping timeout: 480 seconds]

15:44 nvishwa1 has joined #dri-devel

15:45 tzimmermann has quit [Quit: Leaving]

15:55 thellstrom1 has joined #dri-devel

15:59 rgallaispou has left #dri-devel [#dri-devel]

16:00 thellstrom has quit [Ping timeout: 480 seconds]

16:03 sdutt has quit [Remote host closed the connection]

16:03 sdutt has joined #dri-devel

16:05 mbrost has joined #dri-devel

16:20 Duke`` has joined #dri-devel

16:20 DanaG has joined #dri-devel

16:26 <Lyude> danvet: re cross-crtc state: honestly now that I have this working with my mst stuff it's nowhere near as complicated as it seems: https://gitlab.freedesktop.org/lyudess/linux/-/blob/wip/mst-atomic-only-v1/drivers/gpu/drm/dp/drm_dp_mst_topology.c#L4436 basically just build up a bitmask of every CRTC you're involving and then do something in setup_commit() like I did there to

16:26 <Lyude> actually retrieve the drm_crtc_commits

16:27 ybogdano has joined #dri-devel

16:28 <Lyude> even managed to get it so that we can change payload->vcpi_start_slot during the atomic commit like I needed

16:29 <danvet> Lyude, hm why do we only copy stuff over that late?

16:30 <Lyude> danvet: because we're not holding any locks by the time we start committing the state potentially in non-blocking modesetting, right?

16:30 <danvet> or can we compute the vcpi slots only when we do the actual commit and not precompute things?

16:30 <Lyude> yeah you can't precompute them

16:30 <Lyude> because you'd need to know what order the driver is bringing up the payloads in order to do that

16:31 <danvet> ah ok

16:31 <danvet> hm

16:31 <danvet> I guess it's not super atomic-y thought, but should work

16:32 <danvet> deserve a huge comment for that state that it doesn't work quite like the others

16:32 <Lyude> yeah, luckily the thing about start slots is they're totally irrelevant to any actual state computation - so the values there also don't really matter until commit time

16:33 <danvet> yeah I think just a comment that we use this as scratch patch and that it works like hw state essentially - i.e. races are prevented by careful ordering, not locking

16:33 <danvet> and also a comment that the drm_crtc_commit completion provides the necessary cpu memory barriers for this to work correctly

16:33 <danvet> since strictly speaking you're doing a fancy lockless thing here

16:34 <Lyude> mhm - gotcha, was planning on doing something like that once I start cleaning this series up

16:36 sravn_ has quit []

16:48 <anholt> jekstrand: were you objecting to making tex->txl lowering optional? or just asking for explanation

16:52 ybogdano has quit [Ping timeout: 480 seconds]

16:53 slattann has joined #dri-devel

16:54 mbrost_ has joined #dri-devel

16:55 <jekstrand> It still feels kinda bogus to me.

16:55 <anholt> frontend shading language doesn't have txl(shadowcube) or txl(shadow2darray)

16:55 <jekstrand> anholt: It's an obviously correct transform that most hardware needs.

16:56 <anholt> but NIR insists on making those, because...?

16:56 <jekstrand> Ugh...

16:56 mbrost has quit [Read error: Connection reset by peer]

16:56 <anholt> so then the drivers have to back out the lowering they didn't want

16:56 <jekstrand> which drivers is this causing a problem for? virgl, maybe, I guess.

16:56 <anholt> nir_lower_tex has a bunch of options, and then there's this one non-optional thing it does, too.

16:56 <karolherbst> jekstrand: nv50

16:57 <karolherbst> we don't have lod sources for those

16:57 <anholt> nouveau's the one that didn't have a workaround for NIR yet.

16:57 * jekstrand wonders if NV hardware does the right thing there automatically or if shadow sampling in vertex shaders is just busted and no one cares.

16:57 <anholt> in ntt (virgl) and radeonsi we recognize that NIR did the thing we didn't want and back it out.

16:57 <karolherbst> jekstrand: we just don't have it on nv50

16:57 <karolherbst> the hw is.. broken :)

16:57 <jekstrand> Bingo!

16:57 <karolherbst> so we can only pass 4 sources into tex

16:58 <karolherbst> and most of them are filled up with coords

16:58 <anholt> jekstrand: uh, do you positively know that VS texturing on nv50 doesn't set lod to 0?

16:58 <karolherbst> and then you add shadow and you got 4

16:58 <jekstrand> anholt: I have no idea. But I'd kind-of like someone to prove that it does before they say the workaround isn't needed.

16:58 <anholt> given that nvc0 got a knob for lod 0, but the knob isn't on nv50, I would easily believe that it's a shader stage knob on older, then an instruction knob later when they realize the same HW bits would shave instructions in the FS.

16:58 <karolherbst> we can't encode the lod for shadowcube or shadow2darray

16:58 <karolherbst> on nv50 that is

16:59 <karolherbst> there is nothing

16:59 <jekstrand> I'm not arguing that you can't encode it

16:59 <jekstrand> I'm asking if VS texturing works

16:59 <jekstrand> Maybe it does by automagic

16:59 <karolherbst> I think it doesn't :)

16:59 <karolherbst> imirkin said something about parts being broken, but nvidia exposing that anyway and fails to compile

16:59 <karolherbst> something like that

16:59 <jekstrand> :facepalm:

16:59 <DanaG> I'm curious, how hard would it be for somebody to add DRI_PRIME support to the `ast` driver? I'd like to be able to DRI_PRIME offload, or use xrandr offload to render on the Radeon and dump into the ASPEED's framebuffer. If making it not bog down the Radeon would require skipping 3/4 of the frames, or make it allow horrible tearing, that would be fine.

17:00 <anholt> karolherbst: that was about bias.

17:00 <karolherbst> ahh

17:00 <karolherbst> so the lod works, but the bias is dropped, right?

17:00 <jekstrand> Maybe it's just too many years at Intel but I don't assume hardware does things right.

17:00 <karolherbst> anyway.. we have 4 sources and have to do something

17:00 <jekstrand> I assume someone noticed VS texturing was broken on nv50 and went "ugh... we need to force LOD to 0, let's add an instruction bit"

17:01 <karolherbst> on later gens we can encode up to 8 sources

17:01 <karolherbst> but we do have a special zero lod flag as well

17:01 <anholt> jekstrand: I've never understood. What is the argument for why NIR *should* turn tex into txl on VS on all drivers?

17:01 <anholt> like, we don't force many lowering passes on everyone, why this one?

17:02 <jekstrand> anholt: Yeah, it's weird. I'm not opposed to making it optional, in principal. I'm opposed to saying it breaks nv50 when it may actually be sort-of fixing nv50 and we're all too lazy to care and figure out why.

17:02 <anholt> it's valid, but also why not turn tex into txb(bias=0) in the FS? that's also legal.

17:03 <karolherbst> I think it would be fine to force this lowering, but then we have to explicitly say : tex won't reach the driver or something

17:03 Emmy_ has quit [Remote host closed the connection]

17:03 <anholt> and my argument is: you shouldn't be adding tex srcs when you don't have to. and nir-to-tgsi wishes you wouldn't.

17:03 <jekstrand> I also don't like instructions having subtly different behavior per-stage unless they're stage-specific.

17:04 sdutt has quit [Remote host closed the connection]

17:04 sdutt has joined #dri-devel

17:04 <jenatali> FWIW, D3D only has shadow tex with implicit mip, and shadow txl for level 0, and for VS only the latter is allowed. There's no arbitrary level txl

17:04 <anholt> (nir-to-tgsi has to back it out, because then virgl would need to recognize txl(shadowcube/2darray, lod=0) and turn it back into tex since txl on those samplers is illegal!)

17:04 <jekstrand> Why does texturing behave a little differently in 2/3 of the stages? Uh... we didn't think about it when designing GL until it was too late?

17:05 <karolherbst> well back then you only had two stages, no?

17:05 <karolherbst> or was there a time with just one stage even? :D

17:05 <jekstrand> There have always been >= 2

17:05 slattann has quit [Read error: Connection reset by peer]

17:05 <jenatali> Isn't it more like 1/6 stages? FS is the only one that does implicit LOD

17:05 <jekstrand> jenatali: Compute too, with some extensions.

17:06 Emmy_ has joined #dri-devel

17:06 <jenatali> Ah right, I forgot about those extensions

17:06 <anholt> we definitely had one stage in the fragment program days.

17:06 <karolherbst> couldn't we make those instruction to behave the same inside nir?

17:06 <jekstrand> anholt: right...

17:06 <karolherbst> sure.. glsl is stupid, but why should we carry the stupid over

17:06 <jekstrand> karolherbst: That's my argument. :)

17:07 <karolherbst> yeah...

17:07 <jekstrand> If VIRGL wants to translate back to GLSL, it's got to deal with GLSL being stupid. Same for Zink.

17:07 <jekstrand> Maybe we want a txz opcode which is txl with lod0?

17:07 <karolherbst> I am not opposed to check for a zero lod inside codegen, especially as we do have that special lz flag and could make use of it, but...

17:07 <karolherbst> jekstrand: maybe

17:08 <anholt> jekstrand: TGSI's got one of those.

17:08 <karolherbst> tex_lz?

17:08 <anholt> yep

17:08 <jekstrand> anholt: Are those allowed for shadow and cube?

17:08 <anholt> nvc0 and radeonsi have it in hw.

17:08 <anholt> yes

17:08 <jekstrand> Then why not translate nir_texop_txl with lod=0 to tex_lz?

17:09 <anholt> jekstrand: in ntt? because the ntt-consuming drivers don't have support for it, it's a cap.

17:09 <jekstrand> oh

17:10 <anholt> the reason to fix nir_lower_tex was because if we can get a decision for where to fix NIR's undesired lowering, we can get nouveau onto NIR on all chipsets and not go the ntt route for it.

17:10 <anholt> and then I can finally land !8044

17:11 <jekstrand> Yeah, I know.

17:11 <karolherbst> well.. we could handle that in some way inside from_nir, but... I also kind of prefer to move lowering from codegen into nir :)

17:11 <jekstrand> And I really want to land 8044

17:11 <karolherbst> but...

17:12 <karolherbst> maybe it does make sense to add a tex_lz? but that's going to be messy

17:12 <anholt> right. so karolherbst wants the fixup not in nir frontend. imirkin wants the fixup not in the backend, because it's not legal in the shading language. jekstrand wants the fixup not in nir because VS tex instead of txl is silly. I want someone to budge because I just want to move on with my life.

17:12 <jekstrand> anholt: I know

17:13 * jekstrand kind-of wants to replace the entire nouveau compiler. :-P

17:13 * karolherbst wants the same

17:13 <jekstrand> But that's not going to happen today. :)

17:13 rasterman has quit [Quit: Gettin' stinky!]

17:13 <karolherbst> yeah... so my reason for moving stuff into nir is, so that codegen shrinks

17:14 <karolherbst> if we can rely on using nir, I can throw out quite a lot of code

17:14 <jekstrand> I guess it's probably fine. I really don't like "nouveau is sloppy" to be the reason we carry tech debt in NIR.

17:14 <jekstrand> But I can get over it. glsl_to_tgsi is more tech debt than this tiny bit of lowering.

17:14 <karolherbst> well.. if we are expected to fix stuff up when going from nir to codegen, that's fine by me

17:14 <jekstrand> So we're still going the right direction.

17:14 <karolherbst> I just don't want to add more stuff into codegens lowering

17:14 <anholt> jekstrand: when I look at doing this in NIR, it feels like paying down tech debt because so many drivers have to reverse the undesired thing nir adds.

17:15 <jekstrand> anholt: You say "so many" and it's really just 2 AFAICT.

17:15 <anholt> (though, that argument doesn't hold so much because radeonsi and nvc0 would want to recognize lod==zero anyway since they can do tex_lz on all stages)

17:15 <jekstrand> radeonsi isn't undoing anything.

17:15 ybogdano has joined #dri-devel

17:15 <jekstrand> In fact, with the NIR lowering, radeonsi doesn't need to be doign its stage check which is the point.

17:16 <jekstrand> And neither does nouveau except for one piece of hardware we aren't sure works.

17:16 <jekstrand> But whatever, I don't want to keep arguing.

17:16 <jekstrand> As long as we can come up with a better name for the bit (I suggested one), I guess I'm fine adding it.

17:20 aravind has joined #dri-devel

17:24 slattann has joined #dri-devel

17:27 <ajax> jekstrand: want to pick your brain about !4037 if you have a minute

17:34 <jekstrand> ajax: Sure, what about it?

17:35 <ajax> hah, race condition, i just posted a comment on the mr

17:36 <jekstrand> ajax: Feel free to pull the first 6 into a different MR and land them.

17:42 neonking has quit [Ping timeout: 480 seconds]

17:44 <ajax> hmm. could really use a way to tunnel arbitrary events back through an XGE channel.

17:46 <jenatali> zmike: https://gitlab.freedesktop.org/mesa/mesa/-/jobs/21908000 - that a flake or did I somehow break it?

17:46 <zmike> fucking hell what now

17:46 <zmike> it's a flake

17:46 aravind has quit [Ping timeout: 480 seconds]

17:48 <zmike> ajax: really really really need that xlib change ^^^^^^

17:49 <ajax> you know what especially sucks

17:49 <ajax> i probably want to fix that in glvnd's frontend too

17:50 <ajax> let's see if i remember how to do x module releases

17:51 slattann has quit []

17:56 <ajax> ugh okay.

17:57 <ajax> jekstrand: so the irritating thing here is, we're using x11/present in a totally cromulent way, we're passing in the idle fence and waiting for it before we hand it back from ANI.

17:58 <jekstrand> yup

17:59 <ajax> the bug is that the way present releases the pixmaop is it calls into the ddx to flush rendering, and glamor treats that "flush" literally as glFlush instead of anything stronger.

18:01 <ajax> so, iiuc, all xserver has done is submit commands to the device, it hasn't waited for their completion and it can't guarantee those commands get submitted before the client takes back over

18:01 <ajax> (assuming single-queue to the hardware from the kernel, but with no particular ordering among drm clients)

18:01 ngcortes has joined #dri-devel

18:01 <anholt> this sounds correct to me -- at the time we wrote that, everyone had implicit sync, and glFlush() got you to the kernel.

18:02 <jekstrand> ajax: Yup. That's why we still need implicit sync

18:03 neonking has joined #dri-devel

18:04 sdutt has quit [Read error: Connection reset by peer]

18:08 stuart has joined #dri-devel

18:08 rasterman has joined #dri-devel

18:11 <DanaG> One other idea for the DRI_PRIME with ASPEED: have the ASPEED pull from the Radeon, instead of having the Radeon push to the ASPEED.

18:12 <ajax> DanaG: i didn't think aspeed chips had enough dma to do that

18:14 DanaG has quit [Remote host closed the connection]

18:14 <ajax> jekstrand: is ARB_sync explicit enough here? if glamor did glFenceSync and waited for it to pass before emitting the present-idle event, good enough?

18:14 <ajax> is there any benefit to that over just glFinish, too

18:17 <jekstrand> ajax: Both would stall in X and kill any pipelinling

18:17 <ajax> i don't need to stall, i have a main loop and i can ClientWaitSync(timeout=0) just fine.

18:18 gawin has joined #dri-devel

18:18 <jekstrand> I don't mean it stalls X I mean we have to do a full round-trip through userspace and possible wait inside the client (with whatever implications that has) before work gets submitted.

18:19 <jekstrand> With the likely solution being num_buffers++

18:20 <ajax> i don't follow "before work gets submitted" here. the idle fence wouldn't get touched until after the gl fence passed, and we (wsi) wait on the idle fence before ANI will give it back to the app

18:21 <jekstrand> With the way things are today, as soon as X has submitted its compositing job or blit, it can flush, hand us back the buffer, and we can hand it off to the app so they can start rendering to it.

18:21 <ajax> yes i am suggesting glamor would be better

18:22 <ajax> can easily be, small numbers of lines of code.

18:22 <jekstrand> If we glFinish or wait on a fence in X11, regardless of where that wait happens, the client can't even start building command buffers to render until after the GPU is done with that buffer.

18:22 eukara has quit []

18:23 YuGiOhJCJ has joined #dri-devel

18:24 <ajax> walk me through that? what part of the buffer state is so mutable while it's busy that you can't even start?

18:24 eukara has joined #dri-devel

18:24 <ajax> it's not going to change size

18:24 <ajax> it's probably not getting memmoved anywhere

18:24 <jekstrand> The app can't start until it gets it back from ANI because it doesn't know which image ANI is going to return next.

18:26 <jekstrand> So we want to be able to return it from ANI the moment X knows a buffer is going to be free

18:26 <jekstrand> That's the whole reason ANI comes with fences and semaphore.s

18:28 <ajax> (thinking)

18:29 <ajax> that sounds a bit like you want to virtualize the mapping between VkImages and X Pixmaps?

18:29 <jekstrand> You might want me to. :P

18:30 <jekstrand> That's something that has been discussed but no.

18:30 <jekstrand> We want to know the actual BO that's going to become free

18:30 <jekstrand> And have a fence/semaphore

18:30 <jekstrand> that tell you when it's actually free

18:30 <jekstrand> The theory being that there's no point in the app trying to race with X on its current composit anyway.

18:31 <jekstrand> (Unless that app is a VR thing but those are crazy and special and don't want X in the way to begin with)

18:37 alanc has quit [Remote host closed the connection]

18:38 alanc has joined #dri-devel

18:38 <ajax> fine.

18:38 <ajax> how much of a stall ReadPixels

18:38 <ajax> is ReadPixels, excuse me

18:41 <ajax> there _has_ to be some way to learn that a particular command buffer has been completed without stalling the whole chip

18:42 <ajax> how else do you ever reclaim old command bufs

18:43 <ajax> enh. i guess the ring buffer just tells you when a command goes into the gpu, not when it finishes.

18:43 <airlied> karolherbst: oh you have to specify row/image strides on user buffers? seems dangerous :-P

18:43 <karolherbst> airlied: well... the user provides the buffer

18:44 <karolherbst> anyway.. CL allows custom strides, so...

18:45 Haaninjo has quit [Quit: Ex-Chat]

18:45 <nchery> dcbaker: did you see my ping ab a backport for 22.0.3 https://gitlab.freedesktop.org/mesa/mesa/-/issues/6350#note_1355006 ?

18:55 <airlied> karolherbst: does the gallium interface allow that?

18:55 <ajax> jekstrand: is it actually that hard to predict which image will be returned next? it's the one with the lowest sbc.

18:55 <karolherbst> nope

18:57 sravn has joined #dri-devel

18:57 <airlied> karolherbst: did clover deal with it somehow?

18:57 <karolherbst> airlied: not at all

18:58 <karolherbst> I think it relies on resource_from_user_memory to fail

18:58 <karolherbst> the GL interface also doesn't allow custom strides which is a bit ... strange

18:58 <airlied> karolherbst: ah so cl_image_desc is the thing that specs it?

18:58 <karolherbst> yeah

18:59 <karolherbst> AMD_pinned_memory is the GL extension btw

18:59 <airlied> should check the vulkan ext

18:59 <airlied> just for completeness

19:00 <HdkR> pinned_memory is such a wacky extension

19:00 <karolherbst> it looks broken

19:00 <airlied> karolherbst: pinned memory seems buffer only

19:00 <karolherbst> so you specify the host ptr and say how big the texture is, but... not a word about strides?!?

19:00 <karolherbst> could be

19:00 <karolherbst> but

19:00 <karolherbst> you can read pixels out of it

19:00 <karolherbst> boxed

19:01 <airlied> you use the gl packing to do that then

19:01 <HdkR> SSBOs, UBOS, pixels. Dolphin-emu should abuse all of those with pinned_memory

19:01 <karolherbst> but yeah.. looks like the underlying memory needs to be a plain buffer

19:01 <jekstrand> ajax: It's impossible for the app to predict.

19:01 <jekstrand> ajax: The driver might be able to predict it, sure, but that doesn't solve any problems.

19:01 <karolherbst> airlied: yeah

19:01 <ajax> it does if it means ANI can return a promise and the fences/semas actually work

19:01 <ajax> right?

19:02 <jekstrand> ajax: But the only way to actually provide that promise is if the driver then stalls later because we can't actually pass the fence we get from X11 off to the kernel.

19:02 <jekstrand> Most drivers do use submit threads these days (or can) so we could, in theory, do it.

19:02 <ajax> unless we fix x11

19:02 <jekstrand> But oof

19:02 <ajax> which i keep telling you i know how to do

19:03 <jekstrand> What are we going to fix in x11?

19:03 <karolherbst> airlied: anyway.. the thing is, CL requires custom pitches and gallium doesn't allow it :(

19:03 <jekstrand> I'm still unclear on that

19:03 <karolherbst> it's fine for buffers, because it doesn't matter

19:03 <jekstrand> The only "fix" for x11 is if it starts using syncobj instead of shmfence

19:03 <airlied> karolherbst: guess you get to fixing gallium then :)

19:03 <ajax> or, for its shmfences to reflect explicit sync instead of implicit

19:03 <karolherbst> airlied: it seems that way :)

19:04 <jekstrand> ajax: Then we're back to driver threads

19:04 <ajax> i mean i have one of those for wsi for fifo modes anyway...

19:04 abws has joined #dri-devel

19:05 <jekstrand> Unless it's an explicit sync primitive I can hand off to the kernel as part of my exec, we have to manage the whole VkQueue with a driver thread and wait for the fence in the driver before we submit to X11

19:05 <karolherbst> airlied: but actually, I think it's time to upstream some patches atm, so I am a bit reluctant to add more stuff at this point :D

19:05 <karolherbst> the MR is way to big already anyway

19:05 <jekstrand> At which point X11 not being able to use kernel sync primitives is impacting driver design. Please, no.

19:06 <jekstrand> (Impacting far deeper than a bit of annoyance in the WSI code, that is.)

19:06 <karolherbst> jekstrand: ohh btw.. would you mind if I fold your rusticl reference stuff into the "initial" commit? I'd like to clean it all up and would rather prefer to fix the existing commits than to add a bunch of new ones :D

19:06 <jekstrand> karolherbst: fine with me

19:06 <karolherbst> okay, cool

19:08 <airlied> karolherbst: do both :-)

19:08 <airlied> create some upstreaming MRs, hack away while reviewers hold back your brilliant code :-P

19:08 <karolherbst> :D

19:08 <karolherbst> well..

19:08 <karolherbst> I don't want people to review patches if I change stuff 100 patches later

19:09 <airlied> karolherbst: oh the upstreaming MR should be cleanly rewritten usually

19:09 <karolherbst> so I wwant to move all "fixups" into the original code

19:09 <karolherbst> yeah

19:09 <karolherbst> that's my plan for now :P

19:09 <airlied> hopefully next week I can get back to trying to get any images working

19:09 <karolherbst> airlied: if you want to help, we track MRs here: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6311

19:09 <karolherbst> :D

19:10 <airlied> karolherbst: question on rusticl when mapping resources do you use the appropriate buffer/texture map functions?

19:10 <karolherbst> airlied: yes

19:10 <airlied> my feeling is clover is willing to use buffer map on texture resources

19:10 <karolherbst> it does

19:10 <airlied> and that might be the cause of some of my issues

19:10 <karolherbst> possible

19:11 <karolherbst> airlied: current fails with llvmpipe: https://gist.githubusercontent.com/karolherbst/58dac1bfb08482cc30c3d2eb98d79254/raw/990e64f7dac6ce03cb2f72613cbd80feabbebb19/gistfile1.txt

19:11 <karolherbst> every one of those are... annoying issues

19:11 soreau has quit [Read error: Connection reset by peer]

19:11 <karolherbst> but nothing really broken

19:11 <karolherbst> just... annoying

19:11 <karolherbst> allocations fail, because the first try fails and then the CTS is broken checking the second

19:12 soreau has joined #dri-devel

19:12 <karolherbst> basic is alignment of long16

19:12 <karolherbst> contractions is... FTZ

19:12 <karolherbst> images_image_streams is just clamping being not correct, but works if llvmpipe is exposed as a GPU....

19:13 <karolherbst> min_max_write_image_args is... crashing because I don't know, but I'd wait until jekstrand MRs with bumping limits lands

19:13 * airlied just marvels at CTS sometimes

19:13 <karolherbst> anyway.. for the image fails we could probably get a waiver

19:13 <karolherbst> for contractions we have to figure out if llvmpipe supports denorms or not

19:14 <karolherbst> there is one issue though which is driving me nuts: luxmark v3.1 crashes with llvmpipe in JIT code, and I never figured out why

19:14 <airlied> without tweaking fpstate I've no real idea :-P

19:14 <karolherbst> I suspect something is weird with nir passes, but...

19:14 <karolherbst> airlied: yeah...

19:15 <karolherbst> I honestly have no idea what would be the best approach regarding fpstate

19:15 <karolherbst> anyway.. I think I'll get an AMD GPU, so I might be able to try out some things later

19:16 * airlied will try and get back to digging a bit next week, this week has been short and written off elsewhere :-P

19:16 <karolherbst> sure

19:18 mvlad has quit [Remote host closed the connection]

19:30 lynxeye has quit [Quit: Leaving.]

19:59 ngcortes has quit [Ping timeout: 480 seconds]

19:59 famfo_znc has joined #dri-devel

20:01 famfo has quit [Ping timeout: 480 seconds]

20:07 ngcortes has joined #dri-devel

20:09 famfo_znc has quit []

20:12 <karolherbst> jekstrand: anything preventing this MR from landing? https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15673

20:12 <jekstrand> karolherbst: nope

20:14 rasterman has quit [Quit: Gettin' stinky!]

20:18 lemonzest has quit [Quit: WeeChat 3.4]

20:22 famfo has joined #dri-devel

20:26 YuGiOhJCJ has quit [Quit: YuGiOhJCJ]

20:36 <karolherbst> airlied: ohh, if you have a little time, this is something you could think about: https://gitlab.freedesktop.org/karolherbst/mesa/-/commit/1edb8e2797f9420f3d5080670476fc48cf1d43cc

20:36 <karolherbst> but yeah.. it sucks

20:37 <karolherbst> or maybe we could leave it in late, but also call it inside lp_get_disk_shader_cache if necessary?

20:38 <karolherbst> anyway, I need it for caching the libclc

20:49 ybogdano has quit [Ping timeout: 480 seconds]

21:01 mbrost has joined #dri-devel

21:01 mbrost_ has quit [Read error: Connection reset by peer]

21:10 Duke`` has quit [Ping timeout: 480 seconds]

21:11 GeorgesStavracasfeaneron[m] has joined #dri-devel

21:11 <zmike> dcbaker: do you prefer to look over backport MRs or should I just marge them?

21:12 frieder has joined #dri-devel

21:14 <jenatali> Georges Stavracas (feaneron): Yes, but you're not registered correctly, so only other people who are connected via Matrix can see your messages

21:18 frieder has quit [Remote host closed the connection]

21:33 famfo has quit [Ping timeout: 480 seconds]

21:34 famfo has joined #dri-devel

21:37 <airlied> karolherbst: yeah that patch would be fine to init it early I think

21:37 <karolherbst> airlied: ohh.. how easy would you think would it be to wire up function calling support in llvmpipe?

21:37 <airlied> karolherbst: I had it mostly working, except for implicit args

21:37 <karolherbst> airlied: well... creating the screen also ends up spawning all the threads sadly

21:37 <airlied> and flow control :-P

21:38 <karolherbst> ahh

21:38 <airlied> karolherbst: ah yeah that bit is messy isn't it

21:38 <karolherbst> it is, hence I asked you :P

21:38 <airlied> https://gitlab.freedesktop.org/airlied/mesa/-/tree/llvmpipe-cl-funcs

21:38 <karolherbst> the thing is...

21:38 <karolherbst> luxmark crashes in JIT code :(

21:39 <airlied> yeah I might go dig into that one next week

21:39 <karolherbst> cool

21:39 <airlied> sigsegv or sigbus?

21:39 <airlied> or worse SIGILL

21:39 <airlied> ?

21:39 <karolherbst> huh.. I think it's a sigsegv, but let me check

21:39 <airlied> usually it's just some calculation going into memory it shouldn't

21:40 <karolherbst> thing is.. it _sometimes_ doesn't crash

21:40 <karolherbst> but then the rendering is all wrong

21:40 <karolherbst> like it uses uninit values or something

21:41 <karolherbst> ehh I think it only does this inside gdb

21:41 <karolherbst> anyway, it's a segfault

21:42 <karolherbst> mhh.. let me try to lower undefs to 0. I think I already tried, but you never know

21:42 <jenatali> Oh no... a real world app using glProgramStringARB...

21:43 <karolherbst> I am sure they have a good reason for it

21:43 <karolherbst> "using assembly, because it's faster"

21:43 <jenatali> No, I seriously doubt it

21:43 <bnieuwenhuizen> you'd be surprised what some people are capable of

21:43 <karolherbst> ehh wait.. glProgramStringARB is something else

21:44 <jenatali> And Mesa's rejecting their program, woo

21:44 <karolherbst> good

21:44 <karolherbst> glProgramStringARB really sounds ugly

21:44 <karolherbst> wait.. so glProgramStringARB "simply" replaces the current glsl source code?

21:45 <jenatali> Looks like they have a comment after END and it seems Mesa's parser doesn't like that

21:46 <karolherbst> or is that ARB shader stuff?

21:46 <jenatali> This... is not a thing I thought I'd be debugging today

21:46 <karolherbst> (I am too young to know this)

21:46 <jenatali> karolherbst: It's an assembly shader

21:46 ngcortes has quit [Ping timeout: 480 seconds]

21:46 <karolherbst> ahh okay, so my first guess was indeed correct

21:46 <karolherbst> I am sure they do it because of performance

21:46 <karolherbst> as everybody knows, assembly is faster

21:47 <jenatali> Uh huh

21:47 gawin has quit [Ping timeout: 480 seconds]

21:52 <karolherbst> heh nice.. it still crashes with undef to zero

21:53 famfo_znc has joined #dri-devel

21:58 famfo has quit [Ping timeout: 480 seconds]

22:06 rasterman has joined #dri-devel

22:06 gawin has joined #dri-devel

22:10 <jenatali> Ugh... I think it looks like Mesa's lexer for ARB programs doesn't like that the last comment line in this program doesn't have a newline

22:12 <HdkR> Oh no, ARB

22:14 danvet has quit [Ping timeout: 480 seconds]

22:17 icecream95 has joined #dri-devel

22:17 <karolherbst> jenatali: well.. I'd say it's an application bug then :)

22:17 <karolherbst> case closed

22:17 <jenatali> But it works on all Windows GL drivers...

22:18 <karolherbst> ehh.. the spec disagrees with me, I say the spec is wrong

22:18 <karolherbst> "Comments begin with the character "#" and are terminated by a newline, a carriage return, or the end of the program array." :(

22:19 <HdkR> I think someone actually complained about this a year or so ago

22:19 <Sachiel> not ending a text file with a newline should always be a bug

22:19 ngcortes has joined #dri-devel

22:19 <karolherbst> Sachiel: it's windows

22:20 <karolherbst> they seem to like that

22:20 <karolherbst> :P

22:20 ybogdano has joined #dri-devel

22:21 <karolherbst> jenatali: workaround: insert a new line at the end of the program :)

22:21 <jenatali> I'm tempted

22:21 <karolherbst> I am sure we copy the string anyway, so we can also just add it :D

22:21 <jenatali> Yeah actually that's probably the easiest thing...

22:22 <HdkR> `/* The newline of shame gets added to all ARB programs as a workaround */`

22:23 ahajda_ has joined #dri-devel

22:23 ahajda has quit [Read error: Connection reset by peer]

22:23 <jenatali> Yeah

22:23 <karolherbst> I wouldn't be surprised if that's even better from a perf perspective

22:23 <jenatali> Otherwise apparently the only other option is for EOF to generate a different character that can be matched against? WTF flex?

22:23 <karolherbst> the alternative is to make the parser/lexer more complicated

22:24 <jenatali> Yeah

22:42 maxzor has joined #dri-devel

22:46 pcercuei has quit [Quit: dodo]

22:54 mbrost has quit [Ping timeout: 480 seconds]

22:56 * jekstrand realizes he knows way too much about NaN and begins questioning life choices

22:56 <jenatali> There we go, fixed by !16230, as ugly as it is

22:58 <HdkR> ah, probably fixes a bunch of things compiling to ARB with CG?

22:59 illwieckz has joined #dri-devel

22:59 <HdkR> Probably also fixes dolphin-emu 2.0 era OpenGL then :P

23:01 <jenatali> Probably?

23:01 <jenatali> Our compat folks just flagged this particular app because it doesn't need GL to render, but if you add in a Mesa GL impl, then it stops rendering

23:02 <jenatali> So it's technically a regression by adding GL support

23:06 tursulin has quit [Read error: Connection reset by peer]

23:11 <jenatali> Anyways, reviews or acks welcome. I have no idea if that code has an owner these days

23:11 morphis has quit [Ping timeout: 480 seconds]

23:11 morphis has joined #dri-devel

23:12 maxzor has quit [Ping timeout: 480 seconds]

23:13 <airlied> karolherbst: you can't create a context before libclc? though that would be pretty ugly

23:14 <dcbaker> zmike: just let me know that you’ve merged stuff so I don’t force push over it

23:14 <airlied> the other option would be to add a flag to screen creation I suppose

23:14 <karolherbst> airlied: the problem isn't that I can't, the thing is, I don't want to :P

23:14 <karolherbst> I kind of load the libclc when I create the device struct

23:15 <karolherbst> and that's like really really early

23:17 <airlied> plumbing a flag through would also be messy

23:17 <karolherbst> yeah...

23:17 ahajda_ has quit []

23:18 <zmike> dcbaker: kk

23:19 <airlied> karolherbst: the other option could be to delay disk cache thread init

23:20 <FireBurn> airlied: Have you thought about https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15331

23:21 <FireBurn> A solution will be needed for 22.1.0

23:22 <airlied> FireBurn: jekstrand, bnieuwenhuizen : can some just ack the fix, it's a clear regression

23:22 <airlied> if you want to figure out wtf is going wrong feel free, but I've burned a fair bit of time failing there

23:24 <FireBurn> Cheers

23:26 iive has quit []

23:27 illwieckz has quit [Remote host closed the connection]

23:28 <jekstrand> airlied: Yeah, let's land that. I hate it but idk what's going on and don't have time to burn figuring it out

23:29 <bnieuwenhuizen> FireBurn: the bug has "Issue starting Horizon Zero Dawn", what are your symptoms?

23:29 rkanwal has quit [Ping timeout: 480 seconds]

23:29 <bnieuwenhuizen> is it a crash?

23:29 <bnieuwenhuizen> if so it would be very helpful if we could get a backtrace

23:30 <FireBurn> No it doesn't crash

23:31 <FireBurn> Let me just revert the fix locally, it was a month ago and things are a bit fuzzy

23:36 illwieckz has joined #dri-devel

23:40 <FireBurn> I get a "Unfortunately the game has crashed. Do you want to help us fix the issue by sending a crash report? Yes / No"

23:40 <airlied> so one of those crashes we can't get a backtrace from because the game steals it :-P

23:41 <airlied> karolherbst: uggh late init the threads is messy as well

23:41 YuGiOhJCJ has joined #dri-devel

23:42 <bnieuwenhuizen> FireBurn: is this the only game where this happens?

23:42 <bnieuwenhuizen> or does happen for all vulkan games?

23:44 <FireBurn> https://gitlab.freedesktop.org/mesa/mesa/-/issues/6123

23:44 <FireBurn> Someone else reported other games that no longer worked

23:44 <FireBurn> Rise of the Tomb Raider loads

23:45 <bnieuwenhuizen> hmm, looks like it might be correlated with vkd3d-proton

23:45 <FireBurn> Detroit: Become Human appears to load and it's vkd3d-proton

23:47 <FireBurn> Does that game dump help?

23:48 <bnieuwenhuizen> no :(

23:48 <airlied> karolherbst: clover manages it's own clc disk cache, maybe that is an option?\

23:49 <karolherbst> airlied: the way clc loads the libclc is device specific

23:49 <karolherbst> also, we run optimizations

23:49 <karolherbst> and I don't think rusticl should try to know how to cache a device specific nir :)

23:49 <FireBurn> Ah detroit crashed after the shader compile at first start up

23:50 <karolherbst> or well.. we don't run optimizations yet, but we want to

23:50 <airlied> karolherbst: clover creates a disk cache per device

23:50 <karolherbst> airlied: that sounds horrible

23:51 <karolherbst> and also very buggy

23:51 <karolherbst> what's the proper key?

23:51 <karolherbst> how do we make sure we don't load the wrong nir for a different device?

23:51 <airlied> yeah it does sounds a bit broken, granted it doesn't change the nir per device

23:51 <karolherbst> yeah...

23:52 <karolherbst> anyway, it's the drivers responsibility to get me the properly configured disk_cache

23:53 <karolherbst> airlied: maybe we should skip loading llvmpipe if we have a hw device?

23:53 <karolherbst> or only load llvmpipe if the hw device fails to load

23:54 <airlied> well the workaround is for vulkan

23:54 <karolherbst> I could do that from within rusticl

23:54 <airlied> where you don't have that option

23:54 <karolherbst> ohh :(

23:54 ybogdano has quit [Remote host closed the connection]

23:54 <karolherbst> annoying...

23:54 <airlied> so you end up with pipe screens in the vulkan instance, and all the resources associated with that

23:55 <bnieuwenhuizen> FireBurn: want to try https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16232 ? (<--- airlied for review?)

23:56 <karolherbst> airlied: yeah... uhm...

23:56 <karolherbst> airlied: ohhhh... wait.. I have an idea

23:56 <karolherbst> but I don't like it

23:56 <karolherbst> airlied: after adding that, I also started to create a helper context per device

23:57 <airlied> bnieuwenhuizen: I think we tried that one, but maybe we didn't

23:57 <karolherbst> airlied: so I think I could get around by loading the libclc after I created the helper context

23:57 <airlied> karolherbst: yeah if you are creating a helper context, then just do disk cache later

23:58 <airlied> if you are considering removing helper context then it's something we need to dig into deeper

23:58 <karolherbst> I am not considering it, because I need it

23:58 <karolherbst> not for much, but...

23:58 <FireBurn> bnieuwenhuizen: Compiling now

23:58 <karolherbst> airlied: we have to be able to upload data to pipe_resurces without a CL queue

23:59 <karolherbst> soo.. no way around having such a helper context really

23:59 <karolherbst> (also I use it for async maps)

23:59 icecream95 has quit [Ping timeout: 480 seconds]