#dri-devel on 2022-03-17 — irc logs at oftc.irclog.whitequark.org

2021-07-26 22:56 ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar

00:00 <daniels> karolherbst: pre-armv8, unaligned access would trap for _all_ memory; it's a kernel tweak to decide if it passes on a SIGBUS or if it transparently fixes it up and continues

00:01 <karolherbst> :D

00:04 <karolherbst> daniels: I guess broken aarch64 plattforms could have the same thing?

00:05 <karolherbst> you have dts for all of that anyway, so just declare it inside one or something :P

00:07 <HdkR> unaligned device memory fixup on AArch64 sounds horrifying

00:07 <karolherbst> well..

00:07 <karolherbst> don't build broken hardware then?

00:08 <jcdutton> daniels, how would we test the SIGBUS transparent fixes feature?

00:11 <jekstrand> karolherbst: That sounds good enough for now

00:26 jcdutton has quit [Quit: Leaving]

00:28 pcercuei has quit [Quit: dodo]

00:34 iive has quit []

00:44 tursulin has quit [Read error: Connection reset by peer]

01:10 tarceri_ has quit []

01:10 tarceri has joined #dri-devel

01:23 ngcortes has quit [Remote host closed the connection]

01:27 dllud_ has joined #dri-devel

01:27 dllud has quit [Read error: Connection reset by peer]

01:38 columbarius has joined #dri-devel

01:39 co1umbarius has quit [Ping timeout: 480 seconds]

02:11 JohnnyonFlame has quit [Ping timeout: 480 seconds]

02:30 mbrost has quit [Ping timeout: 480 seconds]

02:32 shankaru has joined #dri-devel

02:35 mbrost has joined #dri-devel

02:53 mclasen has quit []

02:53 mclasen has joined #dri-devel

03:03 mripard_ has joined #dri-devel

03:05 saurabhg has joined #dri-devel

03:07 saurabhg has quit []

03:07 jewins1 has quit [Remote host closed the connection]

03:09 mripard has quit [Ping timeout: 480 seconds]

03:17 JohnnyonFlame has joined #dri-devel

03:23 mbrost has quit [Ping timeout: 480 seconds]

03:28 <mareko> what is virglrenderer?

03:33 mclasen has quit []

03:34 mclasen has joined #dri-devel

03:36 shankaru has quit [Quit: Leaving.]

03:40 heat has joined #dri-devel

03:40 shankaru has joined #dri-devel

03:45 wens has joined #dri-devel

03:48 JohnnyonFlame has quit [Read error: Connection reset by peer]

03:49 mclasen has quit []

03:49 mclasen has joined #dri-devel

03:55 mbrost has joined #dri-devel

04:01 vnayana has joined #dri-devel

04:14 mclasen has quit [Ping timeout: 480 seconds]

04:22 kts has joined #dri-devel

04:25 kem has quit [Ping timeout: 480 seconds]

04:26 vnayana_ has joined #dri-devel

04:33 vnayana has quit [Ping timeout: 480 seconds]

04:34 kem has joined #dri-devel

04:40 Daanct12 has joined #dri-devel

04:42 ybogdano has quit [Ping timeout: 480 seconds]

04:43 mbrost has quit [Ping timeout: 480 seconds]

04:54 alatiera has quit [Quit: The Lounge - https://thelounge.chat]

04:55 ybogdano has joined #dri-devel

04:56 Duke`` has joined #dri-devel

04:56 alatiera has joined #dri-devel

04:58 <airlied> mareko: the host side of virgl

04:58 <airlied> plugs into qemu or crosvm and translate gallium into GL (or in later bits vulkan into vulkan)

05:04 ybogdano has quit [Ping timeout: 480 seconds]

05:14 mattrope has quit [Read error: Connection reset by peer]

05:25 heat_ has joined #dri-devel

05:25 heat has quit [Read error: Connection reset by peer]

05:38 YuGiOhJCJ has quit [Quit: YuGiOhJCJ]

05:39 aravind has joined #dri-devel

06:14 kts has quit [Quit: Konversation terminated!]

06:19 Daanct12 has quit [Remote host closed the connection]

06:21 prahal has quit [Ping timeout: 480 seconds]

06:21 heat has joined #dri-devel

06:22 heat_ has quit [Read error: No route to host]

06:24 danvet has joined #dri-devel

06:30 itoral has joined #dri-devel

06:37 nchery has quit [Remote host closed the connection]

06:37 nchery has joined #dri-devel

06:40 eukara_ has quit []

06:44 Duke`` has quit [Ping timeout: 480 seconds]

06:48 pnowack has joined #dri-devel

07:05 heat has quit [Ping timeout: 480 seconds]

07:14 soreau has quit [Read error: No route to host]

07:16 soreau has joined #dri-devel

07:30 Daanct12 has joined #dri-devel

07:36 alanc has quit [Remote host closed the connection]

07:37 alanc has joined #dri-devel

07:37 shankaru has quit [Quit: Leaving.]

07:38 ahajda has joined #dri-devel

07:39 frieder has joined #dri-devel

07:39 itoral has quit [Remote host closed the connection]

07:48 kts has joined #dri-devel

07:50 shankaru has joined #dri-devel

07:58 jkrzyszt has joined #dri-devel

08:19 kts_ has joined #dri-devel

08:22 kts has quit [Ping timeout: 480 seconds]

08:27 itoral has joined #dri-devel

08:28 MajorBiscuit has joined #dri-devel

08:31 MajorBiscuit has quit []

08:31 kts_ has quit []

08:33 nchery has quit [Read error: Connection reset by peer]

08:35 MajorBiscuit has joined #dri-devel

08:36 illwieckz has quit [Ping timeout: 480 seconds]

08:39 tursulin has joined #dri-devel

08:45 ella-0_ has quit [Read error: Connection reset by peer]

08:45 <pq> agd5f, the only use case I could understand from the email discussions was that you want to kill processes by the GPU reset event.

08:46 <pq> agd5f, in my opinion, killing processes is a harmful thing to do, and step in the wrong direction if we want to make systems recover from a reset.

08:48 <pq> This was just the final straw for me.

08:48 <danvet> hm which discussion?

08:48 <danvet> sounds like something I should look at :-)

08:49 <pq> Dri-devel is consuming my time and sanity, but I get nothing from it. I think it's time to stop this addiction. I have disabled mail delivery, but CC should still work. It's time for me to leave, but I will come back if I have to.

08:50 <pq> This way I'm less likely to butt in to topics I have nothing to do with.

08:50 sdutt has quit [Ping timeout: 480 seconds]

08:52 pq has left #dri-devel [Goodbye.]

08:57 lynxeye has joined #dri-devel

08:59 vnayana_ has quit [Ping timeout: 480 seconds]

09:02 anholt has quit [Ping timeout: 480 seconds]

09:08 anholt has joined #dri-devel

09:09 jbarnes has quit [Remote host closed the connection]

09:10 jbarnes has joined #dri-devel

09:17 <MrCooper> danvet: "[PATCH v2 1/2] drm: Add GPU reset sysfs event"

09:24 famfo has quit []

09:25 famfo has joined #dri-devel

09:25 <danvet> MrCooper, thx, replied somewhere hopefully useful

09:25 <MrCooper> cool

09:34 Daanct12 has quit [Remote host closed the connection]

09:41 <MrCooper> danvet: FYI, amdgpu doesn't always nuke everything for GPU reset; I've hit a few cases where mutter continued as if nothing happened after amdgpu reset just one engine (though Firefox responded to the corresponding robustness notification by dropping to SW rendering :)

09:46 pcercuei has joined #dri-devel

09:47 <danvet> javierm, do you plan to apply the patches from geert?

09:48 <danvet> maybe also volunteer him for commit rights in a bit

09:48 <danvet> MrCooper, yeah there's a "unblock stuck threads in all CU" first step iirc

09:48 <danvet> I thought I mentioned that in my mail

09:48 <danvet> but after that it's the "nuke the chip" step

09:48 <danvet> which is awkward because the locking against kms is pain

09:48 <danvet> and the locking against fbcon is impossible :-/

09:49 <danvet> intel hw is pretty nice, since you have per block reset

09:50 <danvet> where block = engine, except for compute engines because they all share the same EU so hard to reset in isolation

09:59 shankaru has quit [Quit: Leaving.]

10:27 pochu has joined #dri-devel

10:29 tzimmermann has joined #dri-devel

10:30 angerctl has quit [Ping timeout: 480 seconds]

10:30 angerctl has joined #dri-devel

10:32 <tzimmermann> javierm, hi. if you're around, i'd need a review of https://patchwork.freedesktop.org/series/101321/

10:35 mclasen has joined #dri-devel

10:40 mclasen has quit []

10:40 mclasen has joined #dri-devel

10:41 frankbinns has joined #dri-devel

10:43 rkanwal has joined #dri-devel

10:51 mclasen has quit []

10:51 mclasen has joined #dri-devel

10:51 flacks has quit [Quit: Quitter]

10:54 flacks has joined #dri-devel

11:12 <javierm> danvet: yes, I was planning to wait a little bit if someone else wanted to look and then apply

11:12 <javierm> danvet: and yeah, we should ask him to request commit access too

11:13 <javierm> tzimmermann: sure. I didn't because thought that Geert already did but will look at it now

11:14 <javierm> tzimmermann: there's also https://patchwork.kernel.org/project/dri-devel/patch/20220317054602.28846-1-chuansheng.liu@intel.com/ which looks good to me

11:17 shankaru has joined #dri-devel

11:19 mvlad has joined #dri-devel

11:31 mclasen has quit []

11:31 mclasen has joined #dri-devel

11:45 shankaru has quit []

12:09 elongbug has joined #dri-devel

12:09 elongbug has quit []

12:10 MajorBiscuit has quit [Ping timeout: 480 seconds]

12:41 lemonzest has quit [Quit: WeeChat 3.4]

13:05 shankaru has joined #dri-devel

13:07 aravind has quit []

13:13 itoral has quit [Remote host closed the connection]

13:31 sdutt has joined #dri-devel

13:39 shankaru has quit [Quit: Leaving.]

13:43 lemonzest has joined #dri-devel

13:47 pochu has quit [Quit: leaving]

13:47 fxkamd has joined #dri-devel

13:52 mbrost has joined #dri-devel

14:02 mattrope has joined #dri-devel

14:03 mattrope has quit [Remote host closed the connection]

14:17 rkanwal has quit [Ping timeout: 480 seconds]

14:21 <pjakobsson> danvet, did my explanation in https://patchwork.kernel.org/project/dri-devel/patch/20220317092555.17882-4-patrik.r.jakobsson@gmail.com/ make sense or do you want me to resend?

14:27 <danvet> pjakobsson, nah I simply missed those when scrolling through

14:27 <danvet> I tried to look for anything like that, I guess I wasn't awake enough yet

14:28 <danvet> and yeah maybe for next time around split these up more, the commit message at least sounds a bit like multiple things smashed into one

14:35 <pjakobsson> danvet, ok thanks

14:37 sdutt has quit []

14:38 sdutt has joined #dri-devel

14:42 Company has joined #dri-devel

14:45 kts has joined #dri-devel

14:51 mclasen has quit []

14:51 mclasen has joined #dri-devel

14:55 shankaru has joined #dri-devel

15:07 Haaninjo has joined #dri-devel

15:11 nchery has joined #dri-devel

15:13 <graphitemaster> So on the topic of DirectStorage from yesterday, what is the equivalent in Linux

15:14 <imirkin> p2p dma

15:14 <imirkin> no userspace-focused APIs afaik

15:14 <imirkin> more like allowing dmabufs to be passed around

15:14 <imirkin> and making things Just Work (tm)?

15:14 <graphitemaster> Seems like something that should be added to io_uring

15:15 <imirkin> i guess i don't know too much about io_uring, but seems entirely unrelated to me

15:15 <imirkin> that one's about cpu <-> device interactions, basically

15:17 <graphitemaster> The thing is the NVMe device is not a regular mapped drive on the CPU side, it's just a flat store of bytes, GPU side just direct reads from it as if the thing is a big chunk of RAM

15:17 <graphitemaster> But you still need to write to the drive as if it was a RAM device (that's how the API works anyways)

15:18 <graphitemaster> So I assume that would map best to io_uring, rather than read/write which requires a working VFS implementation for the kernel.

15:19 <graphitemaster> It's also just unusually weird if you had to use pwrite/pread giving it pointer _addresses_ reinterpreted as offsets. I suppose you could mmap, but mmap is over a _file_

15:19 <graphitemaster> Maybe the device driver just exposes a DirectStorage drive as a singular file, humm..

15:20 <imirkin> there's also something called HMM

15:20 <imirkin> and afaik that's meant to work with dax pages?

15:20 <graphitemaster> I forgot about HMM - in either case we need some working user-space API :P

15:21 <graphitemaster> They're also planning on bringing DirectStorage to WebGPU

15:22 <graphitemaster> So now my webbrowser can fill an SSD from Javascript :P

15:22 <imirkin> woohoo!

15:25 MajorBiscuit has joined #dri-devel

15:42 kts has quit [Quit: Konversation terminated!]

15:46 cheako is now known as cheakoirccloud

15:48 cheakoirccloud has quit []

15:48 cheakoirccloud has joined #dri-devel

15:50 cheakoirccloud has left #dri-devel [#dri-devel]

15:53 cheako has joined #dri-devel

16:06 <anholt> tomeu: does the current lava stuff really not have a "regex for lines looking like this and restart the job?

16:06 <anholt> should I be qable to regex on "for line in logs:"?

16:06 <tomeu> we have that in lava, but is more involved than that

16:07 <tomeu> it basically detects what phase of a job it failed, and depending on what it was will fail the job with an infrastructure error

16:07 <tomeu> which is handled differently

16:07 <anholt> https://gitlab.freedesktop.org/mesa/mesa/-/issues/6158 is pretty critical at this point

16:08 <anholt> like, we need to turn off a618 if we can't get progress on it

16:08 <tomeu> I think we should remove coverage more aggressively if the impact is high

16:08 <tomeu> I can do that

16:09 <anholt> here's another a618 with an obvious regex to add to trigger restart https://gitlab.freedesktop.org/mesa/mesa/-/jobs/19878350

16:10 <anholt> if this was bare-metal, I would have just added the regexes and been done, but I don't see how to do that for maintaining lava boards.

16:11 <tomeu> well, it's hard to keep the total pipeline runtime down if we have so many restarts, we have seen that being a problem already

16:12 <anholt> ok, but is someone working on fixing the underlying issue on your end in that case?

16:12 <anholt> it's failing multiple marges per day currently.

16:13 <anholt> if you don't have someone fixing the underlying thing, then we at least need restarts.

16:13 <anholt> or we have to give up on using the lava lab for our coverage.

16:13 eukara has joined #dri-devel

16:16 <tomeu> yes, we have one person in the lab that is going to be actively looking for these issues (starting from tomorrow)

16:22 <daniels> it's failing a bit less often than a630, but close

16:22 <daniels> (that's raw numbers, not filtered for legit vs. spurious)

16:23 <daniels> anholt: turns out I don't understand 2022 Python, so gallo has the stuff I was doing to change log parsing (so we can regex, and get positive confirmation of starting deqp rather than negative confirmation of specific fail regex, and so it's less hokey, and also way less noisy), and is merging it into the version he has in tree now which has tests etc

16:42 jkrzyszt has quit [Ping timeout: 480 seconds]

16:50 tobiasjakobi has joined #dri-devel

16:52 shankaru has quit [Quit: Leaving.]

17:05 tobiasjakobi has quit []

17:07 Duke`` has joined #dri-devel

17:16 MajorBiscuit has quit [Ping timeout: 480 seconds]

17:20 airlied has quit [Remote host closed the connection]

17:21 airlied has joined #dri-devel

17:23 frieder has quit [Remote host closed the connection]

17:27 <agd5f> graphitemaster, both devices need to support the same p2p API in the kernel (GPUs support dma-buf, nvme supports pci p2p), so you'd need to convert one or the other to a common p2p API. Then you need to some userspace API to provide actually utilize the p2p transfer.

18:30 <karolherbst> airlied: did you write some patches for nir_op_ball_iequal8 and the likes?

18:31 <karolherbst> jekstrand: I think I am almost done :D

18:32 <airlied> karolherbst: thought there was a lowering pass

18:32 <karolherbst> airlied: there is, just not for 8 and 16

18:32 <graphitemaster> agd5f, Well I hope someone is looking into it. I sense a shift in gaming where DirectStorage goes from being an optional thing to a requirement in the next 2 years or so.

18:33 <karolherbst> I was just hoping you already have a patch for it

18:33 <jekstrand> karolherbst: done?

18:33 <karolherbst> jekstrand: yeah.. so maybe 2 or 3 things and all the test pass at least using a CL 1.0 device

18:34 <jekstrand> Nice!

18:34 <jekstrand> karolherbst: Images too?

18:34 <karolherbst> ahh nope

18:34 <karolherbst> I am not sure if I want to do images or 1.2 next

18:34 <karolherbst> probably 1.2

18:35 <karolherbst> images is probably just wiring up more gallium interfaces really

18:35 <karolherbst> I think

18:35 <karolherbst> and using whatever lowering pass we have in clc

18:36 mhenning has joined #dri-devel

18:36 <karolherbst> 1. work offsets 2. linking with external kernels 3. some alu precision stuff

18:36 <karolherbst> that's what needs to be done for 1.0 without images I think

18:36 <karolherbst> there are some weirdo sub buffer fails as well... not sure what the problem is though

18:37 <airlied> karolherbst: does it reproduce under clover?

18:37 <karolherbst> what?

18:37 <airlied> whatever causes that intrinsic

18:37 <karolherbst> ehh.. let me try

18:38 <karolherbst> airlied: yeah, it does

18:38 <karolherbst> test_relationals: ../src/compiler/nir/nir_lower_bool_to_int32.c:109: lower_alu_instr: Assertion `alu->dest.dest.ssa.bit_size > 1' failed.

18:38 illwieckz has joined #dri-devel

18:39 <karolherbst> I can fix it, I just hoped you already did so :D

18:40 <airlied> ill look in my wip branch on a bit

18:40 heat has joined #dri-devel

18:40 <karolherbst> airlied: I also need something for contractions

18:41 <karolherbst> 31) Error for float kernel1: -0x1.c59d36p-122 * -0x1.8029ap-17 + -0x0p+0 = *0x0p+0 vs. 0x1.546p-138

18:41 <karolherbst> clover also fails here

18:42 <karolherbst> jekstrand: I am slowly start to wonder when I should create an MR for it :D

18:43 <jekstrand> karolherbst: Why not?

18:43 <jekstrand> karolherbst: We can keep working on the MR branch while dcbaker and others argue over the rust build system stuff.

18:43 <karolherbst> yeah... my goal was just to get it into good enough shape it passes a basic CL 1.0 CTS run :D

18:43 <karolherbst> ahhh

18:43 <karolherbst> right, we need that include stuff fixed

18:44 <karolherbst> but yeah. I think at this point I am mostly interested in general feedback from everybody knowing more about rust than I do

18:45 ngcortes has joined #dri-devel

18:45 <dcbaker> If you’re willing to go the structured_sources route, that will work in the upcoming 0.62 release

18:45 flto has quit [Ping timeout: 480 seconds]

18:45 <karolherbst> dcbaker: ahh, no, we found another solution, which involves dangerous pointer casting instead

18:46 <karolherbst> but I meant that other issue

18:46 <karolherbst> where I need generated nir header files for bindgen

18:46 <dcbaker> Ah, that one

18:47 <dcbaker> That will have to wait till 0.63, I’m having to change implantation details of dependencies in non-trivial ways :/

18:47 <karolherbst> yeah no worries

18:48 <karolherbst> not interested in getting it merged at this point, just for more people to look at it, now that stuff basically works

18:48 <dcbaker> I have a start on that, but we’re in release freeze atm

18:51 Danct12 has quit [Remote host closed the connection]

18:52 <dcbaker> I’m just interested in getting stuff solved asap so we don’t need a Meson from yesterday when we’re ready to turn it on in general

18:52 <jekstrand> karolherbst: I am planning to come back to it. Really, I am!

18:53 * jekstrand is having too much fun with misc. Vulkan drivers this week

18:53 <karolherbst> :D

18:54 <jekstrand> I'm slightly tempted to dive into v3dv as well but I think I might put that off a bit.

18:55 <jekstrand> zmike: Kicking my lavapipe renderpass MR through CI now

18:55 <jekstrand> zmike: crucible's func.first passes so it's not totally hosed. :D

18:55 <karolherbst> airlied: I think I really need this compiler stuff though... but compiling llvm-14 locally is going to be pain

18:55 <zmike> jekstrand: ci doesn't do a full cts run

18:55 <zmike> so that's gonna have to happen

18:56 <jekstrand> zmike: It doesn't? Ok, I can do that. Let me shove everything over to my desktop

18:56 <zmike> no, it does a very short run

18:57 <karolherbst> jekstrand: the most painful part is now how long those test run take.. 1:10 hours on my i7-12700

18:58 <karolherbst> and the CPU is pretty much at 100% all the time

18:58 <karolherbst> I hope that's much better on iris

18:59 <jekstrand> karolherbst: It's somewhat better on iris because there's an actual GPU but it's probably still pretty compile-heavy

18:59 <karolherbst> mhhh I wonder

18:59 <karolherbst> there is this thread_dimension full_3d_explicit_local test which alone is responsible for 30 min of runtime

18:59 ngcortes has quit [Remote host closed the connection]

19:00 <karolherbst> I am sure you'd be able to cut it in half with iris

19:00 <karolherbst> maybe that's the next step after getting llvmpipe to work :D

19:03 <airlied> karolherbst: https://gitlab.freedesktop.org/airlied/mesa/-/commit/9b1b7b07a7f623cfb167d7bec629dab97a5cbc76

19:04 <airlied> I think I had to move that into the state tracker

19:04 iive has joined #dri-devel

19:04 <airlied> https://gitlab.freedesktop.org/airlied/mesa/-/commit/61de7b0cb14b17af8c7b2bd24ea9e3d560baea66

19:05 flto has joined #dri-devel

19:05 <airlied> probably need that in clc

19:06 <karolherbst> mhhh

19:06 <karolherbst> ahh yeah, makes sense

19:07 <karolherbst> that clover patch looks like something I can port

19:13 rasterman has joined #dri-devel

19:17 alyssa has joined #dri-devel

19:17 <alyssa> Does anyone know what the spec says about side effects in vertex shaders?

19:17 <jekstrand> zmike: Ok, Full CTS run going

19:18 <jekstrand> Should take about an hour according to deqp-runner's current estimates.

19:18 <alyssa> Is it legal for the shader to be run multiple times? or none, for culled vertices?

19:18 <zmike> sounds right

19:18 <alyssa> (Relevant for tilers)

19:18 <jekstrand> alyssa: vertex shader or fragment?

19:18 <alyssa> jekstrand: vertex

19:19 <jekstrand> alyssa: multiple: always, I think. None? Depends on if it has side-effects like atomics.

19:19 <alyssa> "atomicAdd(count, 1); gl_Position = a_position;"

19:19 <alyssa> errm

19:19 <alyssa> "atomicAdd(counts[gl_VertexID], 1); gl_Position = a_position;"

19:19 <jekstrand> I *think* you're supposed to always run it if there are side-effects

19:19 <karolherbst> okay... get_global_offset

19:19 <jekstrand> But I could be wrong there. I'm not especially familiar with that particular corner.

19:20 <alyssa> It would be convenient if the spec didn't say anything about the value of counts[] for that shader

19:20 <alyssa> but hey

19:20 <alyssa> not sure what I would even grep for

19:20 <imirkin> alyssa: side-effects

19:20 <imirkin> and/or "side effects"

19:20 <imirkin> you can also look at e.g. the atomic counters spec (i know this isn't about counters, but it's the same idea)

19:21 <alyssa> imirkin: There's text for helper invoctions but nothing else in ESSL 3.2

19:21 * alyssa tries ES3.2 itself

19:22 <alyssa> "The repeatability requirement doesn't apply when using shaders containing side effects... because these memory operations are not guaranteed to be processed in a defined order"

19:22 * jekstrand can actually hear his desktop right now

19:22 <anholt> alyssa: yes, legal to run VSes multiple times.

19:22 <pendingchaos> https://www.khronos.org/registry/vulkan/specs/1.3-extensions/html/chap9.html#shaders-vertex-execution

19:22 <alyssa> anholt: OK

19:22 <imirkin> alyssa: that's a different thing, fwiw

19:22 <imirkin> alyssa: that's for the r600-style apply-everything-at-the-end types of setups

19:23 <alyssa> imirkin: ah

19:23 <imirkin> and then the exact ordering of blocks can end up being different

19:23 <alyssa> spec seems silent here

19:23 <anholt> GLES was explicit about this related to atomics, GL wasn't, but it's been treated as a bug when GL tests relied on executing once.

19:23 <alyssa> anholt: that helps, thank you :)

19:24 <alyssa> (atomics and also SSBOs/images/etc I would hope?)

19:24 <anholt> right

19:24 <alyssa> does the side effect have to happen for vertices corresponding to primitives that will be culled?

19:24 <alyssa> (Practically: do the side effects haev to happen in the binning shader?)

19:26 ella-0 has joined #dri-devel

19:27 <graphitemaster> Why might an implementation want to execute the VS multiple times?

19:28 <imirkin> graphitemaster: binning

19:29 <alyssa> graphitemaster: and if re-running the VS is faster than looking up the result in memory

19:29 <alyssa> (Might hold for funny caching setups, etc)

19:29 <imirkin> graphitemaster: or more importantly, tiling. binning is just an optimization for tiling.

19:30 <imirkin> basically all the mobile gpu's only have a teensy little bit of fast ram that's good for blending and such ops

19:30 <imirkin> so a FB is cut up into tiles, and you run the full pipeline once for each tile

19:30 <imirkin> but then you run multiple draws in a single tile

19:31 froz has joined #dri-devel

19:31 <imirkin> so you save on moving things in/out of fast memory

19:31 rkanwal has joined #dri-devel

19:31 <imirkin> in exchange for running the vertex stages a bunch of times

19:31 froz has quit []

19:31 <alyssa> to add, new Mali supports two (well three) geometry flows

19:31 <imirkin> (and binning is an opt which helps avoid running the _whole_ vertex pipeline, and instead only do the vertices which will rasterize to polygons that matter)

19:32 <alyssa> a binning flow, and a non-binning one

19:32 <imirkin> ah yeah, binning is optional on adreno too

19:32 <alyssa> (The non-binning flow is legacy)

19:32 <alyssa> Panfrost is conservative and only uses binning when there are no side effects / XFB

19:32 <alyssa> but maybe that's overkill

19:34 <alyssa> I am currently wiring up the legacy path for valhall and wondering if I should just, not

19:36 agd5f_ has joined #dri-devel

19:40 <alyssa> I suppose it helps for that random big GL game that was written without tilers in mind and depends on the exactly-one-invocation behaviour

19:40 <alyssa> if we end up with drirc entries for that

19:41 <HdkR> every big GL game that is written without tilers in mind*

19:41 <alyssa> HdkR: ...and depends on the subtle SSBO behaviour

19:41 <HdkR> Of course :D

19:41 <imirkin> alyssa: freedreno does XFB in the binning pass ;)

19:42 <imirkin> [when there's no hw xfb supported, that is]

19:42 <alyssa> imirkin: So I here

19:42 <alyssa> hear

19:42 agd5f has quit [Ping timeout: 480 seconds]

19:43 * alyssa would be somewhat interested in a nir_lower_xfb

19:43 <alyssa> using mareko 's new store_output stuff

19:44 * alyssa might even write one depending on her motivation levels

19:47 <airlied> karolherbst: 15433 has the clover fix if you have am inute

19:47 <imirkin> alyssa: i thought the freedreno thing was a common pass .... maybe not

19:48 <alyssa> imirkin: will see

19:48 <imirkin> it does rely on some special inputs

19:48 <alyssa> I don't think it became possible to lower XFB purely generically until a few weeks ago

19:48 <imirkin> might not be extremely generic. but it's def an nir pass.

19:48 <alyssa> Anyway, doing "hardware" XFB made sense for Malis that didn't do binning, it saved on memory b/w

19:48 <Lynne> airlied: "can't use VK_IMAGE_CREATE_DISJOINT_BIT because VK_FORMAT_G8_B8R8_2PLANE_420_UNORM doesn't support VK_FORMAT_FEATURE_DISJOINT_BIT"

19:49 <imirkin> should always do hardware xfb. a3xx just doesn't have it. and i'm still RE'ing the finer details on a4xx.

19:49 <alyssa> we don't have real hw xfb

19:49 <alyssa> it's just, on older malis the varyings all got written to driver-allocated memory anyway

19:49 <Lynne> we need to be able to convert between planar images (for codecs) and standalone images (what we normally use everywhere), so not being able to merge images shuts us down on writing a hwaccel

19:49 <alyssa> so we can play games to use the same buffer for xfb and for internal varying use

19:50 <airlied> Lynne: I don't think the hw can support disjoint for hw operations thoigh

19:50 <airlied> Lynne: the decode hw has to have the planes in the same BO allocation

19:50 <Lynne> for intel yes, but pretty sure AMD wants separate planes

19:50 <Lynne> at least in vaapi land

19:51 <airlied> nope

19:51 <airlied> you can have separate planes but they must be in the same object allocation

19:51 <Lynne> oh, is that why radv hard-requires dedicated allocation?

19:53 <airlied> oh maybe I'm confusing intel and amd here, let me dig a bit more

19:54 <alyssa> imirkin: anyway, would need to benchmark but I suspect on newer Mali lowering XFB would be a win for perf

19:54 <airlied> Lynne: oh looks like you might be right, I should probably allow disjoint on amd then

19:54 <alyssa> since Arm seemed to stop caring about perf of the legacy (non-binning) path

19:55 lynxeye has quit []

19:55 <imirkin> alyssa: ok. i prefer the hw approach since it avoids ... annoyances. but if it's missing, then can't do much about that

19:57 <alyssa> imirkin: nod. I guess what I'm saying is, they're both software paths, just with different costs.

19:57 <imirkin> ah yeah. then it's annoying.

19:57 <airlied> Lynne: looks like radv has no disjoint support at all yet, so will have to think about it a bit

19:57 <airlied> bnieuwenhuizen: ^ any ideas?

19:58 <alyssa> imirkin: and given we're failing a big pile of piglits (despite passing deqp-gles), I am not convinced the code is correct either :v

19:58 <imirkin> alyssa: well, watch out with piglits... a lot of them assume correct GL_QUADS handling

19:58 <imirkin> which you might not have, nor care to have

19:59 <alyssa> We have real QUADS

19:59 <bnieuwenhuizen> airlied: it is annoying to reference multiple BOs in descriptor sets for a single binding without making it huge for everything

19:59 agd5f_ has quit []

19:59 gouchi has joined #dri-devel

19:59 agd5f has joined #dri-devel

20:00 <agd5f> graphitemaster, I wouldn't hold your breath considering how much of a challenge it was to get basic p2p support upstream.

20:01 <agd5f> plus I think only AMD and Intel CPUs support p2p DMA. There is no PCIe spec to determine whether it's supported or not

20:01 <agd5f> so the kernel uses a whitelist

20:02 <bnieuwenhuizen> Lynne: not sure about imported images, but in radeonsi VA, all allocated images currently consist of a single BO

20:03 <Lynne> airlied: thanks, also look into updating your radv video branch to the newest headers

20:03 <Lynne> I've left my more-or-less complete but untested code here - https://github.com/cyanreg/FFmpeg/tree/vulkan_decode

20:04 <Lynne> I have a spare machine, I'll sacrifice it and test with nvidia tomorrow

20:04 <Lynne> bnieuwenhuizen: even with VA_EXPORT_SURFACE_SEPARATE_LAYERS?

20:05 <bnieuwenhuizen> lemme look up what layers were in this case again, but IIRC all layers still get the same BO

20:06 <airlied> Lynne: a rebase should bring them up 1.3.207 is that new enough?

20:06 <bnieuwenhuizen> just with a different offset

20:06 * airlied will fixup the rebase today

20:07 <bnieuwenhuizen> Lynne: thought AFAICT VA_EXPORT_SURFACE_SEPARATE_LAYERS doesn't really get any handling in the gallium libva frontend

20:07 <Lynne> airlied: yah, that's new enough

20:07 <Lynne> airlied: " vkCreateImage(): extent.depth 1 exceeds allowable maximum image extent depth 0"

20:07 <Lynne> for a 2D image, that doesn't make any sense

20:08 <airlied> grrr someone updated the vulkan headres without using the script

20:08 <bnieuwenhuizen> I updated but I haven't heard of any script?

20:09 <Lynne> someone also broke the video headers by not including vulkan_video_codecs_common.h

20:09 <airlied> bin/khronos-update.sh

20:09 ybogdano has joined #dri-devel

20:10 <airlied> though it might be a bit heavyweight and we should let it just update vulkan headers if needed

20:12 <airlied> bnieuwenhuizen: please ack 15434

20:13 <bnieuwenhuizen> done

20:14 <airlied> Lynne: branch is on updated headers now

20:17 <Lynne> thanks

20:18 nchery is now known as Guest2430

20:18 nchery has joined #dri-devel

20:19 <jekstrand> zmike: Pass: 375726, Fail: 5, Crash: 3, Warn: 16, Skip: 713664, Timeout: 41, Flake: 6, Duration: 43:48, Remaining: 0

20:19 <jekstrand> zmike: Is that good enough or do you want me to run against main and compare?

20:20 <Sachiel> unacceptable. 100% pass ratio or bust

20:20 <zmike> jekstrand: lavapipe is currently conformant minus like 2 tests from https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15335

20:20 <zmike> so...yeah crashing is no bueno

20:21 <jekstrand> Ok

20:21 * jekstrand runs mainline

20:25 Guest2430 has quit [Ping timeout: 480 seconds]

20:27 mclasen has quit []

20:27 mclasen has joined #dri-devel

20:32 deathmist has joined #dri-devel

20:40 <karolherbst> th hell.. decl_var uniform INTERP_MODE_NONE uvec4x0a16B base_global_invocation_id (1, 16, 0)

20:41 <karolherbst> I am not sure if I should blame rust or soemthing else on that uvec4x0a16B

20:44 <jekstrand> karolherbst: There are passes you may be running which turn vec3 into vec4

20:44 <karolherbst> ohh that's not what I mean. I am just mostly confused on how that type is written out

20:45 <alyssa> karolherbst: memory corruption? :-p

20:45 <karolherbst> I have no idea

20:45 <jekstrand> It's a uvec4 with an explicit alignment of 16B

20:45 <karolherbst> ahh

20:45 <alyssa> jekstrand: where does this notation come from

20:45 <jekstrand> And a stride of 0 because stride doesn't matter for vectors

20:45 <jekstrand> alyssa: Uh... I came up with it?

20:45 <karolherbst> jekstrand: I am do wondering though why microsoft lowest vec3 to vec4

20:45 <jekstrand> nir_print.c

20:45 <karolherbst> *lowers

20:46 <jekstrand> karolherbst: It gets rid of most struct holes so memcpy optimizations work.

20:46 <jekstrand> That's one reason, anyway

20:46 <karolherbst> ohh

20:46 <karolherbst> okay.. then I guess I can keep it anyway and still load internal vec3 uniforms

20:56 <deathmist> is there docs on running dEQP? I know it's setup for CI but I'd like to help out with FD540 and run it locally

20:58 <mattst88> deathmist: not really any docs, but this is how I run it: https://dpaste.org/jk1L

21:00 rkanwal has quit []

21:00 rkanwal has joined #dri-devel

21:03 <deathmist> mattst88: thanks, I'll test it after building a debug-optimized mesa

21:04 <mattst88> yw!

21:08 pnowack has quit [Quit: pnowack]

21:10 <HdkR> So there are currently two RISCV GPU projects in the world. LibreSoC and RV64X. I believe both projects have been in here before? Anyone have any information as to which one is active?

21:10 heat has quit [Read error: No route to host]

21:10 <airlied> HdkR: LibreSOC is not longer risc-v based

21:10 heat has joined #dri-devel

21:10 <bnieuwenhuizen> also don't forget llvmpipe

21:10 <HdkR> ah. That's good to know

21:11 <HdkR> Did LibreSoC move over to POWER or something?

21:11 <alyssa> hnnnn

21:11 <bnieuwenhuizen> IIRC yes

21:11 <alyssa> HdkR: are you trying to nerdsnipe me

21:12 <HdkR> Nah, I'm just trying to find some RISCV CPU designers to poke at :P

21:13 <HdkR> Prodding someone at SiFive may be useful as well

21:13 <bnieuwenhuizen> someone was also on mesa-dev@ asking about stuff while apparently working on RISC-V extensions for GPUs

21:14 <alyssa> if you all aren't trying to nerdsnipe me you're doing a terrible job

21:14 <bnieuwenhuizen> alyssa: whatis there to be nerdsniped by?

21:14 <HdkR> alyssa: So you're saying you DO want to work on x86 emulation

21:15 <airlied> HdkR: pixilica is the name of one company involved

21:15 <bnieuwenhuizen> AFAICT these projects are all terrible at actually delivering a GPU, so no driver to write

21:15 <airlied> tbh the idea of a risc-v gpu is woefully bad

21:15 <HdkR> airlied: I love the name

21:15 <airlied> just write a GPU ISA, it's not like you need to standardise it

21:16 <airlied> you are going to get patent sued to oblivion anyways

21:16 <HdkR> haha

21:16 <karolherbst> jekstrand: oh wow... I am surprised how far I cam without caring about alignment inside the input buffer at all...

21:16 <alyssa> bnieuwenhuizen: open hw gpu

21:17 <jekstrand> karolherbst: hehe

21:17 * airlied would love to have a super fast/massive FPGA, open source tooling, and 5 years of funding, and then 10 years in court

21:17 <jekstrand> zmike: Results for my dynamic rendering branch are the same as your sync2 branch that it's based on top of. I'd say that means it's good.

21:17 <zmike> jekstrand: weird

21:18 <zmike> but not the first time I've seen different systems/llvms give wildly different results

21:18 <airlied> jekstrand: what are the crashes?

21:18 <jekstrand> zmike: This is also on whatever CTS branch I'm on

21:18 <jekstrand> dEQP-VK.glsl.crash_test.divbyzero_comp,Crash

21:18 <zmike> yea that's known + fixed-ish

21:19 <jekstrand> dEQP-VK.spirv_assembly.instruction.spirv1p4.entrypoint.tess_con_pc_entry_point,Crash

21:19 <jekstrand> dEQP-VK.spirv_assembly.instruction.spirv1p4.entrypoint.tess_eval_pc_entry_point,Crash

21:19 <zmike> those are new

21:19 * jekstrand runs in gdb

21:20 <jekstrand> crashes inside LLVM jit code, of course

21:20 <jekstrand> :facepalm:

21:20 <zmike> the nightmare continues

21:20 <zmike> I guess one of us will have to check them out

21:20 <zmike> can you file tickets?

21:20 <jekstrand> I'm gonna go with not my fault. :D

21:20 lemonzest has quit [Quit: WeeChat 3.4]

21:20 <zmike> yea prob not

21:20 <HdkR> If you name the function in LLVM then it is nice enough to give you gdb symbols :P

21:21 ngcortes has joined #dri-devel

21:22 <jekstrand> zmike: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6167

21:23 <zmike> thx

21:25 Haaninjo has quit [Quit: Ex-Chat]

21:26 <airlied> alyssa: you can file you lower to scalar dubiousness in an issue if you want :P

21:27 illwieckz has quit [Ping timeout: 480 seconds]

21:30 <Lynne> airlied: fixed all validation issues, only the dedicated allocation/single->multiplane image issue remains

21:31 <Lynne> IIRC nvidia didn't require dedicated allocation, so I'll see if it works tomorrow

21:34 <airlied> Lynne: oh we might not required dedicate alloc either actually

21:34 <airlied> though other vendors might so no harm in supporting it

21:36 illwieckz has joined #dri-devel

21:39 <Lynne> oh, right, requiresDedicatedAllocation is 0

21:51 Duke`` has quit [Ping timeout: 480 seconds]

21:52 mvlad has quit [Remote host closed the connection]

21:54 * karolherbst hates regressions

21:55 camus1 has joined #dri-devel

21:58 rkanwal has quit [Ping timeout: 480 seconds]

21:58 camus has quit [Ping timeout: 480 seconds]

21:59 <karolherbst> jekstrand: but if I compare how it was to support CL on top of nir in the begining with the situation know I have to say that's a lot more pleasent today. It's nice when stuff just works (tm)

22:00 <karolherbst> "FAILED 1 of 73 tests." nooooo

22:01 <karolherbst> ehh kernel_preprocessor_macros .. *sigh*

22:01 <karolherbst> __OPENCL_VERSION__ and __OPENCL_C_VERSION__ are quite annoying

22:01 <karolherbst> sometimes clang doesn't define them and sometimes it does

22:01 <jekstrand> :(

22:02 <karolherbst> I think I can fix it by passing in -cl-std but I think that was breaking other stuff

22:02 rasterman has quit [Quit: Gettin' stinky!]

22:15 alyssa has quit [Quit: leaving]

22:18 <karolherbst> test_basic: PASSED 73 of 73 tests. \o/

22:19 <karolherbst> airlied: so now I really have to figure out that extern kernel stuff :D

22:20 <karolherbst> either by backporting that patch *ugh* or compiling and installing llvm-14 *uhhuhhhgggh*

22:20 <jekstrand> Those sound equally terrible

22:20 <karolherbst> depends on how terrible it would be to backport that one fix

22:20 <jekstrand> backporting also involves building LLVM

22:20 <karolherbst> not if it's inside the translator

22:21 <jekstrand> Oh, if it's just in the translator, that's not bad

22:21 <karolherbst> yeah.. should be fine

22:23 <karolherbst> jekstrand: wait.... clang just told me something

22:23 <karolherbst> "input.cl:1:1: error: OpenCL C version 1.0 does not support the 'extern' storage class specifier"

22:23 <karolherbst> :D

22:23 <karolherbst> I think I might going to skip it then

22:24 <karolherbst> a bit annoying that most of those tests kind of assume CL 1.2

22:24 <jekstrand> yeah

22:24 <karolherbst> let's create an MR and see what happens

22:24 danvet has quit [Ping timeout: 480 seconds]

22:25 rasterman has joined #dri-devel

22:31 <karolherbst> what color does Rust have?

22:34 mbrost has quit [Ping timeout: 480 seconds]

22:36 <karolherbst> so... it's done

22:39 <karolherbst> soo... let's take a look at 1.1 shall we

22:40 <karolherbst> uhm.. how can I change the source branch for an MR?

22:40 <ajax> https://paste.centos.org/view/raw/7f3b6001

22:40 <ajax> that can't be the intended sort order, right?

22:42 gouchi has quit [Remote host closed the connection]

22:53 <karolherbst> ahh cl_khr_fp64 is required for CL 1.2 nice

22:54 <karolherbst> but I think I start reporting 3.0 once that's enabled :)

22:57 <jekstrand> :)

22:58 <imirkin> ajax: going in order of performance...

22:59 fxkamd has quit []

23:00 <karolherbst> "Device Version OpenCL 3.0" :3

23:01 tobiasjakobi has joined #dri-devel

23:02 tobiasjakobi has quit []

23:04 <airlied> karolherbst: yeah backporting shouldn't be horrible

23:05 * karolherbst switching to 3.0 now

23:05 <airlied> karolherbst: though for CL3.0 there were also a lot of header file changes that I'm not sure are in llvm-13

23:05 <karolherbst> ahh..

23:05 <airlied> karolherbst: it might be an idea to avoid using the opencl-c.h in favour of the new thing

23:05 <airlied> but I think the new thing is only in pretty new llvm

23:05 <Lynne> airlied: just pulled from your radv-vulkan-video-prelim-decode branch, now everything segfaults, starting from radv_GetPhysicalDeviceVideoCapabilitiesKHR

23:06 <karolherbst> airlied: yeah.. but we can figure out those details later

23:06 <airlied> karolherbst: tstellar had some llvm nightly copr I think now for fedora

23:06 <karolherbst> airlied: but that's all handled inside clc anyway, no?

23:06 <airlied> Lynne: that seems suboptimal, let me give check it here

23:06 <airlied> karolherbst: no opencl-c.h comes from clang

23:06 <karolherbst> sure, but clc handles including that

23:06 <karolherbst> I don't mean libclc

23:06 <karolherbst> I mean src/compiler/clc

23:07 <airlied> yes clc likely needs those fixes for 3.0 support

23:08 <airlied> Lynne: still passes cts tests here, I should clone your work and look

23:09 ahajda has quit [Quit: Going offline, see ya! (www.adiirc.com)]

23:11 <airlied> Lynne: I think the gstreamer folks hit the same issue with disjoint

23:11 <airlied> "This is because the Vulkan format output by the driver’s decoder device is VK_FORMAT_G8_B8R8_2PLANE_420_UNORM, which is NV12 crammed in a single image, while for GstVulkan a NV12 frame is a buffer with two images, one per component. "

23:12 <Lynne> hmm, I just remembered we have single-buffer specialcase for intel hardware

23:13 <airlied> yeah intel hw definitely can't support it at all

23:13 <graphitemaster> I'm having the strangest behavior here, glClearNamedFramebufferfv is being affected by a stale bound shader

23:14 <airlied> Lynne: one thing radv does very wrong at the moment is DPB allocation, hoping to try and make it spec compliant

23:14 <airlied> unfortunately it breaks the nvidia player if I do

23:14 <Lynne> break it all the way.

23:14 <Lynne> that mass of C++ object-orientated code deserves all it gets

23:15 <airlied> yeah it's a horror show to even hack the fixes into it

23:16 <Lynne> a quick ffmpeg command to test my branch: "./ffmpeg_g -init_hw_device "vulkan=vk:0,debug=1" -hwaccel vulkan -hwaccel_output_format vulkan -i test.mkv -frames:v 1 -loglevel verbose -f null -"

23:17 <Lynne> it should autodetect vulkan on ./configure

23:18 <Lynne> for the test video, anything remotely standard h264 should work

23:20 ybogdano has quit [Ping timeout: 480 seconds]

23:21 frankbinns has quit [Remote host closed the connection]

23:23 rasterman has quit [Quit: Gettin' stinky!]

23:27 <Lynne> pushed a change to make it always use contiguous memory for decoding

23:28 <Lynne> didn't keep the old repo around, so can't test -_-

23:34 rgallaispou1 has quit [Read error: Connection reset by peer]

23:40 <airlied> Lynne: you have to pass a VIDEO_DECODE_H264_CAPABILITIES_EXT in pNext to that function I think

23:40 <karolherbst> CL 3.0 conformance is looking good as well :)

23:40 <karolherbst> ~250 fails

23:42 rgallaispou has joined #dri-devel

23:43 <airlied> Lynne: though I better go read the spec to confirm

23:43 <airlied> but I think what I have passed CTS tests

23:44 <Lynne> aaah, that

23:44 <Lynne> the validation layer complained about it, so I removed it

23:47 <airlied> https://paste.centos.org/view/f412c718 seems correct from reading spec

23:48 <Lynne> "vkCreateImageView(): Format VK_FORMAT_G8_B8R8_2PLANE_420_UNORM requires a VkSamplerYcbcrConversion but one was not passed in the pNext chain"

23:48 <Lynne> the hell?

23:49 <Lynne> airlied: yeah, I pushed that a minute ago

23:51 heat has quit [Remote host closed the connection]

23:51 <karolherbst> airlied: you wanted me to implement clCloneKernel? :D

23:51 heat has joined #dri-devel

23:54 <airlied> Lynne: I don;t think you need dec_caps anymore

23:55 <airlied> Lynne: I no longer see VideoDecodeCapabilities in the spec

23:55 <airlied> or maybe I need a newer pdf :)

23:56 <airlied> ah indeed I need to update the code again

23:59 ybogdano has joined #dri-devel

23:59 pcercuei has quit [Quit: dodo]