#dri-devel on 2022-04-14 — irc logs at oftc.irclog.whitequark.org

2022-03-22 11:57 ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar

00:01 nchery is now known as Guest1875

00:01 nchery has joined #dri-devel

00:03 <karolherbst> wow rust even specifies true == 1

00:04 <airlied> jekstrand: functions won't start themselves :-P

00:04 * airlied started hacking it up in llvmpipe but it was too much for my brain at the time

00:05 <airlied> esp around passing implicit kernel args or context to each fn call

00:08 Guest1875 has quit [Ping timeout: 480 seconds]

00:10 <karolherbst> okay.. that CTS test is broken :)

00:13 <karolherbst> okay

00:13 khfeng has joined #dri-devel

00:13 <karolherbst> I think I will fix that image_format and image_order stuff next

00:13 <karolherbst> I just have no good idea on how to do it...

00:15 <karolherbst> I kind of hate what I've done for clover

00:15 <airlied> I also think in theory amd could do some of that on the hw side

00:15 <karolherbst> fun

00:16 <karolherbst> airlied: the thing is just, we have to map to those silly CL values

00:17 <karolherbst> something something was very ugly there

00:17 <airlied> yeah its messy, and I'm fine with just lowering to const buffers

00:17 <airlied> since I doubt it's very used

00:18 <karolherbst> no

00:18 <karolherbst> ugly in a "llvm does magic" sense

00:18 <karolherbst> so we get magic + 0x000010d0 and stuff in the shader

00:18 <airlied> ah yeah clang magic

00:18 <karolherbst> airlied: just look at that: https://gist.githubusercontent.com/karolherbst/16a8edfc2bbb16112c8f454b70a092a6/raw/70e7b5a5aa8978cb2aba78ba7fe848b320407fe3/gistfile1.txt

00:19 <karolherbst> I don't even know why they think it was a good idea?

00:20 <airlied> karolherbst: something with spir-v as well

00:20 <karolherbst> could be

00:20 <airlied> I think spir-v defines them at 0 base

00:20 <airlied> then has to add them to get CL

00:21 <karolherbst> I don't think so

00:21 <karolherbst> I had to add a isub to make it pass tests

00:22 <karolherbst> airlied: yeah.. so OpenCL C is 0 based

00:22 <karolherbst> i think?

00:23 <karolherbst> ohh wait no.. so something adds the add, and if I push the CL value in, I have to isub the base again..

00:23 <karolherbst> right

00:23 <karolherbst> that's how it was

00:24 <karolherbst> airlied: ahh yeah.. seems like you are right

00:24 <karolherbst> spir-v is indeed 0 based

00:24 <karolherbst> annoying

00:25 <karolherbst> oh well

00:25 <karolherbst> it's constant folded anyway

00:26 <karolherbst> I will sleep over it and come up with a good solution

00:28 mclasen has quit [Ping timeout: 480 seconds]

00:30 <airlied> karolherbst: I think the conversion code is in the translator

00:33 <airlied> SPIRVToOCLBase::visitCallSPIRVImageQueryBuiltIn

00:34 <airlied> OCLToSPIRVBase::visitCallGetImageChannel

00:37 co1umbarius has joined #dri-devel

00:38 columbarius has quit [Ping timeout: 480 seconds]

00:48 <karolherbst> yeah.. but it does seem to do the correct thing given the spir-v spec

00:56 Ristovski has quit [Ping timeout: 480 seconds]

00:58 tales__ has quit []

00:58 tales__ has joined #dri-devel

00:59 Ristovski has joined #dri-devel

01:01 tales__ has left #dri-devel [#dri-devel]

01:04 <jekstrand> airlied: I'm starting to think that functions may be something I need to start on sooner rather than later.

01:04 <jekstrand> Because if I don't, someone else will, badly, and then I'll get to clean up the mess. :-/

01:04 <jekstrand> Or maybe that's just too pessimistic of me.

01:04 <jekstrand> idk

01:05 <jekstrand> The first step, though, is to fix various optimization passes so they're valid to run pre-lowering. Lots of stuff assumes everything is inlined.

01:05 <jekstrand> Even if it has nothing to do with functions, lots of passes make implicit assumptions the author didn't realize.

01:06 <jekstrand> Maybe I should dig into that with the Intel compiler and see where it goes.

01:06 <jekstrand> I'll probably stick with compute for now. Other stages have lots of I/O which makes assuming inlining is really convenient.

01:07 tales-aparecida has joined #dri-devel

01:07 <icecream95> jekstrand: r u saying that my implementation of functions is going to be bad??

01:07 <jekstrand> icecream95: Not you in particular. I just tend to assume that people will forget things or not know how NIR works somewhere or not realize that they're running optimizations that aren't function-safe or whatever.

01:07 <jekstrand> It's not an insult.

01:08 <icecream95> jekstrand: I'm going to try to keep to the backend side, so hopefully will not make too many conflicting changes to NIR passes

01:08 <jekstrand> Yeah, the back-end is where a lot of the pain is.

01:09 <jekstrand> Retrofitting the Intel back-end will be... entertaining.

01:09 <jekstrand> Not especially looking forward to that TBH

01:09 <jekstrand> But I should be able to pull something off.

01:10 <icecream95> But first I need to bisect an issue around shadow samplers..

01:14 <icecream95> Let me guess.. GL_CLAMP emulation?

01:15 <karolherbst> icecream95: shadow samplers?

01:15 <icecream95> karolherbst: This isn't CL

01:15 <karolherbst> ahh

01:16 <karolherbst> jekstrand: I am wondering if we should just emit two arrays for the format+order thing at index it with the image index...

01:16 <karolherbst> *and

01:16 <karolherbst> I don't really like the approach I had for clover, where I was adding uniform variables for each image

01:23 <icecream95> zmike: lower_tex_to_txd seems to miss a number of fields that should be copied, such as is_shadow

01:24 <zmike> better copy them then

01:30 <zmike> the txb one probably does too

01:30 <airlied> jekstrand: I did have it pass functions through to llvmpipe and not have nir explode

01:32 <airlied> https://gitlab.freedesktop.org/airlied/mesa/-/commits/llvmpipe-cl-funcs/

01:34 <airlied> jekstrand: only 3-4 spirv/nir patches to set things up, it at least passed some basic tests :-P

01:38 <karolherbst> airlied: I'd throw luxmark at it, but llvmpipe kind of crashes with it :(

01:39 <airlied> that branch does the wrong thing in lots of places before that :)

01:39 <karolherbst> :D

01:39 <airlied> mixing llvmpipe simd flow control with LLVM flow control is tricky

01:39 <airlied> got to at least pass the current exec_mask into all functions

01:40 <karolherbst> something something is very wrong with buffers and I have no clue what

01:40 <airlied> then I went down the hole of passing in lots of things into every function

01:40 <karolherbst> ohh I have an idea what's broken

01:40 <airlied> and then I think I was like wtf globabl variables could save me

01:40 <airlied> then I realised I had other problems with global vars so just ran away

01:41 <karolherbst> oops

01:42 <karolherbst> airlied: https://i.imgur.com/Rdzoo8K.png :(

01:44 <karolherbst> I hope it has nothing to do with alignment

01:46 <karolherbst> airlied: what's that "Treating load_kernel_arg in control flow as uniform, results may be incorrect" message btw?

01:47 <karolherbst> ohhh

01:47 <airlied> load_kernel_arg expects the argument to be uniform across the wave

01:47 <karolherbst> yeah mhh

01:48 <airlied> and only picks the first active wave to use to load it

01:48 <airlied> not sure if kernel arguments can be dynanically indexed, don't think you can

01:48 <karolherbst> yeah.. no idea

01:48 <karolherbst> I will take a look at the test hitting this

01:49 <karolherbst> okay, it's a load_const (0x00000020 = 0.000000) :)

01:50 <airlied> yeah so that should be fine

01:50 * airlied isn't sure if you can have an array of kernel arguments :-P

01:50 <karolherbst> you can

01:50 <karolherbst> so, arrays are of course illegal as kernel args

01:50 <karolherbst> but structs aren't

01:51 <karolherbst> so you just wrap your array with a struct and indirectly access it

01:52 sadlerap1 has quit []

01:52 <karolherbst> but "allocations image2d_write" and "allocations image2d_read" fail and this worries me a bit

01:52 sadlerap has joined #dri-devel

01:53 <airlied> dcbaker: 3bbd404457e6e3278afd78f6721be9e174c6b777 still seems to be missing from 22.0 staging

01:54 <karolherbst> but that test is a little insane

01:54 <karolherbst> "Pixel 9440, 5, component 0, expected 47200, got 3873333467."

01:54 <karolherbst> it's always failing at that one pixel though

01:54 <karolherbst> maybe we overflow somewhere?

02:03 ngcortes has quit [Ping timeout: 480 seconds]

02:05 OftenTimeConsuming has quit [Remote host closed the connection]

02:06 OftenTimeConsuming has joined #dri-devel

02:08 rkanwal has quit [Ping timeout: 480 seconds]

02:23 <karolherbst> "kernel compilation time: 175809ms"

02:23 <karolherbst> I think we can do better

02:26 ybogdano has quit [Ping timeout: 480 seconds]

03:06 aravind has joined #dri-devel

03:06 Daanct12 has joined #dri-devel

03:14 mbrost has joined #dri-devel

03:15 mbrost has quit []

03:27 h0tc0d3 has quit [Remote host closed the connection]

03:29 Daanct12 has quit [Quit: Leaving]

03:29 mbrost has joined #dri-devel

03:30 Daanct12 has joined #dri-devel

03:30 h0tc0d3 has joined #dri-devel

03:31 Daanct12 has quit [Remote host closed the connection]

03:32 Daanct12 has joined #dri-devel

03:33 Daanct12 has quit []

03:35 Daanct12 has joined #dri-devel

03:36 Daanct12 has quit [Remote host closed the connection]

03:38 <jekstrand> airlied: I'm less worried about exploding than I am about silently and subtly optimizing something wrong and you have no clue why.

03:38 <jekstrand> airlied: Lots of stuff doesn't actually think through global variables very well, for instnace.

03:39 <jekstrand> We tend to assume they're just like locals (global as in nir_var_shader_temp)

03:40 <jekstrand> We also used to have a bunch of metadata problems where it would or wouldn't get invalidated on a per-shader basis when it should have been per-function. I think that one's mostly sorted now.

03:46 <airlied> jekstrand: hopefully a full cl CTS would show up any major insanity, but sounds like a lot of auditing

03:46 <jekstrand> airlied: What CTS? OpenCL? Nah. it's not that complex.

03:46 <jekstrand> I think some of it will be shown by optimizing libclc more before we inline it all.

03:47 <jekstrand> And successfully running luxmark would give me some confidence.

03:47 <jekstrand> The Vulkan CTS might have enough function stuff going on; not sure.

03:47 <jekstrand> It either doesn't use functions at all or uses them for lots of stupid.

03:48 <Sachiel> the graphicsfuzz tests have plenty of those

03:48 mbrost has quit [Remote host closed the connection]

03:49 mbrost has joined #dri-devel

03:53 Company has quit [Quit: Leaving]

03:54 ppascher has joined #dri-devel

03:58 <jekstrand> Yeah, I think if I could run the full Vulkan CTS with zero inlining for compute shaders, I'd have a reasonable level of confidence that it was working.

04:01 neonking has quit [Ping timeout: 480 seconds]

04:04 Daanct12 has joined #dri-devel

04:05 tales-aparecida has quit [Remote host closed the connection]

04:09 shankaru has joined #dri-devel

04:20 tales_ has quit []

04:23 mbrost has quit [Remote host closed the connection]

04:23 mbrost has joined #dri-devel

04:32 Duke`` has joined #dri-devel

04:32 mbrost has quit [Read error: Connection reset by peer]

04:35 mhenning has quit [Remote host closed the connection]

05:02 Administrator has joined #dri-devel

05:05 Administrator has quit [Remote host closed the connection]

05:07 <airlied> robclark: can you put a small bit more summary info in fixes pull requests :-)

05:12 rgallaispou1 has joined #dri-devel

05:12 <robclark> airlied: last -fixes is "fail less at system suspend plus misc small fixes"?

05:14 <airlied> cool, I just stuck in some guess work anyways :-P

05:16 <robclark> system suspend is, tbh, something I wonder about with other drivers.. we've been finding some fun corner cases that we wouldn't have seen without umm.. crowd sourced debugging (ie. digging through crash reports from field)

05:17 rgallaispou has quit [Ping timeout: 480 seconds]

05:17 rgallaispou1 has quit [Read error: Connection reset by peer]

05:17 rgallaispou has joined #dri-devel

05:26 lemonzest has joined #dri-devel

05:32 OftenTimeConsuming is now known as Guest1894

05:32 sdutt has quit [Read error: Connection reset by peer]

05:32 Guest1894 has quit [Remote host closed the connection]

05:32 OftenTimeConsuming has joined #dri-devel

05:34 Daanct12 has quit [Ping timeout: 480 seconds]

05:36 mszyprow has joined #dri-devel

05:41 shankaru has quit [Quit: Leaving.]

05:43 shankaru has joined #dri-devel

05:45 mszyprow has quit [Ping timeout: 480 seconds]

05:56 pnowack has joined #dri-devel

05:56 pnowack has quit [Remote host closed the connection]

05:57 pnowack has joined #dri-devel

06:02 mszyprow has joined #dri-devel

06:04 flto_ has joined #dri-devel

06:09 flto has quit [Ping timeout: 480 seconds]

06:11 flto has joined #dri-devel

06:15 flto_ has quit [Ping timeout: 480 seconds]

06:16 danvet has joined #dri-devel

06:26 Duke`` has quit [Ping timeout: 480 seconds]

06:31 sumoon has joined #dri-devel

06:36 paulk1 has quit [Ping timeout: 480 seconds]

06:49 frieder has joined #dri-devel

06:51 abhinav__ has quit [Quit: The Lounge - https://thelounge.chat]

06:51 jessica_24 has quit [Quit: The Lounge - https://thelounge.chat]

06:51 jessica_240 has quit []

06:52 jessica_24 has joined #dri-devel

06:58 mvlad has joined #dri-devel

07:03 neonking has joined #dri-devel

07:04 paulk1 has joined #dri-devel

07:09 maxzor has joined #dri-devel

07:16 fxkamd has quit []

07:19 jkrzyszt has joined #dri-devel

07:32 camus has joined #dri-devel

07:36 danvet has quit [Ping timeout: 480 seconds]

07:41 eukara has quit []

07:41 eukara has joined #dri-devel

07:43 oneforall2 has quit [Ping timeout: 480 seconds]

07:43 nchery has quit [Read error: Connection reset by peer]

07:49 lynxeye has joined #dri-devel

07:52 Haaninjo has joined #dri-devel

08:07 jewins has quit [Ping timeout: 480 seconds]

08:10 ppascher has quit [Ping timeout: 480 seconds]

08:30 vyivel has quit [Remote host closed the connection]

08:30 bl4ckb0ne has quit [Remote host closed the connection]

08:30 emersion has quit [Remote host closed the connection]

08:33 bl4ckb0ne has joined #dri-devel

08:33 emersion has joined #dri-devel

08:36 vyivel has joined #dri-devel

08:48 tobiasjakobi has joined #dri-devel

08:48 tobiasjakobi has quit []

09:00 pnowack has quit [Quit: pnowack]

09:24 jljusten has quit [Quit: WeeChat 3.4]

09:28 jljusten has joined #dri-devel

09:31 pnowack has joined #dri-devel

09:36 jkrzyszt has quit [Remote host closed the connection]

09:38 maxzor has quit [Ping timeout: 480 seconds]

09:43 dliviu has joined #dri-devel

09:43 pallavim has joined #dri-devel

09:46 nashpa has quit [Ping timeout: 480 seconds]

10:08 icecream95 has quit [Ping timeout: 480 seconds]

10:10 natto has quit [Ping timeout: 480 seconds]

10:18 mclasen has joined #dri-devel

10:21 jkrzyszt has joined #dri-devel

10:37 flacks has quit [Quit: Quitter]

10:40 flacks has joined #dri-devel

10:51 <karolherbst> dcbaker: I triggered an annoying bug. If rustc gets updated on the system in the meantime, meson doesn't recompile stuff. Not sure if I reported it in the past or not

10:53 <karolherbst> jekstrand: https://blog.rust-lang.org/2022/04/07/Rust-1.60.0.html#stabilized-apis there is some nice stuff un it :)

10:53 <karolherbst> *in

10:53 <karolherbst> like Arc::new_cyclic

10:54 <karolherbst> not sure if we still need it though?

11:10 rkanwal has joined #dri-devel

11:20 nchery has joined #dri-devel

11:34 Company has joined #dri-devel

11:34 ROw has joined #dri-devel

11:37 SR_71 has quit [Ping timeout: 480 seconds]

11:40 shankaru has quit [Quit: Leaving.]

11:49 mclasen has quit [Ping timeout: 480 seconds]

11:56 <jekstrand> karolherbst: ! I like! That's exactly what I wanted.

11:56 neonking has quit [Ping timeout: 480 seconds]

11:57 natto has joined #dri-devel

12:03 mclasen has joined #dri-devel

12:12 pcercuei has joined #dri-devel

12:14 * jekstrand starts a CL CTS run on panfrost

12:14 <jekstrand> Without clear_buffer or clear_texture, it's gonna be a bit busted but better than nothing, I guess.

12:26 karolherbst has quit [Ping timeout: 480 seconds]

12:27 karolherbst has joined #dri-devel

12:37 LexSfX has quit [Remote host closed the connection]

12:38 LexSfX has joined #dri-devel

12:44 neonking has joined #dri-devel

12:46 <karolherbst> jekstrand: we kind of need a better solution for those static inlines :(

12:46 <jekstrand> karolherbst: context?

12:46 <karolherbst> Also I kind of plan to port over to rust 2018, just don't know if I want to fix up history or do one mega commit

12:46 <karolherbst> jekstrand: like bindgen won't generate bindings for static inline functions

12:47 <jekstrand> karolherbst: right

12:50 ppascher has joined #dri-devel

12:50 Haaninjo has quit [Read error: Connection reset by peer]

12:51 Haaninjo has joined #dri-devel

12:52 Jasprose has joined #dri-devel

12:52 shankaru has joined #dri-devel

12:53 <dv_> does anyone know if it is valid to dup() the FD of an open dma-heap device node?

12:56 <karolherbst> jekstrand: I am seriously thinking about just emitting two u16 arrays for format+order and always put those into the input buffer.. because handling indirects any other way would be brutal

12:57 <jekstrand> karolherbst: I guess. I didn't think the input-per-image was too bad

12:57 <jekstrand> Depends on when you want to do the lowering, I guess.

12:57 <karolherbst> yeah, it's not, it just gets complicated in terms of DCE and what if you have an indirect

12:58 <karolherbst> anyway.. it's also just 32 bits per image argument

13:10 * jekstrand decides to let Fedora download LLVM debug symbols this time

13:11 <karolherbst> big mistake :P

13:11 <karolherbst> nir_load_deref(nir_build_deref_array(nir_load_var)) is what I need to do, right?

13:11 <jekstrand> yup

13:13 <karolherbst> ehh s/nir_load_var/nir_build_deref_var/

13:13 <jekstrand> yeah

13:16 paulk1 has quit []

13:19 <karolherbst> nnoooo.. those #undefs in nir_builder.h are killing my wrapper :D

13:20 <jekstrand> Why are you trying to write NIR passes in Rust?

13:20 <jekstrand> You're asking for pain

13:20 <karolherbst> because I am not doing much

13:22 <karolherbst> but yeah.. maybe I should move the pass into C code and see how I deal with sharing data

13:28 <jekstrand> *sigh* Who thought vec4 for compute was a good idea? Aparently, Arm did...

13:28 <imirkin> it's 4x faster

13:28 <karolherbst> :) it's 4 times as fast as scalar, everybody nows that

13:28 <imirkin> lol

13:28 <karolherbst> *knows

13:28 <imirkin> and two people can't both be wrong

13:29 <karolherbst> :D

13:29 mszyprow has quit [Ping timeout: 480 seconds]

13:30 <jekstrand> The annoying thing is that, if we want this mess to actually work, we either need to teach the bifrost compiler vec8 and vec16 or we need to make nir_lower_alu_to_scalar a generic narrowing pass that takes a maximum width or a callback or something.

13:30 Net147 has quit [Quit: Quit]

13:31 Net147 has joined #dri-devel

13:33 jewins has joined #dri-devel

13:34 <karolherbst> jekstrand: I'd guess the latter is better

13:34 <jekstrand> karolherbst: Yeah, I'm looking into that.

13:36 <jekstrand> karolherbst: I don't think it should be that hard to make it narrow instead of always scalarize

13:36 <karolherbst> shouldn't

14:01 maxzor has joined #dri-devel

14:03 sdutt has joined #dri-devel

14:07 fxkamd has joined #dri-devel

14:21 tlwoerner has quit [Read error: Connection reset by peer]

14:27 flto has quit [Ping timeout: 480 seconds]

14:33 ella-0 has joined #dri-devel

14:33 <karolherbst> soo.. kernel side is all done for format and order :) now I just have to upload the values

14:34 flto has joined #dri-devel

14:36 ella-0_ has quit [Remote host closed the connection]

14:39 alyssa has joined #dri-devel

14:40 <alyssa> I find myself wanting to write developer docs for panfrost

14:40 <alyssa> The obvious place is https://docs.mesa3d.org/drivers/panfrost.html but I don't know how appropriate it is,

14:40 <alyssa> the info there is all targeted at end users

14:40 <alyssa> (How to build and run panfrost, not how to hack on it)

14:44 <karolherbst> https://gist.githubusercontent.com/karolherbst/b384131a6596f6077e51b1bcb27e3592/raw/febd9cc7057e12ca430c633308033bee98f7b485/gistfile1.txt :)

14:44 <alyssa> karolherbst: u64vec4x0a32B

14:44 <alyssa> the heck?

14:44 <karolherbst> yes...

14:44 <karolherbst> soo

14:44 <karolherbst> that's explicit stride stuff or something

14:45 <karolherbst> I though it's some memory corruption somewhere, but Jason said that's how it's supposed to look like

14:48 <karolherbst> PASSED 42 of 42 sub-tests. :)

14:50 <karolherbst> jenatali: do you have any data on what image formats/types applications are most interested in?

14:51 <karolherbst> uhhh "Returned array size did not validate (expected 53, got 0)" :(

14:52 <jenatali> karolherbst: no, not really. I'd assume the normal 8bpc unorm stuff

14:53 <karolherbst> yeah... I mean. CL already specifies what's required, I am just wondering what I should care about on top from the start

14:53 <karolherbst> and the 8bpc unorm stuff is already included in that afaik

14:53 <karolherbst> but like.. only CL_R, not CL_A

14:54 <karolherbst> although as long as stuff passes I can also just expose as much as possible

14:55 <jenatali> Yeah I doubt apps really care for much beyond the required

14:55 <karolherbst> I just don't have a nice way of declaring the CL -> pipe mappings, so every combination is a new entry :(

14:55 <jenatali> My read on how CL was designed was that the speccers just looked at what they *could* do without asking what people want

14:55 <karolherbst> ohh that's for sure

14:55 <jenatali> How else do you end up with CL2.x that nobody uses

14:55 <karolherbst> but some might want want 2 channel images

14:56 <karolherbst> which are purely optional

14:56 <karolherbst> anyway, if you don't have any data on that, then I guess we have to see what people complain about :)

14:56 <jenatali> Oh didn't realize. I just hooked up whatever D3D supports, which covered all the required, I didn't look at which ones of them were optional

14:57 <karolherbst> yeah right..

14:57 mclasen has quit []

14:57 <karolherbst> in C you can also just do loops and macro magic

14:57 <karolherbst> rust macros can't create new tokens :(

14:57 <karolherbst> so you can't just concat names

14:57 mclasen has joined #dri-devel

14:58 <karolherbst> I did this for clover at some point: https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/gallium/frontends/clover/core/format.cpp#L29

14:58 <linkmauve> std::concat_idents!() is annoyingly nightly-only, but there is the concat-idents crate which makes it usable on stable.

14:58 <karolherbst> linkmauve: right.. which comes back to the issue that we can't use external crates with meson yet :)

14:59 <karolherbst> so I just ignore those issues unless it's important

14:59 <linkmauve> For now you could perhaps copy its code?

14:59 <karolherbst> it's not as simple

15:00 <karolherbst> although meson does support proc macros now I think :D

15:00 <linkmauve> Right, it depends on syn.

15:00 soreau has quit [Read error: No route to host]

15:00 khfeng has quit [Ping timeout: 480 seconds]

15:00 <karolherbst> proc macro support will help us with some stuff though

15:01 <karolherbst> maybe I hack something up

15:02 soreau has joined #dri-devel

15:05 <karolherbst> jekstrand: I think image_size is busted for array images :( maybe I do something wrongly, but it does work for read images.. mhh

15:08 pjakobsson has quit [Remote host closed the connection]

15:11 pjakobsson has joined #dri-devel

15:12 anarsoul has quit [Ping timeout: 480 seconds]

15:12 anarsoul has joined #dri-devel

15:14 <karolherbst> mhhhhh

15:14 <karolherbst> vec2 32 con ssa_3 = intrinsic image_size (ssa_0, ssa_0) (image_dim=1D /*0*/, image_array=true /*1*/, format=none /*0*/, access=8)

15:14 <karolherbst> vec2 64 con ssa_4 = intrinsic image_size (ssa_0, ssa_0) (image_dim=1D /*0*/, image_array=true /*1*/, format=none /*0*/, access=8)

15:16 <jekstrand> karolherbst: Do we have a pass for scalarizing I/O? nir_intrinsic_load_global and friends?

15:16 <karolherbst> uhm...

15:16 <karolherbst> I think so...

15:17 <karolherbst> if not in tree, there should be an MR

15:17 <karolherbst> I think I saw something at some point somewhere

15:19 <karolherbst> jekstrand: nir_opt_load_store_vectorize.c?

15:19 <karolherbst> ehh wait

15:19 <karolherbst> scalarizing, not vectorizing

15:19 <karolherbst> :(

15:19 <jekstrand> karolherbst: Yeah, that one might be able to scalarize too. I can't remember.

15:20 <karolherbst> jekstrand: nir_lower_io_to_scalar.c

15:20 shankaru has quit []

15:21 <karolherbst> but I guess it only does input atm

15:21 <karolherbst> I doubt it's hard to add support for global there

15:21 * jekstrand will type something mali-specific for now

15:22 <karolherbst> nooo.. I broke stuff :(

15:22 HankB__ has quit [Remote host closed the connection]

15:23 HankB__ has joined #dri-devel

15:29 sdutt has quit []

15:29 sdutt has joined #dri-devel

15:31 jkrzyszt_ has joined #dri-devel

15:33 ybogdano has joined #dri-devel

15:34 <karolherbst> ehh.. crap

15:35 <karolherbst> we have seperate numbering for readonly and writeonly images

15:37 jkrzyszt has quit [Ping timeout: 480 seconds]

15:47 mclasen has quit [Ping timeout: 480 seconds]

15:48 kmn has quit [Quit: Leaving.]

15:49 Duke`` has joined #dri-devel

15:52 mbrost has joined #dri-devel

15:57 ybogdano has quit [Ping timeout: 480 seconds]

15:58 <jekstrand> uh... Why am I getting 64-bit immediates in this shader?!?

15:58 <karolherbst> jekstrand: you don't want them?

15:59 <jekstrand> The panfrost compiler doesn't seem to think so. (-:

15:59 <karolherbst> well that's just sad

15:59 <karolherbst> the hw is 64 bit though, no?

15:59 <jekstrand> The panfrost compiler also thinks it's lowering 64-bit stuff away and that's clearly not happening. :-/

16:00 <jekstrand> Ooh, because I added it! Drp.

16:00 <karolherbst> :D

16:00 * jekstrand needs to lower harder

16:00 <karolherbst> I think I overengineered again

16:01 <karolherbst> "An image type cannot be used to declare a variable, a structure or union field, an array of images, a pointer to an image, or the return type of a function."

16:01 <karolherbst> that makes things simple

16:01 <jekstrand> :)

16:01 Peuc has joined #dri-devel

16:02 <karolherbst> I still keep the array as this makes it easier in rusticl, but still :D I wanted to figure out how to properly solve the issue of indirects at readonly and writeonly images, but guess the spec solves that for me

16:06 <alyssa> jekstrand: sounds like you're having fun

16:07 <jekstrand> alyssa: more or less. :)

16:07 <karolherbst> ehh

16:07 <karolherbst> I think image_deref_format lowering is broken

16:07 lemonzest has quit [Quit: WeeChat 3.4]

16:07 <karolherbst> ehh maybe not

16:07 nchery has quit [Ping timeout: 480 seconds]

16:07 <karolherbst> I can rely on the access thing, no?

16:12 <jekstrand> I've got test_buffers buffer_map_write_float not dying, but it fails random test. :-/

16:13 <karolherbst> mhhh

16:13 <karolherbst> let me check what was a good test to start all of this

16:14 abhinav__ has joined #dri-devel

16:14 <karolherbst> jekstrand: does allocations buffer work?

16:15 <jekstrand> It's definitely allocating successfully

16:15 <karolherbst> I think that one also maps and verifies content

16:15 <karolherbst> it's doing weird things, but it's not as huge as the buffers tests

16:16 * karolherbst kicks of another CTS run

16:17 <karolherbst> alyssa: I got to say, that's one of the most fun projects I've been working on for quite some time :D

16:19 ybogdano has joined #dri-devel

16:20 <jekstrand> karolherbst: Maybe it is mapping fail. I'm seeing all zeros

16:20 <jekstrand> And it doesn't seem fully deterministic

16:20 <jekstrand> Unless someone's flushing these here denorms. :-/

16:21 * jekstrand tries the int test

16:22 <jekstrand> Yeah, either somethings going really wrong launching my kernel or this map is bad.

16:22 <jekstrand> But not all my maps are bad

16:22 <jekstrand> But I don't know panfrost well enough to know which to distrust more. :-(

16:23 Jasprose has quit [Remote host closed the connection]

16:26 lemonzest has joined #dri-devel

16:27 <jekstrand> karolherbst: test_allocations buffer fails :-/

16:28 nchery has joined #dri-devel

16:32 <jekstrand> Oh, test_allocations buffer is hanging. Or, at least, timing out.

16:33 <jekstrand> I guess that poor little GPU doesn't want to checksum 512MB that fast.

16:36 anholt has quit [Remote host closed the connection]

16:40 mclasen has joined #dri-devel

16:43 <kisak> dcbaker: if you have a spare moment, can you check that the (staging/)mesa 22.1 branch went live, and the 22.1-branchpoint tag?

16:44 <dcbaker> kisak: they haven't yet, I ran out of time waiting for marge to merge the version bump last night so I'm making them right now

16:44 <karolherbst> jekstrand: mhh, could be some flushing issue

16:44 <karolherbst> or fencing

16:45 <kisak> thanks, I was just checking if it fell into the abyss by accident

16:45 <dcbaker> just the CI abyss :)

16:45 <karolherbst> "Pass 2122 Fails 24 Crashes 30 Timeouts 0" :)

16:45 <alyssa> :D

16:45 <alyssa> iris?

16:46 <kisak> over here, I completely missed that llvm 14.0.0 was released until there was 14.0.1 news

16:46 <jekstrand> karolherbst: I typed up a u_default_clear_buffer() helper for panfrost. Maybe we should do that for iris too?

16:46 <karolherbst> alyssa: yes

16:46 <karolherbst> jekstrand: maybe

16:46 <karolherbst> jekstrand: soo.. image_size is somewhat broken with iris

16:46 <jekstrand> karolherbst: It's identical to buffer_subdata except with the repeat

16:47 <jekstrand> karolherbst: That's entirely possible.

16:47 <jekstrand> karolherbst: I did think it worked, though.

16:47 <karolherbst> but only for 1d and 2d arrays

16:47 <karolherbst> and only for images

16:47 <karolherbst> those tests pass on llvmpipe

16:47 <jekstrand> hrm

16:47 anholt has joined #dri-devel

16:49 <dcbaker> kisak: branches and tags are up, the release is cutting right now

16:50 <dcbaker> karolherbst: that's an interesting issue... we don't track that explicitly for C-like lanuages either, I suspect that as a side effect that a major gcc bump also changes some headers so ninja decides to rebuild everything because the headers have changed...

16:50 <karolherbst> jekstrand: heh.. maybe I messed up.. on ADL-S it asserts

16:50 <karolherbst> dcbaker: potentially

16:51 <karolherbst> but gcc doesn't has this strong version check on object files

16:51 <jekstrand> modulo CL/Vulkan/GL differences, image_size should work. Vulkan tests it.

16:51 <karolherbst> rustc bails if a dep is compiled with a different compiler

16:51 <karolherbst> or well.. different version at least

16:52 <karolherbst> I see a crash inside brw_nir_clamp_image_1d_2d_array_sizeshere

16:52 <karolherbst> I am sure I messed it up for good

16:54 Haaninjo has quit [Ping timeout: 480 seconds]

16:56 Haaninjo has joined #dri-devel

16:57 frieder has quit [Remote host closed the connection]

16:58 <karolherbst> jekstrand: what is a bit odd is that I get two image_size ops, one 32 and the other 64 bit

16:58 <jekstrand> that is odd

16:58 <jekstrand> Why is there a 64-bit one?

16:58 <karolherbst> because of the spir-v

16:59 <karolherbst> jekstrand: ahh.. I know

16:59 <karolherbst> get_image_* funcs return int

16:59 <karolherbst> get_image_array_size returns... size_t

16:59 <karolherbst> because.. you know

16:59 <jekstrand> Of course it does!

17:00 <karolherbst> this makes sense, because cl_image_desc.image_width is size_t

17:00 <karolherbst> (and the others)

17:00 <karolherbst> honestly...

17:00 <jekstrand> Yeah, we need to turn that into 32-bit in NIR somewhere.

17:01 <karolherbst> guess when handling OpImageQuerySizeLod

17:01 <karolherbst> (and OpImageQuerySize)

17:01 <jekstrand> Yeah, that would work.

17:01 <jekstrand> Or as some bit of lowering somewhere.

17:02 <jekstrand> Though spirv_to_nir seems as good a place as any for now.

17:02 <jekstrand> I can't envision us caring about 64-bit image dimensions any time soon

17:02 <karolherbst> CL doesn't care anyway

17:02 <karolherbst> the API allows it, but...

17:02 <karolherbst> but maybe they thought allowing that on arrays makes sense because....?

17:03 rkanwal has quit [Ping timeout: 480 seconds]

17:09 <karolherbst> jekstrand: vtn_handle_image is kind of a messy hell, isn't it? :D

17:10 <jekstrand> A bit

17:13 <karolherbst> it passes now :)

17:13 <jekstrand> :)

17:14 <jekstrand> I'm starting to think something is wrong with panfrost compute

17:14 <jekstrand> Kernels aren't launching right or something

17:14 * jekstrand runs test_basic

17:14 <karolherbst> but everything else works?

17:15 <jekstrand> hard to tell

17:15 <jekstrand> fpmath fails. :-/

17:15 <karolherbst> :(

17:15 <karolherbst> ohhhhh

17:15 <karolherbst> I think I know what's up

17:15 <jekstrand> sometimes

17:15 <karolherbst> weird

17:16 <karolherbst> you are aware that I still use this input buffer thing? :P

17:16 <jekstrand> what input buffer thing?

17:16 <jekstrand> Oh, the grid inputs?

17:16 <jekstrand> Right...

17:16 <karolherbst> yeah...

17:16 <jekstrand> Those might not be hooked up. :)

17:16 <karolherbst> :)

17:16 <jekstrand> but, wait... If they weren't, it'd be crashing on the NIR intrinsic, right?

17:16 <karolherbst> anyway, my QuerySizeLod hack: https://gitlab.freedesktop.org/karolherbst/mesa/-/commit/df17c3e977a86cacef92e1661946a0c5f47c2f7a

17:17 <karolherbst> jekstrand: yeah...

17:17 <karolherbst> I guess somebody already did something there

17:17 <karolherbst> but maybe there is a sync issue or whatever weird stuff is going

17:17 <karolherbst> but I'd assume the kernel to just crash on hw anyway then

17:19 <karolherbst> jekstrand: are the int or conversion tests running? those are usually pretty trivial

17:20 <karolherbst> mhh.. I hope my USE_HOST_PTR emulation isn't broken, but I did do a run on iris with always using the shadow buffers and that worked fine

17:20 <karolherbst> the CTS kind of uses USE_HOST_PTR all over the place though

17:26 <karolherbst> jekstrand: didn't you had a patch for math_brute_force isnormal somewhere? or was that airlied?

17:26 <jekstrand> karolherbst: I've not touched that one

17:28 <karolherbst> I think I will go down the spirv-link hell...

17:28 <karolherbst> that's like 1 fail and 11 crashes

17:28 aravind has quit [Ping timeout: 480 seconds]

17:29 <karolherbst> but I think I will add a workaround to vtn so we don't depend on a fixed one

17:30 <karolherbst> shouldn't be too hard

17:33 <zmike> dcbaker: nice work on the release

17:34 <dcbaker> thanks!

17:34 <dcbaker> btw, could you look at the top commit of the staging/22.0 branch? I had to do some manual fixups on that

17:34 <zmike> looking

17:35 <zmike> did you get that llvmpipe patch into the 22.0 branch?

17:35 <dcbaker> I'

17:35 <dcbaker> what's in the staging is what's in right now

17:35 <dcbaker> I'm trying to work through my backlog of patches right now

17:35 <dcbaker> I unfortunately have a lot of them

17:36 <zmike> dcbaker: it looks like it hasn't landed then

17:36 <zmike> please make sure the next 22.0 release doesn't go out without "gallivm/sample: detect if rho is inf or nan and flush to zero"

17:36 <zmike> this is needed for conformance submissions

17:37 <zmike> and yeah that fixup looks 👍

17:38 <dcbaker> cool, I'll get the gallivm/sample patch in next

17:42 MajorBiscuit has joined #dri-devel

17:45 tjmercier has joined #dri-devel

17:45 <jekstrand> karolherbst: Can I specify a list of tests to run?

17:46 <karolherbst> jekstrand: yeah, kind of

17:46 <karolherbst> -i buffers

17:46 <karolherbst> not sure if I implemented subtests

17:47 <jekstrand> Ok, if I run fpmath_float fpmath_float2 fpmath_float4, it fails in FP_ADD float4.

17:47 <karolherbst> :(

17:47 <jekstrand> But they all pass individually

17:47 <jekstrand> So state's getting messed up somewhere

17:47 <karolherbst> oh no

17:47 <jekstrand> Uh oh... Now they all passed

17:48 <karolherbst> sounds like memory corruptions or something

17:48 <jekstrand> Quite possibly

17:48 * jekstrand runs with valgrind

17:49 <jekstrand> Valgrind on Arm... wah wah...

17:50 <daniels> it works fine

17:51 <jekstrand> Oh, I'm sure it works correctly. You just have to wait for it.

17:52 <HdkR> jekstrand: Time for an M1Ultra? :P

17:52 <jekstrand> HdkR: I keep telling people I'll buy an M1 once someone finishes writing the GPU kernel driver for it.

17:53 <jekstrand> And, no, I'm not going to sign up for that.

17:53 <jekstrand> Nor am I making any promisses about signing up once there's a kernel driver.

17:54 <jekstrand> But it's not compelling so long as the options are MacOS vs. llvmpipe.

17:54 <HdkR> It's true, even a VM is a bit of a pain

17:55 <jekstrand> Once drm_agx.ko is alive and well, then it might be a compelling platform to hack on.

17:56 <karolherbst> mhhh

17:56 <karolherbst> annoying

17:57 * jekstrand should probably use ubsan... it's faster.

17:58 <jannau> jekstrand: https://www.youtube.com/AsahiLina and I took over Alyssa's driver for the annoying display controller

18:00 <jekstrand> karolherbst: ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

18:00 <jekstrand> karolherbst: :'(

18:00 <karolherbst> :(

18:01 <jekstrand> karolherbst: Are you not enabling disk cache for clc loading?

18:02 <karolherbst> I do

18:02 <jekstrand> hrm...

18:02 <karolherbst> why?

18:02 <jekstrand> I'm seeing SPIR-V warnings on every test startup

18:02 <karolherbst> weird

18:03 <karolherbst> I am using the drivers disk_cache in case that makes a difference

18:03 <jekstrand> karolherbst: Looks like panfrost isn't giving you a disk cache

18:03 <karolherbst> I am very sad about this

18:03 <jekstrand> Yeah, panfrost doesn't disk cache. :-(

18:04 <jekstrand> This is sad

18:04 <dcbaker> pepp: I'm looking at "3c3a8f853d gallium/tc: zero alloc transfers" for 22.0, but I'm not sure it applies, since the tc storage PR isn't in 22.0. Should I pull that series, or just forget about that patch?

18:04 <jekstrand> It doesn't take taht long to build libclc but it's still a tad annoying

18:05 <karolherbst> yeah...

18:06 <karolherbst> I don't particular like the way we convert libclc to nir, but it is a device specific thing, so I don't want to use a rusticl internal disk_cache for that where I am sure we would't mess it up

18:06 <jekstrand> Running fpmath_float fpmath_float2 fpmath_float4 seems to fail about 1 in 5 or maybe a little less often.

18:06 <karolherbst> but I do plan to wire up OpenCL C to spir-v caching at some point

18:06 <jekstrand> yeah

18:09 tlwoerner has joined #dri-devel

18:10 <karolherbst> uhh.. I honestly have no good idea on how to fix this spirv-link stuff inside mesa :( My rough plan was to check if all needed variables were processed in vtn_handle_entry_point when calling another spir-v entrypoint, but ugh...

18:12 <karolherbst> OpVariables are already processed at this point

18:13 <karolherbst> mhhh, maybe not?

18:13 <karolherbst> ahh no, it already is

18:16 <karolherbst> guess fixing it inside spirv-link is the easier path (I hope)

18:20 garrison has joined #dri-devel

18:20 i-garrison has quit [Read error: Connection reset by peer]

18:23 * jekstrand wonders if there's something funny with synchronization and compute jobs

18:26 MajorBiscuit has quit [Ping timeout: 480 seconds]

18:27 oneforall2 has joined #dri-devel

18:30 * jekstrand is skeptical of panfrost_fence_finish

18:32 <jekstrand> Nah. It's fine. A little weird but fine.

18:48 <pepp> dcbaker: the bug predates the tc storage MR but it wasn't visible because the only app using this feature was viewperf. I think that pulling the 2 commits from !15298 makes sense

18:55 gawin has joined #dri-devel

18:57 MajorBiscuit has joined #dri-devel

19:09 pallavim_ has joined #dri-devel

19:10 lemonzest has quit [Quit: WeeChat 3.4]

19:15 pallavim has quit [Ping timeout: 480 seconds]

19:18 oilofparaf has joined #dri-devel

19:19 oilofparaf has left #dri-devel [#dri-devel]

19:21 ngcortes has joined #dri-devel

19:21 ds` has quit [Quit: ...]

19:21 ds` has joined #dri-devel

19:23 pallavim has joined #dri-devel

19:23 <dcbaker> pepp: sounds good, thanks

19:24 jkrzyszt_ has quit [Ping timeout: 480 seconds]

19:29 pallavim_ has quit [Ping timeout: 480 seconds]

19:32 mszyprow has joined #dri-devel

19:42 <karolherbst> jekstrand: does it work with clover?

19:42 <jekstrand> karolherbst: Haven't tried.

19:42 <karolherbst> ehh wait.. then you'd have to do this nir serialized stuff :)

19:43 <karolherbst> :(

19:43 <jekstrand> Yeah

19:43 <jekstrand> And I think I'm hitting a panfrost bug somewhere.

19:43 <karolherbst> potentially

19:43 <jekstrand> None of what I'm seeing looks like a rusticl bug

19:43 <karolherbst> or just some implicit gallium requierement I am not following

19:43 <jekstrand> Not with how well things are working or inirs.

19:43 <karolherbst> yeah.. probably

19:44 <alyssa> jekstrand: is panfrost supposed to use a disk cache nobody told me

19:44 <karolherbst> could be that I don't set correct MEM_FLAGS or something weird, or panfrost doesn't sync on the correct combinatin or other random things :/

19:44 <jekstrand> alyssa: "supposed" is a strong word. I'd generally recommend it.

19:44 <alyssa> What does common do for the driver and what does the driver have to do?

19:44 <alyssa> and are there docs anywhere?

19:44 <karolherbst> alyssa: you just serialize your shader and cache it

19:45 <karolherbst> that's essentially it

19:45 <jekstrand> alyssa: src/util/disk_cache.h. It's better documented than most of Mesa. :-/

19:46 <anholt> jekstrand: that doesn't help make sense of how it fits in gallium drivers, unfortunately.

19:46 <jekstrand> anholt: :-/

19:46 <karolherbst> alyssa: you can also look at how we added support for nouveau for that, it's all in one MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4264

19:46 <alyssa> * WARNING: 3rd party applications might be reading the cache item metadata. * Do not change these values without making the change widely known. * Please contact Valve developers and make them aware of this change.

19:46 <karolherbst> last patch is the driver stuff

19:46 <alyssa> this does not inspire confident

19:46 <alyssa> anholt: ^^ that

19:46 <anholt> alyssa: first step is you hook up the screen disk_cache. you have to do it in the driver, because you have to mix in your driver config knobs that might affect frontend shader compiles to the build id.

19:47 <alyssa> hrumble

19:47 <karolherbst> alyssa: that's just the metadata we add to cached entries internally

19:47 <karolherbst> alyssa: steam makes use of those

19:47 <anholt> with the screen stuff hooked up, mesa/st gets to cache the output of the frontend compiler->nir path.

19:48 * alyssa reads the reference implementation (v3d_disk_cache)

19:48 <karolherbst> yeah.. first step is just initing the cache and configure a proper hash key thingy

19:48 <karolherbst> alyssa: very very high level overview is.. store your compiler result in an uint8_t array and have a function doing the reverse :)

19:48 <anholt> alyssa: sp-disk-cache of my tree has a trivial version of doing it.

19:48 <jekstrand> Hrm... The pandecode.dump for all three of my compute dispatches has the same Push uniforms pointer. That seems a bit fishy.

19:49 <jekstrand> I would expect there to be a bit of ring buffering going on

19:49 <alyssa> anholt: karolherbst: ack, thank you

19:50 <alyssa> jekstrand: no ring buffer, all transient memory (ie. allocated off the batch) is freed when the batch is freed

19:50 <alyssa> and then the BO is returned to the BO cache

19:50 <jekstrand> alyssa: So I may just be getting the same BO over and over again?

19:50 <jekstrand> Hrm...

19:50 <alyssa> and compute jobs get put in their own batch right now for simplicity (this could be optimized)

19:50 <karolherbst> what I and Mark have done for nouveau was to 1. split structs into input and outputs ones, 2. write a hash functions for the input 3. write serialize/deserialize 4. hook it up

19:50 <alyssa> yes, if you're submitting the same stuff over and over

19:50 mclasen has quit [Ping timeout: 480 seconds]

19:50 <jekstrand> alyssa: Ok, that makes sense then.

19:51 mbrost has quit [Ping timeout: 480 seconds]

19:51 <karolherbst> mhhh

19:51 <karolherbst> that's ehhh.

19:51 <karolherbst> wait..

19:51 <alyssa> yes, this means that there are false deps between batches.

19:51 <jekstrand> woo

19:51 <alyssa> might be leaving some perf on the table. don't know.

19:52 <karolherbst> yeah I guess that can break rusticl then :)

19:52 <karolherbst> jekstrand: I could imagine that calls to create_compute_state could overwrite the content of the bo without the hardware executes stuff, no?

19:52 <karolherbst> but mhh

19:53 <karolherbst> yeah I guess this can happen if there is no sync point in between

19:54 <karolherbst> jekstrand: you could try a ctx.flush().wait(); after launch_grid and see if that changes anything?

19:54 <alyssa> shouldn't happen, it's serialized in the kernel

19:54 <alyssa> there is a flush after lunch grid in panfrost

19:54 <karolherbst> yeah.. but I also unbind the entire state directly after launch_grid...

19:54 <jekstrand> PAN_MESA_DEBUG=sync seems to make the fails go away

19:54 <karolherbst> so wait might be important

19:55 <karolherbst> I copied clover design here, which I still don't like :)

19:55 <alyssa> jekstrand: That's... odd.

19:55 <jekstrand> I'm starting to think our clEnqueueReadBuffer is racing with the kernel

19:56 ybogdano has quit [Ping timeout: 480 seconds]

19:56 <alyssa> Plausible

19:56 <alyssa> depending on what transfer flags you use, gallium can optimize away your correctness ;)

19:57 <karolherbst> MAP_READ_WRITE

19:57 <jekstrand> pan_set_global_binding is calling panfrost_batch_write_rsrc so it should be getting a write fence set on it in the kernel ioctl

19:57 <jekstrand> Let me double-check that

19:57 <karolherbst> at some point I will optimize setting all those flags :)

19:57 <alyssa> karolherbst: sync? unsync? etc

19:58 <karolherbst> potentially PIPE_MAP_UNSYNCHRONIZED on non blocking maps

19:58 <karolherbst> but we do flush and wait later

19:59 <karolherbst> but clEnqueueReadBuffer is synced

19:59 <karolherbst> so it's just READ_WRITE

19:59 <alyssa> ok..

19:59 <karolherbst> should be just READ really, but...

19:59 <karolherbst> I don't fight those bugs yet

19:59 <karolherbst> *want to

20:01 mvlad has quit [Remote host closed the connection]

20:04 <karolherbst> " SPIRV-Headers was not found - please checkout a copy under external/." I already don't want to :D

20:05 <jekstrand> karolherbst: What are you building now?

20:05 <karolherbst> spirv-tools

20:05 <karolherbst> the linker is buggy :(

20:05 <jekstrand> :(

20:05 <karolherbst> yeah...

20:05 <karolherbst> I thought we could work around it in mesa, but... it's complicated

20:06 mclasen has joined #dri-devel

20:11 mbrost has joined #dri-devel

20:14 columbarius has joined #dri-devel

20:16 co1umbarius has quit [Ping timeout: 480 seconds]

20:18 nchery has quit [Ping timeout: 480 seconds]

20:25 nchery has joined #dri-devel

20:28 <karolherbst> " error: ‘spvValidate’ is not a member of ‘spvtools’; did you mean ‘spvValidate’?"

20:29 <karolherbst> it tries to tell me I shouldn't use the namespace thingy

20:29 <jekstrand> Ugh... Why don't GDB watchpoints work on this box?!?

20:30 <jekstrand> Or maybe they do and they're just that slow? idk.

20:30 <karolherbst> watchpoints are this kind of thing I set 5 times until I set them correctly

20:30 fw400 has joined #dri-devel

20:30 <karolherbst> but yeah

20:30 <karolherbst> watchpoints make things slow :)

20:30 fw400 has left #dri-devel [#dri-devel]

20:31 Duke`` has quit [Ping timeout: 480 seconds]

20:31 <jekstrand> someone is removing BOs from my batch

20:31 <jekstrand> I think that's why it's failing

20:32 <karolherbst> jekstrand: did you try flush().wait() after launch_grid and/or disable unbinding all the stuff?

20:32 <jekstrand> karolherbst: unbinding doesn't matter

20:32 <karolherbst> I hope you are right :)

20:32 <jekstrand> karolherbst: set_global_binding with resources == NULL is a no-op on panfrost

20:32 <karolherbst> ahh

20:37 <alyssa> ..how would BOs get removed from a batch

20:37 <jekstrand> idk

20:37 <jekstrand> I tried to set a watchpoint on batch->num_bos but GDB hates me

20:37 <alyssa> there's no api for that, for good reason

20:40 <karolherbst> jekstrand: watch -l &batch->num_bos ?

20:40 ybogdano has joined #dri-devel

20:40 pallavim has quit [Read error: Connection reset by peer]

20:40 <karolherbst> (at least I think this is the correct way of using watch)

20:41 <jekstrand> karolherbst: Yeah, I tried that. GDB hates me. :P

20:41 <karolherbst> annyoing

20:42 <karolherbst> maybe it doesn't get written

20:42 <karolherbst> ahh check_interface_variable :)

20:43 <jekstrand> Found it!

20:43 <karolherbst> there it is

20:43 <karolherbst> jekstrand: \o/

20:43 <karolherbst> so, is it my fault or someone elses?

20:43 <jekstrand> Someone else

20:43 <karolherbst> yes

20:46 <alyssa> Is the someone else me

20:46 <jekstrand> Maybe

20:46 <alyssa> Shoot

20:46 <jekstrand> The someone else is whoever hooked up set_global_binding

20:46 <alyssa> Yeah that sounds like me

20:47 maxzor has quit [Ping timeout: 480 seconds]

20:48 <alyssa> git blame says Icecream95 actually, but I'll take the blame anyway if you want :p

20:48 <jekstrand> karolherbst: What's the cap for max global bindings?

20:48 <karolherbst> uhhh, there is one?

20:48 <alyssa> + /* The handle points to uint32_t, but space is allocated for 64 bits */

20:48 <alyssa> does rusticl not emulate that particular clover quirk? :p

20:48 h0tc0d3 has quit [Remote host closed the connection]

20:48 <karolherbst> alyssa: i do the same

20:48 <alyssa> ack

20:48 <karolherbst> but it doesn't support subbuffers :p

20:48 <jekstrand> alyssa: It'll be obvious as soon as I send the patch. :)

20:49 <karolherbst> jekstrand: btw, you might want to wire in support for sub buffers

20:49 <alyssa> jekstrand: :D

20:49 <karolherbst> jekstrand: that's the kind of terrible interface set_global_bindings is: https://gitlab.freedesktop.org/karolherbst/mesa/-/commit/d9b1a592fdd4ef734f77eb7395e7a589e9df38dc

20:50 <karolherbst> for subbuffers I have to offset with the offset.. obviously

20:50 <jekstrand> karolherbst: I don't understand what that does

20:50 maxzor has joined #dri-devel

20:51 <jekstrand> karolherbst: Is the handle an input also?

20:51 <karolherbst> jekstrand: reads out handles, adds the resource address, writes it back

20:51 <karolherbst> yes....

20:51 <jekstrand> Oh, that's truly horrible

20:51 <karolherbst> it is

20:51 <karolherbst> but it's required for subbuffers

20:51 <karolherbst> that was the reason those tests failed ...

20:51 <karolherbst> there isn't really a better way of doing it except fixing up the offset inside rusticl

20:51 <karolherbst> but clover adds the offset this way

20:52 h0tc0d3 has joined #dri-devel

20:52 <karolherbst> I wouldn't mind replacing it with something new :)

20:53 <karolherbst> or maybe we also change what clover is doing...

20:54 mbrost has quit [Ping timeout: 480 seconds]

20:58 rasterman has joined #dri-devel

20:59 maxzor has quit [Ping timeout: 480 seconds]

21:09 <jekstrand> alyssa: https://gitlab.freedesktop.org/jekstrand/mesa/-/commit/a3e6c75486b6d552dfb4c7c79782355e78942cb0

21:15 <h0tc0d3> Does anyone know why the release schedule is not working? https://docs.mesa3d.org/release-calendar.html

21:15 <h0tc0d3> The plan was to have a corrective release of 22.0.2, but there is none.

21:15 <airlied> I think dcbaker is working on it at present

21:15 * jekstrand kicks off a new full CTS run on panfrost

21:16 <jekstrand> This one should be much less full of random fail

21:16 <h0tc0d3> The 2 week release schedule hasn't been working for over a month now.

21:19 * jekstrand goes to find a nap while he waits for his panfrost CTS run. It'll probably take an hour or mor.

21:19 <karolherbst> my name shows up more often than I'd like to in those spirv repos :D

21:20 <karolherbst> at least now I have a plan on fixing this linker bug

21:20 <jekstrand> karolherbst: \o/

21:22 maxzor has joined #dri-devel

21:23 <alyssa> jekstrand: ...so what we had before was more of a set_local_binding? :p

21:23 lynxeye has quit [Quit: Leaving.]

21:23 Major_Biscuit has joined #dri-devel

21:25 MajorBiscuit has quit [Ping timeout: 480 seconds]

21:28 <airlied> robclark: oh also in case you are wondering, agd5f PR also just say Fixes but provide a list in the signed tag

21:28 <airlied> I'm also happy with that :-)

21:31 freem_ has joined #dri-devel

21:32 mbrost has joined #dri-devel

21:34 <robclark> airlied: I guess I should figure out how to do the signed tag thing

21:39 dj-death has quit [Ping timeout: 480 seconds]

21:43 Lucretia-backup has joined #dri-devel

21:47 Lucretia has quit [Ping timeout: 480 seconds]

21:50 Major_Biscuit has quit [Ping timeout: 480 seconds]

21:57 ybogdano has quit [Ping timeout: 480 seconds]

21:58 nchery has quit [Ping timeout: 480 seconds]

21:58 garrison has quit [Read error: Connection reset by peer]

21:59 garrison has joined #dri-devel

22:01 <karolherbst> jekstrand: https://github.com/karolherbst/SPIRV-Tools/commit/5579468d3a62de298406c9017fd2c4f46e023bf7 :D

22:02 <karolherbst> I was so close to reimplement that stuff, but then I saw that this pass doesn't remove, but just recalcluates

22:02 mszyprow has quit [Ping timeout: 480 seconds]

22:09 * karolherbst kicks of another run

22:16 ybogdano has joined #dri-devel

22:17 h0tc0d3 has quit [Quit: Leaving]

22:31 <jekstrand> Pass 1364 Fails 346 Crashes 466 Timeouts 0

22:37 nchery has joined #dri-devel

22:40 <karolherbst> jekstrand: that's not too bad

22:40 <karolherbst> meanwhile: Pass 2137 Fails 22 Crashes 17 Timeouts 0 :3

22:43 <karolherbst> jekstrand: the list becomes very short

22:43 <karolherbst> jekstrand: what's the biggest problem with panfrost though?

22:43 <karolherbst> anything standing out or just random stuff all over the place

22:44 <alyssa> broken compiler stuff I would wager? lot of code paths not previously exercised

22:44 <jekstrand> alyssa: Yeah, just made it lower 8-bit integers

22:44 <jekstrand> Something's wrong with vec3

22:45 <karolherbst> vec3?

22:45 <jekstrand> Error for vector size 3 found at 0x00000001: *0x01 vs 0x00

22:45 <jekstrand> Input value: 0x01 (convert_char3( char3 )) *** convert_charn( charn ) FAILED **

22:45 <karolherbst> I thought we lower that

22:45 <karolherbst> ehh

22:45 mbrost_ has joined #dri-devel

22:45 gawin has quit [Ping timeout: 480 seconds]

22:46 <karolherbst> jekstrand: btw, did you figure out why clamping is broken?

22:47 <jekstrand> karolherbst: Nope

22:47 <jekstrand> Those tests are evil

22:48 mbrost has quit [Read error: Connection reset by peer]

22:48 <karolherbst> yeah...

22:48 <jenatali> Yep

22:49 <karolherbst> jenatali: I will figure out the divide fails btw

22:49 <karolherbst> eh

22:49 <karolherbst> jekstrand: ^^

22:49 <karolherbst> jenatali: btw.. I think you could move to CL 3.0 by now :D

22:50 <jenatali> With LLVM 14 or 15, right?

22:50 <karolherbst> I have a fix for that linking issue

22:50 <karolherbst> but yeah.. it might require newer llvm as well

22:50 <jenatali> Yeah. I'll get around to it... eventually...

22:50 <jenatali> I hope

22:50 <karolherbst> I don't think it's much work tbh

22:51 <jenatali> Yeah but it's still a big context switch from my current priorities

22:51 <karolherbst> just some new APIs, but nothing really new

22:51 <karolherbst> I see

22:51 <jenatali> I've implemented CL3.0, but only exposing CL C 1.2

22:51 <jenatali> So the APIs are done

22:51 <alyssa> rusticl dx12 when

22:51 <karolherbst> alyssa: I was already wondering if I should push it through zink

22:51 * karolherbst hides

22:51 <jenatali> It's not out of the question

22:52 <alyssa> jenatali: The question was for jekstrand, unless you said yes ;-p

22:52 <jenatali> Just, I don't have nearly enough time in the day, and my personal time for this kind of stuff has completely evaporated

22:52 <karolherbst> but I don't think it would be a very good fit tbh

22:53 <karolherbst> it's so gallium specific

22:53 <jekstrand> TBH, once you've got the Mesa compiler, gallium doesn't buy you that much

22:53 <jenatali> ^^

22:53 <karolherbst> sure

22:53 <karolherbst> but most of the code isn't interfacing with the mesa compiler

22:53 <jenatali> The CL API surface area is so tiny compared to the compiler infrastructure

22:53 <karolherbst> yeah

22:54 <karolherbst> but what I meant is, if rusticl would run on dx12, we'd essentially would write it from scratch as most things would have to change...

22:54 <karolherbst> maybe the api validation could stay

22:54 * karolherbst steals scale_fdiv

22:54 <jenatali> Oh I assumed alyssa meant on the d3d12 gallium backend. I don't see any point giving rusticl a direct DX12 backend

22:55 <karolherbst> jenatali: how do you want to implement set_global_bindings

22:55 <karolherbst> I am not doing globals to ssbo lowering

22:55 <jenatali> Yeah I'd want them as ssbos

22:56 <karolherbst> I guess we would have to take some of the passes from src/microsoft and have some flips

22:56 <karolherbst> maybe that could work, but...

22:56 <jenatali> Yeah. Or else move stuff from the frontend into the backend

22:56 <jenatali> But doesn't seem worth it. We're happy with our frontend for now

22:56 <karolherbst> yeah, and I don't want to bother backends with random stuff

22:57 <karolherbst> basing it on top of zink on the other hand....

22:57 <alyssa> jenatali: That is what I meant yes

22:57 gawin has joined #dri-devel

22:57 <karolherbst> I see we'd need some vulkan extensions

22:57 <jenatali> karolherbst: I assume zink would also want it lowered to ssbo

22:57 <karolherbst> but honestly.. we could just allow kernel spir-vs, do the 64 bit buffer stuff and that would be it

22:57 <jenatali> Or I guess you could do it with the BDA extensions

22:58 <karolherbst> or new extensions

22:58 <karolherbst> just push in the spir-v kernels

22:58 <jenatali> But also then you're just rewriting clspv :)

22:58 <jenatali> Er, clvk?

22:58 <karolherbst> I wouldn't

22:58 <alyssa> consonants sure do

22:58 <karolherbst> their idea was to not fix vulkan

22:58 <jekstrand> Yeah, you'd just emit a pile of bda

22:59 <karolherbst> ohh, so bdas are good enough for CL buffers?

22:59 <jenatali> "Fix..." their idea was to fix CL by making it work on Vulkan :)

22:59 <karolherbst> :D

22:59 <karolherbst> yeah well

22:59 <karolherbst> does anybody use it?

23:00 <jekstrand> anybody use what?

23:00 <karolherbst> clvk

23:00 <jenatali> CL's already pretty niche I feel like. I know they had a target use for clvk but I don't remember what it is. And I doubt they have many more customers

23:00 <jekstrand> I don't know about clvk but there's at least one major app doing serious compute work shipping on clspv

23:00 <karolherbst> honestly.. I think using zink is probably the best option here and just add a new extension for spirv kernels

23:00 <karolherbst> ahh

23:01 <karolherbst> jekstrand: but clspv is just kernel spirv to vulkan spirv, right?

23:01 <jenatali> Pretty much

23:01 <jekstrand> yup

23:01 <karolherbst> yeah I guess if that fits your use case

23:02 <karolherbst> anyway, I don't have any plans wiht rusticl anyway, I just wanted to learn rust :D

23:03 * jekstrand wants Mesa to have a competent compute story

23:03 * jekstrand still isn't quite sure what story that will be

23:04 <karolherbst> yeah...

23:05 <karolherbst> I think a CL stack at least as good as intels or AMDs would be a good starting point

23:05 <karolherbst> (making it pointless to their install theirs that is)

23:07 <jekstrand> alyssa: What splits vec3 loads/stores into scalar?

23:07 gawin has quit [Ping timeout: 480 seconds]

23:07 <jekstrand> Maybe LLVM is doing that?

23:07 <jekstrand> that's believable

23:10 <karolherbst> mhh

23:11 <karolherbst> scale_fdiv doesn't fix ERROR: divide: -16777216.000000 ulp error at {-0x1.fffffep+127, -0x1.fffffep+127}: *0x1p+0 vs. 0x0p+0 (0x00000000) at index: 198 :(

23:11 <karolherbst> but that's a subnormal, isn't it?

23:12 <jekstrand> I think that's just 1 vs 0

23:12 <karolherbst> yeah, but the inputs

23:12 <karolherbst> -0x1.fffffep+127 / -0x1.fffffep+127

23:12 <jekstrand> We may need to use the actual fdiv opcode

23:13 <karolherbst> ohh, you don't?

23:13 <jekstrand> Nope

23:13 <jekstrand> We don't for GL

23:13 <karolherbst> how can I flip it?

23:13 <jekstrand> We do mul+rcp

23:13 mbrost_ has quit [Remote host closed the connection]

23:13 <karolherbst> ahh

23:13 <jenatali> karolherbst: We don't support denormals, we always flush them

23:13 <karolherbst> yeah, that won't work

23:13 <jekstrand> It'll require some compiler work.

23:13 mbrost_ has joined #dri-devel

23:13 <karolherbst> okay

23:13 <jekstrand> Not too much but more than zero

23:13 <karolherbst> right

23:13 <karolherbst> yeah, we want real fdiv for CL :)

23:14 <jekstrand> We don't for GL because you can often CSE the RCP

23:14 <karolherbst> now what's up with "mem_host_flags mem_host_write_only_image"

23:14 <karolherbst> sure

23:14 <karolherbst> and a real fdiv is slow

23:15 <jekstrand> idk that it's that much slower than rcp

23:15 <jekstrand> But it's slower than fmul

23:15 <jekstrand> By a lot

23:15 <karolherbst> yeah

23:15 <karolherbst> not that luxmark perf tanks when we start using it :D

23:17 * jekstrand hates this kernel

23:17 <jekstrand> llvm turns a very simple char3 load/store into a giant pile of garbage

23:17 <karolherbst> classic llvm

23:18 h0tc0d3 has joined #dri-devel

23:18 <karolherbst> why though?

23:18 <jekstrand> idk

23:18 <jekstrand> Why does LLVM do anything it does?!?

23:18 <karolherbst> maybe it's not aligned

23:18 <karolherbst> because llvm is a smart compiler always doing the right thing, everybody knows that

23:19 <karolherbst> ehhh the remaining fails are painful

23:20 <jekstrand> Yeah, LLVM is definitely checking for alignment and emitting 64-bit load/store if it can.

23:20 <karolherbst> can we disable that?

23:20 <jekstrand> I think it decided this test was some poor soul's hand-rolled memcpy. :joy:

23:21 <karolherbst> basically I just want llvm to give us the plain thing, no idiotic postprocessing :D

23:21 <karolherbst> lol

23:21 <karolherbst> we do pass -O0 into llvm

23:22 <jekstrand> Maybe we need to pass -O-0?

23:22 <jekstrand> In any case, panfrost should be able to compile this even if it is stupid

23:22 mbrost_ has quit [Ping timeout: 480 seconds]

23:24 <karolherbst> I still don't want llvm to do silly things :D

23:24 * jekstrand views that as inevitable

23:24 <karolherbst> jekstrand: which test is it btw?

23:24 <jekstrand> test_conversions char_char

23:24 <jekstrand> char3 case, to be particular

23:25 <karolherbst> what the...

23:25 <jekstrand> int_int also fails

23:25 <jekstrand> So it's not an 8-bit problem

23:26 <karolherbst> ahh no

23:26 <karolherbst> it's not llvm

23:26 <karolherbst> it's the CTS

23:26 <jekstrand> wha?

23:26 <karolherbst> the CTS special cased vec3 and uses vloadn

23:26 <jekstrand> Of course it did...

23:26 <jekstrand> So it's probably a vloadn problem

23:26 <karolherbst> potentially

23:27 morphis has quit [Ping timeout: 480 seconds]

23:27 <karolherbst> vloadn doesn't guarentee alignment

23:27 * jekstrand looks at vloadn

23:27 <jekstrand> right

23:27 morphis has joined #dri-devel

23:27 <karolherbst> well besides what the base type needs

23:27 <jekstrand> Do we implement vloadn ourselves or use libclc?

23:28 <jekstrand> we do it ourselves

23:28 <karolherbst> yeah

23:29 <karolherbst> not sure if libclc has an impl

23:29 <jekstrand> it appears to

23:30 <jekstrand> Ugh... Yeah, this is all in the CTS test

23:30 <jekstrand> Ok, this makes a lot more sense

23:31 <jekstrand> ironically, bifrost seems to have load/store_i24 opcodes. :-/

23:33 <jekstrand> I wonder if get_global_size is just wrong

23:34 h0tc0d3 has quit [Remote host closed the connection]

23:35 mbrost_ has joined #dri-devel

23:39 neonking has quit [Ping timeout: 480 seconds]

23:39 maxzor has quit [Ping timeout: 480 seconds]

23:44 <karolherbst> jekstrand: ohh so you include bound checks?

23:44 <karolherbst> eh wait

23:45 <karolherbst> get_global_size is this CL thing :D

23:47 maxzor has joined #dri-devel

23:49 Haaninjo has quit [Quit: Ex-Chat]

23:50 alanc has quit [Remote host closed the connection]

23:51 alanc has joined #dri-devel

23:51 <karolherbst> soo.. why is api clone_kernel crashing...

23:51 khfeng has joined #dri-devel

23:53 <karolherbst> jekstrand: I can't use nir_opt_dead_write_vars before inlining, can I?

23:54 <karolherbst> it asserts on vec1 64 ssa_5 = deref_var &copy_in (function_temp struct.structArg)

23:54 ybogdano has quit [Ping timeout: 480 seconds]

23:54 <karolherbst> ehh intrinsic copy_deref (ssa_5, ssa_4) (dst_access=0, src_access=0)

23:55 <karolherbst> wait, it's not inlining, but I have to split

23:55 rasterman has quit [Quit: Gettin' stinky!]