#dri-devel on 2022-04-12 — irc logs at oftc.irclog.whitequark.org

2022-03-22 11:57 ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar

00:00 <karolherbst> ohh no.. I found the bug with sub buffers

00:03 TJ_Mercier has joined #dri-devel

00:04 nchery is now known as Guest1724

00:04 nchery has joined #dri-devel

00:05 mbrost_ has quit [Ping timeout: 480 seconds]

00:07 Guest1724 has quit [Ping timeout: 480 seconds]

00:26 khfeng has joined #dri-devel

00:26 ybogdano has quit [Ping timeout: 480 seconds]

00:33 rkanwal has quit [Ping timeout: 480 seconds]

00:35 mbrost has joined #dri-devel

00:35 iive has quit []

00:39 co1umbarius has joined #dri-devel

00:40 mhenning has joined #dri-devel

00:41 columbarius has quit [Ping timeout: 480 seconds]

00:44 <karolherbst> sub buffers fixed :)

01:01 apinheiro has quit [Ping timeout: 480 seconds]

01:03 danct12_ has joined #dri-devel

01:06 heat has joined #dri-devel

01:09 danct12_ has quit []

01:11 danct12_ has joined #dri-devel

01:18 danct12_ has quit []

01:18 Daanct12 has joined #dri-devel

01:25 nchery has quit [Read error: Connection reset by peer]

01:31 nchery has joined #dri-devel

01:41 <jekstrand> karolherbst: \o/

01:52 nchery has quit [Read error: Connection reset by peer]

01:54 rpigott has quit [Remote host closed the connection]

01:55 nchery has joined #dri-devel

01:55 rpigott has joined #dri-devel

01:56 sdutt has quit [Read error: Connection reset by peer]

02:01 maxzor has quit [Ping timeout: 480 seconds]

02:10 <karolherbst> jekstrand: soo.. slowly I am running out of stuff to fix :)

02:10 mbrost has quit [Read error: Connection reset by peer]

02:12 <karolherbst> two things which is hit by a lot of tests: 1. host ptr support for array and 3d images 2. use_host_ptr sometimes fails on weird alignments

02:13 <karolherbst> ohh and blorb

02:13 <karolherbst> buffers buffer_fill_float

02:13 <karolherbst> buffers buffer_copy

02:13 <karolherbst> and

02:14 <karolherbst> images/clGetInfo/test_cl_get_info 1Darray

02:14 <karolherbst> in case you've gotten some time looking into those issues (it's all iris related afaik)

02:22 Daanct12 has quit [Remote host closed the connection]

02:32 mclasen has quit [Ping timeout: 480 seconds]

02:34 tarceri_ has joined #dri-devel

02:35 tarceri has quit [Remote host closed the connection]

02:35 Danct12 has quit [Remote host closed the connection]

02:35 Net147 has quit [Quit: Quit]

02:35 Net147 has joined #dri-devel

02:35 Danct12 has joined #dri-devel

02:42 ella-0_ has joined #dri-devel

02:46 ella-0 has quit [Read error: Connection reset by peer]

02:48 JohnnyonF has joined #dri-devel

02:50 JohnnyonFlame has quit [Ping timeout: 480 seconds]

03:01 heat_ has joined #dri-devel

03:07 heat has quit [Ping timeout: 480 seconds]

03:08 aravind has joined #dri-devel

03:21 Akari has quit [Ping timeout: 480 seconds]

03:25 JohnnyonF has quit [Read error: Connection reset by peer]

03:26 JohnnyonFlame has joined #dri-devel

03:26 Akari has joined #dri-devel

03:28 Daanct12 has joined #dri-devel

03:31 xperia64 has quit [Remote host closed the connection]

03:31 Daanct12 has quit [Remote host closed the connection]

03:32 xperia64 has joined #dri-devel

03:37 aravind has quit [Ping timeout: 480 seconds]

03:38 aravind has joined #dri-devel

03:41 Daanct12 has joined #dri-devel

03:43 JohnnyonF has joined #dri-devel

03:44 Daanct12 has quit []

03:49 Johnny has joined #dri-devel

03:49 mdroper has quit [Read error: Connection reset by peer]

03:50 JohnnyonFlame has quit [Ping timeout: 480 seconds]

03:52 JohnnyonF has quit [Ping timeout: 480 seconds]

03:52 mhenning has quit [Quit: mhenning]

03:55 Johnny has quit [Read error: Connection reset by peer]

03:55 JohnnyonFlame has joined #dri-devel

04:09 heat_ has quit [Remote host closed the connection]

04:10 heat has joined #dri-devel

04:21 Duke`` has joined #dri-devel

04:46 HankB__ has quit [Remote host closed the connection]

04:46 HankB__ has joined #dri-devel

04:48 shankaru has joined #dri-devel

04:53 khfeng has quit [Remote host closed the connection]

04:53 khfeng has joined #dri-devel

05:10 jewins has quit [Read error: Connection reset by peer]

05:17 heat has quit [Ping timeout: 480 seconds]

05:27 lemonzest has quit [Quit: WeeChat 3.4]

05:34 itoral has joined #dri-devel

05:34 CME has quit [Ping timeout: 480 seconds]

05:35 <airlied> jekstrand: nir_lower_compute_system_values is hard to add a new lowering for something that is currently always lowered

05:35 lemonzest has joined #dri-devel

05:36 * airlied isn't sure how to stop it lowering something

05:37 * airlied wants it to stop lowering SYSTEM_VALUE_GLOBAL_GROUP_SIZE

05:38 <airlied> maybe if I add a compiler option alongside it

05:41 itoral has quit [Remote host closed the connection]

05:41 itoral has joined #dri-devel

05:46 CME has joined #dri-devel

05:48 Duke`` has quit [Ping timeout: 480 seconds]

05:48 <dolphin> airlied: any ETA for drm-misc-next pull to drm-next? there's a patch we have a dependency on drm-intel-gt-next

05:51 <airlied> dolphin: oh let me run it now

05:52 <dolphin> thanks, that'll unblock the DG2 stuff once I backmerge it then :)

05:53 danvet has joined #dri-devel

05:55 ahajda has joined #dri-devel

05:58 itoral has quit [Remote host closed the connection]

05:59 itoral has joined #dri-devel

06:00 idr has quit [Quit: Leaving]

06:02 mszyprow has joined #dri-devel

06:21 mripard[m] has quit []

06:21 mripard[m] has joined #dri-devel

06:48 frieder has joined #dri-devel

06:52 OftenTimeConsuming has quit [Remote host closed the connection]

06:52 OftenTimeConsuming has joined #dri-devel

07:03 OftenTimeConsuming has quit [Remote host closed the connection]

07:03 OftenTimeConsuming has joined #dri-devel

07:06 Guest1711 has quit []

07:06 jessica_24 has quit []

07:06 abhinav__ has quit [Quit: The Lounge - https://thelounge.chat]

07:06 shankaru has quit [Read error: Connection reset by peer]

07:07 abhinav__ has joined #dri-devel

07:07 abhinav__5 has joined #dri-devel

07:07 abhinav__ has quit []

07:07 abhinav__5 has quit []

07:07 abhinav__5 has joined #dri-devel

07:07 abhinav__ has joined #dri-devel

07:12 paulk has joined #dri-devel

07:12 MajorBiscuit has joined #dri-devel

07:14 Major_Biscuit has joined #dri-devel

07:14 <airlied> karolherbst, jekstrand : okay I've rebased by amd nir compute backend support, prelim nir/clover patches are in 15876

07:17 <airlied> no images 9 out of 95 basic passing

07:21 MajorBiscuit has quit [Ping timeout: 480 seconds]

07:25 rpigott has quit [Read error: Connection reset by peer]

07:25 jkrzyszt has joined #dri-devel

07:26 rpigott has joined #dri-devel

07:32 OftenTimeConsuming has quit [Remote host closed the connection]

07:32 OftenTimeConsuming has joined #dri-devel

07:32 pnowack has joined #dri-devel

07:45 JohnnyonFlame has quit [Read error: Network is unreachable]

07:45 JohnnyonFlame has joined #dri-devel

07:54 <airlied> dolphin: drm-next is pushed out now

07:55 <dolphin> airlied: thanks, will send early -fixes PR tomorrow due to easter holidays

07:55 lynxeye has joined #dri-devel

07:55 <dolphin> last week there were none picked up, this week there is one patch

08:08 maxzor has joined #dri-devel

08:14 apinheiro has joined #dri-devel

08:26 <remexre> does anyone know what's blocking VK_EXT_acquire_drm_display support on more drivers? I haven't really dug into the mesa code before, but from c8ed5ac206a7 and 2fe2eb1911f4 it *looks* pretty trivial

08:26 <remexre> (and if I'm not missing something and it is pretty trivial, is this something that'd be a good first commit?)

08:27 shashanks has joined #dri-devel

08:28 shashank_s has joined #dri-devel

08:31 shashank_sharma has quit [Ping timeout: 480 seconds]

08:35 shashanks has quit [Ping timeout: 480 seconds]

08:37 <emersion> yes it should be pretty trivial to wire up

08:37 <emersion> yup, would be a good first contribution

08:37 <emersion> feel free to CC me for a review

08:38 <remexre> okay, thanks!

08:38 mvlad has joined #dri-devel

08:40 seanpaul has quit [Ping timeout: 480 seconds]

08:41 seanpaul has joined #dri-devel

08:44 rasterman has joined #dri-devel

08:54 i-garrison has quit [Read error: Connection reset by peer]

08:54 i-garrison has joined #dri-devel

09:10 Company has joined #dri-devel

09:18 rkanwal has joined #dri-devel

09:30 itoral has quit [Remote host closed the connection]

09:31 itoral has joined #dri-devel

09:35 itoral has quit [Remote host closed the connection]

09:35 itoral has joined #dri-devel

09:39 shashank_s has quit [Ping timeout: 480 seconds]

09:40 mclasen has joined #dri-devel

09:52 itoral has quit [Remote host closed the connection]

09:53 itoral has joined #dri-devel

10:02 Kayden has quit [Read error: Connection reset by peer]

10:02 Kayden has joined #dri-devel

10:12 itoral has quit [Remote host closed the connection]

10:16 flacks has quit [Quit: Quitter]

10:19 flacks has joined #dri-devel

10:24 devilhorns has joined #dri-devel

10:37 libv has quit [Read error: Connection reset by peer]

10:59 libv has joined #dri-devel

11:07 anujp has quit [Ping timeout: 480 seconds]

11:35 mripard[m] has quit []

11:35 mripard[m] has joined #dri-devel

11:37 shashanks has joined #dri-devel

11:40 mripard has joined #dri-devel

11:41 mripard has quit []

11:42 mripard has joined #dri-devel

11:44 mripard has quit []

11:47 mripard[m] has quit []

11:54 ramaling has quit []

11:54 ramaling has joined #dri-devel

12:04 mripard has joined #dri-devel

12:04 <marex> daniels: do you think I can already pick https://patchwork.freedesktop.org/patch/480551/ and https://patchwork.freedesktop.org/patch/480471/ ?

12:07 <marex> lynxeye: ^

12:08 <daniels> marex: I'm not involved at all with bridge stuff; pinchartl is the one who mostly deals with that

12:15 shashank_sharma has joined #dri-devel

12:16 shashank_s has joined #dri-devel

12:18 shashanks has quit [Ping timeout: 480 seconds]

12:24 shashank_sharma has quit [Ping timeout: 480 seconds]

12:25 devilhorns has quit [Remote host closed the connection]

12:25 devilhorns has joined #dri-devel

12:40 MajorBiscuit has joined #dri-devel

12:43 Major_Biscuit has quit [Ping timeout: 480 seconds]

12:47 Major_Biscuit has joined #dri-devel

12:49 MajorBiscuit has quit [Ping timeout: 480 seconds]

12:54 tango_ has quit [Ping timeout: 480 seconds]

12:58 tango_ has joined #dri-devel

13:00 sdutt has joined #dri-devel

13:04 <zmike> MrCooper: looking into that log error, it seems like I'd need to programmatically #define the string name in meson

13:04 <zmike> but my meson-fu only extends to static #defines

13:06 <MrCooper> not sure offhand how best to deal with that, sorry

13:06 <zmike> yeah I'm writing it into the ticket

13:07 <zmike> maybe dcbaker can rescue us

13:09 Danct12 has quit [Remote host closed the connection]

13:10 Danct12 has joined #dri-devel

13:55 pcercuei has joined #dri-devel

13:56 jewins has joined #dri-devel

14:17 apinheiro has quit [Ping timeout: 480 seconds]

14:18 mbrost has joined #dri-devel

14:21 maxzor has quit [Ping timeout: 480 seconds]

14:27 Major_Biscuit has quit [Ping timeout: 480 seconds]

14:28 Net147 has quit [Ping timeout: 480 seconds]

14:29 ramaling has quit []

14:29 ramaling has joined #dri-devel

14:31 ella-0 has joined #dri-devel

14:31 nchery has quit [Read error: Connection reset by peer]

14:35 ella-0_ has quit [Remote host closed the connection]

14:35 Net147 has joined #dri-devel

14:37 <jekstrand> airlied: Have you looked at hooking up ACO?

14:41 maxzor has joined #dri-devel

14:44 ramaling has quit []

14:45 ramaling has joined #dri-devel

14:48 pcercuei_ has joined #dri-devel

14:48 pcercuei has quit [Read error: Connection reset by peer]

14:49 Major_Biscuit has joined #dri-devel

14:51 pnowack has quit [Quit: pnowack]

14:53 <jekstrand> Kayden: Were you going to do any more review on !15829? You left some detailed comments and I think I addressed them.

14:55 <karolherbst> jekstrand: what's the deal with the binding table in iris? Does it contain like everything? Because I have tests crashing on that having too many entries for the tests using 32 images

14:56 khfeng has quit [Ping timeout: 480 seconds]

14:57 <karolherbst> mhh iris also sets sampelr_views == shader_images == texture_samplers :/

14:57 <karolherbst> uhm.. max_*

15:00 <jekstrand> karolherbst: Yeah, we're going to need to do something about iris' binding tables. I can work on that today, if you'd like.

15:01 <jekstrand> It'll mean some driver surgery but I don't think it'll be that bad.

15:01 <jekstrand> Basically, I need to separate samplers and textures and make it look at variables or something instead of relying on textures_used.

15:01 <karolherbst> mhh, I think it might more sense to fix that stuff for llvmpipe first (because we have to bump mesa core limits anyway)

15:01 <jekstrand> But I'm not sure how much gallium deletes before we get into iris.

15:02 <jekstrand> But iris needs to be the premier CL driver. :P

15:02 <karolherbst> :D

15:02 <karolherbst> sure

15:02 <karolherbst> but llvmpipe has that stuff split already

15:02 <karolherbst> so I'd just make mesa able to support 128 textures (or samplers?) and then we can fix iris

15:02 mdroper has joined #dri-devel

15:03 <karolherbst> there are two issues which cause way more fails/crashes anyway: use_host_ptr on arrays + 3D and use_host_ptr failing on weirdly aligned ptrs for buffers and 1d images

15:03 <karolherbst> like if the alignment is 0x1

15:03 <karolherbst> those two things would probably kill ~50% of the fails I still have

15:07 <jekstrand> karolherbst: Yeah, I'm going to try and look at those today too.

15:07 <karolherbst> cool

15:07 <jekstrand> karolherbst: RE: weirdly aligned pointers, Intel HW can't handle writes to images which aren't pixel-aligned. Texturing should work, though.

15:08 <karolherbst> mhh, okay, let me see how the pointer looks like for the 1d image cases

15:08 Major_Biscuit has quit [Ping timeout: 480 seconds]

15:08 ramaling has quit [Quit: leaving]

15:09 <karolherbst> ohh wait, no, I mistook that one issue for image_buffers not really supported atm

15:09 <karolherbst> so I think use_ptr on 1D images are fine

15:09 yogesh_mohan has quit [Ping timeout: 480 seconds]

15:10 <karolherbst> but iris seems to suffer from the same issues llvmpipe does for texture clamping :(

15:10 <jekstrand> what issue would that be?

15:11 <karolherbst> jekstrand: for llvmpipe I had to do this e.g.: https://gitlab.freedesktop.org/karolherbst/mesa/-/commit/933d1dc735d898df53e088946b8b660560f8a3cd

15:11 <karolherbst> some minor precision problems

15:11 <karolherbst> -1 get's turned into something very close to -1 instead and things fall apart

15:11 <karolherbst> stuff like this

15:12 <karolherbst> iris might suffer from something different, but similiar

15:13 <jekstrand> karolherbst: Ugh...

15:13 <jekstrand> I hope not

15:13 <jekstrand> That's not something we can just fix

15:14 <karolherbst> https://gist.github.com/karolherbst/49ebb0d4c4a261d8ed8291c4789009b2

15:15 <karolherbst> could be something different here though

15:15 <karolherbst> but..

15:15 <karolherbst> you see this very very small y coord?

15:16 <karolherbst> it's OOB, but it still reads at 46,0 instead

15:16 <karolherbst> CLAMP is PIPE_TEX_WRAP_CLAMP_TO_BORDER

15:17 <karolherbst> mhh, the second fail seems like random precision stuff

15:17 <karolherbst> reading at 0,2 instead of 0,3

15:18 * jekstrand kicks off a full wimpy CTS run

15:18 <jekstrand> I really need a dedicated test machine. Oh, well.

15:18 Akari has quit [Ping timeout: 480 seconds]

15:18 <karolherbst> you don't? :(

15:19 <jekstrand> No. Intel took all my hardware back when I quit. I just have my collabora-issued XPS 13.

15:19 <karolherbst> on my desktop it takes like 7 minutes on iris, so if you want me to run stuff :P

15:19 <jekstrand> My fancy new desktop doesn't have an Intel GPU.

15:19 <daniels> run it on Panfrost

15:19 <jekstrand> daniels: :P

15:19 <jekstrand> daniels: I do intend to get panfrost working. I think that's next after iris. I'll let airlied deal with radeonsi for now.

15:20 * karolherbst burries itself into the chaos compiler features_macro is

15:21 anujp has joined #dri-devel

15:22 ramaling has joined #dri-devel

15:22 Akari has joined #dri-devel

15:25 maxzor has quit [Ping timeout: 480 seconds]

15:26 <bnieuwenhuizen> jekstrand: random Q on compute, do y'

15:26 <bnieuwenhuizen> all have plans for how to deal with large compute kernels

15:26 <bnieuwenhuizen> wrt not inlining everything and such

15:27 <karolherbst> I think we have ideas

15:27 <karolherbst> not sure if we can call them plans

15:29 <jekstrand> bnieuwenhuizen: We plan to make a plan. :P

15:32 <karolherbst> ehh

15:32 <karolherbst> __opencl_c_program_scope_global_variables ...

15:32 <karolherbst> Feature status: API - not supported, compiler - supported

15:32 <karolherbst> __opencl_c_program_scope_global_variables - failed

15:32 <karolherbst> how can I tell clang to just not do that? :D

15:32 Akari has quit [Ping timeout: 480 seconds]

15:32 <karolherbst> ohh

15:33 <karolherbst> -__opencl_c_program_scope_global_variables ?

15:33 <karolherbst> okay...

15:33 <karolherbst> it's a llvm-13+ feature

15:33 <karolherbst> guess we require llvm-13 then

15:34 Akari has joined #dri-devel

15:39 <jekstrand> karolherbst: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6311

15:39 <jekstrand> karolherbst: I'm going to post the list of failing tests once this CTS run finishes. It's 76% done

15:39 <karolherbst> I can do that as well :p

15:39 <karolherbst> it's already done here

15:39 <jekstrand> Oh, ok then.

15:39 <jekstrand> I've also got a list of open iris MRs on there.

15:40 <karolherbst> yeah. I usually just pull in those commits into my branch

15:40 <karolherbst> list added

15:40 shashank_s has quit [Ping timeout: 480 seconds]

15:40 <jekstrand> Thanks!

15:40 <karolherbst> given that's a full CL 3.0 run the list isn't all that huge anymore :)

15:41 <karolherbst> printf is annoying to fix

15:41 <jekstrand> :D

15:41 <karolherbst> so is this linker bug

15:41 <karolherbst> ignoring those two things, it's really not much left anymore

15:41 <jekstrand> One of these days, I need to make nir_lower_conversions stop lowering all the things. Intel can do a bunch of it in in HW.

15:41 <karolherbst> yeah, nvidia can do like everything in hardware as long as you don't step over 32 bit

15:41 nchery has joined #dri-devel

15:41 <karolherbst> if you step over 32 you need to do two conversions

15:42 <jekstrand> Right

15:42 <jekstrand> Intel has a similar issue

15:42 <karolherbst> so 64 -> 8 would be 64 -> 32 -> 8

15:42 <karolherbst> okay

15:42 <jekstrand> We can't do 64 -> 8

15:42 <jekstrand> We might be able to do 64 -> 16 but I'm not sure

15:42 <karolherbst> yeah, not sure about that either

15:42 <jekstrand> Well, not on TGL since there is no 64... :sob:

15:42 <karolherbst> maybe we got the same IP blocks :P

15:42 <karolherbst> another thing is fma

15:43 <karolherbst> but...

15:43 <karolherbst> fma is a mess in mesa, so

15:43 <jekstrand> Yeah...

15:43 <jekstrand> That's on the list of things we need to fix for real one of these days.

15:43 <karolherbst> most comes from the assumption that glsl would know fma, but it doesn't

15:44 <jekstrand> Yeah, GLSL is the mess. CL is pretty sane, IMO.

15:44 <karolherbst> yep

15:44 <karolherbst> having two ops is the only sane path here

15:44 <jekstrand> I think so

15:44 <karolherbst> everything else is just hacks over hacks

15:45 <jekstrand> There may also be some need somewhere for something like "exact" but "ieee-correct". We might be able to do it with that, maybe.

15:45 Duke`` has joined #dri-devel

15:45 <karolherbst> mhh

15:45 <karolherbst> yeah exact doesn't mean ieee correct

15:46 <karolherbst> again, glsl doesn't know fma, so an "exact" thing just means to stay consistent

15:46 <karolherbst> if your hw can't do fma, then exact is meaningless

15:46 <jekstrand> Yup

15:46 <jekstrand> Not quite meaningless.

15:47 <karolherbst> why not?

15:47 <karolherbst> sure, for other ops

15:47 <jekstrand> Exact means "don't change the value when you optimize" That has a meaning regardless of whether fma is high-precision or not.

15:47 <jekstrand> But, yeah, I think we want separate fma and fmad.

15:47 <jekstrand> *ffma

15:47 <karolherbst> ohh sure, but you can split/merge it all you want

15:47 <jekstrand> yup

15:48 <dj-death> oh crap

15:48 <dj-death> a nir_opt_if() bug :(

15:48 <karolherbst> the best ones

15:49 <jekstrand> dj-death: Ouch

15:49 <dj-death> where is my hack that prints out what nir line modified an instruction

15:49 Net147 has quit [Ping timeout: 480 seconds]

15:50 <karolherbst> mhh I think I have to modify clc :(

15:51 <jekstrand> karolherbst: Mind fixing some other stuff while you're at it?

15:51 Net147 has joined #dri-devel

15:51 <karolherbst> jenatali: if somebody at MS has some internal CL 3.0 support patches to wire up OpenCLExtensionsAsWritten, now would be the best time to show them :p

15:52 <karolherbst> jekstrand: what exactly?

15:52 <jenatali> Hm?

15:52 <karolherbst> jenatali: yeah so.. we have to disable/enable opencl c feautres via c->getTargetOpts().OpenCLExtensionsAsWritten

15:53 <jekstrand> karolherbst: generic variants of arithmetic built-ins and wait_group_events

15:53 <karolherbst> ohh

15:53 <karolherbst> I meant src/compiler/clc

15:53 <jekstrand> Oh

15:53 <karolherbst> not libclc

15:53 * jekstrand had his hopes up

15:53 <karolherbst> :D

15:53 ybogdano has joined #dri-devel

15:53 <karolherbst> jenatali: I guess we can just disable everything?

15:54 <swick> ajax: I heard you use silverblue. Any tips and tricks for kernel development? I'm not really happy with rebuilding a new rpm for every change.

15:54 <karolherbst> jenatali: clover is doing this, and I plan to do just the same until somebody cares about those https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/gallium/frontends/clover/llvm/invocation.cpp#L255-264

15:55 <jenatali> Yeah, my thinking was to disable everything in the clc code that the LLVM/SPIR-V/NIR compiler stack doesn't support, and then leave it up the frontend to disable stuff that the compiler can support but the frontend doesn't

15:55 <karolherbst> at some point we have to pass in lists with supported things, but atm I don't support anything anyway

15:57 shashanks has joined #dri-devel

15:57 <karolherbst> I also need to talk with airlied about https://gitlab.freedesktop.org/airlied/mesa/-/commit/9ba15005d74a8d71482cf8f408d6ba8e26368880

15:59 * jekstrand thinks he may have found a but in the util/vma.h :-(

16:00 aravind has quit []

16:01 <dj-death> jekstrand: ffs :)

16:02 <jekstrand> dj-death: I'm still not 100% sure. It may be an iris bug

16:03 <dj-death> yeah my nir_opt_if() bug is "if ptr == 0" turned into "if false"

16:03 <dj-death> I'm pretty sure this is wrong

16:03 <dj-death> especially when ptr was loaded from an ssbo

16:06 <karolherbst> oh no :(

16:06 <karolherbst> dj-death: we did change something in this regard.. let me find it

16:07 <karolherbst> ohh it wasn't merged yet?

16:07 <karolherbst> nvm then

16:07 paulk has quit [Ping timeout: 480 seconds]

16:08 <jekstrand> dj-death: oof

16:10 <karolherbst> heck.. nobody tests this llvm stuff, does one?

16:15 <jekstrand> Ok, iris isn't returning VA to the wrong vma allocator

16:15 <jekstrand> Maybe we really are leaking BOs somewhere?

16:15 <karolherbst> jekstrand: what's the issue you are seeing?

16:15 <jekstrand> That seems more likely that iris not having VA space

16:15 <jekstrand> karolherbst: it's this long_math test

16:16 <jekstrand> It runs out of shader memory

16:16 <jekstrand> But it looks like it's freeing BOs.

16:16 <karolherbst> yeah.. well.. that works for me

16:16 <jekstrand> on iris?

16:16 <karolherbst> yeah

16:16 <karolherbst> uhm..

16:16 <karolherbst> whimpy or real?

16:16 <jekstrand> real

16:17 <karolherbst> okay, let me run it for real

16:18 <karolherbst> 2000% CPU time although that runs on the GPU :(

16:19 <karolherbst> jekstrand: yeah so uhm.. how long do I have to wait?

16:19 <jekstrand> a little while

16:19 <jekstrand> Not terribly long

16:19 <karolherbst> llvm probably races here or something...

16:19 <jekstrand> karolherbst: I've got a patch for that

16:20 <karolherbst> now it's doing stuff with CL_TEST_SINGLE_THREADED=1 :)

16:20 <jekstrand> "compiler/clc: Only initialize LLVM once"

16:20 <jekstrand> in my rusticl/wip branch

16:20 <karolherbst> ohh

16:20 <karolherbst> let me take that one then

16:20 * jekstrand should post that one

16:21 <karolherbst> "ERROR: clGetDeviceInfo CL_DEVICE_NAME failed! (CL_SUCCESS from /data/git/OpenCL-CTS/test_common/harness/kernelHelpers.cpp:892)" ehhh

16:21 <karolherbst> yeah well

16:21 <karolherbst> I had to fix that for nouveau as well :P

16:22 <karolherbst> jekstrand: it looks like stuff is just racy as hell everywhere

16:23 <jekstrand> karolherbst: Isn't rust supposed to fix that by design? :P

16:23 <karolherbst> sure, but only if I'd use Send everywhere

16:23 <karolherbst> which.. I don't :(

16:23 <karolherbst> shame on me

16:23 <karolherbst> so I move things between threads without them being strictly Send, so...

16:23 <karolherbst> I have ideas on how to fix it, but pointers...

16:25 <jekstrand> Bah... I have at most 5447 BOs allocated. Each of them would have to be 1M or larger in order to run out of VA

16:25 <karolherbst> well.. it's CL

16:25 <jekstrand> But they're all shaders

16:25 <karolherbst> races inside ../src/gallium/drivers/iris/iris_program_cache.c :)

16:25 <jekstrand> 1M is a really big shader

16:26 <karolherbst> yeah.. not quite sure what races yet exactly, but there seem to be things

16:26 <karolherbst> #1 bo_close ../src/gallium/drivers/iris/iris_bufmgr.c:1347 (libRusticlOpenCL.so.1+0xc3ce0a)

16:26 <karolherbst> so if you race inside bo_close...

16:26 <karolherbst> I guess bad things can happen indeed

16:27 <karolherbst> emit_state as well

16:27 <karolherbst> yeah.. I will try to tackle multithreading stuff over the weekend, this might get a bit annoying as I'd have to start to wrap more of our stuff

16:28 <karolherbst> I turned on Send for Event task once and got ~300 compiler errors :(

16:29 <jekstrand> Ok, according to the little atomic I added, we really are burning 4GB

16:29 <jekstrand> Oof

16:29 <jekstrand> Seems a bit insane

16:29 <karolherbst> :(

16:29 <jekstrand> That's a lot of quite big shaders

16:29 <karolherbst> well

16:29 <karolherbst> it's clc stuff

16:30 <karolherbst> although long math shouldn't cause such huge shaders

16:30 <jekstrand> Have you looked at those kernels? You can't get much simpler.

16:30 <karolherbst> yeah...

16:30 <karolherbst> I don't tink we leak though

16:30 <karolherbst> I see my mem consumption to jump around a little

16:31 <karolherbst> but it never exceeds a certian value

16:31 <jekstrand> Oh, I'm now very sure something's leaking. :)

16:31 <jekstrand> Something around kernels, specifically. I'm just not sure what yet.

16:31 <karolherbst> I doubt it

16:31 <karolherbst> the memory gets freed

16:31 gouchi has joined #dri-devel

16:32 <karolherbst> it just uses a lot of memory here

16:32 <karolherbst> but it does run

16:33 <karolherbst> jekstrand: it probably just has a huge buffer for the results

16:33 <jekstrand> That could be

16:33 <jekstrand> No, that's not it

16:33 <jekstrand> It's burning 4GB of shader memory

16:33 <karolherbst> I see the memory consumption to drop by a lot after a subtest finished

16:33 <karolherbst> mhh

16:35 * jekstrand wonders if we're leaking contexts

16:35 jessica_24 has joined #dri-devel

16:36 mszyprow has quit [Ping timeout: 480 seconds]

16:36 <jekstrand> Or if It's creating a context per test submission, effectively.

16:36 <karolherbst> I see multiple annoying races though

16:36 <karolherbst> soo

16:36 <karolherbst> you know you can't trust reporting, right?

16:36 paulk has joined #dri-devel

16:36 <karolherbst> like if the runtime races on the "allocated" value then that can just lead to broken values down the line

16:37 <karolherbst> vma.c also races.. uhh

16:37 <dcbaker> zmike: which issue?

16:37 <jekstrand> vma.c doesn't race. It's protected by a mutex

16:37 <karolherbst> yeah well

16:37 <karolherbst> it does

16:37 <jekstrand> according to what?

16:37 * dcbaker is excited for problems that are not dead firewalls

16:38 <karolherbst> tsan at least reports a few things

16:38 <jekstrand> Looks like the test creates 2037 contexts before dying

16:39 <karolherbst> mhh

16:39 <karolherbst> that would be annoying

16:39 <zmike> dcbaker: daniels beat you to it

16:39 <karolherbst> it could be that we refrence the context somewhere we shouldn't? mhh

16:39 <zmike> get back to that dead firewall

16:39 <karolherbst> jekstrand: context as in pipe_context?

16:40 <jekstrand> karolherbst: I'm checking now to see if it cleans them up

16:40 <jekstrand> karolherbst: Yes, pipe_context

16:40 <jekstrand> Hrm... It says it destroys 2027 of them

16:41 <karolherbst> sounds sane

16:41 <karolherbst> running with tsan makes stuff not crash here, so I'd assume we have some ugly races here and there

16:41 <jekstrand> So I should only have 10 instances of my 6MB scratch buffer live at any given time

16:42 <jekstrand> Uh...Yeah, iris is leaking scratch surfaces for some reason

16:42 <jekstrand> awesome.

16:42 <karolherbst> jekstrand: btw, is iris_bufmgr.c protected?

16:43 <karolherbst> although looks like it

16:43 <karolherbst> werid

16:43 <karolherbst> the tooling around figuring out races is really terribly bad anyway :(

16:44 <karolherbst> valgrinds solution to not knowing that atomics don't race is "just annotate your code, duh"

16:45 alyssa has joined #dri-devel

16:49 <dcbaker> zmike: I finally got it working. Spent from 7:30 yesterday till about 22:30 but finally got it working again... mostly

16:49 <dcbaker> still haven't figured out if it was the CPU or the motherboard that died

16:49 <zmike> dcbaker: yikes, that sounds awful

16:49 <jekstrand> karolherbst:

16:49 <jekstrand> long_math passed

16:49 <jekstrand> PASSED sub-test.

16:49 <jekstrand> PASSED test.

16:49 <dcbaker> yeah, kernel panic within 5 minutes of booting

16:49 <karolherbst> yay

16:50 <dcbaker> memtest works fine for hours

16:50 <karolherbst> jekstrand: what did you change?

16:51 <jekstrand> karolherbst: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15897

16:51 <karolherbst> oops

16:52 <jekstrand> I can now delete all my printf's and go back to debugging interesting things.

16:52 <karolherbst> :D

16:52 <karolherbst> I'll let airlied handle compiler features_macro

16:52 <karolherbst> this stuff is just broken in llvm

16:53 <karolherbst> and I know that airlied already knows how to fix that

16:53 <karolherbst> using opencl-h-base.h makes compiling _fast_ though

16:54 <alyssa> raise your hand if you want to ignore real work and add an SSA RA for an Apple GPU compiler :p

16:54 * alyssa raises hand

16:55 <karolherbst> fun stuff is the only real work and I refuse to accept any other pov here

16:55 <alyssa> :D

16:55 <zmike> totally me but I'm already booked for another 6 months solid on this deadlift platform

16:55 <alyssa> zmike: switch to 2012 and then you'll have opengl drivers and won't need zink anymore

16:56 <karolherbst> a gl driver is probably more work than zink in 2012

16:56 <alyssa> yeah but somebody else did the work :p

16:56 <zmike> alyssa: but then I'll just be in a squat rack on a different continent

16:57 <alyssa> let's see, I wonder if I have a way to test the M1 Linux GPU driver on my M1 Linux

16:57 <alyssa> maybe drm-shim? no, wait, that won't work there's no drm interface implemented yet

16:57 * alyssa sweats

16:57 <karolherbst> alyssa: I guess you need to write m1kms then

16:57 <karolherbst> ehh

16:57 <karolherbst> m1drm

16:58 <alyssa> already wrote m1kms that's how I have nice 4K software rendering :-p

16:58 <karolherbst> :D

16:58 <karolherbst> I guess on the m1 software rendering isn't all too slow

16:59 rpigott has quit [Ping timeout: 480 seconds]

16:59 <alyssa> AGX_FAKE_DEVICE=1, that was the one

17:02 <alyssa> aha, magic combination:

17:02 <alyssa> gallium-drivers=panfrost,swrast,asahi

17:02 <alyssa> tools=panfrost,drm-shim

17:02 <alyssa> AGX_FAKE_DEVICE=1 LIBGL_DRIVERS_PATH=~/lib/dri/ LD_PRELOAD=~/mesa/build/src/panfrost/drm-shim/libpanfrost_noop_drm_shim.so

17:03 <alyssa> that gets a fake (mali) render node and then loads the m1 driver as a fake software rast

17:03 <alyssa> which tricks shader-db into using the agx compiler

17:03 <alyssa> despite no render nodes whatsoever

17:03 <alyssa> wait, no, now it's running panfrost.

17:03 <alyssa> gah

17:04 TJ_Mercier has quit [Remote host closed the connection]

17:07 * alyssa supposes she should start stubbing a DRM interface for Asahi, now that work is happening on the kernel

17:07 frieder has quit [Remote host closed the connection]

17:08 jkrzyszt has quit [Ping timeout: 480 seconds]

17:18 * alyssa copies piles of code from panfrost

17:21 iive has joined #dri-devel

17:21 pendingchaos_ has joined #dri-devel

17:24 pendingchaos has quit [Ping timeout: 480 seconds]

17:26 Peste_Bubonica has joined #dri-devel

17:28 devilhorns has quit []

17:34 rpigott has joined #dri-devel

17:34 mbrost has quit []

17:36 <Kayden> jekstrand: part of what I was trying to get at with my comment on !15829 is that if OpenCL is using uclz or ifind_msb on non-32-bit datatypes...it may be broken. at least it would be on 64-bit types

17:36 <Kayden> jekstrand: because the nir constant expression handling hardcodes 31

17:36 <alyssa> Ruh roh

17:36 <alyssa> Kayden: good chance to fix int8/int16 too ;)

17:37 <Kayden> those might work

17:38 <alyssa> the lack of an in-tree agx disassembler is really starting to biite

17:38 <Kayden> jekstrand: posted R-b for patches 2-3, I'm not planning to review patch 1, feel free to land it

17:39 <alyssa> am I going to take the isaspec plunge? or open code yet another disasm? idk

17:39 <karolherbst> alyssa: I'd use isaspec

17:39 <jekstrand> Kayden: uh... WAT?

17:39 minecrell has quit [Quit: :( ]

17:40 <jekstrand> So it does...

17:40 <jekstrand> Kayden: That's easily fixed

17:41 <dj-death> found the culprit

17:41 <dj-death> opt_if_rewrite_uniform_uses()

17:41 <alyssa> karolherbst: do we have proof of it working for !freedreno yet?

17:41 <jekstrand> Kayden: We also appear to not have int64 lowering for them. :-/

17:41 <jekstrand> idk why the CL tests didn't pick up on that.

17:42 <karolherbst> alyssa: you could be that proof

17:42 <karolherbst> dunno if anybody else is using it, but _if_ I write a new compiler, I'd use them

17:43 <karolherbst> *that

17:43 minecrell has joined #dri-devel

17:44 <alyssa> it's overwhelming

17:46 <alyssa> and agx has some weird encoding details to it

17:49 pendingchaos_ is now known as pendingchaos

17:51 <alyssa> karolherbst: https://rosenzweig.io/0001-HACK-agx-Call-out-to-the-python-disassembler.patch

17:51 <alyssa> tu l'aimes ?

17:51 <karolherbst> uhhh

17:51 <karolherbst> :D

17:52 <karolherbst> what's that shm/dump stuff anyway?

17:52 * jekstrand wonders why we have a different clz for OpenCL

17:52 <alyssa> write to a file

17:52 <alyssa> jekstrand: what

17:52 <karolherbst> jekstrand: you don't want to know why

17:52 <Kayden> there did seem to be some redundancy in those opcodes

17:53 <jekstrand> Kayden: Uh oh...

17:53 <jekstrand> karolherbst rather

17:54 <karolherbst> there was a reason but I forgot.. something something precision

17:54 <karolherbst> maybe clz is one of the non broken one, I forgot

17:54 <alyssa> clz.. precision..?

17:55 <karolherbst> yeah.. dunno

17:55 <karolherbst> I think it was the case for others, clz might actually be fine

17:55 <karolherbst> let's see

17:55 <karolherbst> maybe because it was hardcoded to 32 bit and I didn't want to mess with it?

17:56 <karolherbst> ehh

17:56 <karolherbst> I can put the blame on airlied

17:57 <karolherbst> and jekstrand :D

17:57 <jekstrand> What the?!? Switching nir_clz_u to nir_uclz causes SPIR-V parsing to start failing.

17:57 <karolherbst> fun

17:57 <jekstrand> Oh... I need a u2u on it

17:58 mbrost has joined #dri-devel

17:58 <karolherbst> oh wow.. it grew "93 files changed, 14976 insertions(+), 314 deletions(-)"

17:58 <karolherbst> ehh wait

17:59 <karolherbst> I checked in random files

17:59 <karolherbst> oops

17:59 <jekstrand> Oh, bother...

17:59 <jekstrand> Yeah, it's 32-bit-only

18:00 <karolherbst> " 89 files changed, 13138 insertions(+), 313 deletions(-)" still massive

18:00 <karolherbst> who wants to review?

18:00 <alyssa> karolherbst: still smaller than powervr

18:01 <karolherbst> uhhh "BITSET_DECLARE(textures_used, 128);" uhhh

18:01 <karolherbst> I am doing it

18:02 <karolherbst> jekstrand: I guess we could just do a u2u for cl then?

18:02 <jekstrand> Yeah, maybe. Though the lowering for that one is tricky.

18:03 <karolherbst> right...

18:04 <karolherbst> what was our conclusion about textures_used though? airlied said something we probably don't have to bump it, but?

18:04 <jekstrand> I don't know. I've been putting out other fires.

18:04 <karolherbst> smart

18:04 <karolherbst> I'll just reap airlied branches and cherry-pick until it works

18:05 <dj-death> aaaa, looks like the block_index is not preserved on resume shaders

18:05 <dj-death> but why is it not rebuilt?

18:06 <jekstrand> karolherbst: Ok, now that I've sorted out that BO leak, maybe images are next. Where should I start?

18:06 <karolherbst> host ptrs on arrays + 3d?

18:06 <jekstrand> which test hits that?

18:08 <jekstrand> The image_streams test fails seem to be CLAMP_TO_EDGE with normalized coords

18:08 ngcortes has joined #dri-devel

18:08 <karolherbst> ./build/test_conformance/images/kernel_read_write/test_image_streams write 3D CL_MEM_USE_HOST_PTR

18:09 <karolherbst> no idea why textures work though

18:10 <karolherbst> I mean read images

18:12 <jekstrand> karolherbst: I'm not sure how to even implement USE_HOST_PTR for 3D images reliably. Intel has all sorts of restrictions there.

18:12 <karolherbst> :(

18:12 <karolherbst> what's the problem?

18:12 <jekstrand> Maybe not if it's linear?

18:12 * jekstrand looks

18:13 <karolherbst> jekstrand: also.. is 1d/2d array an issue?

18:13 <karolherbst> although... mhh disabling writes to 3d images also suck

18:13 <karolherbst> anyway.. seems like the CTS doesn't use host_ptr for the read image tests at all

18:14 <karolherbst> jekstrand: mem_host_flags mem_host_read_only_image also fails, but I think it's the same issue, just different

18:16 <jekstrand> Yeah, any array is going to be an issue

18:17 <karolherbst> ehh no, mem_host_read_only_image is a different fail

18:17 <karolherbst> that's I think alignment stuff

18:18 <karolherbst> yeah.. somehting weird is happening there

18:18 <karolherbst> isl_calc_row_pitch fails

18:18 <karolherbst> min_row_pitch_B is 1024, but surf_info->row_pitch_B is 800

18:20 ybogdano has quit [Ping timeout: 480 seconds]

18:21 <karolherbst> "api/test_api get_image1d_info" hits this issue as well

18:23 <jekstrand> karolherbst: I don't see how we can implement USE_HOST_PTR for 3D or array images on iris. We either need to lower to basically a buffer or we need to have a shadow copy that the GPU accesses.

18:24 <karolherbst> okay

18:24 <jekstrand> It's literally impossible to program the hardware with a slice pitch that's not a multiple of 4 rows

18:24 <karolherbst> uhh

18:24 <jekstrand> idk how the Intel CL driver does this

18:24 <karolherbst> yeah, dunno either

18:25 <karolherbst> let me check something...

18:25 TJ_Mercier has joined #dri-devel

18:26 <karolherbst> it's so strange

18:26 <karolherbst> so there is a query to get the required alignment, but only for 2D images created from a buffer object.. it's so wild

18:27 TJ_Mercier has quit []

18:27 tjmercier has joined #dri-devel

18:28 <jekstrand> Yeah

18:28 tjmercier_ has joined #dri-devel

18:28 tjmercier_ has left #dri-devel [#dri-devel]

18:29 <jekstrand> karolherbst: Hrm... I wonder...

18:29 tjmercier has quit []

18:31 <jekstrand> OpenCL implementations are allowed to cache the buffer contents pointed to by host_ptr in device memory. This cached copy can be used when kernels are executed on a device.

18:32 <karolherbst> yeah, so that just means we don't have to immediately reflect changes to the host

18:33 <jekstrand> So I think you're still required to map/unmap around CPU usage, it's just that the client basically gave you the map they want.

18:33 <karolherbst> and contents are only at sync at synchronization points

18:33 <karolherbst> yeah, you have to sync explicitly

18:33 <jekstrand> Well, that's the bit that's not clear

18:33 <jekstrand> When are they supposed to be synchronized.

18:33 <jekstrand> The naeve answer would be map/unmap

18:33 <karolherbst> when all maps are dropped

18:33 <karolherbst> it's written in the spec actually

18:34 <jekstrand> Ok

18:34 <karolherbst> or ehh.. when you map

18:34 <karolherbst> "5.5.3. Accessing mapped regions of a memory object"

18:34 <karolherbst> "If a memory object is currently mapped for writing .." and so on

18:34 <karolherbst> those two sections implicitly require this

18:35 <jekstrand> ok

18:35 <karolherbst> okay, so we have to do shadow buffering.. shouldn't be too painful

18:35 <karolherbst> I can hack something up

18:35 <jekstrand> I just pushed to my branch again with an iris commit which makes it gently return NULL if it can't create the resource instead of asserting.

18:36 <karolherbst> jekstrand: so what you should change is to just fail from_user if it's not possible to use

18:36 <jekstrand> So you can go ahead and try to create it with a host pointer and then fall back.

18:36 <karolherbst> ahh, okay

18:36 <karolherbst> cool

18:36 <karolherbst> that's what clover is already doing, but in the past we failed for any non page aligned pointer :)

18:36 <jekstrand> OpenCL is starting to get stupid....

18:36 <karolherbst> yes...

18:36 <jekstrand> karolherbst: Sure. We can do a lot better than that, fortunately. :)

18:36 <karolherbst> those are the things which make you "ahh, that's why clover was doing it like that"

18:37 <karolherbst> jekstrand: k, mind looking into that 1d fail though with the non matching pitch?

18:37 <karolherbst> dunno what's up there

18:38 <karolherbst> but might be fixable?

18:38 <karolherbst> "api/test_api get_image1d_info" is probably the easiest to hit it

18:38 <jekstrand> sure

18:42 ybogdano has joined #dri-devel

18:44 <dj-death> does adding a nir_push_if() in a shader preserves the dominance metadata?

18:44 <jekstrand> no

18:44 Haaninjo has joined #dri-devel

18:45 <jekstrand> If you alter control-flow at all, throw away all metadata

18:46 <karolherbst> random thought: until today I am actually surprised how well the structurizer holds up.

18:49 <jekstrand> Did it break today?

18:49 <karolherbst> nope

18:51 <jekstrand> There were a couple bugs in the original implementation but I got them sorted as part of the ray-tracing work. Those kernels are pretty brutal.

18:52 <jekstrand> karolherbst: Looks like SKL+ requires 1D images to have a stride aligned to 64 pixels. That's why it's failing.

18:52 <karolherbst> ahh, I guess I missed the fun

18:52 <karolherbst> okay

18:52 <karolherbst> makes kind of sense

18:52 <karolherbst> I am sure shadow buffering fixes that as well, but maybe we can be a bit better here

18:52 <jekstrand> So those will hit the same fall-back shadow path

18:53 <karolherbst> then I guess there isn't much left to do actually

18:54 <jekstrand> dj-death: Not sure if you saw it but https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15811 could use your eyes quick. In particular, I have no idea what alignment OA requires.

18:54 <karolherbst> fill_buffer fails due to blorp hittin assters

18:54 <karolherbst> I think because the fill data is bigger then this 16 bs value? something like that

18:55 <karolherbst> "buffers buffer_fill_float" e.g.

18:55 <jekstrand> Yup

18:55 <karolherbst> but I guess we want an accelerated kernel there anyway

18:55 <jekstrand> I need to implement fill_buffer for realz

18:55 <jekstrand> Or we need the state tracker to do something

18:55 <karolherbst> so two projects for me: 1. accelerated fill_buffer 2. shadow buffers

18:55 <karolherbst> yeah.. something

18:55 <karolherbst> doesn't have to be CL specific

18:56 <jekstrand> I can implement it in blorp or in iris. I just have to figure out how I want to go about it.

18:56 <karolherbst> I still need to find somebody to fix spirv-link :D

18:56 <dj-death> jekstrand: done

18:57 <jekstrand> dj-death: Thanks!

18:58 <karolherbst> jekstrand: ohh.. and we have to think about what to do about those clamp fails

18:58 <jekstrand> karolherbst: Yeah, let me take a peak

18:58 <jekstrand> Those seem super weird

18:59 * karolherbst kicks of another CTS run

19:00 <karolherbst> jekstrand: from what I know is, that most/all of this fails are like in the "precision" area, probably weird rounding in some places or something

19:00 <karolherbst> but dunno what we can actually do about it, if the hardware is not good enough

19:01 <karolherbst> we could probably also cheat and check what the intel stack is doing, but wouldn't surprise me if that falls inside llvm land

19:03 <airlied> jekstrand: aco isn't hooked up to radeonsi at all yet

19:03 <karolherbst> airlied: ahh the person I waited for :D

19:04 <karolherbst> airlied: soo.. opencl-c-base.h + all those feature defines to enable/disable compiler features, how does that stuff work?

19:04 <karolherbst> I noticed that compilation speed skyrocket once I use base.h, but it didn't help me fix the compiler features_macro test

19:04 <airlied> karolherbst: it works badly

19:05 <airlied> using the base + generated stuff probably is useable in llvm14

19:05 <airlied> tbh for CL3.0 I expect there is still cleanup to be done on in both methods even now

19:06 <karolherbst> ahh yeah...

19:06 <karolherbst> sad

19:06 <airlied> at least now I have llvm commit rights, so I can speed up landing the fixes

19:07 <airlied> karolherbst, jekstrand : if you can look at 15876 and make sure you are comfortable with it (acks please), I think for radeonsi rusticl will need to handle the last two bits

19:11 <jekstrand> airlied: I'm very confused by the need to unlower GLOBAL_GROUP_SIZE. Does radeon not have workgroup_size and num_workgroups?

19:11 <airlied> jekstrand: the LLVM backend has a fixed API, it doesn't have those

19:11 <airlied> it does have global group size

19:11 <jekstrand> Ugh

19:12 <airlied> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7231/diffs?commit_id=1cbce6b2b3a26992999bf9547ab76a7632ee5c6a just to make you feel worse

19:13 <airlied> and yes that is at least what ROCm was doing 10 months ago, I should recheck if they ever worked out how dumb it was

19:14 <jekstrand> airlied: Ok, so at least it's consistently stpuid

19:14 <karolherbst> jekstrand: for details, read up on the clover llvm backend :p

19:14 shashank_sharma has joined #dri-devel

19:15 <jekstrand> I think, with that, we have an almost complete combinatorial set of possible lowerings of global/local_size/index.

19:15 <karolherbst> :D

19:16 <airlied> gotta catch em all

19:17 gawin has joined #dri-devel

19:18 <jekstrand> I kind of want an interface that's just "tell me what system values you have and I'll sort it out"

19:19 <airlied> jekstrand: yeah I fell over the using the compute system values flags yesterday, but since lots of things call it from various places it can't really do negative flags

19:19 <airlied> hence why I stuck it into options

19:20 shashanks has quit [Ping timeout: 480 seconds]

19:29 <airlied> jekstrand: the kernel args backend work is also a horrid thing

19:31 <jekstrand> airlied: I really hate that whole MR.

19:31 <jekstrand> Maybe ACO would be easier. :P

19:32 <airlied> I'd have to trick dschuermann then :-P

19:37 <karolherbst> :D

19:37 <karolherbst> something we could change in rusticl to make it less annoying?

19:38 mbrost has quit [Remote host closed the connection]

19:38 mbrost has joined #dri-devel

19:39 <airlied> karolherbst: nope, the annoying is the LLVM/amdgpu ABI

19:40 <airlied> I think those two things that need changing are just don't lower the kernel args, and lower work_dim support

19:44 <airlied> karolherbst: where is your cts launcher again?

19:45 <karolherbst> airlied: https://gitlab.freedesktop.org/karolherbst/opencl_cts_runner.git

19:51 <airlied> karolherbst: my cts build names all the cts test binaries conformance_test_ isntead of test_

19:51 <airlied> not sure when that changed

19:52 <karolherbst> strange

19:52 <karolherbst> airlied: use this instead then: https://github.com/karolherbst/OpenCL-CTS/commits/master

19:52 <karolherbst> I have some fixes to make them return proper error codes anyway

19:57 nchery has quit [Read error: Connection reset by peer]

19:58 nchery has joined #dri-devel

19:58 Duke`` has quit [Ping timeout: 480 seconds]

20:01 * jekstrand drags back up !7939 and takes a look

20:09 <airlied> jekstrand: there's still a lot of work, though hooking up kernel should be simpler

20:09 <airlied> but then the backend isn't kernel ready

20:10 <jekstrand> airlied: It's probably not that far off

20:11 <jekstrand> VK_KHR_buffer_device_address is like half of what you need.

20:11 <airlied> jekstrand: it wasn't that far off a year ago either

20:11 <jekstrand> airlied: I've been meaning to learn a bit about AMD hardware. :)

20:11 <airlied> hence why I've left the amd/nir patch sit around for a year

20:12 <airlied> I'd rather not leave it sit around for another year, if I can images work it has value now

20:12 <jekstrand> Yeah

20:12 * airlied has no idea if I can get images to work though, the ABI might defeat me :-P

20:12 <jekstrand> I just hate having to design around AMD's LLVM ABI. :-(

20:12 <jekstrand> I'd rather make ACO work

20:13 <karolherbst> how hard would it be to add another ABI which is essentially a "no ABI"?

20:13 <karolherbst> although I guess there are a few things which needs to be defined :(

20:14 <karolherbst> anyway.. shadow buffers...

20:14 shashank_sharma has quit [Read error: Connection reset by peer]

20:14 <jekstrand> Ugh... need newer libdrm

20:14 shashank_sharma has joined #dri-devel

20:15 <DrNick> need to be able to stack meson devenvs

20:16 <karolherbst> ehh wait.. I just need to keep a pointer with data

20:16 * jekstrand should probably update to new fedora

20:16 <karolherbst> 36?

20:17 <karolherbst> is that already out?

20:17 <karolherbst> jekstrand: but you can also just upgrade one package

20:17 <airlied> pretty sure I've built it for f35 already

20:17 <airlied> you can just the f35 package on f34 if you are still there

20:17 <karolherbst> sudo dnf install fedora-repos-rawhide -y

20:18 <jekstrand> karolherbst: Yeah, I can. But it'm gonna need to bump to 36 before too long anyway

20:18 <karolherbst> sudo dnf --disablerepo=* --enablerepo=rawhide

20:18 <jekstrand> But maybe not today

20:18 <alyssa> ACO+rusticl, the future is now?

20:18 <jekstrand> I pulled drm from 36

20:18 <karolherbst> k

20:18 <karolherbst> I also have to update at some point

20:18 <karolherbst> alyssa: everything+rusticl you mean

20:18 mvlad has quit [Remote host closed the connection]

20:19 <karolherbst> there are two things we already planed to make it easier for other drivers: 1. drop this input pointer thing 2. clear_buffer emulation

20:19 <karolherbst> ehh and 3. shadow buffers for host_ptrs

20:19 <karolherbst> with that it's really just set_global_bindings + some caps, no?

20:20 * karolherbst hates the set_global_bindings interface though

20:20 <jekstrand> How do I tell meson to nuke its cached versions of what I have on the system and fully reconfigure?

20:21 <alyssa> jekstrand: rm -rf build

20:21 <alyssa> ? :p

20:21 <karolherbst> configure --clearcache?

20:22 <airlied> meson configure --clearcache is the magic

20:23 <jekstrand> Thanks!

20:23 <karolherbst> uhhh

20:23 <karolherbst> now I just noticed how big of a thing this shadow buffer stuff is

20:24 <jekstrand> karolherbst: Yeah, it's going to be a real PITA

20:24 <karolherbst> cl_mems are not just one pipe_resource but a dev -> pipe_resource mapping thing

20:24 <karolherbst> I always forget this

20:24 <karolherbst> so I need to manage that per device

20:24 <karolherbst> my interfaces don't work like that :D

20:24 <karolherbst> ehh, I have an idea

20:26 <airlied> the horrors of clover come back

20:26 <karolherbst> :D

20:26 <karolherbst> sadly

20:26 <karolherbst> airlied: btw.. I have multiple tests crashing inside llvmpipe shaders

20:27 <karolherbst> in case you want to have some fun

20:27 <jekstrand> Ugh... Glancing at this radeonsi ACO patch, it has RADV data structures in it. :(

20:27 <karolherbst> or are you ignoring lp now and just go with radeonsi:P

20:27 <airlied> jekstrand: yeah the driver/compiler interface is a bit mushy

20:27 <karolherbst> we do cheat a little, but iris has a better passrate than lp and that confuses me

20:28 <airlied> karolherbst: crashing inside shaders usually means you passed the wrong garbage in

20:28 <karolherbst> well..

20:28 <airlied> so it ends up reading from an ssbo that doesn't exist or somethgin

20:28 <karolherbst> yeah... I guess?

20:28 <karolherbst> but how would I debug that

20:28 <karolherbst> it doesn't crash with iris is what confuses me

20:29 <airlied> karolherbst: you should talk to zmike about that confusion

20:29 <zmike> what

20:29 <karolherbst> :D

20:29 <karolherbst> you mean stuff which works with zink but not with the native GL driver?

20:30 <airlied> karolherbst: no things zink crashes on that work with other vulkan drivers

20:30 <karolherbst> ahh

20:30 * karolherbst goes back into this shadow buffer cave

20:30 <zmike> 🤔

20:30 <airlied> just because iris doesn't die horrible, doesn't mean llvmpipe is wrong :-P

20:30 <karolherbst> I am sure llvmpipe is wrong

20:31 <airlied> karolherbst: if it crashes with clover get back to me :-P

20:31 <karolherbst> anyway, I have no idea on how to even debug this, hence me asking

20:31 <karolherbst> airlied: it does

20:31 <karolherbst> otherwise I would have fixed it already

20:31 <airlied> ah send me the test name or file an issue

20:31 <alyssa> airlied: so i'm thinking of adding a new software rasterizer to mesa

20:31 <zmike> no

20:31 <alyssa> airlied: it'll have way fewer features than llvmpipe

20:31 <airlied> you generally debug adding lp_build_print_value everywhere

20:32 <alyssa> orders of magnitude slower, too

20:32 <airlied> alyssa: we have one called softpipe :-P

20:32 <karolherbst> airlied: ./build/test_conformance/math_brute_force/test_bruteforce cosh -w

20:32 <airlied> we already inveented the wheel

20:32 <alyssa> no no no this one will be slower than softpipe i promise

20:32 <karolherbst> have fun with printf

20:32 * airlied stays away from trig functions

20:32 <airlied> esp ones with h on the end

20:32 <alyssa> it'll need an extra build dependency though, not sure that's ok

20:32 <airlied> I don't remember maths class teaching about those

20:32 lemonzest has quit [Quit: WeeChat 3.4]

20:32 <karolherbst> guess why I am asking questions on how to debug this stuff :P

20:33 <jekstrand> How do I fetch a commit in gitlab which is clearly in the UI but not in any branch I know of?

20:33 <airlied> alyssa: is this going to be a gfx pipeline in a compute shader?

20:33 <alyssa> hyperbolic trig functions were covered in uni math classes

20:33 <alyssa> airlied: no the dependency is verilator ;p

20:33 <jekstrand> I can make a tag but I don't really want to do that.

20:33 <airlied> jekstrand: add .patch to the end? and downlad it

20:33 <karolherbst> jekstrand: just fetch on the commit

20:33 apinheiro has joined #dri-devel

20:33 <jekstrand> karolherbst: Oh, I didn't know you could do that. Neat!

20:34 <karolherbst> I didn't know that either

20:34 <karolherbst> I guessed

20:34 <karolherbst> but that sounds like something which git supports :D

20:34 <airlied> karolherbst: okay I got clctsrunner going, I had some ancient cts tree

20:34 <alyssa> .patch is great, how do you think my email gitlab workflow goes :p

20:34 <airlied> karolherbst: but yeah to debug llvmpipe you add printfs to the llvm generated code, there is no other good way

20:35 <karolherbst> but fun that you can actually fetch by commit :)

20:35 <karolherbst> but git has terrible magic

20:35 <airlied> dang it gpu hang

20:35 <karolherbst> jekstrand: btw, MRs also have refs on gitlab in case you didn't know. You can even configure git to auto fetch all Ms

20:36 <karolherbst> *MRs

20:36 * karolherbst guess he knows why his .git dirs are so huge

20:37 <karolherbst> wow.. my kernels .git is like 6GB

20:40 <alyssa> oof

20:46 <DrNick> eh, origin, stable and tip are 5 GB

20:47 <karolherbst> mhh

20:48 lynxeye has quit [Quit: Leaving.]

20:48 <jekstrand> Ugh... Trying to rebase across "radeonsi: switch to 3-spaces style"

20:48 <karolherbst> ...

20:48 <karolherbst> the best commits

20:48 <alyssa> jekstrand: this is what I'm scared of for panfrost

20:49 <karolherbst> never ever change it

20:49 <jekstrand> No, I'm not rebasing across that

20:49 <jekstrand> hrm...

20:49 <alyssa> it is an awful mix of 3- and 8- depending on how old the code is

20:49 <alyssa> jekstrand: there's a clever way to do it, aco had a script iirc

20:50 <karolherbst> just make it weird for everybody and choose the GNU code style

20:51 * alyssa trying to figure out p_split vs p_extract etc

20:52 <DrNick> syntax aware whitespace-agnostic patching

20:52 <alyssa> the way this is handled in ir3 is really weird

20:53 tales_ has joined #dri-devel

20:54 <airlied> jekstrand: wasn't that MR merged? or are there other patches?

20:54 <jekstrand> airlied: There are other patches. I'm attempting a rebase now

20:55 <karolherbst> is there a way on a pipe_resorce to know if it's using user_memory or not?

20:56 <karolherbst> I guess not, but..

20:56 <jekstrand> karolherbst: No

20:56 danvet has quit [Ping timeout: 480 seconds]

20:56 <karolherbst> k.. then I need to track that myself

20:58 * alyssa wonders how split of a phi works

21:00 gouchi has quit [Remote host closed the connection]

21:01 <alyssa> oh ACO just does p_extract ok

21:06 shashank_sharma has quit [Ping timeout: 480 seconds]

21:13 ybogdano has quit [Ping timeout: 480 seconds]

21:20 gawin has quit [Ping timeout: 480 seconds]

21:21 mszyprow has joined #dri-devel

21:23 ngcortes has quit [Ping timeout: 480 seconds]

21:24 ahajda has quit [Remote host closed the connection]

21:32 maxzor has joined #dri-devel

21:32 gawin has joined #dri-devel

21:32 <FLHerne> jekstrand: git rebase --ignore-whitespace ?

21:33 <FLHerne> it ignores the whitespace

21:36 ngcortes has joined #dri-devel

21:36 ybogdano has joined #dri-devel

21:46 <jekstrand> airlied: Got it rebased and building: https://gitlab.freedesktop.org/jekstrand/mesa/-/commits/radeonsi/aco

21:47 <jekstrand> Doesn't do anything. It's not capable of using the newly compiled shader. Need to figure that out nex<t.

21:48 <alyssa> ;_D

21:49 <jekstrand> dcbaker: Can you make waffle releases?

21:50 <jekstrand> Or maybe jljusten?

21:50 mszyprow has quit [Ping timeout: 480 seconds]

21:57 icecream95 has joined #dri-devel

21:58 maxzor has quit [Ping timeout: 480 seconds]

21:59 sneil has quit [Remote host closed the connection]

22:00 sneil has joined #dri-devel

22:01 <jljusten> jekstrand: yes, in theory. :) xexaxo has been doing most of the maintenance of waffle for quite some time though.

22:02 <jekstrand> jljusten: I've been bugging him to make a release for a while and nothing happens. :-(

22:02 <jekstrand> jljusten: So I was hoping someone else would maybe make it happen?

22:02 <jekstrand> We've already missed f36 :(

22:03 <jekstrand> I fixed a bug affecting radeonsi 5 months ago and it's not in a release yet. :-(

22:05 <jljusten> jekstrand: hmm, maybe if I trick dcbaker into working on it, then I can get the Mesa 22.1 branch point delayed. :) (still waiting on i915 for dg2...)

22:06 <jekstrand> jljusten: From what I've heard, we probably don't want to wait on i915 DG2. :-(

22:06 <jekstrand> But I suppose you're supposed to know more about that than me. :)

22:06 <alyssa> jekstrand: why is i915 for dg2 a thing

22:07 * jekstrand no longer has to justify Intel's decisions. :P

22:07 <jljusten> jekstrand: it seems there is chance for drm-next to have it merged by the end of April

22:08 <jljusten> alyssa: we need a couple i915 query items to be defined

22:21 * jekstrand is getting very confused by this driver<->compiler "interface"

22:21 <dcbaker> jekstrand: I can't for some reason, I think only Emil and jljusten can

22:34 <jljusten> jekstrand: I think dcbaker will try to work on it

22:34 <jekstrand> jljusten: Works for me. As long as one happens, I don't care who types the "bump the version" commits and does the tagging.

22:36 <dcbaker> jljusten: any problems with me merging: https://gitlab.freedesktop.org/mesa/waffle/-/merge_requests/81

22:36 <karolherbst> ahhhh...

22:38 <jekstrand> dcbaker: I don't see a good reason why not. It's been 5 years since I've heard a peep about NaCL

22:38 <jekstrand> dcbaker: It'd be nice to get an ack from chadv or someone else at Google but I don't see a problem with it.

22:38 <dcbaker> okya, merged

22:38 <dcbaker> that sounded like a second ack

22:39 <robclark> jekstrand, dcbaker: NaCL is still alive and kicking for another couple years

22:39 <robclark> (annoyingly)

22:39 <dcbaker> does it matter for waffle though?

22:40 <jekstrand> robclark: That's very ssad

22:40 <dcbaker> and does the code even work anymore?

22:40 <robclark> what it has to do with waffle, I have no idea

22:40 <dcbaker> though I did merge the waffle MR already...

22:40 <karolherbst> jekstrand: "FAILED 42 of 105 sub-tests." -> "FAILED 16 of 105 sub-tests." :3

22:40 <jekstrand> karolherbst: \o/

22:40 <karolherbst> something up with 3D and no idea what

22:40 <jekstrand> karolherbst: Getting shadows working?

22:41 <karolherbst> yeah

22:41 <jekstrand> Oh, I was going to look at CLAMP

22:41 <jekstrand> I should do that

22:41 <dcbaker> jljusten, jekstrand I vote for the keithp strategy, "let's delete it and see if anyone complains"

22:41 <karolherbst> jekstrand: good point

22:42 <karolherbst> with shadow buffers 3D goes from 21 of 21 fails down to just 15

22:42 <karolherbst> ohh.. resource creation fails

22:42 <karolherbst> nvm then

22:42 <karolherbst> something is still not working alright, but not sure what

22:42 <jekstrand> karolherbst: Any easy way to run just a clamp test?

22:42 <karolherbst> specify the options

22:43 <karolherbst> you can pass in format + target and stuff

22:43 <karolherbst> just find a combination which then just runs one thing

22:43 <karolherbst> read int CL_R ... something or so

22:44 <karolherbst> jekstrand: there is just one ugly thing: non blocking maps :'(

22:44 <karolherbst> I need to rewrite my code again

22:45 <jekstrand> :'(

22:46 <karolherbst> ohh.. I have an idea

22:46 sneil_ has joined #dri-devel

22:47 sneil has quit [Read error: Connection reset by peer]

22:47 <jekstrand> karolherbst: I'm very confused by this test. It seems to be rounding funny

22:47 <karolherbst> yeah.. it takes time to get used to

22:48 <karolherbst> but it's doing the correct thing

22:50 zf has quit [Quit: No Ping reply in 180 seconds.]

22:51 zf has joined #dri-devel

22:53 Haaninjo has quit [Quit: Ex-Chat]

22:55 <jekstrand> karolherbst: It's doing NEAREST and sampling exactly on the edge between pixels

22:55 <jekstrand> :-/

22:55 <karolherbst> yep

22:56 <karolherbst> this is the area where gl doesn't care :P

22:57 <karolherbst> worst case we have to add a flag in the sampler_view with "super precise clamping"

22:57 <karolherbst> ehh sampler

22:57 <karolherbst> one day I will alwyas use the right term

22:57 <jekstrand> I don't think this is fundamentally a clamping issue

22:57 <jekstrand> Maybe it is?

22:57 <karolherbst> dunno

22:57 <jekstrand> I guess clamping might affect it somehow

22:57 <karolherbst> what's the Intel CL stack doing?

22:58 <jekstrand> idk

22:58 <karolherbst> jekstrand: ohh, right, I also saw issues on non clamped values

22:58 <jekstrand> I don't think I even have it installed. :-/

22:58 <karolherbst> I do

22:58 <karolherbst> ehh

22:58 <karolherbst> how slow

22:58 <jekstrand> I don't think the Mesa dumping tools will work with the CL driver

22:58 <karolherbst> probably not

22:58 <karolherbst> but it does seem to do the right thing

22:59 <karolherbst> woah... they even supprt CL_LUMINANCE, crazy shit

22:59 <jekstrand> CL_LUMINANCE? Ok, I'm done with this API...

22:59 <karolherbst> there is also CL_DEPTH

23:00 <karolherbst> you know... there is this gl sharing extension

23:00 <jekstrand> :facepalm:

23:00 <karolherbst> jekstrand: it's optional though.. CL_LUMINANCE that is

23:00 <karolherbst> I expose only the required formants in rusticl for now

23:00 <karolherbst> but all of them

23:00 shashanks has joined #dri-devel

23:00 <karolherbst> adding things like CL_A or CL_RG/CL_RA are potential no brainers to add later

23:01 <karolherbst> there is CL_INTENSITY as well :)

23:01 <karolherbst> I think the ones I love the most are the x ones

23:01 <karolherbst> CL_sRGBx for extra nightmares

23:02 mbrost has quit [Ping timeout: 480 seconds]

23:02 <karolherbst> jekstrand: CL 2.0 even requires CL_DEPTH

23:03 * jekstrand clones the compute-runtime repo

23:03 <karolherbst> but because they knew how insane that was, it's only required for 2D images

23:03 <karolherbst> but I am surprised how my debug build runs faster through those tests than intels stack

23:04 <karolherbst> jekstrand: we need to get function calling working for... reasons

23:04 nchery has quit [Ping timeout: 480 seconds]

23:09 mbrost has joined #dri-devel

23:11 pcercuei_ has quit []

23:13 h0tc0d3 has joined #dri-devel

23:14 <jekstrand> karolherbst: Yeah, functions are a big project

23:14 mhenning has joined #dri-devel

23:15 * jekstrand wonders if the intel driver is doing something clever here. :-/

23:15 <jekstrand> Are we supposed to add 0.5 / texSize() or something?

23:15 <karolherbst> no clue

23:20 nchery has joined #dri-devel

23:25 rkanwal has quit [Ping timeout: 480 seconds]

23:26 <anholt> CLAMP -- do you mean GL_CLAMP?

23:27 <jekstrand> CLAMP_TO_EDGE

23:27 <anholt> oh, good

23:28 <jenatali> jekstrand: Not sure if https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/microsoft/clc/clc_compiler.c#L386 is helpful to you?

23:28 <jenatali> For non-normalized coords, they're supposed to translate to a coord at the pixel center

23:29 <jekstrand> jenatali: This is for the normalized case

23:29 <jenatali> Oh ok, nevermind :)

23:29 morphis has quit [Ping timeout: 480 seconds]

23:29 morphis has joined #dri-devel

23:36 * karolherbst kicks of the CTS and hopes nothing breaks

23:40 <zmike> is someone slamming iris jobs on ci or something?

23:40 <jekstrand> I've run a few but I wouldn't say "slamming"

23:40 <zmike> just had a merge fail and like half the iris jobs were still going

23:41 <jekstrand> :-/

23:43 khfeng has joined #dri-devel

23:49 rasterman has quit [Quit: Gettin' stinky!]

23:49 alanc has quit [Remote host closed the connection]

23:50 alanc has joined #dri-devel

23:55 <bl4ckb0ne> dcbaker: cheers, i completely forgot about that waffle patch

23:58 gawin has quit [Ping timeout: 480 seconds]