#dri-devel on 2022-01-28 — irc logs at oftc.irclog.whitequark.org

2021-07-26 22:56 ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar

00:00 <graphitemaster> Everything being the same old single GDDR / HBM is so ridiculous.

00:00 <mareko> is that a joke? :)

00:01 <graphitemaster> It's not a joke, no. I want two separate types of memory and two separate cache systems with explicit intrinsics in my shading language to flush and syncronize the caches

00:01 <graphitemaster> shared memory doesn't count, it's too small and it's not programmable.

00:02 <graphitemaster> Also it eats away at cache

00:02 <graphitemaster> (programmable as in I can upload data to it from the CPU like any other resource)

00:02 <graphitemaster> AMD has ubiquitous tiled resources which basically live in their own world and have a different page size and everything

00:03 <graphitemaster> So it's not an insane ask.

00:05 <mattst88> is https://pastebin.com/HtT7yUcK a bit surprising to anyone?

00:05 <anholt> not surprising to me (unfortunately)

00:06 <anholt> but, also, are you trying to do the string table thing for perf metrics, by chance?

00:06 <imirkin> mattst88: maor const

00:06 <imirkin> mattst88: try static const char *const seasion[]

00:06 <mattst88> ugh. not a huge deal, as the next step was to replace the pointers with just an index into the table

00:06 <mattst88> anholt: yeah

00:07 <anholt> mattst88: that won't help you, anyway, because your [0] = are relocs, and your season[2] is a reloc

00:07 <mattst88> imirkin: wow, that helps with gcc, but not clang (!!!)

00:07 ybogdano has quit [Ping timeout: 480 seconds]

00:07 <mattst88> anholt: yeah, I was just trying to make it a clean intermediate step

00:07 <anholt> I see

00:08 <imirkin> mattst88: hmmm ... i guess yeah, unclear if there's a way to say that an array's pointers are immutable

00:08 <imirkin> (or maybe that's not what it's even complaining about? dunno)

00:09 <imirkin> mattst88: also -std=c11 could help? or hurt :)

00:09 <mattst88> imirkin: yeah, still doesn't work with clang :(

00:10 <imirkin> is the clang error the same?

00:10 <mattst88> yeah

00:10 <mattst88> > t.c:12:43: error: initializer element is not a compile-time constant

00:10 <gawin> just use constexpr from c++ /s

00:10 <anholt> if it's your intermediate step, just drop the consts and move on, C is not your friend here.

00:10 <alyssa> mareko: Mali does GS+XFB with no assistance from the hardware

00:10 <mattst88> anholt: yeah, that's the plan

00:11 <alyssa> well, technically there's some special formats for loading attribute data with funny topologies (patches or adjacency or something)

00:11 <alyssa> other than that none, it's 1000s of lines of assembly in a half dozen compute kernels that monkey patch the command stream

00:12 <mattst88> well, except I already made the intel_perf_query_counter data into static const bufferes, so no can do there.

00:13 <mattst88> oh wow, I can just drop the const

00:14 nchery has quit [Ping timeout: 480 seconds]

00:14 ngcortes has joined #dri-devel

00:14 <jenatali> FWIW it wouldn't work with MSVC either I think

00:15 <mattst88> thanks

00:17 <anholt> tomeu82: tgl got wedged https://gitlab.freedesktop.org/mesa/mesa/-/jobs/18165765

00:18 <alyssa> jekstrand: oh gosh is "v2f32" llvm syntax

00:19 <alyssa> is Mali assembly syntax just llvm syntax

00:19 <alyssa> what have i done

00:19 <jekstrand> alyssa: Maybe?

00:20 <anholt> alyssa: got started on your reimplementing llvm project, clearly.

00:20 <alyssa> anholt: Noooo

00:22 <ccr> anholt, imirkin, https://reviews.llvm.org/D76096

00:22 <ccr> eh, I meant mattst88 ^

00:23 <mattst88> ccr: oh, interesting. thanks for the link

00:23 <bylaws> alyssa: if you set the DF_0_GLOBAL flag on a so and then dlopen it it'll act as LD_PRELOAD for all subsequently loaded libs

00:29 fxkamd has quit []

00:39 ppascher has joined #dri-devel

00:40 <alyssa> Wild

00:42 <DrNick> you mean DF_1_GLOBAL?

00:44 LexSfX has joined #dri-devel

00:44 mszyprow_ has quit [Ping timeout: 480 seconds]

00:45 * alyssa is going to /part because overstimulated

00:45 <alyssa> over in #panfrost if you need me

00:45 alyssa has left #dri-devel [#dri-devel]

00:51 <DrNick> glibc's dlfcn.h says #define DF_1_GLOBAL 0x00000002 /* Set RTLD_GLOBAL for this object. */ but doesn't actually implement it, Solaris has the define but documents it as unused, apparently it came from the *BSDs

00:51 tursulin has quit [Read error: Connection reset by peer]

00:55 <jekstrand> Ok, I take it all back. Docker is for the birds!

00:55 <anholt> I don't use docker much for my actual dev, but I'm curious: what trouble did you get in?

00:56 <jekstrand> Oh, running things under qemu is taking forever

00:56 <anholt> oh, yeah. do not recommend.

00:56 linearcannon has quit [Read error: Connection reset by peer]

00:56 <jekstrand> I thought I had a pretty good setup with running under qemu and then kicking off to icecream with real x86_64 binaries of the aarch64 compiler for the actual building.

00:57 <jekstrand> But qemu can't even run fast enough to keep icecream fed.

00:57 <anholt> just use a meson cross file?

00:58 <anholt> not sure why you'd run any of your build stuff under qemu. unit tests run under qemu and that's hard enough.

00:58 <jekstrand> If I were on debian, that'd be easy. :-/

00:58 <jekstrand> Fedora's cross-build support is horrible

00:58 <anholt> oh.

00:58 <anholt> at that point you're stuck with using a sysroot to your arm64 chroot.

00:59 <jekstrand> Yeah, I played with sysroot a bit but I've not figured out how to get it to stop messing up stdc++ includes

01:00 <jekstrand> My pi4, on the other hand, doesn't seem to have much trouble at all keeping icecream full.

01:00 <jekstrand> Maybe that's what I do?

01:01 <jekstrand> Except I/O sucks on the pi

01:01 <jekstrand> because it's an sd card

01:02 <imirkin> what's icecream btw?

01:02 <imirkin> (doesn't feel like it'd be easy to search for that ...)

01:02 <jekstrand> imirkin: https://github.com/icecc/icecream

01:02 <imirkin> thanks

01:02 <jekstrand> distcc but better

01:02 <imirkin> low bar

01:03 <imirkin> anyways, neat

01:03 <jekstrand> In particular, it can cross-build

01:03 <jekstrand> as in I'm building on my pi4 with a bunch of the build jobs happening on an i9

01:04 <imirkin> right

01:04 <imirkin> i do the same with my laptop

01:04 <imirkin> (but it's same arch)

01:12 iive has quit []

01:22 karolherbst has quit [Read error: Connection reset by peer]

01:22 karolherbst has joined #dri-devel

01:31 The_Company has quit []

01:55 columbarius has joined #dri-devel

01:57 co1umbarius has quit [Ping timeout: 480 seconds]

02:03 gawin has quit [Ping timeout: 480 seconds]

02:10 alatiera5 has joined #dri-devel

02:15 alatiera has quit [Ping timeout: 480 seconds]

02:29 <graphitemaster> does glsl have any occupancy query functions

02:29 <graphitemaster> stuff like cudaOccupancyMaxActiveBlocksPerMultiprocessor

02:32 agx_ has joined #dri-devel

02:32 agx has quit [Read error: Connection reset by peer]

02:36 <idr> That looks like an API query for how much parallelism a shader will have.

02:36 <idr> I don't know of anything like that in GL or Vulkan.

02:40 <imirkin> iirc there are nv exts to expose some of that stuff

02:43 <imirkin> ah hm. i was probably thinking of NV_shader_thread_(group|shuffle). but that's something else.

02:46 <graphitemaster> Interesting how AMD implements it for HIP: https://github.com/ROCm-Developer-Tools/hipamd/blob/develop/src/hip_platform.cpp#L317

02:47 <graphitemaster> Sadly does not apply to NV. Though I wonder how tricky it would be to change it to support both.

03:05 mclasen has quit [Ping timeout: 480 seconds]

03:06 mclasen has joined #dri-devel

03:07 linearcannon has joined #dri-devel

03:33 ngcortes has quit [Ping timeout: 480 seconds]

03:37 agx_ has quit [Read error: Connection reset by peer]

03:37 agx has joined #dri-devel

03:59 pzanoni` has joined #dri-devel

03:59 jewins1 has joined #dri-devel

04:00 ramaling_ has joined #dri-devel

04:05 jewins has quit [Ping timeout: 480 seconds]

04:05 pzanoni has quit [Ping timeout: 480 seconds]

04:06 mattrope has quit [Ping timeout: 480 seconds]

04:06 ramaling has quit [Ping timeout: 480 seconds]

04:07 mattrope has joined #dri-devel

04:19 <jekstrand> imirkin, graphitemaster: THere's an NV Vulkan spec for it: VK_NV_shader_sm_builtins

04:19 <jekstrand> https://www.khronos.org/registry/vulkan/specs/1.2-extensions/html/vkspec.html#VkPhysicalDeviceShaderSMBuiltinsPropertiesNV

04:20 <jekstrand> VkPhysicalDeviceShaderSMBuiltinsPropertiesNV::shaderWarpsPerSM

04:54 shankaru has joined #dri-devel

05:07 idr has quit [Quit: Leaving]

05:40 <HdkR> Uh oh. I have a new panel and it doesn't work with amdgpu

05:43 <jekstrand> Why does the raspbery pi 4 not have an NVME disk? SD cards suck.

05:44 <HdkR> IO is hard for ARM :D

05:44 <HdkR> Should have bought a Xavier if you want /too/ much IO

05:47 jewins1 has quit [Ping timeout: 480 seconds]

05:48 <jekstrand> This VIM3 has an NVME and it seems to work. It takes like 5 minutes to boot but then it seems to have enough I/O

05:48 <jekstrand> Not gonna win any awards but it's non-terrible

05:56 ramaling has joined #dri-devel

05:56 pzanoni has joined #dri-devel

05:57 <austriancoder> jekstrand: why not use a Debian container with meson crossfiles? Works superb here on my fedora installation.

06:00 <jekstrand> Can you then run those binaries on a fedora system?

06:03 pzanoni` has quit [Ping timeout: 480 seconds]

06:03 ramaling_ has quit [Ping timeout: 480 seconds]

06:04 <jekstrand> I'm not against containers, so long as I don't have to qemu them because qemu sucks a whole lot more than I remember.

06:04 <jekstrand> Then again, last time I did serious qemu, I was comparing it to a 1st gen beagle board so...

06:06 <airlied> the last time I did series qemu I wrote a virt gpu

06:06 Wally has joined #dri-devel

06:07 <airlied> serious doh

06:08 <jekstrand> parallel qemu is better. :P

06:08 <airlied> jekstrand: if you'd just bought an M1 like alyssa suggested you'd have saved more time :-P

06:08 <jekstrand> airlied: Probably.

06:08 <airlied> and you'd also be able to distract yourself by implementing a vulkan driver for apple, or zink on moltenvk!

06:09 <jekstrand> lol

06:09 <jekstrand> In all seriousness, they're not that expensive....

06:09 <jekstrand> But then I'd insist on running Fedora on it and I'd spend like a week figuring out how to do that. (-:

06:09 <airlied> like it could be worse you could be trying to use android on arm

06:10 <jekstrand> I'm not using android

06:12 tzimmermann has joined #dri-devel

06:12 mattrope has quit [Remote host closed the connection]

06:18 * jekstrand has a panvk build \o/

06:18 <jekstrand> And a CTS to go with it

06:20 shankaru has quit [Ping timeout: 480 seconds]

06:20 Wally has quit [Quit: Page closed]

06:31 sdutt_ has joined #dri-devel

06:31 sdutt has quit [Read error: Connection reset by peer]

06:37 itoral has joined #dri-devel

06:40 Duke`` has joined #dri-devel

06:45 mclasen has quit [Ping timeout: 480 seconds]

06:55 mszyprow_ has joined #dri-devel

07:29 ahajda has joined #dri-devel

07:30 Duke`` has quit [Ping timeout: 480 seconds]

07:36 danvet has joined #dri-devel

07:37 alanc has quit [Remote host closed the connection]

07:37 alanc has joined #dri-devel

07:41 dllud has joined #dri-devel

07:41 dllud_ has quit [Read error: Connection reset by peer]

07:56 <daniels> jekstrand: unfortunately it’s eMMC rather than NVMe

07:57 mlankhorst has joined #dri-devel

08:10 Company has joined #dri-devel

08:21 MajorBiscuit has joined #dri-devel

08:32 mvlad has joined #dri-devel

08:33 tursulin has joined #dri-devel

08:34 itoral has quit [Remote host closed the connection]

08:34 itoral has joined #dri-devel

08:47 itoral has quit [Remote host closed the connection]

08:47 itoral has joined #dri-devel

08:52 sdutt_ has quit [Ping timeout: 480 seconds]

08:55 <daniels> but yeah, RPi is notoriously not blessed with I/O

09:03 <javierm> daniels, jekstrand: what I do is to install the rpi4 edk2 firmware in a uSD and a EFI install on an USB3 disk

09:04 <javierm> i.e: https://docs.fedoraproject.org/te/fedora-coreos/provisioning-raspberry-pi4/#_installing_fcos_and_booting_via_edk2

09:10 itoral has quit [Remote host closed the connection]

09:10 itoral has joined #dri-devel

09:15 <pepp> Kayden: probably next week

09:17 pcercuei has joined #dri-devel

09:22 JohnnyonF has joined #dri-devel

09:26 <Kayden> okay, thanks!

09:27 JohnnyonFlame has quit [Ping timeout: 480 seconds]

09:30 itoral has quit [Remote host closed the connection]

09:30 itoral has joined #dri-devel

09:32 itoral has quit [Remote host closed the connection]

09:32 itoral has joined #dri-devel

09:36 itoral has quit [Remote host closed the connection]

09:37 itoral has joined #dri-devel

09:39 <daniels> javierm: EFI! Red Hat's changed you :(

09:43 <dolphin> at least couple of months age that EFI firmware worked really poor on the rpi4

09:43 <dolphin> s/age/ago/

09:44 <dolphin> the boot time increases *much* and it was unreliable in making a successful boot :/

09:46 itoral has quit [Remote host closed the connection]

09:46 itoral has joined #dri-devel

09:47 <javierm> daniels :D

09:49 itoral has quit [Remote host closed the connection]

09:49 itoral has joined #dri-devel

09:54 Lucretia has quit []

09:55 Lucretia has joined #dri-devel

09:55 rasterman has joined #dri-devel

09:55 kuter has joined #dri-devel

09:56 <kuter> window help

09:56 <kuter> exit

09:56 kuter has quit []

10:00 itoral has quit [Remote host closed the connection]

10:00 itoral has joined #dri-devel

10:03 kuter has joined #dri-devel

10:07 kuter has quit []

10:15 itoral has quit [Remote host closed the connection]

10:15 itoral has joined #dri-devel

10:18 itoral has quit [Remote host closed the connection]

10:18 itoral has joined #dri-devel

10:19 kts has joined #dri-devel

10:19 itoral has quit [Remote host closed the connection]

10:20 itoral has joined #dri-devel

10:21 itoral has quit [Remote host closed the connection]

10:22 itoral has joined #dri-devel

10:25 itoral has quit [Remote host closed the connection]

10:25 itoral has joined #dri-devel

10:35 kts has quit [Quit: Konversation terminated!]

10:36 itoral has quit [Remote host closed the connection]

10:37 itoral has joined #dri-devel

10:41 itoral has quit [Remote host closed the connection]

10:41 itoral has joined #dri-devel

10:44 kts has joined #dri-devel

10:45 itoral has quit [Remote host closed the connection]

10:45 itoral has joined #dri-devel

10:46 sagar__ has quit [Remote host closed the connection]

10:46 kts has quit []

10:47 itoral has quit [Remote host closed the connection]

10:47 sagar__ has joined #dri-devel

10:47 itoral has joined #dri-devel

10:49 itoral has quit [Remote host closed the connection]

10:50 itoral has joined #dri-devel

10:54 itoral has quit [Remote host closed the connection]

10:54 itoral has joined #dri-devel

11:02 itoral has quit [Remote host closed the connection]

11:02 itoral has joined #dri-devel

11:02 Haaninjo has joined #dri-devel

11:05 boistordu has joined #dri-devel

11:12 flacks has quit [Quit: Quitter]

11:13 flacks has joined #dri-devel

11:15 <javierm> pinchartl: I agree with you but don't think that making the arm drivers honour the nomodeset param makes things worse than the status quo

11:15 elongbug has joined #dri-devel

11:15 <pinchartl> looks like useless code to me :-)

11:15 <javierm> pinchartl: at least users could have a known way to disable the drm drivers rather than figuring out if is built-in and have to use initcall_blacklist=rcar_du_init or modprobe.blacklist, etc

11:16 <pinchartl> having per-subsystem ways to disable drivers doesn't sound like the best idea though

11:20 itoral has quit [Remote host closed the connection]

11:20 itoral has joined #dri-devel

11:21 itoral has quit [Remote host closed the connection]

11:22 itoral has joined #dri-devel

11:25 itoral has quit []

11:32 <javierm> pinchartl: fair

11:32 <pinchartl> I'm not strictly opposed to that series, but I doubt it will be useful in most drivers

11:32 <javierm> it's surprising how nomodeset meaning changed over time. It started as a way to force text mode in vgacon and nowadays is used by gdm to decide if the wayland session should be disabled

11:43 <pq> There is something really strange in that gdm.rules file anyway, as if physical seats didn't exist as a concept.

11:44 Lucretia has quit []

11:48 Lucretia has joined #dri-devel

12:00 tobiasjakobi has joined #dri-devel

12:14 mclasen has joined #dri-devel

12:15 pnowack has joined #dri-devel

12:27 devilhorns has joined #dri-devel

12:33 <javierm> danvet, tzimmermann, pq: another thing that could ease writing new tiny drm drivers is to have a drivers/gpu/drm/tiny/tiny-skeleton.c, similar to drivers/usb/usb-skeleton.c for usb

12:34 <tzimmermann> javierm, how well does this work?

12:35 <javierm> tzimmermann: I don't know, never wrote a usb driver :)

12:35 <javierm> but when writing drivers the first thing I do is to copy one existing that is as similar as possible to my HW, so having a template would be useful

12:38 <tzimmermann> javierm, i never tried such templates in practice.

12:40 <tzimmermann> i do the same and try to follow the execution flow of an existing driver. that's not possible with the skeleton drivers, so i never had much use for them

12:42 <javierm> tzimmermann: yes, but at least you can codify there good practices and conventions. So people could start from there and fill the callbacks rather than removing / modifying an existing driver

12:43 <javierm> but maybe you are right and there's no much use for those in practice

12:43 <tzimmermann> javierm, i really don't know

12:45 <graphitemaster> jekstrand, Seems like it doesn't offer much for an already compiled program though. You still have to guess the number of vgprs and sgprs used to compute occupancy by hand, unlike the HIP and Cuda functions.

12:46 <graphitemaster> It's not impossible, the binary shaders returned from both NV and AMD have a program header that describes it. So I can do it by hand and likely will do it by hand.

12:46 <tzimmermann> javierm, i think, what i would find useful is a document on the reimplementation of tiny/cirrus.c. it would walk newbies through that driver's code and explain what the individual functions do and how they work together. cirrus is for qemu, so it's easy to tinker with it

12:46 <javierm> tzimmermann: that's a good idea too

13:00 nchery has joined #dri-devel

13:17 ahajda_ has joined #dri-devel

13:17 ahajda has quit [Read error: Connection reset by peer]

13:22 Haaninjo has quit [Quit: Ex-Chat]

13:45 Peste_Bubonica has joined #dri-devel

14:12 MrCooper has joined #dri-devel

14:12 MrCooper has quit [Remote host closed the connection]

14:13 MrCooper has joined #dri-devel

14:37 fxkamd has joined #dri-devel

14:52 <danvet> javierm, we have the drm_device skeleton in the docs

14:52 <danvet> I think maybe a simple pipe skeleton might be useful

14:53 <danvet> but the trouble with display chips is they are all wildly different

14:53 <danvet> like even within simple

14:53 <danvet> and you drop easily out of simple (e.g. as soon as you have planes)

14:53 <danvet> so I think examples are good, but maybe we need more composable examples ...

14:54 <danvet> javierm, tzimmermann https://dri.freedesktop.org/docs/drm/gpu/drm-internals.html#display-driver-example

14:56 <tzimmermann> yes, that's the basic driver layout

14:57 <javierm> danvet: ah, cool. I missed that

14:58 hikiko has joined #dri-devel

15:00 <hikiko> hello! I was looking at my MRs and there's this one: https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/211 for DRM that is fixing the kms-steal-crtc test. I've never tested it properly because I lack the hardware (I based my fix on a compiler warning about the size of an input parameter in util_fill_pattern). Would anyone mind to take a look at it as it's currently rotting? :D

15:01 <hikiko> (it's for drm not mesa)

15:09 mattrope has joined #dri-devel

15:18 jewins has joined #dri-devel

15:27 sdutt has joined #dri-devel

15:28 <jekstrand> javierm: Yeah, that might be better for the pi. I don't have any particularly good USB media at the moment, though. What do you use?

15:31 craft has joined #dri-devel

15:33 craft has quit [Remote host closed the connection]

15:33 craft has joined #dri-devel

15:33 <javierm> jekstrand: I just use latest Fedora workstation image: https://download.fedoraproject.org/pub/fedora/linux/releases/35/Workstation/aarch64/images/Fedora-Workstation-35-1.2.aarch64.raw.xz

15:34 <javierm> and write to the USB media using the arm-image-installer script, because allows me to set a ssh key, add console cmdline param, etc

15:34 <javierm> i.e: sudo arm-image-installer --image=Fedora-Workstation-35-1.2.aarch64.raw.xz --target=none --media=/dev/sdb --addconsole --addkey=id_rsa.pub --norootpass --resizefs

15:35 tobiasjakobi has quit [Remote host closed the connection]

15:39 Company has quit [Ping timeout: 480 seconds]

15:39 Company has joined #dri-devel

15:39 <javierm> jekstrand: oh, you meant the USB drive ? just a cheap sandisk 64 GiB USB3 stick. It's still way faster and more reliable than a SD card :)

15:43 rpigott has quit [Read error: Connection reset by peer]

15:43 rpigott has joined #dri-devel

15:44 lemonzest has quit [Quit: WeeChat 3.4]

15:46 <jekstrand> javierm: Oh, well I do have a couple of those and can easily buy more. :)

15:49 craft has quit [Remote host closed the connection]

15:50 craft has joined #dri-devel

15:52 lemonzest has joined #dri-devel

15:53 <tzimmermann> danvet, do you have further comments on the virtfb thing?

15:53 <danvet> virtfb thing?

15:53 <danvet> oh right

15:53 <tzimmermann> that falg

15:53 <danvet> let me grab the latest

15:53 <tzimmermann> flag

15:55 <danvet> tzimmermann, I was blind

15:55 <danvet> I'll reply and explain :-)

15:56 <tzimmermann> ok

15:56 <tzimmermann> there's no hurry

15:58 * danvet was indeed missing something

15:59 <danvet> anyway sent it out, contains r-b with some bikeshed requests for your consideration

16:00 <tzimmermann> saw it. thanks a lot

16:05 <graphitemaster> Does anyone know where I can find out the binary format of shaders on NV - like when you dump them you seem to get the NVfp/NVvp stuff as plain text, but there's a program header on there too and I was just wondering if that has been reverse engineered / documented somewhere, maybe in nouveau (unlikely though), envytools, anything?

16:06 <daniels> graphitemaster: -> #nouveau

16:06 <graphitemaster> Thanks

16:14 mareko has quit [Read error: Connection reset by peer]

16:14 mslusarz has quit [Read error: Connection reset by peer]

16:14 marcheu has quit [Read error: Connection reset by peer]

16:15 gouchi has joined #dri-devel

16:16 dri-logger has quit [Read error: Connection reset by peer]

16:16 glisse has quit [Read error: Connection reset by peer]

16:19 Duke`` has joined #dri-devel

16:19 mszyprow_ has quit [Ping timeout: 480 seconds]

16:25 mslusarz has joined #dri-devel

16:30 dri-logger has joined #dri-devel

16:30 JohnnyonFlame has joined #dri-devel

16:34 JohnnyonF has quit [Ping timeout: 480 seconds]

16:38 alatiera5 is now known as alatiera

16:48 sdutt has quit []

16:48 sdutt has joined #dri-devel

16:56 mszyprow_ has joined #dri-devel

17:04 <austriancoder> Why does st/main do not skip triangle draws when FRONT_AND_BACK culling is enabled? d3d12 and svga are doing this in draw_vbo and etnaviv soon too

17:05 mszyprow_ has quit [Ping timeout: 480 seconds]

17:09 <jekstrand> How insane would it be to have a vk_descriptor_set_layout base struct?

17:09 <jekstrand> bnieuwenhuizen, dj-death: ^^

17:10 <zmike> containing what

17:11 <jekstrand> descriptor types if nothing else

17:12 <jekstrand> I'm looking at how tractable it would be to do common vk_descriptor_update_template

17:12 <jekstrand> And I at least need to know the type for each thing.

17:18 <jekstrand> Actually... I don't. That's already in the VkDescriptorUpdateTemplateEntry. \o/

17:18 <jekstrand> But, also common descriptor set stuff could maybe be useful one day.

17:18 mbrost has joined #dri-devel

17:30 tzimmermann has quit [Quit: Leaving]

17:36 craft has quit [Remote host closed the connection]

17:38 ybogdano has joined #dri-devel

17:51 Milardo has joined #dri-devel

17:57 mbrost has quit [Ping timeout: 480 seconds]

18:03 MajorBiscuit has quit [Quit: WeeChat 3.3]

18:10 <jenatali> austriancoder: I think technically front and back culling is different in that it's supposed to run the VS for side effects and be reflected in stats

18:15 <bnieuwenhuizen> jekstrand: I think Josh actually has a JIT for the template thing lying around somewhere ... Not sure if we still want to move to that in RADV

18:15 <jekstrand> a JIT?

18:16 <bnieuwenhuizen> A JIT compiler creating x86 code for the templatew updates

18:16 <jekstrand> Oof

18:16 <jekstrand> Is that necessary?

18:16 <jekstrand> Genuine question.

18:16 <jekstrand> I don't know how hot templates actually get in apps.

18:17 <bnieuwenhuizen> AFAIU there's been complaaints about perf in general and it is a significant cost in dxvk since all descriptor updates go through the template path

18:17 <bnieuwenhuizen> about template update perf in general*

18:17 <bnieuwenhuizen> though the gains weren't super great IIRC hence I'm not sure if that still was going forward

18:18 <zmike> templates are extremely hot

18:19 <zmike> basemark, for example, is like 90% template updating

18:19 <jekstrand> Good, then I'll have something to benchmark. :)

18:22 <jekstrand> I noticed panvk doesn't have templates yet and I'd rather implement it generic than implement it in panvk if we can do so and have it fast.

18:24 <austriancoder> jenatali: makes sense somehow. my hardware has no way to set front and back culling register wise. the blob driver also skips triangle draws when FRONT_AND_BACK culling is enabled.. so I will do the same in draw_vbo

18:25 <jenatali> Makes sense. I think I probably need to switch us to disabling rasterization instead of dropping the draw. Xfb/ssbo stuff should still happen I think

18:25 ybogdano has quit [Remote host closed the connection]

18:36 <jekstrand> Bah... My beautiful plan for fast updates requires a lock. :-(

18:40 <jekstrand> daaaaang... vk-gl-cts is 6.9G when built. My poor VIM3 only has 12G of eMMC

18:40 <jekstrand> I have room for one mesa build and one vk-gl-cts build and no room for test results. :-/

18:41 <jekstrand> wait, no. It should have 32G of eMMC

18:41 <lstrano> jekstrand: dnf install fuse-sshfs

18:42 mvlad has quit [Quit: Leaving]

18:42 <jekstrand> Yeah, I just need to grow the partition

18:42 * jekstrand figures out how to do that

18:44 <jekstrand> Ah, much better.

18:45 <jekstrand> Now deqp-runner won't crash from running out of disk space. :)

18:45 mlankhorst has quit [Ping timeout: 480 seconds]

18:48 lstrano_ has quit []

18:48 lstrano_ has joined #dri-devel

18:48 <jekstrand> Ok, I give up, maybe. I don't have a plan for fast template updates that doesn't involve piles of allocation or a lock in the VkDescriptorUpdateTemplate.

18:49 lstrano_ has quit []

18:49 <jekstrand> I guess that's why we have the extension in the first place. :-/

18:49 <jekstrand> Maybe the data structure can at least be common?

18:49 <jekstrand> Save a bit of copy+pasta?

19:02 <jekstrand> Yeah, I think that's the new plan.

19:04 <cwabbott> jekstrand: robclark: I took a bit of a look at pipeline cache in ir3, and it's gonna require a lot of rewriting

19:04 <jekstrand> cwabbott: I saw your post. :-/

19:04 <cwabbott> I'm still trying to figure out whether we can use the common vk cache infrastructure without depending on it in ir3

19:06 mszyprow_ has joined #dri-devel

19:07 <cwabbott> but even if we did, we have all this caching stuff in ir3 and I guess we'd have to pull out the bits we need and "shut it off" in ir3

19:07 <cwabbott> not great

19:08 <robclark> cwabbott: fwiw, the situation where lack of pipeline cache became fairly noticeable, disk_cache is disabled.. but not sure that matters from PoV of "how this all fits together".. and it is easy enough to disable the ir3 disk cache

19:08 <robclark> but I'll need to sit down and look at the vk pipeline cache helper mr to have a more intelligent response

19:08 iive has joined #dri-devel

19:09 <cwabbott> robclark: fwiw, from freedreno's point of view, I think you can think of the pipeline cache as a weird combination of ir3_program_state cache and variant cache

19:09 <cwabbott> well that, plus it's also used for caching NIR shaders (i.e. stuff that mesa/st does in gallium land)

19:09 Milardo has quit []

19:11 <cwabbott> it has an interface similar to ir3_cache if you squint hard enough, except that you can also serialize and deserialize it

19:12 <cwabbott> so it's more used for caching shader binaries rather than cmdstream

19:15 <robclark> not sure if it is a dumb idea or not, but I suppose if you serialize the nir as well you could create a new ir3_shader and then deserialize the ir3_shader_variant from that?

19:20 ybogdano has joined #dri-devel

19:22 <cwabbott> I guess, but that's more overhead than just sharing the ir3_shader_variant

19:24 <cwabbott> we'll probably be caching the ir3_shader

19:25 <cwabbott> at which point, I guess we technically wouldn't need to reference count the variants?

19:26 <cwabbott> just stop serializing/deserializing them, and have serializing/deserializing the shader also serialize the variants it owns

19:26 <cwabbott> and not even have the variants themselves in the vk pipeline cache

19:28 mszyprow_ has quit [Ping timeout: 480 seconds]

19:28 <cwabbott> yeah, that might be a workable plan

19:35 mclasen has quit [Ping timeout: 480 seconds]

19:37 mclasen has joined #dri-devel

19:39 devilhorns has quit []

19:46 <robclark> cwabbott: I started down that path before, but the thing is new variants can show up on any draw

19:46 ahajda_ has quit [Read error: Connection reset by peer]

19:47 <cwabbott> robclark: not with vulkan

19:47 <robclark> which is why I eventually gave up and made the variant the thing that gets cached/serialized

19:47 <robclark> right, I meant w/ gl

19:47 <cwabbott> so caching the shader might be the answer for vulkan

19:49 <cwabbott> although, there is the disk_cache thingy as a backup

19:49 <cwabbott> which might screw up that plan

19:49 <cwabbott> turns out lots of games don't actually use the pipeline cache and we have to provide a default one

19:50 <robclark> it would be trivial enough to add an arg to ir3_compiler to disable it's in-built disk_cache, if that simplifies things for vk

19:50 <cwabbott> except that the default one can't just dump the entire state after compiles are done, like the app is supposed to do

19:51 <cwabbott> because the driver obviously doesn't know when compiles are done

19:51 <robclark> hmm, I thought the default pipeline cache thing solved that.. but tbh I've not had a chance to look at that yet

19:53 pnowack has quit [Ping timeout: 480 seconds]

19:54 <agd5f> Lyude, I think Greg's comments are short sighted. If the patch is valid, I don't see a reason to not commit it

19:55 <Lyude> I'm fine with that, I'm mainly just wondering if linus will be unhappy with something like that or not

19:55 <Lyude> ( @ agd5f )

19:57 <agd5f> we got some fixes from umn folks and I'm not planning to revert them. The ones we got were valid bug fixes at least.

19:58 <agd5f> I don't have the time to retype valid patches. That also has ethical implications

19:58 <cwabbott> robclark: the default pipeline cache is just disk_cache

19:59 <cwabbott> which has the same problem of new variants showing up after caching the shader, if you cache the ir3_shader

20:00 mclasen has quit [Ping timeout: 480 seconds]

20:00 <robclark> cwabbott: I suppose we could keep the existing path, but also add a path that serializes the ir3_shader and it's (presumably) single variant perhaps?

20:01 <cwabbott> robclark: with the pipeline cache a ir3_shader can have multiple variants

20:01 <cwabbott> assuming we get a cache hit

20:01 <robclark> hmm

20:02 <cwabbott> atm there's only one variant

20:06 <cwabbott> so yeah, maybe caching the variant is still best, dunno

20:07 <cwabbott> I think that require partially excising the vulkan cache code into src/util and making it derive from vk_pipeline_cache_object

20:12 pnowack has joined #dri-devel

20:12 Haaninjo has joined #dri-devel

20:13 <cwabbott> yeah, the more I think about the less sure I am that it's viable

20:14 <cwabbott> disk_cache doesn't handle "updating" something with the same hash, it assumes things being cached are immutable

20:15 ngcortes has joined #dri-devel

20:15 <cwabbott> but vk_pipeline_cache would probably get messed up if we change the hash of something suddenly

20:18 <robclark> can the pipeline cache cache arbitrary things (ie both shader and variant)?

20:21 <cwabbott> yes, it can

20:22 <cwabbott> other drivers have multiple levels of cache

20:22 <cwabbott> so, a cache for spirv->nir

20:23 <cwabbott> plus an overall cache that caches all the binaries (variants) given the nir shaders + other state

20:23 <cwabbott> plus a post-linking cache (similar to existing ir3 cache)

20:24 <cwabbott> you can be as creative as you want with the combinations

20:25 <robclark> I guess you can cache ir3_shader and ir3_shader_variant separately.. although tbh if you cache the spirv->nir you don't really need to cache the shader itself

20:25 <cwabbott> well, there are a bunch of lowering steps between spirv->nir and the shader

20:26 <robclark> well, cache spirv->lowering->nir then?

20:26 <robclark> or is tu handling some variant stuff outside of ir3?

20:27 <cwabbott> tu does have some variant-like stuff

20:27 <cwabbott> for example multi-pos output depends on the multiview mask which we get from the subpass

20:28 <cwabbott> the multi-pos lowering happens in tu_create_shader

20:28 Lucretia has quit []

20:28 <robclark> so mesa/st does some variant stuff, and has it's own variant key.. so I guess you could have tu_variant_key and have three levels of caching ;-)

20:29 <cwabbott> like I said, we can get as creative as we want :)

20:29 <cwabbott> but I think we do need to cache variants unfortunately

20:30 <cwabbott> I mean, we have to cache ir3_shader_variants "ourself"

20:31 <robclark> it would be easy enough to expose the variant key and serialization stuff so it could be re-used directly from tu

20:41 Lucretia has joined #dri-devel

20:42 <jenatali> Is there a ready-made pass that lowers varying doubles into 2x uints, and deals with loads/stores appropriately?

20:45 <anholt> jenatali: feel free to steal any part of nir_to_tgsi_lower_64bit_intrinsic()

20:45 <jenatali> Cool

20:46 <jekstrand> jenatali: nir_lower_io_lower_64bit_to_32

20:47 <jekstrand> automagic!

20:47 <jenatali> There it is, that's what I'm looking for, thanks

20:47 <anholt> ooh!

20:47 Daanct12 has joined #dri-devel

20:47 <jenatali> I assumed there must've been given all the restrictions I saw in the spec about that

20:47 <anholt> that said, with how much NTT I've got waiting for review right now, probably not goign to bother cleaning up

20:47 <jekstrand> :-/

20:48 <jenatali> Looking closer I'm not dealing with varyings yet, but I'll need that soon enough probably

20:48 <jekstrand> Sorry. Some of that's waiting on me. :-|

20:49 <jekstrand> Hopefully, I'll be able to start chipping away at the review backlog next week. This week's been burned doing Collabora new hire stuff and farting around trying to figure out a good aarch64 setup.

20:49 <jekstrand> And writing a blog post

20:50 <jenatali> I'm 2/3 of the remaining extensions down to get GL4.0 :) just need to finish plumbing fp64 which should be easy enough given that there's full software support if I need it

20:53 Danct12 has quit [Ping timeout: 480 seconds]

20:56 <imirkin> jenatali: tess and gpu_shader5 are generally the hard ones in that list. you're in the home stretch

20:56 <jenatali> Yep!

20:57 <jenatali> The transform feedback 2 and 3 ones gave me a bit of a headache, especially since xfb3 actually started using multiple GS streams that I thought I'd already done with gpu_shader5

20:57 <imirkin> hehe

20:58 <jenatali> I'm pretty sure there's no piglits for positive tests for indexed queries though

20:58 <jenatali> Since I didn't implement those and I'm not seeing failures

20:58 <imirkin> uhm

20:58 <imirkin> there def are

20:59 <jenatali> Maybe they're just not in the quick_gl or quick_shader passes, hm

20:59 <imirkin> not sure precisely what quick_gl does

20:59 <zmike> use the full gpu profile

20:59 <zmike> there's definitely tests

20:59 <jenatali> Ack, I'll dig harder

21:00 <zmike> no I mean literally the profile is named 'gpu'

21:00 <imirkin> perhaps they fail for other reasons ;)

21:01 <jenatali> Oh I see. But no they're not failing for other reasons, unless their skip conditions are just completely broken :P

21:01 <imirkin> jenatali: so there's at least arb_transform_feedback_overflow_query

21:01 <imirkin> which definitely tests for it. but perhaps you don't have that ext?

21:01 <jenatali> Right

21:01 <zmike> isn't that 4.4 or something?

21:02 <imirkin> jenatali: arb_gpu_shader5/execution/xfb-streams.c

21:02 <jenatali> Huh, that's passing for me...

21:02 * jenatali sees why

21:02 <imirkin> for (i = 0; i < STREAMS; i++) {

21:02 <imirkin> glBeginQueryIndexed(GL_PRIMITIVES_GENERATED, i, queries[i]);

21:03 <imirkin> perhaps the test is easy to pass. dunno :)

21:03 <jenatali> If the shader writes 1 primitive to each stream then sure I'd pass it, right now all of those would just be stream 0 queries lol

21:03 <jenatali> Annnnd yep that's what it does lol

21:04 ngcortes has quit [Ping timeout: 480 seconds]

21:04 <imirkin> not a great test then :)

21:04 <zmike> I think the enhanced layouts tests do more with streams

21:04 <jenatali> I'll go ahead and hook up the index and maybe I'll just pass those tests when I get to them. Or when I dig into CTS for these new features

21:05 <imirkin> there are def CTS tests for this stuff too

21:05 <jenatali> Yeah I'd assumed so

21:06 ngcortes has joined #dri-devel

21:09 mlankhorst has joined #dri-devel

21:10 Peste_Bubonica has quit [Quit: Leaving]

21:21 <jekstrand> Ugh... Looks like the Vulkan CTS is broken for 1.0. :-/

21:21 <jekstrand> Specifically, stuff calling GetPhysicalDeviceProperties2 unconditionally...

21:27 <imirkin> should add an option to drivers to expose the minimum stuff?

21:27 <imirkin> (to help test cts ;) )

21:27 <jekstrand> Yeah...

21:27 <imirkin> or maybe fuzz it

21:27 <jekstrand> Or I can just enable VK_KHR_get_physical_device_properties2 in panvk and forget about it.

21:27 <imirkin> hehehe

21:27 <imirkin> i wonder which will take longer

21:28 <imirkin> adding a switch to the driver, or fixing cts

21:28 <jekstrand> Nothing takes longer than fixing dEQP bugs

21:29 nico_ has joined #dri-devel

21:32 nico_ has left #dri-devel [#dri-devel]

21:54 <anarsoul> jekstrand: I just finished reading https://www.jlekstrand.net/jason/blog/2022/01/in-defense-of-nir/ - that's a really nice post :)

22:05 Haaninjo has quit [Quit: Ex-Chat]

22:12 Duke`` has quit [Ping timeout: 480 seconds]

22:20 <graphitemaster> Yeah, a really nice post.

22:22 nsneck has joined #dri-devel

22:29 ngcortes has quit [Read error: Connection reset by peer]

22:31 LexSfX has quit []

22:32 nsneck has quit [Quit: bye]

22:32 nsneck has joined #dri-devel

22:33 hch12907 has joined #dri-devel

22:38 LexSfX has joined #dri-devel

22:39 hch12907_ has quit [Ping timeout: 480 seconds]

22:42 hch12907_ has joined #dri-devel

22:49 hch12907 has quit [Ping timeout: 480 seconds]

22:55 jfalempe has quit [Quit: Leaving]

22:56 hch12907 has joined #dri-devel

22:58 <jenatali> Ugh. Why don't we have double-precision ffract

22:59 <FLHerne> jekstrand: The existence of that post makes me worry Intel want to ditch NIR and do some nonsensical over-the-wall thing with IBC :-(

22:59 <FLHerne> also that you've left

23:02 gouchi has quit [Remote host closed the connection]

23:03 hch12907_ has quit [Ping timeout: 480 seconds]

23:04 hch12907_ has joined #dri-devel

23:07 hch12907 has quit [Ping timeout: 480 seconds]

23:07 moony has joined #dri-devel

23:10 hch12907 has joined #dri-devel

23:10 tjaalton has joined #dri-devel

23:13 iive has quit [Ping timeout: 480 seconds]

23:16 urja has joined #dri-devel

23:16 hch12907_ has quit [Ping timeout: 480 seconds]

23:17 <anholt> argh, wtf. skqp built for amd64: runs vk backend tests fine. skqp built for arm64 with the same flags: opens libvulkan, but doesn't even log anything under VK_LOADER_DEBUG=all, never gets to the driver, acts as if the tests don't exist.

23:17 <anholt> gagallo7[m]: any idea?

23:18 iive has joined #dri-devel

23:25 <anholt> oh. well done, skia. opens libvulkan.so instead of libvulkan.so.1 because library abis are for chumps. arm system didn't have libvulkan-dev, so the bare .so link was missing.

23:26 mszyprow_ has joined #dri-devel

23:27 <HdkR> Ouch

23:37 ngcortes has joined #dri-devel

23:47 nchery has quit [Ping timeout: 480 seconds]

23:49 mlankhorst has quit [Ping timeout: 480 seconds]

23:55 hch12907_ has joined #dri-devel