#dri-devel on 2022-02-02 — irc logs at oftc.irclog.whitequark.org

2021-07-26 22:56 ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar

00:03 tjaalton_ has quit [Remote host closed the connection]

00:09 tpalli has joined #dri-devel

00:13 tjaalton has joined #dri-devel

00:13 ngcortes has joined #dri-devel

00:14 pnowack has quit [Quit: pnowack]

00:16 ngcortes has quit [Remote host closed the connection]

00:18 iive has quit []

00:23 ybogdano has joined #dri-devel

00:23 tursulin has quit [Read error: Connection reset by peer]

00:41 Company has quit [Quit: Leaving]

00:47 <anholt> airlied: thanks. just realized I don't have an ack for the uprev commit, either :/

00:51 <airlied> anholt: ab for that

01:17 <jenatali> Looks like I have to completely rewrite how we produce DXIL signatures, that'll be fun

01:17 cphealy has quit [Read error: Connection reset by peer]

01:21 ybogdano has quit [Ping timeout: 480 seconds]

01:23 <kisak> must be time to switch over to cursive

01:26 <HdkR> Cursive is great for only signatures. I can just scribble on something and no one can duplicate it, not even myself.

01:27 rasterman has quit [Quit: Gettin' stinky!]

01:28 Lucretia-backup has quit []

01:28 Lucretia has joined #dri-devel

01:29 oneforall2 has quit [Quit: Leaving]

01:31 nchery has joined #dri-devel

01:36 oneforall2 has joined #dri-devel

01:40 shsharma has joined #dri-devel

01:47 sdutt has quit [Ping timeout: 480 seconds]

01:49 shsharma has quit [Ping timeout: 480 seconds]

01:50 co1umbarius has joined #dri-devel

01:52 columbarius has quit [Ping timeout: 480 seconds]

02:02 nchery has quit [Ping timeout: 480 seconds]

02:04 dv_ has joined #dri-devel

02:04 gawin has quit [Ping timeout: 480 seconds]

02:07 sdutt has joined #dri-devel

02:20 nsneck_ has joined #dri-devel

02:27 nsneck has quit [Ping timeout: 480 seconds]

02:43 fxkamd has quit []

02:48 nchery has joined #dri-devel

02:48 kts_ has quit [Ping timeout: 480 seconds]

02:53 bnieuwenhuizen_ has quit []

02:53 bnieuwenhuizen has joined #dri-devel

02:56 nchery has quit [Ping timeout: 480 seconds]

03:06 sdutt has quit [Ping timeout: 480 seconds]

03:15 <graphitemaster> I assume when NV lists "Maximum number of 32-bit registers per SM" and then lists something like 64 K here, they do not mean 64,000 32-bit registers, but 64 KiB worth 32-bit registers

03:15 <graphitemaster> https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#compute-capabilities

03:16 <graphitemaster> The other fields use KB so maybe they do mean thousands (K)

03:16 <graphitemaster> But they're power of two's so I'd assume they mean K as in 1024

03:16 <graphitemaster> What a shitty table

03:22 <imirkin> graphitemaster: yes. "shared" memory in GL compute parlance is split between registers used for shaders and for the super-shared "registers" across the SM

03:26 <graphitemaster> Yeah I know that

03:27 <graphitemaster> I just want to know if 64K 32-bit registers means 64 * 1024 * 4 bytes or 64 * 1024 bytes or 64 * 1000 * 4 bytes or 64 * 1000 bytes XD

03:38 nchery has joined #dri-devel

03:41 <imirkin> afaik 64kb worth of registers. i.e. 65536 / 4 32-bit registers.

03:42 <imirkin> although turing+ introduces "uniform" registers, not sure how those are accounted for exactly

03:43 cworth has joined #dri-devel

03:48 sdutt has joined #dri-devel

03:59 cworth has quit [Ping timeout: 480 seconds]

04:10 kts has joined #dri-devel

04:15 <mareko> airlied: I didn't actually finish FP16/mediump support - tess and GS varyings and transform feedback were the main missing items, and then some failing tests I think

04:16 <mareko> the reason I stopped is that I didn't see any perf improvement in a GLES benchmark we cared about

04:22 kts has quit [Ping timeout: 480 seconds]

04:27 <HdkR> rc1 tomorrow right?

04:50 cphealy has joined #dri-devel

04:51 benettig has joined #dri-devel

04:55 jewins has quit [Ping timeout: 480 seconds]

04:57 <cheako> I've uploaded my API dump, the most interesting frame is 4113

04:57 <cheako> https://drive.google.com/drive/folders/1u5dTh1esfT-22M3q9-hL5glf_xf1eUac?usp=sharing

04:58 <cheako> I think there are a growing number of allocatememory calls when compared to freememory.

05:00 <cheako> It also seems like the engine is overzealous when re-creating the swapchain, I don't think it's beneficial to free all those images.

05:04 cworth has joined #dri-devel

05:12 mbrost has quit [Ping timeout: 480 seconds]

05:16 cworth has quit [Ping timeout: 480 seconds]

05:34 Duke`` has joined #dri-devel

05:46 maxzor has joined #dri-devel

06:05 sdutt has quit [Read error: Connection reset by peer]

06:19 itoral has joined #dri-devel

06:23 itoral_ has joined #dri-devel

06:29 itoral has quit [Ping timeout: 480 seconds]

06:53 lemonzest has joined #dri-devel

06:54 Duke`` has quit [Ping timeout: 480 seconds]

06:57 frieder has joined #dri-devel

06:58 frieder has quit []

06:58 frieder has joined #dri-devel

07:11 mattrope has quit [Read error: Connection reset by peer]

07:14 mlankhorst has joined #dri-devel

07:25 mattrope has joined #dri-devel

07:32 maxzor has quit [Remote host closed the connection]

07:34 danvet has joined #dri-devel

07:35 MajorBiscuit has joined #dri-devel

07:52 JohnnyonF has quit [Read error: Connection reset by peer]

07:52 JohnnyonFlame has joined #dri-devel

08:04 <javierm> dianders: I see... maybe danvet have some suggestions for how to do expose it through debugfs ?

08:07 * danvet has no scrollback

08:08 <danvet> not sure about no ideas, just started the first coffee :-)

08:08 <danvet> what is it?

08:08 tzimmermann has joined #dri-devel

08:09 <javierm> danvet: https://paste.centos.org/view/raw/21caa33c

08:09 <javierm> danvet: I suggested dianders that maybe drm_debugfs_create_files() could fit his use case

08:09 <danvet> uh my little comment about moving the debug/testing thing for panels to debugfs?

08:10 <javierm> danvet: yeah

08:10 <danvet> yes and no

08:10 <danvet> so first thing is that drm_debugfs_create_files is the wrong interface

08:10 <danvet> because you can only use it after drm_dev_register

08:11 <danvet> we had a few internships that tried to tackle it, but didn't go anywhere yet in upstream

08:11 <danvet> essentially the idea is to have an add_file interface you can call anytime before registration time

08:11 <danvet> I can dig out the patches we've gotten thus far, or melissawen

08:11 mattrope has quit [Ping timeout: 480 seconds]

08:12 <danvet> ideally we'd then do the same for connector

08:12 <javierm> danvet: correct, and the problem he had with that interface is that you can't get a drm dev from a struct drm_panel

08:12 <danvet> and drm_panel would add those files somewhere

08:12 <danvet> nah drm_device is the wrong thing anyway

08:12 <danvet> we should add it to the connector

08:13 <danvet> which has even fewer interfaces readymade for drivers, but at least no bad ones to misguide you :-/

08:13 <javierm> danvet: I see

08:13 <danvet> https://dri.freedesktop.org/docs/drm/gpu/todo.html#clean-up-the-debugfs-support

08:13 <danvet> so I think the perfect world solution here would be to merge that

08:14 <danvet> then extend it for the subdirectories we have (well at least connector)

08:14 <danvet> then glue drm_panel in there somewhere

08:14 <danvet> unfortunately there's a lot of handwaving involved at this point

08:15 <javierm> danvet: thanks for your thoughts. It's too late on dianders' timezone but could read your comments when online again

08:15 <danvet> and even the debugfs rfc for drm_device is rather far from working state

08:15 <danvet> but also doing this properly, i.e. allowing drivers to register debugfs files at any time before registration

08:15 <danvet> and not having to care about how/when they show up

08:15 <danvet> would be really nice in general

08:16 <danvet> it's essentially how sysfs works too, and that's just the right model

08:16 <danvet> with attributes and attribute groups and all that

08:16 <javierm> danvet: I was about to mention that. That this is another thing that should be handled by the driver core model

08:16 <javierm> the use case is not DRM specific really

08:16 <danvet> javierm, so maybe the compromise between the total hack and fixing the world is to

08:17 <danvet> 1) get the drm_device infra in place roughly

08:17 <danvet> 2) add a list of drm_panel_info (NULL terminated array maybe) which drm_panel could add

08:17 <danvet> or a callback

08:17 <danvet> or something like that

08:17 <danvet> javierm, driver core model handles it for sysfs

08:17 <danvet> but debugfs is just "roll your own"

08:17 <javierm> danvet: yes I know, but no for debugfs

08:18 <javierm> danvet: exactly

08:18 <danvet> but that's an even bigger discussion

08:18 <danvet> e.g. like with drm_managed maybe if we roll something in drm first

08:18 <danvet> but yeah maybe that should also mean the infrastructure shouldn't be too drm_device specific, but attach to a more abstract struct underneath

08:19 <danvet> which we could then embed into drm_device, drm_crtc, drm_connector

08:19 <danvet> at this point it's probably 1-2 quarters of work to roll this out for someone who's good and knows drm :-/

08:19 <javierm> danvet: yeah...

08:20 <javierm> danvet: speaking about rabbit holes, should I drop the patch that adds the I2C connector type ?

08:20 <javierm> I didn't think that would had been controversial

08:20 <danvet> :-)

08:20 maxzor has joined #dri-devel

08:20 <danvet> it's a bikeshed I wouldn't load up on a tiny driver for sure ...

08:21 <javierm> danvet: yep, I will just use the Unknown type. I just thought that was an opportunity to add a I2C one because saw one for SPI

08:22 <javierm> but a panel makes much more sense to avoid user-space having to catch-up all the time and I2C/SPI doesn't tell much to the user really

08:23 <javierm> danvet: thanks again. I promise to not ask more questions now and allow you to get coffee :)

08:24 <danvet> yeah I'm tempted to just mass-rename them except they're uapi for existing drivers because we suck

08:24 <danvet> which is also why to this day i915 lies to you about connectors and stuff

08:24 <danvet> except if I missed something big time

08:25 <danvet> ideally the naming would reflect a lot more what the user sees

08:25 <danvet> but in reality we're driver developers, so they reflect what the hw looks like from the inside :-(

08:25 <emersion> there's the subconnector property

08:26 <emersion> if you want to expose both the inside and the outside

08:33 pnowack has joined #dri-devel

09:20 Company has joined #dri-devel

09:23 <javierm> danvet: btw, https://javierm.fedorapeople.org/ssd1307-doom.jpeg :P

09:24 <danvet> epic

09:24 <danvet> emersion, subconnector was used by only some drivers for selecting the transport mode

09:24 <danvet> like dvi-d vs dvi-i or different tv-out modes

09:25 <emersion> amdgpu uses it to indicate the "real" connector

09:25 <danvet> oh lol that stuff changed then I guess

09:25 <emersion> e.g. laptop has DP → HDMI bridge

09:25 pcercuei has joined #dri-devel

09:25 <emersion> connector is DP, subconnector is HDMI

09:25 <danvet> ah right yeah that makes some sense

09:25 <danvet> it's still a mess

09:26 <emersion> another example on my machine:

09:26 <emersion> Type: DisplayPort

09:26 <emersion> "subconnector" (immutable): enum {Unknown, VGA, DVI-D, HDMI, DP, Wireless, Native} = DVI-D

09:36 itoral_ has quit [Remote host closed the connection]

09:37 itoral has joined #dri-devel

09:38 itoral has quit [Remote host closed the connection]

09:39 itoral has joined #dri-devel

09:42 itoral has quit [Remote host closed the connection]

09:42 itoral has joined #dri-devel

09:47 mvlad has joined #dri-devel

09:48 <pq> javierm, awesome doom! What the window system stack like in that picture? fbdev or KMS? Xorg or something else?

09:48 nchery is now known as Guest1496

09:48 nchery has joined #dri-devel

09:49 itoral has quit [Remote host closed the connection]

09:50 nchery is now known as Guest1497

09:50 nchery has joined #dri-devel

09:50 itoral has joined #dri-devel

09:51 itoral has quit [Remote host closed the connection]

09:52 itoral has joined #dri-devel

09:52 tursulin has joined #dri-devel

09:54 itoral has quit [Remote host closed the connection]

09:55 itoral has joined #dri-devel

09:55 <javierm> pq: it's just fbdev. I also cheated because Doom complains that can't run there due the minimum resolution being 320x200 and the panel only having 128x64

09:55 Guest1496 has quit [Ping timeout: 480 seconds]

09:55 <javierm> pq: so what I did is to run on fb0 and then grab each frame, transform and write into fb1

09:56 <pq> ahaha, but hmm...

09:56 Guest1497 has quit [Ping timeout: 480 seconds]

09:57 <pq> I wonder if there is some way to trick weston/drm - Xwayland - doom/x11 to let doom run in 320x200 and then Weston scales it down to "fullscreen"

09:58 itoral has quit [Remote host closed the connection]

09:58 itoral has joined #dri-devel

09:59 <pq> Xwayland would have to advertise 320x200 video mode as a possibility to Doom, then Doom does a modeset with RandR, and Xwayland emulates that with wp_viewport without actually changing the mode.

09:59 <pq> But I suspect Xwayland might refuse to advertise modes larger than what the Wayland compositor has.

10:00 mszyprow_ has joined #dri-devel

10:00 itoral has quit [Remote host closed the connection]

10:00 <javierm> pq: doom is able to upscale, the problem is with down scaling below 320x200

10:00 itoral has joined #dri-devel

10:01 <javierm> with bigger resolution it scales and you just get blurry images

10:01 <emersion> gamescope could be used to fake a large screen but… o_O

10:01 <javierm> pq: I tried to hack the original doom source to remove that constraint, but there seemeded to be too many assumptions about 320x200

10:02 <pq> javierm, yes, exactly; Xwayland would fake 320x200 mode to Doom while actually running in 128x64 Wayland desktop.

10:02 <pq> emersion, oh, that sounds better

10:02 <javierm> pq: ah, got it now. haha, that would be interesting to explore

10:03 <javierm> pq: I also considered porting https://github.com/daveruiz/doom-nano to linux

10:03 <pq> javierm, the more standard and unptached components used, the more impressive it is :-D

10:03 <javierm> pq: 100%

10:03 itoral has quit [Remote host closed the connection]

10:04 <javierm> pq: so then I just gave up and did the hack I mentioned

10:04 itoral has joined #dri-devel

10:04 <javierm> but didn't considered that xwayland lying and advertising a 320x200 resolution

10:05 <pq> Xwayland is good at lying about video modes I think, but it might refuse to do it with modes larger than actual, I'm not sure.

10:06 <pq> the usual use case is upscaling and not downscaling, indeed

10:06 <pq> although... hmm.

10:06 <javierm> yeah. I never had this much fun working on a driver :)

10:07 Ziemas has joined #dri-devel

10:08 mszyprow_ has quit [Ping timeout: 480 seconds]

10:11 itoral has quit [Remote host closed the connection]

10:11 itoral has joined #dri-devel

10:15 itoral has quit [Remote host closed the connection]

10:16 itoral has joined #dri-devel

10:22 <linkmauve> Weston itself can already fake a different mode, if the hardware supports plane scaling (i915 does), in an [output] section set mode=320x200 and the primary plane for this output will be 320×200, and then scaled to the actual monitor’s size.

10:22 <linkmauve> This is what I use to test hidpi for instance.

10:23 <pq> linkmauve, uhh... how does that work? 'mode' is supposed to change the video mode, not only the FB size.

10:25 <linkmauve> It is a video mode change, to userland it looks like the screen grew more pixels than it actually has.

10:25 <pq> I would be surprised if ssd1307 driver supported any other video modes than the native one, let alone mode bigger than the native one.

10:25 <pq> linkmauve, so you get the big video mode on HDMI or whatever, but then the monitor scales it down?

10:26 <pq> but then plane scaling is not at play there

10:27 <pq> not that ssd1307 supported plane scaling to begin with, I believe

10:28 itoral has quit [Remote host closed the connection]

10:28 <linkmauve> Just tested with my 1366×768 LVDS, the connector stays in 1366×768, but its crtc says Mode: 1920x1080@59.96 userdef nhsync pvsync

10:29 <linkmauve> I’m pretty sure LVDS doesn’t support scaling at all, so it’d be in the encoder maybe?

10:29 itoral has joined #dri-devel

10:29 <pq> where do you see connector more?

10:30 <linkmauve> Modes

10:30 <linkmauve> Queried it with drm_info, it says:

10:30 <linkmauve> └───1366x768@60.11 preferred driver nhsync nvsync

10:30 <javierm> pq: correct, the driver only supports its only native mode. Which in fact is hardcoded in the Device Tree

10:30 <pq> hmm, so you are using a custom modeline, and then the *kernel driver* decides to fake it, because it know the panel can only run at a single specific mode.

10:31 <pq> linkmauve, Modes on what object?

10:31 <linkmauve> On the CRTC.

10:31 <pq> then you are *

10:31 <linkmauve> And correct about the custom modeline.

10:31 <javierm> pq: I guess that could expose a 320x200 mode to user-space and then scale down in the driver. Just like we do with the pixel format :P

10:33 <linkmauve> pq, here is the entire drm_mode output: https://linkmauve.fr/files/custom-mode.txt

10:34 <linkmauve> javierm, I believe that’s what i915 does here, but at scanout instead of in software.

10:34 <pq> linkmauve, ah, connector does list modes by EDID, but it won't show what's happening on the cable.

10:36 <javierm> linkmauve: I see. Interesting

10:36 itoral has quit [Remote host closed the connection]

10:36 itoral has joined #dri-devel

10:37 <pq> linkmauve, ok. What you see is not Weston at all. It is the kernel driver allowing an unlisted mode and magically deciding to scale it instead of actually using it.

10:38 <pq> or maybe the panel has a scaler

10:38 <linkmauve> That’s what I thought as well.

10:38 <vsyrjala> it's in the pipe (~=crtc)

10:39 <vsyrjala> with intel hw that is

10:39 <pq> This won't work on a panel that does not have a scaler and a driver that does not scale.

10:40 <pq> IOW, Weston is not faking. The driver is.

10:40 <linkmauve> Right.

10:40 <pq> Weston *could* fake modes with a primary plane that supports scaling, but that's not implemented.

10:42 <pq> Weston could also fake with renderer scaling, but that's not implemented either, though daniels has a MR for something like that.

10:42 <pq> but it's only for upscaling I think

10:48 itoral has quit [Remote host closed the connection]

10:49 itoral has joined #dri-devel

10:49 heat has joined #dri-devel

10:50 <linkmauve> Oh, that’d be interesting. ^^

10:51 <javierm> pq, linkmauve: I never thought that sharing a picture of doom would spark a conversations like this

10:52 <javierm> it seems gfx land is full or rabbit holes

10:52 <javierm> *of

10:59 itoral has quit [Remote host closed the connection]

10:59 itoral has joined #dri-devel

11:05 heat has quit [Ping timeout: 480 seconds]

11:08 <danvet> tomba, pinchartl since I think you're the last two who actively cared about fbdev, feel like taking a look at "[PATCH 01/21] MAINTAINERS: Add entry for fbdev core" ?

11:12 <daniels> pq: yeah, renderer-follows-scale only does upscale (i.e. renderer buffer is an integer multiple smaller than the display mode), but it could reasonably easily be adapted

11:14 <daniels> I've been waiting for some of Derek's region work to coalesce before I do tho, because that will make things far less error-prone ...

11:15 rasterman has joined #dri-devel

11:25 flacks has quit [Quit: Quitter]

11:26 <pq> javierm, haha, well, isn't userspace almost always the biggest hurdle in KMS development? :-)

11:26 flacks has joined #dri-devel

11:28 <javierm> pq :) I'm just joking. Learning a lot from these discussions since I'm a newbie in gfx

11:58 tobiasjakobi has joined #dri-devel

12:10 thellstrom has joined #dri-devel

12:12 <danvet> demarchi, btw want drm-misc commit rights so it's simpler to push the all over refactorings you're doing?

12:18 frieder_ has joined #dri-devel

12:18 frieder has quit [Read error: Connection reset by peer]

12:29 devilhorns has joined #dri-devel

12:47 thellstrom has quit [Ping timeout: 480 seconds]

12:52 thellstrom has joined #dri-devel

13:03 gawin has joined #dri-devel

13:07 heat has joined #dri-devel

13:31 itoral has quit [Remote host closed the connection]

13:44 <graphitemaster> Conceptually is there really anything different about SSBOs and UBOs when it comes to the driver?

13:44 <graphitemaster> I would assume drivers still attempt to marshal small SSBOs as uniforms

14:00 <cheako> I've uploaded my API dump, the most interesting frame is 4113 https://drive.google.com/drive/folders/1u5dTh1esfT-22M3q9-hL5glf_xf1eUac?usp=sharing I think there are a growing number of allocatememory calls when compared to freememory. It also seems like the engine is overzealous when re-creating the swapchain, I don't think it's beneficial to free all those images after reciving a VK_SUBOPTIMAL_KHR. I assume something like the

14:00 <cheako> awk https://pastebin.com/65fhiHGR would show an even number of allocs-free, but instead this value increases. Validation layers doesn't show it.

14:01 kts has joined #dri-devel

14:31 jewins has joined #dri-devel

14:34 sdutt has joined #dri-devel

14:35 JohnnyonFlame has quit [Ping timeout: 480 seconds]

14:48 off^ has quit [Remote host closed the connection]

14:51 <dianders> javierm / danvet: Thanks for your thoughts on my debugfs woes. At least now I know I'm not crazy and didn't just miss something super obvious. ;-) I'm not convinced I'll manage to find time to do anything quite as major as being discussed. What about something even simpler that moves us in a baby step? Add a debugfs API to "drm_panel" for panels to use. This can be the right API for panels to use long run but for now it doesn't go

14:51 <dianders> further than the panel. Create a top-level "panel" dir in debugfs and then a sub-dir for each panel device under there owned by each panel. Since debugfs isn't ABI this could move later if/when we find a way to hook this into drm more correctly.

14:55 <danvet> dianders, yeah, but maybe we should put the panel under the existing connector debugfs directory

14:55 <danvet> that should be doable I think

14:56 <danvet> dianders, also maybe an option is that the info file also points at the parent object

14:56 <danvet> but that can be added later on

14:56 <danvet> since it's probably better to make the interfaces specific

14:56 <dianders> danvet: The existing connector directory = something like /sys/kernel/debug/dri/1/eDP-1 ?

14:57 <danvet> iow some drm_panel_add_debugfs() and then some magic to make it show up

14:57 <danvet> yeah

14:57 <danvet> maybe that's already too much screaming, dunno

14:57 <dianders> danvet: I just couldn't figure out how to do that. The panel stuff is pretty disconnected from everything else DRM. Even more so now that there's no more drm_panel_attach...

14:57 <danvet> as long as panel drivers only ever call drm_panel_add_debugfs then it shouldn't matter much how we wire it up like you say

14:58 <danvet> dianders, we can't set it up from the panel bridge helpers?

14:58 <javierm> the good thing is what dianders mentioned, that debug is not an ABI so could be done incrementally

14:58 <danvet> like worst case it needs a bunch of calls to a drm_panel_actually_create_the_debugfs_stuff_now(panel, connector);

14:58 <dianders> danvet: I don't think we're guaranteed that the panel bridge helpers are used in any particular case, are we?

14:58 jfalempe has quit []

14:58 <danvet> not the greatest, but also not the worst

14:58 <danvet> dianders, yeah but if we take care of the 90% case and encourage people to use more of the standard stuff

14:59 <danvet> and allow the other 10% to work too

14:59 <danvet> then that's good enough I think

15:00 <dianders> danvet: OK. I'll try to find some time later today or tomorrow and dig to see if one of those two options works. Either trigger it off the panel bridge helpers or find a magic place to call drm_panel_actually_create_the_debugfs_stuff_now(). Gotta run for the moment. Thanks!

15:00 kts has quit [Quit: Konversation terminated!]

15:02 jfalempe has joined #dri-devel

15:02 jfalempe has quit []

15:10 FireBurn has joined #dri-devel

15:11 FireBurn has quit []

15:12 FireBurn has joined #dri-devel

15:15 jfalempe has joined #dri-devel

15:20 FireBurn has quit [Remote host closed the connection]

15:21 FireBurn has joined #dri-devel

15:24 fxkamd has joined #dri-devel

15:26 Haaninjo has joined #dri-devel

15:29 gawin has quit [Ping timeout: 480 seconds]

15:37 <jenatali> Any piglit experts want to help me figure out how https://jenatali.pages.freedesktop.org/-/mesa/-/jobs/18355086/artifacts/summary/results/spec@arb_gpu_shader_fp64@execution@built-in-functions@fs-frexp-dvec4-only-exponent.html is supposed to work? Looks like the GLSL 1.40 directive is forcing Mesa to consider this a compat 3.1 shader, and the extension table doesn't consider fp64 supported until compat 3.2

15:37 <jenatali> imirkin: Since you're usually helpful for this kind of thing ^^

15:40 mattrope has joined #dri-devel

15:44 <imirkin> jenatali: the GLSL 1.40 directive has nothing to do with mesa

15:44 <imirkin> it tells piglit that it wants at least GLSL 1.40

15:44 <imirkin> which in turn means GL 3.1+

15:44 <jenatali> It causes a #version 140 to be added into the shader

15:44 <imirkin> sure

15:44 <imirkin> is that a problem?

15:45 <jenatali> That causes Mesa to reject the fp64 extension, as far as I can tell

15:45 <imirkin> let's seeee...

15:45 <jenatali> Since that extension isn't supported until compat 3.2

15:46 <imirkin> huh. that's weird.

15:46 <imirkin> ok, so the difference is that you must be getting a compat context, whereas on linux you probably end up with a core context

15:46 <jenatali> I guess the shader doesn't explicitly ask for compat...

15:46 <imirkin> the ext, in mesa, requires core or GL compat 3.2+. which is _weird_

15:46 <jenatali> Yeah exactly. Guess I'll try to figure out why that's happening

15:47 <imirkin> so basically the platform GL context creation/picking logic is making slightly different decisions

15:47 <imirkin> BUT

15:48 <imirkin> i see no reason why the ext won't work with a GL 3.1 compat context

15:49 <imirkin> tarceri: 9f77a9729eb62d - do you remember why you picked GL 3.2 as the compat cut-off?

15:55 <FireBurn> I'm seeing messages like "mesa: for the --simplifycfg-sink-common option: may only occur zero or one times!" being generated

15:55 <FireBurn> I think that was fixed ages ago my Marek, should I be worried that it's back?

15:58 <FireBurn> https://gitlab.freedesktop.org/mesa/mesa/-/commit/18b12bf53351e1a902dc1f2e527a94ec8d8f3eff

16:02 <pendingchaos> obviously they can't be in ssbos (not because the spec disallows it, but because it doesn't say how that would work),

16:02 <pendingchaos> anyone know what exactly are the rules for opaque types like OpTypeImage in SPIR-V?

16:02 <pendingchaos> but what about function/private variables? this is mostly disallowed in glsl, but I can't find anything in the spir-v spec about this

16:04 <jenatali> pendingchaos: You can pass them as function args for sure. I believe function-local are disallowed except for SampledImage constructed from image + sampler

16:06 <jenatali> imirkin: Found the difference. The dri frontend treats 3.1 as core, but the wgl frontend treats 3.1 as compat

16:08 <imirkin> jenatali: so with 3.1, there is no core / compat

16:08 <imirkin> it's all 3.1

16:08 <imirkin> but you might expose GL_ARB_compatibility. or you might not.

16:08 <jenatali> With 3.1 it's up to the implementation whether it returns the compat extension (i.e. a compat context) or not, yeah

16:08 <imirkin> core / compat separation came with 3.2

16:08 <imirkin> BUT

16:08 <imirkin> when you ask for some context on linux (e.g. 3.1)

16:08 <imirkin> the driver will just give you e.g. GL 4.6

16:08 <imirkin> perhaps that's not the case on windows?

16:09 <jenatali> Windows: https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/gallium/frontends/wgl/stw_context.c#L246

16:10 <jenatali> Linux: https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/gallium/frontends/dri/dri_util.c#L389

16:11 <jenatali> Ehh only if max compat is < 3.1, so maybe that's not what I was looking for

16:12 <imirkin> the diff might also be in what piglit (or waffle) requests

16:13 <imirkin> jenatali: what do you get when you run "bin/glinfo" in piglit?

16:13 <imirkin> do you get GL 3.3? or some earlier version?

16:13 <jenatali> Well I'm testing the last extension for 4.0, so I'm getting 4.0

16:13 <imirkin> when you run that piglit glinfo thing, right?

16:13 <jenatali> Yep

16:13 <imirkin> hm ok

16:14 <imirkin> iirc that doesn't ask for anything too special as far as context creation goes

16:14 <imirkin> does it list GL_ARB_gpu_shader_fp64?

16:14 <jenatali> Yep

16:14 <imirkin> what if you add GL >= 3.1 to the shader_test?

16:15 <imirkin> (to the [require] section)

16:15 <jenatali> Ah, ok I found the real logic I was looking for, it's in glx

16:15 <jenatali> https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/glx/dri_common.c#L464 - defaults to core if no profile bit is in the attribs

16:16 <imirkin> oh right. well for a long time we didn't support compat >= 3.1

16:17 <imirkin> anyways, tbh i think it'd be totally fine to reduce the GLL requirement to 3.1 for that ext

16:17 Duke`` has joined #dri-devel

16:17 <imirkin> it was probably made 3.2 to continue not exposing it on intel compat contexts

16:17 <imirkin> since intel didn't support compat in 3.1+

16:17 <imirkin> but that's changed now too

16:17 <imirkin> (i think)

16:19 <jenatali> imirkin: https://www.khronos.org/registry/OpenGL/extensions/ARB/ARB_gpu_shader_fp64.txt: "OpenGL 3.2 and GLSL 1.50 are required."

16:19 <jenatali> The test is wrong, it needs to require GLSL 1.5, not 1.4

16:19 <imirkin> yeah, but we tend to relax things a bit

16:19 <imirkin> oh hm

16:19 <imirkin> the other tests want GLSL 1.50 too

16:19 <imirkin> e.g. fs-frexp-dvec4.shader_test

16:19 <imirkin> so that one's just the odd man out

16:19 <imirkin> i guess it's fine then?

16:20 <jenatali> Yeah. Seems that way. Fun

16:20 <imirkin> a few of them hve 1.40

16:20 <imirkin> but most have 1.50+

16:21 <jenatali> Yep, I'll prep a test PR

16:23 <jenatali> Thanks for the help, as usual :)

16:24 <imirkin> yw

16:30 sdutt has quit []

16:30 sdutt has joined #dri-devel

16:31 gawin has joined #dri-devel

16:34 <gawin> can someone assign Marge? (mr already has rb from Emma) thx https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13226

16:35 <imirkin> gawin: i don't see a record of that in the MR comments

16:35 <gawin> resolved thread

16:36 <imirkin> the one i started. heh. didn't expect anholt's comment in there :)

16:37 <imirkin> gawin: done

16:37 <gawin> in the end I used your suggestion, if suggested by two people then probably it's objectively better.

16:37 <gawin> thanks

16:41 <imirkin> gawin: the do { } while (foo = foo->next) thing is a pretty common pattern

16:41 <imirkin> gawin: fwiw i'm always wary of for loops with weird "next" code

16:42 <imirkin> but obv from a runtime perspective, it's all the same

16:46 <gawin> for me assignments in c are tricky (as quiet casting in background can happen, especially if macros are used)

16:49 maxzor has quit [Ping timeout: 480 seconds]

17:09 <sravn> dianders, danvet: We could add the debugfs stuff in the pnale bridge, I think that would be a better fit than adding it to drm_panel. This would also fit the idea that we should (maybe) move over to always use a bridge between the display driver and the panel.

17:09 <sravn> I wanted to type panel bridge

17:11 <sravn> Lots of things has happended while I was away so maybe the bridge ideas have changed since, but I liked how things became simpler with the display driver <=> bridge <=> panel model

17:26 i-garrison has quit []

17:26 i-garrison has joined #dri-devel

17:29 maxzor has joined #dri-devel

17:30 <demarchi> danvet: that would be helpful

17:31 nchery is now known as Guest1528

17:31 nchery has joined #dri-devel

17:34 ybogdano has joined #dri-devel

17:34 gouchi has joined #dri-devel

17:39 Guest1528 has quit [Ping timeout: 480 seconds]

17:40 <anholt> austriancoder: looks like gc2000 is out of disk space. have you set up a docker-gc cronjob?

17:41 MajorBiscuit has quit [Ping timeout: 480 seconds]

18:01 <jenatali> imirkin: Any thoughts on this one? https://jenatali.pages.freedesktop.org/-/mesa/-/jobs/18355086/artifacts/summary/results/spec@arb_gpu_shader_fp64@execution@fs-indirect-temp-double-dst.html - looks like "pick" and "pick2" are getting optimized out somehow

18:02 <jenatali> I suspect the test is missing some other extension requirement that adds support for temp indexing

18:02 <imirkin> no

18:02 <imirkin> that's allowed in GLSL 1.10 iirc

18:03 * jenatali sighs

18:03 <imirkin> w.t.f.

18:03 <imirkin> that ... should work

18:03 <imirkin> surprise surprise - i get "pass"

18:04 <imirkin> maybe check the nir that it gets compiled to? dunno

18:04 <jenatali> Yeah, passes on softpipe on Windows too, so not a platform difference, just a driver difference

18:04 <imirkin> perhaps the nir linker screws up?

18:04 Daanct12 has joined #dri-devel

18:04 <imirkin> do you have some sort of uniform specialization thing?

18:05 mclasen has joined #dri-devel

18:05 <imirkin> perhaps you have a pass which, uh, "optimizes" this sort of thing out?

18:05 <jenatali> This "fail to find" is coming from mesa/main, does that rely on which linker is being used?

18:05 <imirkin> sorta like gcc 2.96 did to my for loops

18:05 <imirkin> well, it relies on the shader contents

18:05 <imirkin> like let's say "pick" were _really_ unused

18:06 <imirkin> then this would be expected -- it wouldn't have a location assigned to it

18:06 <imirkin> so something somewhere is dropping the usage of "pick"

18:06 <imirkin> but i dunno how deep in the pipeline that dropping has to be

18:06 <imirkin> the driver normally can do whatever

18:07 <imirkin> so i suspect the nir linker is the "last spot" which can affect this

18:07 <imirkin> can you use the regular linker to see if everything works?

18:07 <jenatali> I'll keep digging, starting there, yeah

18:10 cworth has joined #dri-devel

18:10 Danct12 has quit [Ping timeout: 480 seconds]

18:10 devilhorns has quit [Remote host closed the connection]

18:11 <jenatali> Yup, the variables are gone by the time it gets to the driver

18:11 devilhorns has joined #dri-devel

18:11 rellla has joined #dri-devel

18:12 <imirkin> iirc there's some uniform specialization thing available

18:12 <imirkin> i have no information beyond its potential existence

18:13 JohnStultz[m] has joined #dri-devel

18:14 <jenatali> Looks like it's getting nuked from nir_remove_dead_variables somehow

18:14 <imirkin> yeah. so just like gcc 2.96 was optimizing my loops then -- "for () { do stuff }" -- we don't _really_ need that, the program will run much faster without it.

18:15 <imirkin> (i'm definitely not still bitter about that, 20+ years later)

18:15 <jenatali> Er, to be more specific, nir_remove_dead_variables is killing the function_temp array somehow, and once that's gone, pick and pick2 becomes usesless

18:16 <jenatali> Does anybody else use the NIR linker? I searched and I only saw zink with failures for this test in the logs

18:16 <imirkin> without answering that, i can say that it's fairly new

18:18 <ccr> it takes no time to execute if there's no code to run!

18:18 <imirkin> exactly.

18:18 <imirkin> it _did_ run a lot faster

18:19 <austriancoder> anholt: should be working again.. yeah time for a docker-gc job.

18:19 <imirkin> the more annoying thing was that gcc 2.96 wasn't a real release - it was just a CVS checkout RH made at some random point.

18:19 <imirkin> and shipped with RH 6 or so?

18:20 <ccr> something like that

18:20 * ccr gets flashbacks from gcc vs egcs "wars"

18:21 <jenatali> Yeah ok, vars_to_ssa is seeing that the load is undef if it pick and pick2 aren't the same, and therefore the compiler assumes that they have to be the same, which means that the temp array is considered unused, and therefore it and the uniforms are all dead

18:21 <imirkin> i see no problem with that logic.

18:21 gawin has quit [Ping timeout: 480 seconds]

18:21 <jenatali> Yep. Test isn't sufficiently smart to circumvent a smart compiler

18:22 mbrost has joined #dri-devel

18:24 <jenatali> Yup. Initializing the array fixes the test

18:24 Daanct12 is now known as Danct12

18:27 frieder_ has quit [Remote host closed the connection]

18:30 cworth has quit [Ping timeout: 480 seconds]

18:32 <imirkin> jenatali: tbh i dunno what level of intelligence is "allowed" for optimizing such things out

18:32 <jenatali> Yeah

18:32 <imirkin> (from the uniform API)

18:32 <cheako> does overlay layer depend on internals of mesa?

18:32 <imirkin> although afaik it's "any", so this could be fine

18:33 <cheako> I'm looking to write a vulkan layer.

18:39 <pendingchaos> cheako: I think it's supposed to work with non-mesa drivers, if that's what you mean

18:39 <pendingchaos> it uses stuff in src/util/ and src/vulkan/util though

18:47 <zmike> can I get a quick rb on https://gitlab.freedesktop.org/mesa/piglit/-/merge_requests/634

18:50 <cheako> I can't figure out where it's populating vtable, does it peek inside the opaque vkinstance handle?

18:54 <danvet> mlankhorst, mripard_ tzimmermann ack for adding demarchi in drm-misc?

18:54 glLiquidAcidARB has joined #dri-devel

18:56 <zmike> imirkin: this is actually the worst

18:56 <imirkin> :(

18:57 <imirkin> sorry. i shouldn't have looked, and just let someone else slap a r-b on there

18:57 <imirkin> what you don't know can't hurt you!

18:57 <zmike> there's a gtf test that does the same thing

18:57 <imirkin> (esp when it relates to xfb of gl_PointSize)

18:57 <zmike> so in any case it's probably relevant

18:58 <zmike> I think I need a spec bug

18:58 <imirkin> why doesn't this Just Work btw? does vk do the GLES thing of requiring emitting a gl_PointSize?

18:58 <zmike> yes

18:58 <zmike> and according to opengl (core) spec, the pointsize value that's used depends on the state of that enum

18:59 <imirkin> ah. and so then you helpfully stick in gl_PointSize = glPointSize() value. but that doesn't gel with the contents of the shader...

18:59 <zmike> but it doesn't explicitly say whether that behavior also affects xfb

18:59 <imirkin> i would assume that it'd be undefined to xfb gl_PointSize without setting it in the shader

18:59 <imirkin> i _sorta_ assume that without that value set, you're expected to xfb the thing in the shader anyways, even though the glPointSize is different

19:00 <imirkin> but ... not 100% sure

19:00 <imirkin> i'd just export 2 things

19:00 <imirkin> (in the presence of xfb)

19:00 <zmike> actually the worst

19:00 <imirkin> xfbgl_PointSize + realgl_PointSize :)

19:00 ybogdano has quit [Remote host closed the connection]

19:00 tobiasjakobi has quit [Ping timeout: 480 seconds]

19:00 <imirkin> i.e. aka a ton of work for a corner case no one will hit

19:01 <zmike> I'm hitting it :/

19:01 <imirkin> right. i mean "no actual users"

19:03 cworth has joined #dri-devel

19:04 <demarchi> danvet: mlankhorst mripard_ tzimmermann just Cc'ed you all in https://gitlab.freedesktop.org/freedesktop/freedesktop/-/issues/416

19:07 <airlied> zmike: yes xfb should handle the shader one only

19:08 <zmike> I'm already deep into an even bigger disaster

19:09 gouchi has quit [Remote host closed the connection]

19:10 nchery has quit [Ping timeout: 480 seconds]

19:12 nchery has joined #dri-devel

19:18 devilhorns has quit [Remote host closed the connection]

19:22 <airlied> zmike: I think we should probably disable using FP16_CONSTBUF

19:22 <zmike> airlied: I think the problem is not also using PIPE_SHADER_CAP_GLSL_16BIT_CONSTS?

19:23 <zmike> or at least that's what radeonsi does and things seem fine there

19:23 alanc has quit [Remote host closed the connection]

19:24 alanc has joined #dri-devel

19:24 <airlied> zmike: I think glGetUniform is broken on radeonsi as well

19:25 <zmike> oof

19:25 <airlied> just nobody has tested GTF

19:25 <zmike> isn't radeonsi conformant?

19:25 <airlied> doesn't mean anyone runs the test suite

19:25 <airlied> not sure when gl4.6 was last submitted, I should test it here I suppose

19:25 <zmike> but surely it would have had to run at some point

19:26 <zmike> alright, well if that's what we gotta do then I guess that's what we gotta do

19:27 <imirkin> zmike: potentially at a point priot to the 16-bit consts thing. that's fairly new.

19:27 <imirkin> prior*

19:32 <airlied> and mareko did say they never finished fp16/mediump off properly

19:32 <zmike> cool cool cool

19:33 ngcortes has joined #dri-devel

19:35 gawin has joined #dri-devel

19:55 <zmike> airlied: here we go https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14835

19:55 <zmike> and now I'm making more awfulness for xfb

19:55 <zmike> hooray

19:56 cworth has quit [Ping timeout: 480 seconds]

20:15 gawin has quit [Ping timeout: 480 seconds]

20:19 cworth has joined #dri-devel

20:23 fxkamd has quit []

20:24 <mareko> what is GTF?

20:24 <airlied> mareko: kc-cts closed source conformance bits

20:25 fxkamd has joined #dri-devel

20:25 <mareko> thanks

20:38 glLiquidAcidARB has quit [Remote host closed the connection]

20:40 tzimmermann has quit [Quit: Leaving]

20:46 i-garrison has quit []

20:47 <mlankhorst> demarchi: don't have i915 commit rights yet btw?

20:47 <jenatali> Branch point is today, right?

20:48 i-garrison has joined #dri-devel

20:48 <airlied> zmike: care to add llvmpipe to it so I can ack it easier :-P

20:49 <zmike> k

20:49 <zmike> airlied: you mean the label in gitlab or ?

20:50 <zmike> llvmpipe doesn't export the cap

20:51 oneforall2 has quit [Quit: Leaving]

20:52 <Sachiel> jenatali: yes

20:52 <jenatali> Sad face, just too slow to get GL4 into it

20:53 <airlied> zmike: yes it does, but it's hidden in gallivm

20:53 <zmike> oh

20:53 <zmike> obviously

20:53 <Sachiel> cram it all in now, then bugfix your way through the next couple of weeks

20:55 <zmike> airlied: k

21:00 mbrost_ has joined #dri-devel

21:04 <jenatali> Alright, only 2 (hopefully easy) extensions away from GL4.2 now, cool

21:05 mclasen has quit [Ping timeout: 480 seconds]

21:05 <ccr> \:D/

21:07 mbrost has quit [Ping timeout: 480 seconds]

21:08 oneforall2 has joined #dri-devel

21:10 nchery is now known as Guest1551

21:10 nchery has joined #dri-devel

21:14 <airlied> jenatali: people only care about compute shaders :-P

21:15 <jenatali> Got those!

21:17 Guest1551 has quit [Ping timeout: 480 seconds]

21:22 ybogdano has joined #dri-devel

21:23 alyssa has joined #dri-devel

21:23 <alyssa> Mali has lots of data structures where different words are interpreted different ways depending on a "type" word at the start.

21:24 <alyssa> (Think: a "depth" field only existing in a Texture descriptor if it's a 3D texture)

21:24 <alyssa> GenXML can handle this by overlaying all the possibe combinations, but this makes for verbose dumps and seems a bit unsafe.

21:25 crabbedhaloablut has joined #dri-devel

21:25 <alyssa> Considering augmenting our XML with `if="mode=3D"` type properties

21:25 <alyssa> but not sure if there's an established way to handle this in existing GenXML's

21:25 <alyssa> Kayden: anholt: ^ not sure if Intel or Broadcom hit this in their respective GenXMLs

21:26 <imirkin> there's definitely some funny business in the media things on intel. i don't think it was handled particularly well in the genxml though.

21:27 <alyssa> *nod*

21:27 <imirkin> iirc there are just a bunch of structs for the "class"-based things, and then the main struct just has a bag of bits without any info on how to interpret it

21:27 <alyssa> *nod*

21:28 <imirkin> airlied: did you never land the crocus media thing?

21:29 <airlied> imirkin: no I had some license rabbit holes to descend and got distracted

21:29 <imirkin> delightful

21:29 <airlied> yeah turns out the gpu vendors don't pay the license fees you'd expect

21:29 <imirkin> alyssa: look in e.g. gen6.xml at INTERFACE_DESCRIPTOR_DATA

21:30 <imirkin> alyssa: and then the thing i add here: https://gitlab.freedesktop.org/airlied/mesa/-/commit/87c882a4cc95163e06c12331211ad0102b9e85a8#f10de9973eabc1fbe0353fbbfd324db7845c468a

21:30 <alyssa> ah reading Intel XML is weird

21:30 <imirkin> oh, but there i force the type

21:30 <imirkin> but in reality it can be several things

21:30 <alyssa> subtly different from the Panfrost XML >.<

21:30 <imirkin> we were just lazy and only did one :)

21:31 <imirkin> (hard work sometimes pays off later, but laziness always pays off now)

21:31 <alyssa> mmh

21:32 <alyssa> meh, guess I'll add a vendor extension to Panfrost GenXML :sweat:

21:32 <alyssa> far from the first *sweats*

21:32 mvlad has quit [Remote host closed the connection]

21:32 <imirkin> in rnndb, we sometimes use variants

21:33 <imirkin> to separate things out

21:33 <imirkin> but then something has to define what variant it is

21:37 cphealy has quit [Ping timeout: 480 seconds]

21:38 <demarchi> mlankhorst: I do

21:40 maxzor_ has joined #dri-devel

21:41 <bl4ckb0ne> has there been any "break" between vk 1.2 and 1.3?

21:41 <airlied> shouldn't be

21:41 <bl4ckb0ne> my loader is 1.3.204 but my physical device is 1.2.195

21:41 <bl4ckb0ne> im getting "Invalid physicalDevice" from the loader when calling a function

21:43 <bl4ckb0ne> im suspecting https://github.com/KhronosGroup/Vulkan-Loader/blob/b21acf16e7f703fed6e6604c9dffdc36e162b2c8/loader/loader.h#L38

21:45 <bl4ckb0ne> well, vkcube seems to be working fine

21:47 <bl4ckb0ne> the ext in question is VK_EXT_acquire_drm_display, im on polaris10

21:48 mdnavare has quit [Read error: Connection reset by peer]

21:48 mdnavare has joined #dri-devel

21:51 shoragan has quit [Ping timeout: 480 seconds]

21:58 <airlied> alyssa: the cap break glGetUniform, and Khronos guideance so far is that other impls don't do that

22:03 maxzor_ has quit [Remote host closed the connection]

22:03 maxzor_ has joined #dri-devel

22:03 JohnnyonFlame has joined #dri-devel

22:06 <alyssa> airlied: grumble... what's involved in fixing the cap?

22:07 <airlied> fixing the glGetUniform path I suppose

22:11 <alyssa> Thanks for volunteering C:

22:13 <imirkin> alyssa: that was a "note to self" presumably? :)

22:19 baryluk has joined #dri-devel

22:30 Koniiiik has joined #dri-devel

22:37 rcf has quit [Quit: WeeChat 3.2.1]

22:38 Duke`` has quit [Ping timeout: 480 seconds]

22:38 danvet has quit [Ping timeout: 480 seconds]

22:44 rcf has joined #dri-devel

22:47 <graphitemaster> I really dislike AMD's approach to open source. "Open" stuff which is just repos of docs and no code or worse - precompiled binaries and no code - this is actually worse than NVs not giving anything *shrug*

22:48 <graphitemaster> Like this is insane: https://github.com/GPUOpen-Tools/radeon_gpu_analyzer/tree/master/source/utils/shader_analysis

22:48 pcercuei has quit [Quit: brb]

22:49 <graphitemaster> And the actual blog post has a link to it

22:49 <graphitemaster> https://gpuopen.com/learn/live-vgpr-analysis-radeon-gpu-analyzer/ and it 404s too

22:50 pcercuei has joined #dri-devel

22:51 <imirkin> yeah, links in github tend to go stale pretty quickly

22:51 <imirkin> i recently reported such a thing to another company (in an unrelated field)

22:51 <imirkin> to my *vast* surprise they fixed it not by updating the github link but by making the original link work. i guess too many things had the old link :)

22:51 nchery has quit [Ping timeout: 480 seconds]

22:52 maxzor_ has quit [Remote host closed the connection]

22:52 <graphitemaster> I was able to find more information of what I needed in the Cuda documentation than what I'm able to find in AMDs OpenGPU docs and "open" code.

22:52 maxzor_ has joined #dri-devel

22:52 <graphitemaster> I'd consider that a bit of a soft failure since NV has a reputation for being closed as hell lol.

22:53 <imirkin> nvidia has always had excellent developer docs for cuda & co

22:53 <imirkin> (ok, i dunno about literally always ... were those docs really that great for cuda 1.0? probably not. but for a long time)

22:53 <graphitemaster> Bonus points you can actually navigate the cuda docs in a terminal webbrowser

22:54 <imirkin> (they certainly beat trying to do compute in a vertex shader, irrespective of the amount of docs...)

22:54 <graphitemaster> These gpu open docs need 50 js files to load the damn nav bar

22:54 pcercuei has quit []

22:56 <imirkin> yeah, i always try my best to minimize that for any web stuff i do

22:56 <imirkin> but it seems like most people don't give a shit

22:57 cphealy has joined #dri-devel

22:57 <graphitemaster> I think Fabien Sanglard has the best website I've ever visited: https://fabiensanglard.net/

22:57 <graphitemaster> That thing is so fast

22:58 <graphitemaster> Like damn this loads so nice https://fabiensanglard.net/cuda/index.html

22:58 degasus has joined #dri-devel

22:59 degasus is now known as Guest1567

23:03 nchery has joined #dri-devel

23:05 Guest1567 is now known as degasus

23:06 <graphitemaster> Anyways I need to compute occupancy on AMD GPUs and I can't find any information :3

23:07 gawin has joined #dri-devel

23:07 javierm has joined #dri-devel

23:08 lplc has quit [Ping timeout: 480 seconds]

23:11 dj-death has joined #dri-devel

23:11 <pendingchaos> graphitemaster: https://gitlab.freedesktop.org/mesa/mesa/-/blob/366d83a30ec6f1033ef262ec309e72cce6d3cdf7/src/amd/vulkan/radv_shader.c#L2192

23:11 <pendingchaos> radeon_info is initialized in ac_gpu_info.c

23:11 <pendingchaos> and for GFX10+ with wave64, we multiply it by 2 so it can be compared with wave32

23:11 <pendingchaos> not sure if the fragment shader lds stuff is correct

23:11 <pendingchaos> result is waves per simd

23:19 kgz has joined #dri-devel

23:19 Prf_Jakob has joined #dri-devel

23:20 <graphitemaster> pendingchaos, I assume waves in this means wavefronts (i.e groups of 64 threads) - just checking to make sure, no one is consistent with the terminology - I've seen waves used to refer to individual threads in a wavefront which is really confusing

23:21 <HdkR> Waves being used to refer to individual threads is just wrong

23:21 maxzor is now known as Guest1570

23:21 maxzor_ is now known as maxzor

23:21 <maxzor> Maybe all that's needed is Fabien Sanglard starting writing blog posts about AMD stuff

23:23 <maxzor> graphitemaster, there are various levels of open-source, piles of code without comments nor design considerations, nor any educated engineer making any significant documentation effort is a ring1 layer?

23:24 <maxzor> repos of doc that are left to rot are something - https://github.com/RadeonOpenCompute/ROCm_Documentation

23:25 mlankhorst has quit [Ping timeout: 480 seconds]

23:25 <pendingchaos> waves as in subgroups (warps with nv)

23:25 <pendingchaos> not 64 invocations/threads/lanes in the case of wave32

23:25 <graphitemaster> I was looking at this earlier

23:25 <graphitemaster> https://github.com/GPUOpen-Tools/radeon_compute_profiler/blob/v5.6/docs/source/occupancy.rst#kernel-occupancy-for-amd-radeon-hd-7000-series-or-newer-based-on-graphics-core-next-architecture

23:26 <graphitemaster> The formulas are low res PNGs on dark GH background, unreadable

23:26 <graphitemaster> But yeah these docs are yikes, I should check the ROC one

23:28 <graphitemaster> There's also this thing https://github.com/ROCm-Developer-Tools/hipamd/blob/develop/src/hip_platform.cpp#L317

23:28 lplc has joined #dri-devel

23:29 <graphitemaster> Lets see how good these actually are, going to play around

23:29 <maxzor> is the rocm-smi python wrapper around the c api not enough for your need?

23:33 <graphitemaster> Not to use at runtime in a shipped product to dynamically pick kernels and adjust local size

23:34 <graphitemaster> I really dislike how this information is presented by both AMD and NV as static developer profiling markers and not actually exposed as information in APIs for writing code.

23:34 <graphitemaster> I mean aside from cuda which has all this stuff

23:40 pcercuei has joined #dri-devel

23:41 pnowack has quit [Quit: pnowack]

23:53 JohnnyonFlame has quit [Ping timeout: 480 seconds]

23:54 unidan has joined #dri-devel

23:55 oneforall2 has quit [Quit: Leaving]

23:58 cworth has quit [Ping timeout: 480 seconds]

23:58 rasterman has quit [Quit: Gettin' stinky!]

23:59 pcercuei has quit [Quit: dodo]