#dri-devel on 2021-09-10 — irc logs at oftc.irclog.whitequark.org

2021-07-26 22:56 ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar

00:02 tursulin has quit [Read error: Connection reset by peer]

00:06 mbrost has quit [Ping timeout: 480 seconds]

00:07 dllud_ has joined #dri-devel

00:07 dllud has quit [Read error: Connection reset by peer]

00:14 <idr> So... the finalize_nir hook is called twice.

00:14 <idr> This is a bummer because many of the lowering passes that we run, only need to be run once.

00:14 columbarius has joined #dri-devel

00:14 <idr> And nir_lower_gs_intrinsics only can be run once. :(

00:16 bluebugs has joined #dri-devel

00:16 <idr> It looks like the only other Gallium driver that uses that pass is zink.

00:16 co1umbarius has quit [Ping timeout: 480 seconds]

00:16 <idr> zmike: ^^^ How does zink not end up with multiple set_vertex_and_primitive_count in geometry shaders?

00:17 <idr> With the last set_vertex_and_primitive_count setting everything to zero. :(

00:17 <zmike> uhhh

00:17 mbrost has joined #dri-devel

00:17 <zmike> probably don't call that in finalize?

00:17 <zmike> 🤷‍♂️

00:17 <idr> zink does.

00:17 <zmike> does it?

00:17 <zmike> huh

00:18 <idr> Without massively restructuring a pile of our shared code, I don't have that luxury.

00:18 * zmike is better than he thought

00:18 <zmike> dunno

00:18 <zmike> I guess I'm just that good

00:19 <idr> Uh... ok.

00:19 <zmike> (I really have no idea)

00:21 <zmike> if you end up figuring it out, I'm interested since this was always one of those things that "just works"

00:30 Hi-Angel has quit [Ping timeout: 480 seconds]

00:35 <idr> zmike: I am not sure how it could work. :( The output of NIR_PRINT=1 on any geometry shader test could enlighten things.

00:35 <idr> I've been using glsl-fs-fogscale.

00:36 <idr> I have a couple commits that fix the problem I was encountering with that test and finalize_nir hooked up.

00:39 <zmike> maybe it's just not finalizing twice for me?

00:39 <imirkin_> does OpenCL have a "ssbo" equivalent? something where you'd do untyped stores to a resource with a base address?

00:40 <imirkin_> i know you can pass in pointers to stuff, but that generally gets implemented as general writes to VA, not the ssbo-type writes

00:41 <samueldr> I'm observing something that makes no sense; related to that divide by zero... https://github.com/torvalds/linux/blob/master/drivers/gpu/drm/sun4i/sun4i_dotclock.c#L93-L94

00:41 <samueldr> ideal is 1073742035; in clk_hw_round_rate, `rate` (the second param) ends up being 0

00:42 <imirkin_> samueldr: 32-bit platform?

00:42 <samueldr> yeah

00:42 <imirkin_> i think that's a known issue

00:42 <imirkin_> where rounding fails

00:42 <imirkin_> at least i def remember seeing stuff about that

00:42 <imirkin_> where if the clock values are too high (and 1GHz is certainly pretty high) you get messed up results

00:42 <samueldr> but how would that value become zero at the function call "boundary" ?

00:44 <imirkin_> samueldr: it wouldn't

00:44 thellstrom has quit [Read error: Connection reset by peer]

00:45 <imirkin_> i would suggest that your analysis is incorrect somehow

00:45 <imirkin_> i.e. the value isn't 0 like you think it is

00:45 thellstrom has joined #dri-devel

00:45 <samueldr> I added if (WARN(rate == 0, "clk_hw_round_rate, rate is zero."))

00:45 <imirkin_> where

00:45 <samueldr> to clk_hw_round_rate; and it hits

00:46 <samueldr> and using %ld to printk() the value

00:46 <imirkin_> so "ideal" is 0 then?

00:46 <samueldr> ideal = 1073742035

00:46 <imirkin_> do you print that before the warn?

00:46 <airlied> imirkin_: some thing lower opencl globals to ssbos

00:46 <airlied> things

00:46 <imirkin_> airlied: that's a driver decision, right?

00:46 <airlied> imirkin_: yes

00:47 <imirkin_> airlied: basically i have an opencl impl that i'm trying to get to use ssbo's

00:47 <imirkin_> i've thus far been unsuccessful, but i also don't actually know CL

00:48 <imirkin_> airlied: the situation is that on a4xx, the blob that came with my board only has ES 3.0. the later blobs that come with various android phones don't seem to execute successfully on the board.

00:48 <imirkin_> (and my attempts to do what those later blobs do have also not resulted in working ssbo's)

00:48 <samueldr> imirkin_: I print `ideal` before the function call; in the function I print `rate` before the warn

00:49 <imirkin_> samueldr: surprising :)

00:49 <airlied> imirkin_: yeah can't think of a way to make that happen from CL API

00:49 <samueldr> imirkin_: indeed!

00:49 <imirkin_> samueldr: one might even say ... not ideal

00:49 <samueldr> I'm not discounting that my instrumentation could be flawed, but at a glance it doesn't look wrong

00:49 <imirkin_> samueldr: put up a "git diff" somewhere?

00:50 <samueldr> yeah, I was actually thinking about that

00:50 <airlied> imirkin_: can you lower ssbos to globals? :_

00:50 <airlied> :-)

00:50 <imirkin_> airlied: yea

00:50 <imirkin_> airlied: i might just do that...

00:50 nchery has quit [Quit: Leaving]

00:50 <imirkin_> airlied: but there's a lot of overhead to doing that

00:50 <imirkin_> airlied: most importantly, bounds checking

00:50 <airlied> yeah sounds like bounds checking in the shader might be all the hw can do

00:51 <imirkin_> well ... it's the same descriptors as for images

00:51 <imirkin_> (which i can definitely get CL to make use of)

00:51 <imirkin_> and the "newer" blobs also use them

00:51 dhwohrom^ has quit [Remote host closed the connection]

00:52 <samueldr> https://gist.github.com/samueldr/07562d3d91c013bf2b0237e001374240 (though there is a yet untested change building from %d -> %lld for ideal)

00:53 <imirkin_> samueldr: well, long == int, right (on 32-bit)

00:54 <samueldr> it was a warning from the compiler

00:54 <samueldr> so I wanted to be thorough

00:54 <samueldr> right, so ideal is zero

00:54 <samueldr> now that I've re-instrumented with %lld

00:55 * samueldr hates C :)

00:55 mbrost has quit [Ping timeout: 480 seconds]

00:56 <samueldr> footguns at reach all the time, but that's not for this channel

00:57 <jenatali> imirkin_: CLOn12 lowers global mem to SSBOs

00:57 <imirkin_> jenatali: the situation is i'm tracing what a driver is doing

00:57 <imirkin_> jenatali: so i want to force that driver to use ssbo's somehow

00:58 <jenatali> Ohh

00:58 <jenatali> Yeah, no, the CL API doesn't have that, it's just memory

00:59 <imirkin_> well, the CL api has images

00:59 <imirkin_> which will require the driver to use some sort of descriptor

00:59 <imirkin_> is there a way to do untyped writes to images?

01:00 <jenatali> Uh... nope, don't think so

01:00 <imirkin_> e.g. to have a RGBA8 image, but just write U32's to it

01:00 <jenatali> You can create buffer image views, but that's still typed

01:10 <airlied> jenatali: I did some light clover on zink on vulkan hacking last week, had to do that lowering :-P

01:11 bluebugs has quit [Ping timeout: 480 seconds]

01:13 <imirkin_> right ok

01:13 <imirkin_> sorta what i expected, but thanks for confirming

01:17 <jenatali> airlied: Sharing lowering passes or doing something different?

01:18 <jenatali> I know a bunch of the stuff in CLOn12 is very hacked together before I really knew anything about how to write proper lowering passes so probably not easy to share right now

01:21 <samueldr> would a change similar to this be appropriate? https://gist.github.com/9ec964726e9e8ef59614e4c242185f49

01:21 <samueldr> correct, maybe not though

01:22 <samueldr> I can't say it makes things work, but it makes things less wrong on my already shaky foundations

01:22 <samueldr> (continuing assuming tcon0 actually works, since right now writing to the framebuffer through DRM doesn't hang userspace for a while like it did when it was failing)

01:25 <airlied> jenatali: I cut-n-paste yours but it needs adapting, due to other lowering

01:28 rgallaispou1 has joined #dri-devel

01:32 rgallaispou has quit [Ping timeout: 480 seconds]

01:32 soreau has quit [Remote host closed the connection]

01:33 soreau has joined #dri-devel

01:37 pnowack has quit [Quit: pnowack]

02:11 boistordu has joined #dri-devel

02:17 boistordu_ex has quit [Ping timeout: 480 seconds]

02:17 Bennett has quit []

02:34 dfip^ has joined #dri-devel

02:40 <marex> samueldr: how do you reach ideal == 0 ? u64 ideal = (u64)rate * i; so either i == 0, which means dclk_min_div == 0 or rate == 0 , which one is it ?

03:11 * samueldr looks back

03:14 <samueldr> marex: happens in this order: https://gist.github.com/samueldr/3b7233e3d81c7be9b1adf54a86ad45dd

03:20 idr has quit [Quit: Leaving]

03:54 <pinchartl> samueldr: looks fishy

03:54 <samueldr> (to be 100% clear I'm out of my depth with clock stuff :))

03:55 <samueldr> though it was stated earlier that on 32 bit (allwinner) platforms things are screwey

03:55 <pinchartl> the problem seems to be that dclk_min_div is 0

03:55 <pinchartl> that's not right

03:55 <pinchartl> a clock divisor of 0 isn't valid

03:55 shadeslayer has quit [Quit: Ping timeout (120 seconds)]

03:56 shadeslayer has joined #dri-devel

03:56 <pinchartl> that's what you need to fix, not hack around it by a check in the loop

03:56 <samueldr> when I said I'm out of my debt with clock stuff, it means I have no idea where to start really looking into that right now

03:57 <pinchartl> it happens to be in clock-related code, but the problem isn't specific to clocks

03:57 <pinchartl> it's just C

03:57 <pinchartl> look at where dclk_min_div comes from, and figure out why it's 0

03:57 <samueldr> in a pretty big codebase, with a lot of background knowledge about hardware

03:57 <pinchartl> and then fix that

03:58 <samueldr> I'm not saying I won't, but that "just" is probably the most loaded word to use

04:00 <samueldr> but yes, as I said when I shared that initially, it's unlikely to be the *solution*, but I wanted to move forward with other things; the division by zero broke so many things that maybe working around would have pointed me to the actual issue

04:01 bluebugs has joined #dri-devel

04:03 <samueldr> but, with that said, thank you pinchartl, as *in this case* it looks trivial enough, and that nudge into looking at dclk_min_div helped

04:03 <samueldr> (forward port of the older patch set did not include what must be a new property)

04:06 bl4ckb0ne has quit [Ping timeout: 480 seconds]

04:06 bl4ckb0ne has joined #dri-devel

04:09 NiksDev has joined #dri-devel

04:12 dviola has quit [Quit: WeeChat 3.2]

04:16 kem has quit [Ping timeout: 480 seconds]

04:20 NiksDev has quit [Remote host closed the connection]

04:21 NiksDev has joined #dri-devel

05:04 dllud has joined #dri-devel

05:04 dllud_ has quit [Read error: Connection reset by peer]

05:06 dllud_ has joined #dri-devel

05:06 dllud has quit [Read error: Connection reset by peer]

05:22 Duke`` has joined #dri-devel

05:32 sdutt_ has joined #dri-devel

05:32 sdutt has quit [Read error: Connection reset by peer]

05:42 itoral has joined #dri-devel

05:50 danvet has joined #dri-devel

05:53 mattrope has quit [Ping timeout: 480 seconds]

05:58 Hi-Angel has joined #dri-devel

05:58 mlankhorst has joined #dri-devel

06:08 Guest6808 has quit []

06:09 thelounge63 has joined #dri-devel

06:13 Company has quit [Quit: Leaving]

06:31 itoral has quit [Remote host closed the connection]

06:41 alanc has quit [Remote host closed the connection]

06:41 alanc has joined #dri-devel

06:55 lemonzest has joined #dri-devel

06:58 <airlied> jenatali: the entrypoint vs call MR is finally ready for merging :-P

07:04 tursulin has joined #dri-devel

07:11 rasterman has joined #dri-devel

07:20 jkrzyszt has joined #dri-devel

07:24 pnowack has joined #dri-devel

07:26 <pq> ajax, yes, I know alpha is always linear. My puzzle is, that when anything does premult, they multiply alpha into non-linear RGB values. That is the exact wrong thing to do if your aim is to filter or blend correctly, that is, in light-linear values to better match physics.

07:31 thellstrom1 has joined #dri-devel

07:36 <pq> mareko, ahh, that's what you meant with associativity. Ok. I never thought of that case, because it requires an extra temporary render target to hold (a o b).

07:36 thellstrom has quit [Ping timeout: 480 seconds]

07:38 <pq> swick, for normal blend the order *does* matter. If your stack is bg, a, b, then blending b into bg before a into bg would be wrong.

07:44 LaydLis21F has joined #dri-devel

07:45 LaydLis21F has quit []

07:47 <pq> Perhaps the assumption I got wrong is how premult is done in apps and toolkits; do they actually multiply alpha into light-linear RGB values and *then* convert to the sRGB non-linear encoding for 8 bpc?

07:49 <pq> That's an important question for Wayland which is premult by default, but there being two different ways of doing premult.

07:49 <pq> and we never wrote down which premult it is

07:51 <pq> A different matter is whether compositors convert to light-linear representation before blending or not. I've been assuming that most (all?) do not.

07:52 <pq> For premult the big question is: which way does Cairo do it?

07:57 sdutt_ has quit [Ping timeout: 480 seconds]

08:09 Ahuj has joined #dri-devel

08:21 kem has joined #dri-devel

08:30 pcercuei has joined #dri-devel

08:51 YuGiOhJCJ has joined #dri-devel

09:06 jernej has quit [Quit: Free ZNC ~ Powered by LunarBNC: https://LunarBNC.net]

09:27 xexaxo has joined #dri-devel

09:27 jhli has quit [Remote host closed the connection]

09:33 xexaxo_ has quit [Ping timeout: 480 seconds]

09:38 quasselcore_ has joined #dri-devel

09:39 xexaxo_ has joined #dri-devel

09:44 quasselcore has quit [Ping timeout: 480 seconds]

09:46 xexaxo has quit [Ping timeout: 480 seconds]

09:47 jernej has joined #dri-devel

09:48 macromorgan has quit [Remote host closed the connection]

09:52 xexaxo has joined #dri-devel

09:58 xexaxo_ has quit [Ping timeout: 480 seconds]

10:10 camus1 has joined #dri-devel

10:10 camus has quit [Read error: Connection reset by peer]

10:19 dllud has joined #dri-devel

10:19 dllud_ has quit [Read error: Connection reset by peer]

10:45 Net147_ has joined #dri-devel

10:51 Net147 has quit [Ping timeout: 480 seconds]

11:03 Net147 has joined #dri-devel

11:03 Net147_ has quit [Remote host closed the connection]

11:11 Net147_ has joined #dri-devel

11:13 Net147 has quit [Read error: Connection reset by peer]

11:15 Net147 has joined #dri-devel

11:19 Net147_ has quit [Read error: Network is unreachable]

11:19 Net147 has quit [Read error: Connection reset by peer]

11:20 Net147 has joined #dri-devel

11:48 Peste_Bubonica has joined #dri-devel

11:49 JohnnyonFlame has quit [Ping timeout: 480 seconds]

12:08 Duke`` has quit []

12:10 yoslin has quit [Remote host closed the connection]

12:16 f11f12 has joined #dri-devel

12:17 yoslin has joined #dri-devel

12:20 pcercuei has quit [Quit: brb]

12:22 jkrzyszt has quit [Remote host closed the connection]

12:26 pcercuei has joined #dri-devel

12:27 bluebugs has quit [Ping timeout: 480 seconds]

12:30 vivijim has joined #dri-devel

12:47 YuGiOhJCJ has quit [Quit: YuGiOhJCJ]

12:50 dviola has joined #dri-devel

13:02 iive has joined #dri-devel

13:04 hansg has joined #dri-devel

13:14 <mripard> sravn: if you have a bit of time, could you review https://lore.kernel.org/dri-devel/20210830094910.150713-1-maxime@cerno.tech/ ?

13:15 jewins has joined #dri-devel

13:19 <marex> there is one thing I don't quite understand, maybe someone can give me a hint ...

13:21 <marex> with latest mesa (although it goes back quite a bit) with etnaviv (might be irrelevant), qt5 with eglfs backend does eglChooseConfig(... EGL_SURFACE_TYPE, EGL_PBUFFER_BIT ...) and returns zero available configuration

13:21 <marex> but eglChooseConfig(... EGL_SURFACE_TYPE, EGL_WINDOW_BIT ...) returns 4 configurations

13:21 <marex> both return EGL_SUCCESS , i.e. it is not an error

13:21 <marex> why is it not possible to allocate the pbuffer ?

13:24 macromorgan has joined #dri-devel

13:24 <MrCooper> marex: may depend on the platform, GBM & Wayland platforms don't seem to support PBuffers (or pixmaps)

13:25 <marex> MrCooper: got any details on the later part , starting with GBM .... ?

13:25 <zmike> emersion: so you're saying the kms stuff can just be removed entirely since that should only hit gbm?

13:25 <marex> that sounds odd, why wouldnt they ?

13:25 <MrCooper> marex: don't know why, just that they currently don't according to eglinfo

13:26 <emersion> zmike: yea

13:26 <zmike> cool

13:26 <emersion> zmike: assuming resource_get_param is implemented

13:26 <zmike> it is

13:26 <emersion> cool

13:26 <emersion> then yeah i'd just fail on KMS queries

13:26 <zmike> sgtm

13:26 <emersion> supporting KMS handles would be a bit annoying

13:27 Company has joined #dri-devel

13:33 <marex> MrCooper: oh, I should check eglinfo indeed, thanks for the hint

13:34 <marex> I still wonder why the pbuffer support is missing, it just doesn't make sense to me

13:36 <pq> marex, probably no-one bothered to wire it up and no-one asked for it. It's not the first time with pbuffers.

13:37 <pq> marex, people tend to use FBOs and textures/render targets nowadays instead, or surfaceless EGL platform if offscreen-only is your aim.

13:37 <marex> pq: it is for offscreen, yes

13:38 <pq> funnily enough, surfaceless EGL platform supports pbuffers :-)

13:38 <marex> pq: what am I looking for in mesa (I presume this is mesa thing, where else would it be) ?

13:38 <marex> hm

13:38 <pq> a moment...

13:39 <marex> src/egl/drivers/dri2/platform_surfaceless.c no ?

13:39 <marex> except I need it for somewhere else

13:39 <pq> EGL_PLATFORM_SURFACELESS_MESA and eglGetPlatformDisplay(EXT)()

13:40 co1umbarius has joined #dri-devel

13:40 <pq> checking that the needed EGL extensions are actually advertised for this is a bit of a chore

13:40 <marex> pq: wait a moment ...

13:40 <marex> that suggestion sounds familiar

13:41 <marex> pq: https://bugs.freedesktop.org/show_bug.cgi?id=108967 this

13:41 <marex> pq: specifically comment 6

13:41 columbarius has quit [Ping timeout: 480 seconds]

13:42 <pq> marex, surfaceless context is a different thing from surfaceless platform. I am referring to the platform.

13:45 <pq> marex, here's an example of the supported platform check: https://gitlab.freedesktop.org/wayland/weston/-/blob/a2a8d382e38fdee65adb91ce7b4e6fb280389c8f/libweston/renderer-gl/egl-glue.c#L511

13:46 <marex> pq: so if I grep right, this is not mesa driver specific , is it ?

13:47 <pq> it's not driver-specific I guess, or I don't know why it would be

13:47 f11f12 has quit [Quit: Leaving]

13:48 <marex> ah, qt5 turns off surfaceless for mesa https://code.qt.io/cgit/qt/qtbase.git/tree/src/platformsupport/eglconvenience/qeglpbuffer.cpp?h=5.15.2#n73

13:49 <pq> marex, it talks about surfaceless context, not platform.

13:49 <marex> pq: I think I am confused, just like comment 6 above

13:50 <marex> pq: why would I care about surfaceless platform with qt5 ?

13:50 <pq> yes, surfaceless context and surfaceless EGL platform are completely different things

13:51 <pq> marex, because you said you wanted pbuffers for off-screen-only rendering.

13:51 <pq> I have no idea what qt5 is about.

13:51 <pq> EGL platform is "which window system do you connect to and how", where surfaceless means "none, no display".

13:51 <marex> pq: they request a bunch of pbuffers for offscreen rendering as far as I can tell

13:52 <pq> surfaceless context extension means: you can eglMakeCurrent with EGL_NO_SURFACE.

13:52 <marex> ah

13:53 <pq> to confuse everything, you don't have to rely on surfaceless context extension on surfaceless EGL platform, because there you can create a pbuffer surface. :-)

13:54 <marex> pq: I spent most of yesterday digging through that qt5 code and mesa, trying to wrap my head around what's going on there, I think I am already positively confused :)

13:58 <pq> marex, I suppose you *don't* want off-screen *only* rendering, but both off-screen and on-screen, since you're talking about qt?

13:58 <pq> in that case, sorry for mentioning surfaceless EGL platform. It does off-screen only.

13:59 <marex> pq: well, can't you render off-screen and then have scanout engine display the result ?

13:59 <pq> sure, that's called on-screen :-)

13:59 <marex> oh, well ...

13:59 <pq> ...with double buffering

14:00 <marex> pq: just for clarity, the surfaceless platform is mostly for testing, I recall seeing it used with dEQP

14:00 <marex> right ?

14:00 <pq> but yes, you could use purely off-screen setup, and import a display-capable buffer for rendering

14:01 <pq> well... testing is one, GPGPU also, not sure if wlroots uses it with the above mentioned buffer imports

14:02 <marex> hmmm, maybe I just need to add .create_pbuffer_surface in https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/egl/drivers/dri2/platform_drm.c#L671 ?

14:10 <marex> https://lists.freedesktop.org/archives/mesa-dev/2015-June/086741.html there already is a patch no less

14:17 mattrope has joined #dri-devel

14:24 tobiasjakobi has joined #dri-devel

14:27 tobiasjakobi has quit [Remote host closed the connection]

14:30 nchery has joined #dri-devel

14:31 sdutt has joined #dri-devel

14:34 azorshi has joined #dri-devel

14:47 Ahuj has quit [Ping timeout: 480 seconds]

14:49 <marex> pq: https://gitlab.freedesktop.org/marex/mesa/-/commit/010aa326109a9d51d5b0b4aac4dd25e0d3e62341 this maybe ?

14:49 <marex> I am about to test it

15:07 mlankhorst has quit [Ping timeout: 480 seconds]

15:10 Peste_Bubonica has quit [Quit: Leaving]

15:11 bluebugs has joined #dri-devel

15:19 <sravn> mripard: yep, I will take a look tonight or sunday

15:22 bluebugs has quit [Read error: Connection reset by peer]

15:22 bluebugs has joined #dri-devel

15:23 Duke`` has joined #dri-devel

15:27 <mripard> sravn: awesome, thanks

15:31 bluebugs has quit [Quit: Leaving]

15:31 Daanct12 has joined #dri-devel

15:32 Daanct12 has quit []

15:33 sdutt has quit []

15:33 sdutt has joined #dri-devel

15:34 JohnnyonFlame has joined #dri-devel

15:35 cedric has joined #dri-devel

15:35 Anorelsan has joined #dri-devel

15:35 cedric has quit []

15:37 Danct12 has quit [Ping timeout: 480 seconds]

15:40 cedric has joined #dri-devel

15:45 rgallaispou1 has quit [Read error: Connection reset by peer]

15:48 mlankhorst has joined #dri-devel

15:51 camus1 has quit []

15:55 cedric is now known as bluebugs

16:19 camus has joined #dri-devel

16:32 <imirkin_> how does one read back the contents of an image in OpenCL?

16:32 <imirkin_> (i.e. in the C program)

16:32 Danct12 has joined #dri-devel

16:32 <imirkin_> someone needs to register docs.cl

16:33 <imirkin_> oh. it's already registered. but not nearly as useful as docs.gl

16:33 <jenatali> imirkin_: clReadImage?

16:33 <jenatali> Oh, clEnqueueReadImage

16:33 <imirkin_> thanks

16:34 <imirkin_> ok cool, that looks perfect

16:41 nsneck has joined #dri-devel

16:41 macromorgan has quit [Quit: Leaving]

16:47 macromorgan has joined #dri-devel

17:06 mbrost has joined #dri-devel

17:09 gouchi has joined #dri-devel

17:27 <jekstrand> General question: I just finished making my Ray-tracing pt. 2 slide deck for XDC next week. It's 46 very dense slides. No way it's going to fit in 45 minutes. What would people be more interested in: Shader compiling for ray-tracing pipelines or BVH building?

17:27 <jekstrand> My current thought is see how far I get and maybe organize a BVH building break-out

17:27 <jekstrand> Because I think it'd be good to have back-and-forth there.

17:27 <jekstrand> Thoughts?

17:27 <jekstrand> bnieuwenhuizen: ^^

17:28 <imirkin_> i'd definitely be interested in a good talk about BVH, whether it's yours or someone else's

17:28 <jekstrand> I'm not going to talk about actual BVH algorithms, FYI

17:29 <imirkin_> (although full disclosure, i will not be attending in any capacity. but i do watch the videos afterwards.)

17:29 <jekstrand> Just about the annoying complexities of launching piles of OpenCL kernels to do it

17:29 <imirkin_> ah

17:30 macromorgan is now known as Guest6898

17:30 macromorgan has joined #dri-devel

17:31 Guest6898 has quit [Read error: Connection reset by peer]

17:36 dllud_ has joined #dri-devel

17:36 dllud has quit [Read error: Connection reset by peer]

17:42 <daniels> jasuarez: rpi farm seems v unhappy

17:43 ngcortes has joined #dri-devel

17:43 <jasuarez> I'll check

17:43 <jasuarez> Thx

17:48 hfink has quit [Remote host closed the connection]

17:48 seanpaul has quit [Read error: Connection reset by peer]

17:49 CosmicPenguin has quit [Remote host closed the connection]

17:49 ezequielg has quit [Remote host closed the connection]

17:49 zmike has quit [Remote host closed the connection]

17:50 hfink has joined #dri-devel

17:50 CosmicPenguin has joined #dri-devel

17:50 seanpaul has joined #dri-devel

17:50 ezequielg has joined #dri-devel

17:50 zmike has joined #dri-devel

17:57 <jasuarez> daniels: it's up now

17:57 <jasuarez> i've restarted some of the broadcom jobs, they seemed stuck

17:58 <daniels> jasuarez: thankyou!

17:58 <jasuarez> 👍️

18:11 gouchi has quit [Remote host closed the connection]

18:23 Anorelsan has quit [Quit: Leaving]

18:34 alyssa has joined #dri-devel

18:34 * alyssa mumbles

18:34 <alyssa> Xorg is giving me an atomic_flush before i've signaled the previous vblank event... DCP does not like.

18:34 <alyssa> (wayland and simple-fb are fine)

18:36 imirkin_ has quit [Quit: Leaving]

18:42 pnowack has quit [Quit: pnowack]

18:42 <alyssa> or maybe not that. something is definitely broken in how I'm handling vblanks

18:48 mlankhorst has quit [Ping timeout: 480 seconds]

18:50 idr has joined #dri-devel

18:54 <alyssa> I guess there's no in-tree precedent for this firmware-managed vblank nonsense.

18:55 <airlied> alyssa: nouveau might

18:55 <airlied> but not sure if it still has vbl irqs

18:55 <alyssa> 👀

18:59 <danvet> alyssa, if you do delay the vblank event until dcp sends you one then atomic helpers should hold up the next hw commit

18:59 <danvet> since uapi is that you can get EBUSY up to that vblank

18:59 <danvet> most hw is fairly unhappy with this

18:59 <danvet> s/vblank event/crtc_state->event completion/

19:00 <danvet> which exists everywhere, even without vblank

19:00 <alyssa> right, that's what I think I've done and which works for weston but not apparently X

19:01 <alyssa> although .. what I'm doing is still abusing the semantics of vblank a lot.

19:02 <alyssa> e.g.: if enable_vblank() is called and then nothing else happens (no atomic_flush), no vblank will come in

19:04 <alyssa> although err atomic_flush isn't what I really want is it now. ugh

19:05 <alyssa> the call ordering in tomic_helper_commit_tail doesn't match that. Uh

19:05 <alyssa> (commiting modeset enables *after* committing planes)

19:05 <alyssa> though commit_tail_rpm swaps that

19:06 <alyssa> oh just need a non-default commit function uh ok

19:07 <alyssa> right and i''m using the rpm version, so that's fine

19:08 ngcortes has quit [Ping timeout: 480 seconds]

19:09 <danvet> alyssa, I'm wondering whether it wouldn't be simpler to start out without announcing vblank support

19:09 <danvet> and only handing the crtc_state->event through your vblank machinery

19:09 <danvet> and maybe it might be easier to just not bother with hw vblank

19:09 <danvet> but fake it with hrtimer

19:10 <danvet> and then if the display is entirely idle, fire a vblank event once per second for clock drift

19:10 <alyssa> I mean, it's fake in either case ... my understanding is that I /needed/ vblank to get the EBUSY stuff

19:10 <danvet> assuming dcp didn't go into self-refresh :-)

19:10 <danvet> nah

19:10 <danvet> you need to handle crtc_state->event

19:10 <danvet> because that one is uapi (and the atomic helper keels over if you don't)

19:10 <danvet> but no vblank required

19:11 <alyssa> Hmm okay

19:11 <danvet> only thing required is that you get that out the door when convenient

19:11 <danvet> accurate timestamp optional (it'll just pick current time if there's none)

19:19 <alyssa> giving a try to handling state->event without adveristing vblanks

19:26 ngcortes has joined #dri-devel

19:27 lemonzest has quit [Quit: WeeChat 3.2]

19:27 dllud has joined #dri-devel

19:27 dllud_ has quit [Read error: Connection reset by peer]

19:29 dllud_ has joined #dri-devel

19:29 dllud has quit [Read error: Connection reset by peer]

19:36 <alyssa> oh and ... i'm also trying to swap 0 layers ... that can't possible work

19:36 <alyssa> sigh

19:55 ngcortes has quit [Ping timeout: 480 seconds]

20:05 <anholt_> oof. virgl really wants a base value on load_ubo.

20:06 <anholt_> (it requires that the ubo array's base index has to be in the TGSI register's index, not its indirect offset)

20:09 * alyssa can't tell if she's making positive or negative progress anymore

20:13 <alyssa> (git bisect 1, alyssa 0)

20:17 ppascher has joined #dri-devel

20:21 <alyssa> aggressive use of git bisect got me to Xorg again,w eee

20:22 <airlied> anholt_: isn't that how ubos work in TGSI anyways?

20:22 <airlied> the indirect offset shouldn't only be used for indirect UBO accesses

20:22 <airlied> which requires GL4.0 / ARB_gpu_shader5 on the host

20:23 <alyssa> modeset is still broken. come on, this was working recently ago! :p

20:27 <anholt_> airlied: well, by tgsi spec it seems like it ought to be just as happy with the base offset in the indrect vs in the index field of the register.

20:28 <anholt_> but virgl ends up trying to deref vsconst[indirect] instead of vsubo1[indirect] because the "1" was added to the user's indirect offset.

20:29 Company has quit [Read error: No route to host]

20:30 Company has joined #dri-devel

20:35 rasterman has quit [Quit: Gettin' stinky!]

20:37 gpoo has quit [Ping timeout: 480 seconds]

20:38 <alyssa> ok modesetting working again. sort of.

20:52 <clever> ive been booting linux on an arm board, and i'm trying to add a framebuffer to it now

20:52 <alyssa> clever: i'm so sorry

20:52 <clever> ive added a `compatible = "simple-framebuffer";` to the DT, but as soon as i turn that on, the SD card stops working

20:53 <clever> alyssa: getting an image out on video is already working

20:53 <clever> the problem is getting linux to put an image at that addr in ram

20:54 <alyssa> zsimple-framebuffer is as simple as it gets

20:54 <alyssa> so if that doesn't work ... it's not the gfx stack's fault ...

20:54 <alyssa> most likely

20:54 <clever> what could i look into, to debug this more?

20:55 <alyssa> dunno -- but if a trivial gfx thing is breaking your SD card you have far bigger issues than gfx

20:55 quasselcore_ has quit [Read error: Connection reset by peer]

20:56 <clever> my DTB patching is also extremely flakey

20:57 <clever> https://github.com/librerpi/lk-overlay/blob/master/app/linux-bootloader/loader.c#L133-L139

20:57 <clever> alyssa: this should be disabling the simple-framebuffer, but it doesnt do anything

20:57 loki_val is now known as crabbedhaloablut

20:57 <marex> clever: doesnt upstream already contain all the RPi graphics hardware init stuff ?

20:57 <clever> marex: upstream is closed-source, ive replaced it with open source code

20:58 <clever> and now i have to re-implement nearly every feature

20:58 <marex> clever: I meant linux upstream

20:58 <clever> marex: there is an undocumented register, that essentially firewalls off the gfx hardware, and the arm core cant configure any of it

20:58 <clever> so the gfx init within linux just fails

20:59 <clever> i have already brought the gfx hw online with my own code, and it is running a full demo with shaders, in parallel to linux booting

20:59 <clever> one of the layers on the screen, is just whatever happens to be at +128mb into ram

20:59 <clever> reg = <0x8000000 40000>;

20:59 <clever> and simple-framebuffer then tells linux to use that

21:00 vivijim has quit [Ping timeout: 480 seconds]

21:00 <clever> oh

21:00 <clever> https://github.com/librerpi/lk-overlay/blob/master/platform/bcm28xx/arm/arm.c#L78-L83

21:00 <clever> there is also a custom MMU, between the arm core, and the ram

21:00 <clever> i didnt map +128mb

21:01 <clever> so when the arm tries to write to +128mb, it gets re-routed to $undefined

21:01 <clever> and who knows what happens!!

21:02 gpoo has joined #dri-devel

21:02 <marex> clever: cant linux manage all that just like on other hardware with (io)mmu ?

21:02 <marex> it seems to me various GPUs also have their own MMUs and Linux handles that

21:03 <clever> marex: this MMU can only be written to by the VPU core, and is meant to do things at a magement engine level

21:03 <alyssa> clever: then... write it from the VPU...

21:03 <clever> alyssa: yep, already patched it to map the 128mb offset for the framebuffer

21:03 <alyssa> (in general I don't get the appeal of doing everything on the arm core. it's an uphill battle.)

21:04 <clever> in the case of gfx, not using the arm would add latency to pageflips

21:04 <clever> even with things all on the vpu, i'm seeing a stable tear about 20 pixels from the top of the screen

21:05 <clever> i see a flashing cursor!!!

21:05 <clever> ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00007f00 ]---

21:06 <clever> it stopped flashing, lol

21:06 <clever> Kernel panic - not syncing: Attempted to kill init! exitcode=0x00007f00

21:08 <alyssa> again... why can't use you mainline vc4?

21:08 <clever> alyssa: it fails with "async external abort" if it writes to even 1 control register

21:08 <alyssa> why?

21:08 <clever> undocumented firewall, that wont let the arm manage things

21:08 <clever> it has to be turned off by the firmware first, and i dont know how

21:08 <alyssa> so you're missing a power gate or a clock gate? sounds like one or two registers probably?

21:09 <alyssa> lot easier than reimplementing the entire gfx stack at any rate

21:09 <clever> definitely not power or clock gating

21:09 <clever> its rendering gfx right now

21:09 <clever> alyssa: https://www.youtube.com/watch?v=GHDh9RYg6WI

21:09 <clever> that looks like "on" to me

21:12 gouchi has joined #dri-devel

21:18 hansg has quit [Remote host closed the connection]

21:19 <clever> alyssa: just a bit triggered, because every RPF engineer ive asked for help, says clock/power gating, and then never answers again

21:20 <clever> and that above video, clearly shows its on

21:21 Duke`` has quit [Ping timeout: 480 seconds]

21:22 dviola has quit [Ping timeout: 480 seconds]

21:23 stuartsummers has joined #dri-devel

21:25 co1umbarius has quit [Ping timeout: 480 seconds]

21:30 danvet has quit [Ping timeout: 480 seconds]

21:48 gouchi has quit [Remote host closed the connection]

21:50 JohnnyonFlame has quit [Read error: Connection reset by peer]

21:54 <alyssa> easily could be multiple independent gates

21:54 <alyssa> or dependent gates or...

21:54 <alyssa> I don't know the clock topology.

21:55 <alyssa> I'm also not an RPF engineer

21:55 <clever> alyssa: if i write to a register from the VPU, and then read it from the ARM, i can see the value

21:55 <clever> that implies that there are no gates blocking it

21:56 <clever> if i write from the VPU, then write the identical value from the ARM, it gets an async abort

21:56 <clever> that implies that only writes are blocked

21:57 <marex> clever: what is VPU btw ?

21:57 <marex> Video Processing Unit ?

21:58 <clever> marex: the VPU is a dual-core processor with scalar and vector opcodes

21:58 <clever> it lacks an MMU, but has a form of MPU (step out of lines, and you fault, but the addresses must be identity mapped)

21:58 <marex> so custom ISA ?

21:58 <marex> or some small ARM coprocessor ?

21:58 alyssa has left #dri-devel [#dri-devel]

21:58 <clever> its a variant of a synopsys DSP, which is ARC based

21:59 <clever> but it doesnt match any of the existing ARC toolchains

21:59 <clever> somebody modified the ISA heavily

21:59 <clever> https://github.com/itszor/vc4-toolchain gives you a binutils + gcc that can target it

21:59 <marex> oh

21:59 <marex> clever: the async abort could be multiple things, ranging from what alyssa said, IP is gated off (clock, reset, etc), but also things like wrong write width

21:59 <marex> i.e. the write might need to be 8/16/32bit and no other

22:00 <clever> marex: always doing 32bit writes

22:00 <marex> or yes, it is blocked because secure/non-secure setting of the ARM core

22:00 <clever> 8bit reads malfunction in all kinds of fun ways

22:00 <clever> when running under the closed firmware, you can just mmap /dev/mem, and poke the HVS control registers, directly from userland

22:00 <clever> that implies zero security is being enforced, in that state

22:01 <clever> there is undocumented hardware, that allows you to block ALL MMIO from userland, reads just return null

22:01 <clever> but i have found that and disabled it

22:01 <clever> and i believe i tested HVS writes from arm secure mode, and they failed

22:03 <clever> something is selectively blocking HVS writes, but allowing SD/uart/i2c writes

22:03 <marex> well if it is disabled, why do you get aborts ? :)

22:03 <clever> there appears to be at least 2 different security features

22:03 <clever> 1 to stop userland from doing mmio, even if it gets mapped somehow

22:03 <clever> a 2nd to stop even kernel mode from touching certain dangerous hardware

22:04 <marex> clever: well ... what happens if you try the access from e.g. u-boot ?

22:04 <clever> i'm using a custom bootloader, and it fails even from the first few opcodes ran on the arm

22:04 <clever> but when using the closed firmware, it doesnt fail

22:04 <marex> that should run in secure mode (on rpi2 and 3 anyway)

22:05 <marex> clever: the closed firmware is running on ARM or on the companion core ?

22:05 <clever> the closed firmware runs on the VPU

22:05 <marex> can you e.g. run the closed firmware in QEMU and have it poke registers via JTAG , thus find out what it writes and reads and when etc ?

22:06 <clever> nope

22:06 <clever> qemu doesnt support the VPU ISA (yet, its on my todo list)

22:06 <clever> the jtag for the VPU is also not publicly documented

22:06 <marex> qemu does support arc :)

22:06 <clever> but somebody at broadcom heavily modified this arc variant

22:06 <clever> it disagrees with the arc isa in a number of ways

22:06 <clever> https://github.com/raspberrypi/tools/blob/master/armstubs/armstub7.S

22:07 <clever> when booting normally, the closed firmware will drop a compiled copy of armstub7.S at arm physical 0 (the reset vector)

22:07 <clever> it will then patch lines 194/195, with the addresses for the linux kernel and dtb files

22:07 <clever> armstub7.S is responsible for setting things like CNTFRQ, and dropping out of secure mode, then running linux

22:08 <clever> if your using uboot or tianocore, then youve replaced armstub7.S with that

22:08 <clever> linux expects CNTFRQ to contain the freq of the arm timer, but the register defaults to 0, and linux trusts it way too much, leading to a divide by zero!

22:09 <clever> thats one of the many things ive discovered have to be configured

22:11 <marex> ACK

22:11 <clever> the FPU must also be enabled (armstub7.S has examples) and not flagged for trapping

22:12 <clever> if you fail to do so, linux just assumes you dont have an FPU

22:12 <clever> and the instant userland touches the FPU, you get SIGILL

22:12 <clever> my latest iteration on the bootloader uses lazy context switching on the FPU, so it had been configured to trap all access

22:12 <clever> took a few days to figure out

22:17 NiksDev has quit [Ping timeout: 480 seconds]

22:24 <clever> marex: i think i see the cause of the latest failure, color_imageblit() is rendering the penguin logo to the framebuffer, i dont think it checks if it fits...

22:24 <clever> so if the framebuffer is too small, it will overflow, and fault

22:25 <clever> CONFIG_LOGO=n, ok, there goes that theory...

22:25 <clever> why is it even running this then?

22:30 <clever> ah

22:30 <clever> this appears to be the font rendering, for the console

22:48 i-garrison has quit []

22:57 ngcortes has joined #dri-devel

22:59 Surkow|laptop has quit [Remote host closed the connection]

23:01 Surkow|laptop has joined #dri-devel

23:01 co1umbarius has joined #dri-devel

23:07 tursulin has quit [Remote host closed the connection]

23:15 Hi-Angel has quit [Ping timeout: 480 seconds]

23:16 iive has quit []

23:30 thellstrom has joined #dri-devel

23:30 thellstrom1 has quit [Read error: Connection reset by peer]

23:59 hanetzer has joined #dri-devel

23:59 pcercuei has quit [Quit: dodo]

23:59 <hanetzer> I need to find a bar.