#dri-devel on 2022-02-08 — irc logs at oftc.irclog.whitequark.org

2021-07-26 22:56 ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar

00:02 yogesh_mohan has quit [Ping timeout: 480 seconds]

00:03 <Kayden> okay, that time mingw failed for....reasons

00:03 pcercuei has quit [Quit: dodo]

00:04 <Kayden> the build job failed, but the build actually succeeded

00:04 <Kayden> so...just trying again.

00:07 <daniels> Summary of Failures:

00:07 <daniels> 30/35 mesa:compiler+glsl / glsl optimization FAIL 18.72s (exit status 1)

00:08 <Kayden> thanks, missed that

00:08 <Kayden> just means more things flaking though

00:09 <daniels> Unexpected output on stderr: 0019:err:service:process_send_command service protocol error - failed to read pipe r = 0 count = 0!

00:09 <daniels> remove_continue_at_end_of_loop: FAIL

00:09 <daniels> yep, it does

00:09 <Kayden> ah :(

00:09 <Kayden> EOF would be an unexpected GLSL result, yeah. :D

00:10 <Kayden> I had just reassigned marge because I figured I was no longer at the front of the queue, so it would rebase and re-run tests anyway

00:10 <Kayden> but I was actually at the front of the queue, so it auto-failed again, reusing the same test results

00:11 <Kayden> so I hit re-run on the mingw job

00:11 <Kayden> (figured re-running that job would be a waste if it was rebasing anyway...)

00:11 <Kayden> usually the right thing to do, but not this time, hehe

00:11 <daniels> yeah :)

00:12 <Kayden> magic

00:12 <Kayden> now I can retry https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14615

00:12 <anholt> Kayden: thanks for sorting out an intel flake marker!

00:13 <Kayden> heh, least I can do

00:13 <Kayden> just need to learn the system a bit better

00:14 <anholt> all: vulkan cts uprev MR is up at !14920. would love anyone to review it and push it along, I'm feeling more under the weather again and might be done working for a bit. (woo covid)

00:14 <Kayden> ! :(

00:14 <Kayden> hope you feel better soon

00:14 <anholt> Kayden: the iris egl crashes would also be interesting to look into, if you had some bandwidth.

00:15 <anholt> also would love to get asan into the intel pipeline.

00:16 <Kayden> I haven't had a lot of luck looking into those in the past :(

00:16 <Kayden> asan would be great, haven't kept up with that at all

00:16 maxzor_ has quit [Ping timeout: 480 seconds]

00:18 <Kayden> anholt: what do you need others to do on the cts uprev MR?

00:19 <Kayden> definitely happy to see the uprev happen

00:19 <airlied> anholt: any idea how much asan is CTS vs driver?

00:19 <Kayden> surprised to see anv crashes on some tests, I know we recently passed 1.3 CTS on those platforms, but maybe some patches got lost in the shuffle

00:21 <Sachiel> is it 1.3.0 or 1.3.1? There were new things going in on 1.3.1 that we are not passing

00:22 <Kayden> ah, that's why

00:22 <Kayden> (it's 1.3.1)

00:23 <Sachiel> we also have some failures from a cts fix from dj-death that didn't make it to 1.3.1

00:31 <cheako> Is there anything I can do to classify semaphores? I'm thinking about looking at the least significant bits in the value Vulcan uses, to try and catch an alignment issue... but I don't think the address, if that number even is an address, to the object effects a semaphore's performance.

00:31 tursulin has quit [Read error: Connection reset by peer]

00:34 ybogdano has joined #dri-devel

00:44 mbrost has joined #dri-devel

00:44 dllud_ has quit [Read error: Connection reset by peer]

00:44 dllud has joined #dri-devel

00:47 dllud_ has joined #dri-devel

00:47 dllud has quit [Read error: Connection reset by peer]

00:47 mbrost has quit [Read error: Connection reset by peer]

00:48 iive has quit []

00:52 mbrost has joined #dri-devel

00:55 <Kayden> YES

00:55 <Kayden> multiple things marged

00:58 <zmike> Kayden: if you get time can you check out !14878? you were the one who made the pass run originally

01:03 Company has quit [Quit: Leaving]

01:05 agd5f_ has joined #dri-devel

01:12 agd5f has quit [Ping timeout: 480 seconds]

01:14 tarceri has joined #dri-devel

01:19 <cheako> https://gitlab.com/cheako/vk-layer-cache/-/blob/master/src/lib.rs#L103-117 I know it's because I'm doing something... and the api-dump layer is covering for me. I just don't know what. When I follow this code with api-dump all is well, otherwise this is an endless loop.

01:22 <zmike> I think you probably want either a vulkan channel or a rust channel

01:41 ngcortes has quit [Remote host closed the connection]

01:42 mbrost has quit [Ping timeout: 480 seconds]

01:44 co1umbarius has joined #dri-devel

01:45 columbarius has quit [Ping timeout: 480 seconds]

01:48 Kat_Witten has joined #dri-devel

01:59 hch12907__ has quit [Ping timeout: 480 seconds]

02:05 Kat_Witten has quit []

02:08 <imirkin> zmike: scary to think i would have been the last to touch it. it was _ages_ ago

02:09 <zmike> imirkin: yeah this has the feel of code that everyone's too afraid to tip over

02:09 <imirkin> anyways, i'll try to sort something out tonight. maybe i can convince nouveau to feed it through the translate thing.

02:09 <zmike> nice, thanks

02:09 <imirkin> we do it for the R64_* formats, so R32_USCALED shouldn't be too big a lift

02:10 <imirkin> (p.s. whoever thought the R64_* formats were a good idea ... i disagree.)

02:10 <zmike> at the least it'd be cool to not lose the lower bits there

02:10 <zmike> need those to pass cts

02:10 <imirkin> well, should be straightforward to make it just work properly

02:10 <zmike> yeah

02:10 <imirkin> but it's not completely trivially obvious, so i def want a way to repro first

02:10 <zmike> nice

02:11 <imirkin> (i know, testing fixes before pushing is so passe, but i guess i'm old-fashioned)

02:11 <zmike> well you could just use zink on that test like I suggested

02:11 <zmike> but disabling format support on any driver will also do it

02:11 <imirkin> yeah, but i don't have anything zink-capable here. the gen9 thing is at work. i can try to use it, but might be easier to do it all local

02:12 <zmike> oof how's Debian sarge treating you?

02:12 <imirkin> if that was directed at me, i don't use debian

02:13 <zmike> probably also wouldn't be using sarge since that was like 20 years ago

02:13 <imirkin> ah ok

02:13 <imirkin> the name sounded mildly familiar

02:25 jljusten has quit [Remote host closed the connection]

02:27 <imirkin> zmike: `Result vector is equal to [ 0, 4 ], but [ 1, 5 ] was expected.` -- that's a lot like the failure you're seeing right?

02:28 <zmike> yup

02:28 <zmike> that's the exact one

02:28 <imirkin> excellent

02:30 <imirkin> now to go digging in the intel intrinsics guide

02:31 <imirkin> i know the op is there, but who knows what it's called

02:40 <imirkin> heh. AVX512F adds unsigned -> float conversion. good thing every cpu has that...

02:41 yogesh_mohan has joined #dri-devel

02:42 mbrost has joined #dri-devel

02:43 <HdkR> AVX512 fills a bunch of holes that the previous vector extensions missed. x86 really needs an extension like AVX3 that just brings those in without caring about 512bit registers :|

02:43 m has joined #dri-devel

02:44 <HdkR> AVX-maintenance3

02:45 ybogdano has quit [Ping timeout: 480 seconds]

02:47 maxzor_ has joined #dri-devel

02:51 <imirkin> wtf is the difference between "pand" and "andps" and "andpd"?

02:51 <imirkin> it's doing a bitwise and ... who cares if it's 4x32, 2x64, or 1x128?

02:52 hch12907__ has joined #dri-devel

02:53 <airlied> https://community.intel.com/t5/Intel-ISA-Extensions/andps-vs-andpd-vs-pand/m-p/882959

02:54 <imirkin> ok. so "subtle shit i don't care about"

02:54 <HdkR> pretty much

02:55 <imirkin> "i just want USCALED to be not-broken, i don't care if it loses a cycle here and there"

02:55 <imirkin> ;)

02:55 hch12907_ has joined #dri-devel

02:58 <DrNick> that subtle shit you don't care about is also 14 years old and probably wrong by now

02:58 <imirkin> that ... doesn't make me care more

03:02 fxkamd has joined #dri-devel

03:02 hch12907__ has quit [Ping timeout: 480 seconds]

03:03 mripard_ has joined #dri-devel

03:06 hch12907__ has joined #dri-devel

03:09 mripard has quit [Ping timeout: 480 seconds]

03:12 hch12907_ has quit [Ping timeout: 480 seconds]

03:25 dllud has joined #dri-devel

03:25 dllud_ has quit [Read error: Connection reset by peer]

03:31 jljusten has joined #dri-devel

03:50 mclasen has quit [Ping timeout: 480 seconds]

04:08 <mareko> uscaled = uint + shader code

04:10 <mareko> r11g11b10f = uint + shader code, etc.

04:16 TheComputerGuy has joined #dri-devel

04:17 <imirkin> mareko: meh, well translate should handle it. i have it working, i think

04:17 <imirkin> could also teach translate to "return false" in that case

04:17 <imirkin> fwiw nvidia hw handles uscaled directly

04:17 <imirkin> (but doesn't handle fixed / R64 vertex formats)

04:20 <graphitemaster> Does mesa / linux graphics stack have a device database with like gpu hardware info? like given say a model name query the number of compute units or memory bandwidth it has, or is it still very much just what ever the hardware reports and no info for anything but what hardware you actually have and are using

04:20 <airlied> graphitemaster: the latter

04:21 <graphitemaster> *cries*

04:21 <jekstrand> It's really hard to make a database of that information. On Intel, for instance, chip name doesn't have anything to do with memory bandwidth.

04:21 <jekstrand> That's all about what you've got plugged in

04:21 <jekstrand> On discrete, you can get the same series card with different memory configs depending on card manufacturer

04:21 <graphitemaster> memory bandwidth was just an example, I'd rather mroe just know the number of compute units and "lanes" per simd or what ever something has

04:22 <jekstrand> Number of compute units may be easier but we don't have a database for it.

04:22 <HdkR> It's impossible to maintain to, since your bottlenecks can change so dramatically overtime. Compute or BW might not even be the bottleneck if your shader uses some edge case feature that's slow. :P

04:22 <jekstrand> For Intel, you can look at intel_device_info.c

04:22 <graphitemaster> I don't need it to be accurate - I was just hoping to have at least some information to compare against

04:22 <graphitemaster> Even a window would be fine

04:23 <graphitemaster> LLike oh runs somewhere between 20 GiB/s and 64 GiB/s is already pretty decent

04:23 <HdkR> And then you get something like RDNA2 where your VRAM BW is quite low, but it punches above its weight due to cache :D

04:24 <graphitemaster> This is actually surprisingly not that difficult to do with NV - AMD is a mess, Intel seems the worst :D

04:24 <airlied> wikipedia :-P

04:24 <HdkR> and then Nvidia throws a curveball like the GTX 970 at you

04:24 <jekstrand> Also, your SIMD width on Intel is variable. (-:

04:25 <graphitemaster> My main issue with AMD is how they run backwards in time around the GCN era, like the 520, 530, and 540 RX GPUs all came out the same day, one is GCN 1, the other GCN 3, and then GCN 4 - what is that mess? Also fuckin' HD 7000 series GPUs that are TeraScale 2 instead of GCN. They went forward on GCN, then back to TeraScale, then back on GCN again (four times by my count)

04:26 <jekstrand> You can always look at https://browser.geekbench.com/v5/compute and hope it corresponds to something useful. :-/

04:26 <imirkin> graphitemaster: model numbers are marketing names. you can't trust them for anything.

04:26 <graphitemaster> This infographic outlines my pain https://cdn.discordapp.com/attachments/935579725135507466/940191842702602250/5762o4el6r431.png

04:26 <imirkin> NVIDIA GT 630 could be fermi, kepler, or kepler2

04:26 <graphitemaster> tl;dr it all went to shit in 2006 XD

04:27 <airlied> graphitemaster: amd often had the low end of a series be an older seres

04:27 <airlied> series

04:27 <graphitemaster> Yeah, I did not know that until the past few days as I've been trying to write up my own database

04:27 <airlied> marketing names are marketing

04:27 <graphitemaster> Made me lost a bit of respect for AMD because it's actually confusing

04:28 <airlied> wikipedia is the answer

04:28 <graphitemaster> And dihonest.

04:28 <graphitemaster> *dishonest

04:28 <jekstrand> And if it's Intel, there are at least 4 different marekting names for each GPU and none of them correlate to any of the others cleanly!

04:28 <graphitemaster> I wrote in work chat

04:28 <graphitemaster> "I seriously think I'm going to find some job recruitment for data analyst at the CIA if I keep going down the AMD product naming and chronology Wikipedia."

04:28 <airlied> yeah I think all of them make up marketing names

04:28 <jekstrand> NVIDIA is the only ones who've actually gotten marketing rightish, IMO.

04:29 <airlied> as imirkin pointed out above, even they do it wrong sometimes

04:29 * HdkR pushes Nvidia's mobile stack under a rug quick.

04:29 <vsyrjala> except for selling ancient gpus under new names. at least that's what i read somewhere

04:29 <jekstrand> HdkR: Well, yeah, no one wants to look at that anyway. :P

04:29 <imirkin> jekstrand: they might get marketing right, but the marketing names have no connection to generation :)

04:29 <airlied> maxwell and maxwell2

04:29 <graphitemaster> The Wikipedia tables for AMD are also quite wrong in some areas which I made no attempt to fix because it's inline HTML + CSS withhin a tiny 400px box on Wikipedia and frankly I'm not a web developer

04:30 <jekstrand> imirkin: Maybe I've just not payed enough attention and got fooled like the rest of the world.

04:30 <graphitemaster> And if it was hard for me to figure it out, it should be hard for others too XD

04:30 <imirkin> jekstrand: exactly. they got marketing right :)

04:30 <jekstrand> imirkin: Yup!

04:30 <imirkin> on average, x50 and up is reliably a fixed generation

04:31 <imirkin> but below x50, it's a mix

04:31 <graphitemaster> I still can't find any consistent information on wave32 / wave64 for AMD GPUs either.

04:31 <jekstrand> imirkin: In that case, maybe Intel got marketing right too if the objective is to maximize your confusion * sales.

04:31 <imirkin> i.e. 650/660/670/680 are all going to be kepler, no matter what

04:31 <imirkin> but 640 and below can be whatever

04:31 <graphitemaster> NV has made a couple miffs

04:31 <imirkin> and like HdkR mentions, on mobile it's actually even more confused

04:32 <jekstrand> Oh, I assume mobile is a disaster always

04:32 <graphitemaster> As someone pointed out to me earlier, the 8800 GT that came in 320MiB and 640MiB flavors, the 8800 Ultra which came in 768MiB flavor and was also released as the 8800 GTX, the 8800 GTS 512MiB however was a GF9 part. So the GTS 512MiB was faster than the GT 640MiB and the GTX 768MiB (aka Ultra 768MiB) and came out later than all of those.

04:32 <jekstrand> It's an axiom of mobile that everything is custom and weird and nothing makes sense.

04:32 <imirkin> like a GTX 750 is maxwell, but GTX 750M can be kepler, among other things

04:33 <jekstrand> :facepalm:

04:33 <imirkin> and a bunch of the 8xx series are kepler too

04:33 <imirkin> or a mix

04:33 <graphitemaster> What is the benefit to doing this, is it just to get rid of stock or something lol.

04:33 <imirkin> graphitemaster: bigger numbers = more sales

04:33 <imirkin> nobody wants a 6xx when the latest is 8xx

04:33 <graphitemaster> Honestly seems like false advertisement to me.

04:34 <vsyrjala> "it's just a number"

04:34 <graphitemaster> When you get the new iPhone it's always a new iPhone, not the old iPhone with a model number change

04:34 <jekstrand> If you have a process bump, often the older process is cheaper and has higher throughput

04:34 <imirkin> graphitemaster: https://www.youtube.com/watch?v=ACtraE9qlzw

04:34 <jekstrand> So, if you can keep selling the niceish old thing as a less nice new thing, why not?

04:34 <graphitemaster> imirkin, :D

04:34 <imirkin> AMD does this a lot too

04:34 <imirkin> even like AMD HD 6540 is terascale

04:35 <graphitemaster> To AMD's credit, you can dump their product specifications as a JSON or XML here in the top right corner beside the search bar https://www.amd.com/en/products/specifications/graphics

04:35 <jekstrand> Intel definitely did with 10th generation i-series.

04:35 <graphitemaster> But it's all scuffed.

04:35 <graphitemaster> I tried unscuffing it but got nowhere

04:36 <graphitemaster> I also found errors even in their own data

04:36 <jekstrand> And then there's where you release exactly the same hardware and drop a few "go slow" loops from the driver when you detect that PCI ID. :-D

04:36 * maxzor notes that AMD wikipedia pages need love indeed

04:36 shankaru has joined #dri-devel

04:37 <graphitemaster> Add a compute unit count to the table if someone is going to fix the AMD tables on Wikipedia.

04:37 <graphitemaster> The unified shader cores thing doesn't always divide by 64 or what ever into CUs as I found

04:37 <maxzor> it's already there?

04:38 <graphitemaster> Only on the newer cards

04:38 <maxzor> This page is actually one of the best amd pages https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units

04:40 TheComputerGuy has quit []

04:40 <graphitemaster> The RX 6000 config table confuses me

04:40 <graphitemaster> Since they define `e` as "unified shaders : texture mapping units : render output units"

04:40 <graphitemaster> But those have 4 elements instead of 3

04:41 <graphitemaster> My guess is ray accelerators is the last column but those numbers do not match the ones on AMDs site

04:42 <graphitemaster> It's extra confusing because X1000 has 4 columns too and that was in 2005, way before ray accelerators were a thing

04:43 hch12907__ has quit [Ping timeout: 480 seconds]

04:43 <graphitemaster> I guess ascribing any consistency between separate table generations is a mistake

04:44 <maxzor> eh it makes for maintenance easier, because wikitext is already an horror, but with templates and tables this comes close to hell https://en.wikipedia.org/wiki/User:Maxorazon/sandbox/AMG_GPU_features_transclusion_example

04:45 <graphitemaster> Right

04:46 <graphitemaster> It would be even easier if this was just all in a JSON document and you ran a script to generate the wikitext :P

04:47 fxkamd has quit []

04:47 <maxzor> yea <3

04:50 soreau has quit [Read error: Connection reset by peer]

04:50 soreau has joined #dri-devel

04:55 <maxzor> AMD ProRender already has graphics/compute interop and ray-tracing support. Blender is not even working yet with HIP HAHAhahaaaa

05:04 <maxzor> Btw the article about the graphics pipeline is in a dire state too. https://en.wikipedia.org/wiki/Graphics_pipeline

05:04 <maxzor> Few years ago I was considering letting wikipedia down given how bad the knowledge corpus being on such horrendous markup language is

05:04 <maxzor> But it's all we have...

05:06 <maxzor> and lately the work on parsoid gave me hope of disenclaving the petabytes of prose

05:07 <maxzor> anyway, venting meanwhile https://www.reddit.com/r/AyyMD/comments/saxpd8/you_fuckers_should_spend_less_time_jerkin_off_and/

05:08 <graphitemaster> That's not very inclusive language :|

05:09 <airlied> yeah please don't be posting that sort crap in here

05:09 <maxzor> ¯\_(ツ)_/¯ it is appropriate in this specific context

05:09 hch12907_ has joined #dri-devel

05:18 <DrNick> my favorite thing about AMD wikipedia articles is how every one of them is sure to mention that it supports HyperZ in the first paragraph

05:20 hch12907__ has joined #dri-devel

05:23 hch12907 has joined #dri-devel

05:26 hch12907_ has quit [Ping timeout: 480 seconds]

05:28 hch12907__ has quit [Ping timeout: 480 seconds]

05:28 hch12907_ has joined #dri-devel

05:32 hch12907 has quit [Ping timeout: 480 seconds]

05:36 hch12907_ has quit [Ping timeout: 480 seconds]

05:41 mattrope has quit [Remote host closed the connection]

05:44 Duke`` has joined #dri-devel

05:48 lemonzest has quit [Quit: WeeChat 3.4]

05:52 <dcbaker> airlied: do you want me to pull the crocus CI into 22.0, or just drop changes to the CI files and leave it off?

05:52 <airlied> dcbaker: probably fine to just leave it off for 22.0

05:53 lemonzest has joined #dri-devel

05:53 <dcbaker> sounds good, I've pushed one patch now with the CI file modifications dropped

05:55 blue_penquin has quit [Server closed connection]

05:55 blue_penquin has joined #dri-devel

05:57 hch12907_ has joined #dri-devel

06:00 m has quit []

06:03 jewins has quit [Read error: Connection reset by peer]

06:07 hch12907__ has joined #dri-devel

06:11 mmind00 has quit [Server closed connection]

06:11 mmind00 has joined #dri-devel

06:11 rpigott has quit [Remote host closed the connection]

06:12 i-garrison has quit []

06:12 i-garrison has joined #dri-devel

06:13 hch12907_ has quit [Ping timeout: 480 seconds]

06:19 i-garrison has quit []

06:20 i-garrison has joined #dri-devel

06:26 itoral has joined #dri-devel

06:32 danvet has joined #dri-devel

06:35 mbrost has quit [Ping timeout: 480 seconds]

06:49 libv_ has joined #dri-devel

06:54 libv has quit [Ping timeout: 480 seconds]

06:56 Duke`` has quit [Ping timeout: 480 seconds]

06:58 hch12907__ has quit [Ping timeout: 480 seconds]

06:58 sdutt has quit [Read error: Connection reset by peer]

07:01 mlankhorst has joined #dri-devel

07:22 Wallbraker[m] has quit [Server closed connection]

07:22 Wallbraker[m] has joined #dri-devel

07:27 maxzor_ has quit []

07:33 alanc has quit [Remote host closed the connection]

07:33 alanc has joined #dri-devel

07:36 frieder has joined #dri-devel

07:37 mvlad has joined #dri-devel

07:38 ppascher has quit [Quit: Gateway shutdown]

07:46 libv_ is now known as libv

07:47 tobiasjakobi has joined #dri-devel

07:47 tobiasjakobi has quit [Remote host closed the connection]

07:51 hch12907__ has joined #dri-devel

07:53 ppascher has joined #dri-devel

07:54 hch12907_ has joined #dri-devel

07:59 hch12907 has joined #dri-devel

08:00 hch12907__ has quit [Ping timeout: 480 seconds]

08:00 hch12907__ has joined #dri-devel

08:04 hch12907_ has quit [Ping timeout: 480 seconds]

08:07 hch12907 has quit [Ping timeout: 480 seconds]

08:11 hch12907_ has joined #dri-devel

08:11 jfalempe has quit []

08:11 sigmaris has quit [Server closed connection]

08:11 sigmaris has joined #dri-devel

08:12 tomeu829 has quit []

08:12 tomeu has joined #dri-devel

08:16 tzimmermann has joined #dri-devel

08:17 hch12907__ has quit [Ping timeout: 480 seconds]

08:18 jfalempe has joined #dri-devel

08:23 tursulin has joined #dri-devel

08:26 gio has quit [Server closed connection]

08:27 gio has joined #dri-devel

08:31 MajorBiscuit has joined #dri-devel

08:32 Guest2141 has left #dri-devel [#dri-devel]

08:33 pepp has joined #dri-devel

08:35 <javierm> tzimmermann: hi, could you please elaborate on the "So the copying could start and end in the middle of bytes" you mentioned in https://www.spinics.net/lists/dri-devel/msg332221.html ?

08:36 <javierm> tzimmermann: I'm going through the feedback I got on v2 to prepare v3 and that's the only comment that isn't clear to me

08:36 Arsen has quit [Server closed connection]

08:37 Arsen has joined #dri-devel

08:37 Arsen is now known as Guest2209

08:39 famfo has quit [Server closed connection]

08:39 famfo has joined #dri-devel

08:46 hch12907_ has quit [Ping timeout: 480 seconds]

08:50 <hakzsam> anholt: actually, I still have few Missing tests around with 0.12.0 and 1.3.1.0, like https://pastebin.com/raw/nEU9Z4ZV

08:58 mszyprow has joined #dri-devel

09:01 <tzimmermann> javierm, those damage rectangles are given in scanline and pixel coordinates. pixels can be in the middle of a single byte. imagine you want to convert pixels 5 to 11 of a scanline. if one of your formats is 1-bit mono, the first and final pixels would be located in the middle of a byte

09:16 dviola has joined #dri-devel

09:22 <hakzsam> anholt: something like: ERROR - Failure getting run results: No results parsed. Is your caselist out of sync with your deqp binary? (See "/mnt/mesa/../../mnt/results/cts/c152.r1.log")

09:26 <javierm> tzimmermann: I still don't get it... sorry. How can pixels be in the middle of a single byte if for gray8 is a 1 byte per pixel ?

09:27 <javierm> tzimmermann: if gray8 -> mono conversion is done for pixels 5 to 11, then the bits in the packed mono may start at the end of a byte, but that shouldn't be a problem

09:27 <javierm> one question is if is allowed to convert for scanlines < 8, probably that won't be supported ?

09:28 <javierm> s/at the end/at the middle

09:28 <tzimmermann> javierm, np: 1-bit mono has 1 bit per pixels; so 1 byte contains 8 bit/pixels

09:28 <tzimmermann> right?

09:28 <javierm> tzimmermann: correct

09:29 <tzimmermann> for example, bit/pixel 5 is in the middle of the byte

09:29 <javierm> for mono yes

09:30 <tzimmermann> in the conversion function, if you only have the dst pointer. it can only refer to pixels 0, 8, 16, 24, etc. if you want to refer to pixel 5 you need dst pointing to pixel 0 and an offset into (*dst) so you know that you have to start at bit 5

09:32 <tzimmermann> regarding scalines < 8: 'scanlines' refers to the vertical y axis. so it's not a problem

09:32 <tzimmermann> pixels per scanline refers to the horizontal x axis.

09:33 hch12907 has joined #dri-devel

09:33 <javierm> tzimmermann: ah, Ok. I meant pixels for scanline. That is, you can't convert less than 8 pixels since the minimum that you can get for mono is 8 pixels packed in one byte

09:34 <tzimmermann> exactly.

09:34 <javierm> tzimmermann: but now I understand what you mean. If the source isn't 8 pixel aligned, then it's a problem

09:35 <pq> YUV images with odd dimensions are a thing, even if chroma is 2x2 sub-sampled, so...

09:35 <tzimmermann> javierm, for a full solution, we'd need a bit-offset into *src and *dst to know where we start converting

09:36 <pq> I don't see why mono images should be limited to dimensions multiple of 8 either.

09:36 <tzimmermann> pq, it shouldn't. we're talking about limitations of the current api

09:36 <pq> uapi?

09:37 <tzimmermann> pq, no. internal

09:37 <pq> ok, good

09:37 <tzimmermann> javierm, for the usecase of your driver, i'd simply put in a check to ensure that everything is aligned properly. and extend the damage area in your driver

09:37 <tzimmermann> i.e., make x1 a bit lower and x2 a bit higher, if necessary

09:37 hch12907_ has joined #dri-devel

09:39 hch12907__ has joined #dri-devel

09:40 <javierm> tzimmermann: can we use https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/drm_format_helper.c#L20 ?

09:40 <javierm> tzimmermann: I would prefer to extend the API as needed. Since I'm less interested in supporting this particular display but more about making sure that DRM is suitable to port the others

09:40 <tzimmermann> i don't think so

09:41 <tzimmermann> javierm, even better :)

09:41 <javierm> I mean, that was the motivation to buy this panel and start experimenting :)

09:43 <javierm> tzimmermann: so this offset calculation should just be internal to drm_fb_*_to_mono() right ?

09:43 <javierm> drivers that call the helpers to convert to mono shouldn't be aware about this

09:44 hch12907 has quit [Ping timeout: 480 seconds]

09:44 dos1 has quit [Server closed connection]

09:44 dos1 has joined #dri-devel

09:45 <tzimmermann> javierm, if you want to do the full thing, simply add parameters like dbuf_bit and sbuf_bit to the per-line helper

09:45 hch12907_ has quit [Ping timeout: 480 seconds]

09:49 hch12907_ has joined #dri-devel

09:55 hch12907__ has quit [Ping timeout: 480 seconds]

09:56 <javierm> tzimmermann: yes, only the drm_fb_gray8_to_mono_reversed_line() should be aware of this to shift the bits in the dest bytes as needed

09:57 <javierm> but it's more tricky because this may only be for the start and end dst byte and not in the middle if there are 8 pixels aligned

09:58 <javierm> tzimmermann: anyway, thanks a lot for pointing out this. I didn't notice before that assumption

10:04 mstoeckl has quit [Server closed connection]

10:04 mstoeckl has joined #dri-devel

10:05 narmstrong has quit [Read error: Connection reset by peer]

10:05 narmstrong has joined #dri-devel

10:06 hwentlan____ has quit [Read error: Connection reset by peer]

10:06 hwentlan____ has joined #dri-devel

10:08 pinchartl has quit [Server closed connection]

10:08 pinchartl has joined #dri-devel

10:14 rasterman has joined #dri-devel

10:15 vup has quit [Server closed connection]

10:16 vup has joined #dri-devel

10:25 pq has quit [Server closed connection]

10:26 pq has joined #dri-devel

10:29 iokill has quit [Server closed connection]

10:29 iokill has joined #dri-devel

10:30 lkw has joined #dri-devel

10:44 <romangg> MrCooper: Do you have some thoughts on https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14684#note_1238622?

10:44 <romangg> Do you think this could be a viable path forward or would you rather go into a different direction?

10:46 <romangg> There is a bit of "vagueness" introduced by Mesa then doing basically the same as some Wayland compositors: delaying the commits based on client processing time but if mesa has not as tight limits that would still work out fine I assume.

10:46 <romangg> It would be nice to also measure the gpu -time though.

10:48 Company has joined #dri-devel

10:57 pcercuei has joined #dri-devel

11:05 <MrCooper> romangg: yeah, measuring CPU time only may not work well in GPU limited scenarios; other than that, no particular thoughts offhand

11:07 itoral has quit [Remote host closed the connection]

11:07 <ishitatsuyuki> romangg: I'm generally not in favor of trying to be "smart" in the drivers — it can potentially benefit a lot of applications, yes, but it can also unintendedly break application assumptions as well

11:07 <ishitatsuyuki> similarly, delaying commits in compositors is a good idea if and only if you have full control over what you render. any assumption over the app's rendering model does not work well

11:07 <ishitatsuyuki> trying to optimizing for games often breaks GUI workloads, because the assumption that apps renders in a busy loop with constant workload doesn't hold for GUI applications that incremenetally render and skip no-change frames

11:08 itoral has joined #dri-devel

11:09 flacks has quit [Quit: Quitter]

11:11 jernej has quit [Server closed connection]

11:11 jernej_ has joined #dri-devel

11:11 <romangg> ishitatsuyuki: The current patch at least (thanks to MrCooper's first draft) only tries to be smart when the client saturates the pipeline with no buffer free anymore. So GUI apps should not be impacted.

11:12 <romangg> For Wayland compositors it's nowadays the standard model I would say to try to delay the presentation. Weston does it too.

11:12 <ishitatsuyuki> I'm not criticizing the idea of repaint scheduling itself, but in particular trying to do it dynamically

11:12 flacks has joined #dri-devel

11:13 itoral has quit [Remote host closed the connection]

11:13 <ishitatsuyuki> but well, it's also largely affected by implicit sync used everywhere so that one needs some fix

11:13 itoral has joined #dri-devel

11:14 <ishitatsuyuki> doing repaint scheduling at a fixed margin proportional to refresh period would likely give less headaches

11:16 itoral has quit [Remote host closed the connection]

11:16 mszyprow has quit [Remote host closed the connection]

11:16 mszyprow has joined #dri-devel

11:16 <emersion> that's what weston does

11:16 itoral has joined #dri-devel

11:17 itoral has quit [Remote host closed the connection]

11:18 itoral has joined #dri-devel

11:21 <ishitatsuyuki> > With the patch we give the client less than a frame to submit the next one.

11:21 <ishitatsuyuki> This assumption/behavior is problematic because it breaks CPU/GPU pipelining. Latency optimization should only go as far as it does not reduce throughput

11:22 itoral has quit [Read error: Connection reset by peer]

11:23 <romangg> emersion: Ah right, I thought it also does some dynamic adjustment to client performance. wlroots is also currently doing it this way, rigth? Plus some configuration value to adjust the delay.

11:23 itoral has joined #dri-devel

11:23 mclasen has joined #dri-devel

11:23 <romangg> ishitatsuyuki: Why does it break CPU/GPU pipelining to give the client less than a frame to submit the next one?

11:23 <romangg> When the client is slower I assume, right?

11:24 <romangg> But only then. If the client has like 500fps on a 60hz display for example there shouldn't be a problem. Or is there still one?

11:25 <ishitatsuyuki> romangg: because the optimal case is where both the CPU and GPU takes a frame worth of time to do their job?

11:25 <ishitatsuyuki> I still haven't understood what are you trying to measure

11:26 <ishitatsuyuki> measuring is very hard, and can easily run into feedback loops or other instabilities

11:26 <ishitatsuyuki> (in general)

11:29 pjakobsson has joined #dri-devel

11:29 <daniels> Kayden: thanks for the flake updates!

11:33 Stary has quit [Server closed connection]

11:33 Stary has joined #dri-devel

11:35 nchery is now known as Guest2225

11:35 nchery has joined #dri-devel

11:41 Guest2225 has quit [Ping timeout: 480 seconds]

11:51 reactormonk[m] has quit [Server closed connection]

11:51 reactormonk[m] has joined #dri-devel

11:59 <romangg> ishitatsuyuki: My goal would be to measure how long the client/system needs overall to process a frame so that we can delay this to a point close to vblank for reduced latency. At the moment it's a latency of up to two frames for saturating clients from processing till depication on screen.

12:01 <romangg> MrCooper, ishitatsuyuki: As a compromise would an option be fine to let the compositor (gamescope) switch on/off the delay/block for one frame?

12:01 <romangg> Then I'll throw the logic to measure the client processing time out again.

12:17 mriesch has quit [Server closed connection]

12:17 mriesch has joined #dri-devel

12:24 <ishitatsuyuki> I guess making it opt-in would be better, but I'm still not convinced that it's the right approach

12:25 <ishitatsuyuki> if it's a game, it's probably much better to set a frame limit instead

12:26 <romangg> ishitatsuyuki: Who would set such a frame limit?

12:27 devilhorns has joined #dri-devel

12:28 <romangg> I think the original problem was about games that limit their frames "stupidly" by just using up all available buffers and then waiting for one to become available again.

12:37 <tpalli> piglit ci 'tox' pipelines are failing because of some python thing, example log: https://gitlab.freedesktop.org/tpalli/piglit/-/jobs/18555627

12:38 q66_ has quit [Server closed connection]

12:38 q66 has joined #dri-devel

12:40 <daniels> tpalli: success with pytest 6.2.5 https://gitlab.freedesktop.org/mesa/piglit/-/jobs/18424010

12:40 <daniels> and yours has pytest 7.0.0

12:40 <daniels> so I'm going to guess at some kind of API change which isn't handled

12:41 CounterPillow_ has quit [Server closed connection]

12:41 <daniels> looks like pytest is installed from Debian, so you'd either have to adapt to handle pytest 7.x, or use pip to install it and force 6.x

12:41 CounterPillow has joined #dri-devel

12:41 hakzsam has quit [Server closed connection]

12:41 hakzsam has joined #dri-devel

12:42 <romangg> MrCooper: Imo from an architectural viewpoint it makes sense to put some more intelligence in the driver if the client is stupid (as your check on the buffers being depleted ensures). But as said, if you don't think it's worth it or too risky I would go for a more lowkey solution with a switch for the compositor to toggle.

12:42 linkmauve has joined #dri-devel

12:45 Duke`` has joined #dri-devel

12:47 <tpalli> daniels ok, sounds like some fixes needed for piglit to adapt

12:47 <pq> romangg, the only thing I've read is this IRC discussion FWIW, but I'd be wary of optimizing for naive apps due to the risk of also preventing apps from ever becoming smarter. IOW smart apps would need a way to promise they are smart.

12:47 <daniels> tpalli: yeah, though I've never used pytest so have nfi what they are ...

12:48 <romangg> pq: I definitely don't want to prevent apps from becoming smarter haha. :D Do you mean with the promise like a way they can tell the driver to not do any optimizations?

12:49 <tpalli> daniels heh same here

12:49 <pq> something like that, a way to disable driver magic that could get in the way

12:50 <pq> of course, then you will have apps that disable driver magic and are still naive...

12:50 <romangg> I thought it's enough to check if the client depletes all buffers. Could this happen also for "smart" apps?

12:51 <pq> hard to say... what's the limit, 4?

12:52 <romangg> Depends on the gpu I guess. But yea, I think 4 images a swap chain usually provides.

12:53 <romangg> There is also a check for buffer age. If the buffer age is queried at any point in time no optimizations is taking place because that defines a client to be "smart" I guess. :D

12:53 <pq> smart apps may decide to want to minimize latency or maximize throughput, and if they maximize throughput (without wasting drawn frames), they need to be able to fully occupy the whole pipeline from app to compositor to KMS to scanout.

12:53 <pq> and that might take all 4 buffers?

12:54 <pq> more, if they use Wayland sub-surfaces or other future Wayland surface sync extensions

12:54 <romangg> I don't understand how that maximizes throughput or what it means actually. Should I google it or is it quick to explain?

12:56 <pq> the time for a single frame to go through the pipeline may take longer than a single refresh period, so the able to present something new on every refresh, you have to have buffers in flight at each step of the pipeline simultaneously.

12:56 <pq> *so to be able to present

12:57 <pq> that's what I mean with maximizing throughput - not that each frame could be started and finished within one refresh period of time

12:58 <pq> I mean, there are apps that want FIFO instead of mailbox model, too.

12:58 <romangg> ty. right, I always found FIFO to be weird. Why do you want to present something possibly outdated?

12:59 <pq> I dunno. I just recall that people make noise about it.

12:59 shankaru has quit [Quit: Leaving.]

12:59 <pq> video playback?

13:01 <romangg> right, but you could do that also with mailbox or immediate if you listen for the vsync. Maybe it's easier with fifo if you know the refresh rate and then just commit buffers according to that in advance.

13:02 <romangg> or just something legacy clients liked to do before mailbox was a thing.

13:04 <pq> I suppose people are accustomed to the old-school swapinterval kind of scheduling.

13:08 maxzor_ has joined #dri-devel

13:09 <clever> ive recently made my own 2d gui framework, and i just have a function that requests an update of the screen, and defers the actual update until the next vsync, plus one that can block a thread until vsync

13:10 <clever> so at the start of a frame, the pageflip occurs, and all threads unblock, and you have until the end of the frame to post another pageflip request

13:11 <clever> but i can see how that would cause problems, if you take 1.1 frames to render, it will go down to half the vsync rate

13:11 <clever> a bounded fifo, with vsync based draining would eliminate that

13:11 <Company> that might be preferable though

13:12 <Company> 30fps often looks smoother than 60fps with dropped frames

13:12 <clever> yeah, but when you start to go down to 15fps or 7.5fps, dropped frames may be preferable

13:12 <Company> though for gui frameworks, the time to render is usually not constant between frames

13:13 <clever> for a gui framework, id be more inclined to re-render based on a dirty callback, and post an update to take place at the next vsync

13:13 <Company> there's bursts (like when scrolling or opening a sidebar or something big like that) and then there's lots of very little work inbetween

13:14 <Company> depends very much on the kind of gui

13:14 <Company> note: I'm a core GTK developer

13:14 <clever> most of my work has been baremetal, implementing rpi drivers, and RE'ing the linux kms source

13:15 <Company> some guis are very complicated with very little changes (think a file manager that just updates the bg of the file the moust is hovering on)

13:15 <clever> Company: i can do https://www.youtube.com/watch?v=JFmCin3EJIs and https://www.youtube.com/watch?v=GHDh9RYg6WI baremetal, with basically zero cpu usage, all locked to vsync

13:15 <Company> or a loading bar progressing

13:16 <Company> in that case dirty-area based updates are very useful

13:16 <Company> and then there's game-based guis where you have 5 buttons and they are all bouncing around

13:16 <clever> yeah

13:16 <Company> in that case full redraws are perfectly fine

13:17 <clever> one game i was playing with recently, had a full 3d render of a forest in the background of the menu, with branches blowing in the wind

13:17 <clever> but, that was a very handy reference, as the quality took a massive nose-dive when i messed with LOD settings

13:17 <clever> so i could see the impact of my choices, without having to launch into the game itself

13:21 robertfoss has joined #dri-devel

13:21 zackr has quit [Remote host closed the connection]

13:29 <clever> Company: of note, the 2d composition engine on the rpi is surprisingly powerful, so you could make each window in xorg its own 2d bitmap (xorg already does that when composition is enabled), and then just have the hw combine them for you

13:30 <clever> but there is a limit of about 290 layers, and if you waste bandwidth by covering a pixel multiple times, the hw can be overloaded and not render right

13:31 <Company> yeah, and all of that runs into the problem where you have no idea what applications are gonna try

13:31 <clever> yeah

13:32 <clever> i think in the really old days, each ui element may have been its own XImage

13:32 <clever> but with modern toolkits like gtk, the whole window is one image, and it just renders client-side

13:33 <Company> GTK2 had one X window per element with user interaction

13:33 <clever> ahh

13:34 <Company> a bunch of them did input-only windows

13:34 <clever> so in theory, each element with user interaction could be its own double-buffered bitmap

13:34 <Company> somewhere during GTK 2.X those windows were emulated client-side

13:34 <clever> and the hw could composite things

13:34 <Company> and from GTK 3.X onwards it was all client-side

13:34 <clever> ahhh

13:35 <Company> GTK 4 has to be all client-side because it does rotations and scaling and clipping with rounded corners and all of these fancy modern things

13:35 <Company> that X can't do

13:35 <Company> also blurs and such

13:36 <clever> yeah, the 2d unit on the rpi only has limited flips, and rounded corners are via per-pixel alpha

13:36 <clever> scaling is possible, but i dont see any blur options yet

13:37 <clever> 90 degree rotations (axis swaps) are via a secondary stage, which heavily complicates things if you want only certain elements rotated

13:37 <Company> also, you want an animated rotation, where you smoothly animate from 0deg to 90deg

13:37 <clever> but everybody just ignores the 2d core on most gpu's, and throws moar opengl at the problem

13:37 <clever> which can do everything

13:38 <pq> clever, we had a raspberry pi backend for Weston several years ago, which off-loaded every surface (window) to the (proprietary) 2D compositing API. Today, the same could happen on the standard Weston DRM-backend, provided all apps use dmabuf. And I very much remember the bandwidth overload problems.

13:38 <Company> yeah

13:39 <clever> pq: yeah, you would need to detect occluded layers and slice them up, or use the writeback port to pre-render it to a secondary buffer

13:40 <clever> the only real reason i can see to use the 2d core, is that it may result in lower memory bandwidth, and reduced power draw

13:40 <clever> although, memory bandwidth usage depends on how often your redrawing the whole scene

13:42 <clever> if your directly using the 2d composition hw, then the hw will dynamicaly fetch every bitmap as it races ahead of the (virtual) electron beam scanning out the image

13:42 <clever> if your using the writeback port, then you can do that without having to race, but it writes back to ram (more bandwidth), and then you still need to render that as before (more bandwidth)

13:43 <clever> and if your using the 3d core, i guess the memory usage will be the same, possibly a bit lower with the texture caches

13:43 <clever> but texture format changes may add more...

13:44 <danvet> javierm, [PATCH 08/21] fbcon: Use delayed work for cursor <- do you have time to look at this?

13:45 <clever> pq: oh, but i can see a limitation your dispmanx code may have had!

13:45 <pq> oh right, that's what it was called :-)

13:46 <clever> https://github.com/raspberrypi/userland/blob/master/host_applications/linux/apps/hello_pi/hello_dispmanx/dispmanx.c#L117

13:46 <clever> pq: vc_dispmanx_resource_write_data() copies image data from linux userland to a gpu buffer

13:46 <clever> but, you can avoid that memcpy

13:46 <clever> https://github.com/raspberrypi/firmware/wiki/Mailbox-property-interface#get-dispmanx-resource-mem-handle

13:46 <pq> I can? On RPi 2 and many years old firmware?

13:46 <javierm> danvet: sure. I'm in a meeting now but will look at it once it finishes

13:46 <clever> pq: yep, even on a pi1

13:47 <clever> pq: this mailbox call returns the handle for the image buffer behind a dispmanx resource, you then pass that to mem-lock (another mailbox), and you get its current physical address, mmap /dev/mem, write directly to it

13:47 itoral has quit [Remote host closed the connection]

13:47 <pq> yuck, physical address

13:47 <pq> weston ran as a normal user :-)

13:48 <pq> no root privs

13:48 <clever> yeah

13:48 <pq> so no phys addresses either

13:48 <clever> the modern drm/kms api's hide that from you

13:48 <daniels> that DispManX call also didn't exist at the time

13:48 <clever> daniels: it was added before the pi2 came out i believe

13:48 <danvet> javierm, well no rush, just figured since you've said you plan to look at it

13:49 <clever> https://github.com/raspberrypi/firmware/issues/257 back in 2014

13:49 <daniels> yeah ... 2 years after the backend was merged

13:49 <clever> ahh

13:49 <daniels> but admittedly 2 years before it was deleted

13:50 <clever> https://github.com/cleverca22/gl/blob/master/core.c#L506-L507

13:50 <clever> in my case, i was using that call in a custom 3d driver

13:50 <daniels> (we asked them for that exact call and it got put on Dom's to-do list, but it always takes a while)

13:50 <daniels> in any case, like pq says, we're using the Mesa GLES driver and the KMS display driver on that platform now

13:50 <clever> yep

13:51 <clever> the kms driver gives you the same features with a proper standardized api

13:52 <pq> oh, does it allow creating a KMS-accessible bo from user pointer?

13:52 camus1 has joined #dri-devel

13:53 <clever> pq: the scanout hw can only read the lower 1gig of ram, and it must be contiguous, so its best to have the kernel allocate the memory for you

13:53 <pq> hmm, no, that's not what mem-lock thing allowed either, is it?

13:53 <clever> but if you can then mmap that dma_buf, problem solved

13:53 <clever> no

13:53 <clever> the firmware has a relocatable heap, so it can defrag the free space

13:53 <clever> mem-lock locks an object, and returns whatever its physical addr currently is

13:53 <pq> clever, so it would not have allowed us to avoid vc_dispmanx_resource_write_data() copy.

13:54 <clever> mem-unlock unlocks it, and the object will now randomly move about on its own

13:54 pcercuei has quit [Ping timeout: 480 seconds]

13:54 camus has quit [Ping timeout: 480 seconds]

13:54 <clever> correct, it needs root (or a kernel driver) to bypass that

13:54 <clever> the fkms driver is the closest thing there, it wraps the dispmanx api under drm/kms

13:55 <pq> Wayland buffers are client allocated, and wl_shm specifically is just any memory accessible by mmapping an fd, like a file.

13:55 <clever> in my custom v3d driver, i had the option to mmap /dev/v3d, a character device

13:56 <clever> and it could have been modified so that you can pass the handle off to a client via a unix socket, and not give them any powerful pemrs

13:56 <pq> yeah, but you still need the buffer allocated in a special way. That's in Wayland client-side code.

13:56 <clever> but if your using opengl to consume the client buffers, some other factors come into play

13:57 <pq> software-rendered clients may literally create a file on tmpfs to mmap, or memfd.

13:57 <clever> the pi4 3d core, has an extra MMU between its 32bit space, and the 64bit space

13:57 <clever> so up to 4gig of ram can be mapped to the 3d core at once

13:57 <clever> in theory, that would give the 3d core direct access to the same pages as a wayland client

13:59 FireBurn has quit [Remote host closed the connection]

14:01 mlankhorst has quit [Ping timeout: 480 seconds]

14:02 agd5f_ has quit [Read error: Connection reset by peer]

14:03 maxzor_ has quit [Ping timeout: 480 seconds]

14:06 rgallaispou has joined #dri-devel

14:08 agd5f has joined #dri-devel

14:09 <danvet> sravn, thx for your comments on my fbcon series, I hope I got all your questions

14:13 sdutt has joined #dri-devel

14:22 shankaru has joined #dri-devel

14:25 mbrost has joined #dri-devel

14:40 fxkamd has joined #dri-devel

14:42 vup has quit []

14:42 vup has joined #dri-devel

14:44 pcercuei has joined #dri-devel

14:45 gawin has joined #dri-devel

14:48 mvlad has quit [Read error: Connection reset by peer]

14:49 nchery is now known as Guest2244

14:49 Guest2244 has quit [Read error: Connection reset by peer]

14:49 nchery has joined #dri-devel

15:00 mlankhorst has joined #dri-devel

15:05 jewins has joined #dri-devel

15:06 <alyssa> Ray tracing stuff landing, good for you

15:06 * alyssa cries in mali

15:07 <alyssa> I guess mali v11 might have raytracing?

15:08 <alyssa> that's only like 3 years away from Mesa support? :-p

15:10 Guest2209 has quit []

15:10 maxzor_ has joined #dri-devel

15:11 Arsen has joined #dri-devel

15:11 maxzor_ has quit [Remote host closed the connection]

15:12 yshui` has quit [Server closed connection]

15:12 yshui` has joined #dri-devel

15:13 mvlad has joined #dri-devel

15:15 mattrope has joined #dri-devel

15:25 <jekstrand> alyssa: We'll get to it.

15:26 <jekstrand> But someone's got to go first

15:30 <pq> who are you planning to eliminate? :-O

15:32 <dj-death> alyssa: it's not all of it and not usable yet (but we'll get there...)

15:50 <alyssa> dj-death: good luck :)

16:02 gawin has quit [Ping timeout: 480 seconds]

16:03 devilhorns has quit [Remote host closed the connection]

16:12 gawin has joined #dri-devel

16:12 Haaninjo has joined #dri-devel

16:26 dnkl has left #dri-devel [#dri-devel]

16:32 mszyprow has quit [Ping timeout: 480 seconds]

16:36 PiGLDN[m] has quit [Server closed connection]

16:36 PiGLDN[m] has joined #dri-devel

16:48 <jenatali> Corentin Noël: FYI, looks like you're not registered on IRC, meaning only people connected via Matrix can see your messages

17:02 iive has joined #dri-devel

17:04 <cheako> I fixed it by re-ordering some things, don't know why that worked.

17:04 tintou has quit []

17:04 tintou has joined #dri-devel

17:04 <danvet> tzimmermann, mlankhorst just heads up, but I think it'd be good to backmerge -rc4 to sync up the fbcon stuff into both drm-misc-fixes & -next

17:05 <danvet> airlied, ^^ also need backmerge for drm-next then

17:05 <danvet> there's still a fbcon fix pending now for -rc4, so can't yet do it

17:06 benettig has quit []

17:06 benettig has joined #dri-devel

17:10 tintou has quit []

17:10 tintou has joined #dri-devel

17:11 tintou has quit []

17:11 tintou has joined #dri-devel

17:15 <tintou> Hi there, I have a question about the code in https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/mesa/state_tracker/st_glsl_to_nir.cpp#L77

17:15 <tintou> daniels told me that the += 9 was because for some hardware, the VAR0-7 was reserved for point co-ordinate information, I have a piglit test that is actually using GL_MAX_VARYING_FLOATS variables so the 24-31 are too much there, I wonder if GL_MAX_VARYING_FLOATS should have been decreased by 9 in this case (which collides with the documentation requiring at least 32) or if the test was actually missing a check someway

17:33 tzimmermann has quit [Quit: Leaving]

17:37 ella-0_ has joined #dri-devel

17:39 <jenatali> Anyone want to ack a trivial TC patch? zmike? https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14933/diffs?commit_id=d53bd592aab035466ec12863c4ef9469f03ad19f

17:39 ella-0 has quit [Read error: Connection reset by peer]

17:40 <zmike> jenatali: ab

17:40 <jenatali> Thanks

17:41 <alyssa> anholt: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14936

17:42 <alyssa> not really sure how to tag the MR ... it has a behaviour change for the laundry list of drivers that didn't set info.internal correctly

17:42 <anholt> tintou: at that point I think you're stuck with "expose support for gl texcoord, do your own allocation of texcoord to unused varying slots at backend codegen time"

17:42 <jenatali> FWIW that's what I did recently

17:44 <alyssa> on the other hand, only the drivers that set internal=true care about the value

17:44 <anholt> rbed, should probably give others a chance to look, though :)

17:44 frieder has quit [Ping timeout: 480 seconds]

17:44 <alyssa> sure, thanks :)

17:44 <alyssa> I think the only behaviour change is whether NIR_PRINT=1 prints meta shaders for e.g. v3dv

17:45 <zmike> (NIR_DEBUG=print)

17:45 <alyssa> tells you how often i use it

17:45 <anholt> tintou: we could also maybe do something with using texcoord declarations (or not) in linked shaders to skip/reduce the location fixup in the common case. dunno.

17:45 <jenatali> Yeah, unless people added new dependencies on internal vs not

17:46 <alyssa> zmike: I always do {ISA}_MESA_DEBUG=shaders which dumps optimized NIR and backend IR and disassembly

17:46 <jenatali> I added that to be able to print CL shaders without dumping all of libclc, which is an insane amount of code...

17:46 <alyssa> don't usually care about pass-by-pass NIR dumps

17:46 <alyssa> jenatali: We check it in panfrost

17:46 <jenatali> Ah cool :)

17:47 <alyssa> the compute shader used for indirect draws is massive, for ex

17:47 <alyssa> BIFROST_MESA_DEBUG=shaders won't dump it

17:47 <alyssa> BIFROST_MESA_DEBUG=shaders,internal will

17:49 * alyssa tries to run deqp-vk

17:49 <alyssa> I suppose LIBGL_DRIVERS_PATH doesn't work

17:50 <anholt> I hear meson devenv -C build is supposed to help now.

17:51 jhli has quit [Quit: ZNC 1.8.2 - https://znc.in]

17:54 <alyssa> VK_ICD_FILENAMES=~/mesa/build/src/panfrost/vulkan/panfrost_devenv_icd.aarch64.json gets further

17:55 <cheako> Testing today and semaphore caching makes no differance. I'm moving on to caching imageviews, with a goal at caching the whole image stack eventually.

17:56 <alyssa> oh right this one is 100% my fault

17:56 jhli has joined #dri-devel

18:01 rpigott has joined #dri-devel

18:03 ybogdano has joined #dri-devel

18:09 <shadeslayer> <anholt> "I hear meson devenv -C build..." <- Oh nice, didn't know about that one

18:09 <anholt> very recent

18:09 <anholt> (2f916f2be6ef4f6ffcbcd7edbcee06546d0da519)

18:25 ngcortes has joined #dri-devel

18:28 MajorBiscuit has quit [Ping timeout: 480 seconds]

18:31 ngcortes has quit [Read error: Connection reset by peer]

18:34 mattst88 has quit [Ping timeout: 480 seconds]

18:37 ngcortes has joined #dri-devel

18:50 gouchi has joined #dri-devel

18:52 mattst88 has joined #dri-devel

18:56 <sravn> danvet: I think I used up my share of "asking the professor questions" in the fbcon series. Looks good. I skipped a few patches in the hope someone else will find time to dive into them

19:00 <danvet> sravn, you mean anything from you I should review in return?

19:02 <sravn> danvet: Hee, I am already behind processing feedback. dianders and pinchartl gave a lot of good feedback that I need to address. I am going to introduce a drm_bridge_helper in my re-spin of that series.

19:03 <sravn> If someone came in and sweeped all the panel patches that would be good, but I think thats something for the weekend to take a peek at. They have piled up it seems

19:18 rgallaispou has quit [Quit: Leaving.]

19:44 shankaru has quit [Quit: Leaving.]

19:46 <danvet> mripard_, mlankhorst https://gitlab.freedesktop.org/freedesktop/freedesktop/-/issues/417 pls ack

19:47 <danvet> javierm, I'm prepping a new version of the series anyway, so pls look at that one

19:48 <javierm> danvet: Ok, sorry for not reviewing yet but got distracted by other stuff

19:48 <danvet> javierm, hey no worries, I've been like weeks late on average with my promised reviews

19:48 <danvet> last few months :-/

20:11 turol has joined #dri-devel

20:24 mattrope has quit [Read error: Connection reset by peer]

20:30 HdkR has joined #dri-devel

20:44 mattrope has joined #dri-devel

20:47 Plagman has joined #dri-devel

20:47 mlankhorst has quit [Ping timeout: 480 seconds]

20:49 romangg has joined #dri-devel

20:51 pcercuei has quit [Ping timeout: 480 seconds]

20:52 mlankhorst has joined #dri-devel

20:53 milek7 has joined #dri-devel

20:57 <danvet> sravn, hah I wont edit the TODO, because the olpc_dcon TODO in staging already explains that you should use drm self-refresh helpers here :-)

21:01 fahien has joined #dri-devel

21:01 maxzor has quit [Remote host closed the connection]

21:02 oneforall2 has quit [Quit: Leaving]

21:09 mceier has joined #dri-devel

21:12 jani has joined #dri-devel

21:14 Lynne has joined #dri-devel

21:15 oneforall2 has joined #dri-devel

21:16 <demarchi> mlankhorst: could we have a drm-misc-next pull to drm-next before next week? I have some patches to land on drm-intel-next that depend some merged in drm-misc-next

21:16 vsyrjala has joined #dri-devel

21:18 <demarchi> and next week airlied plans to backmerge rc4 and that could go to drm-intel-next

21:20 ngcortes has quit [Ping timeout: 480 seconds]

21:21 <anholt> zmike: is there a more limited subset of piglit that might cover what you need for the no-timelines knob?

21:22 <zmike> anholt: not really? the point is just to run everything to make sure I didn't break vk 1.0 support in some subtle way

21:22 <anholt> ok. given the name I was thinking maybe there was just something fence-related we could target.

21:23 <zmike> sadly not

21:23 <zmike> it ends up affecting pretty much the whole driver

21:33 danvet has quit [Ping timeout: 480 seconds]

21:40 ngcortes has joined #dri-devel

21:43 lkw has quit [Quit: leaving]

21:49 cleverca22[m] has joined #dri-devel

21:49 <anholt> oh no. I forgot to add prefix support to piglit. guess we're going to have to have a total of 2 ci jobs for zink until I sort that out.

21:54 gawin has quit [Ping timeout: 480 seconds]

22:00 nchery has quit [Remote host closed the connection]

22:02 chema has joined #dri-devel

22:02 nchery has joined #dri-devel

22:03 rodrigovivi is now known as vivijim

22:03 Duke`` has quit [Ping timeout: 480 seconds]

22:04 vivijim is now known as rodrigovivi

22:04 MatrixTravelerbot[m] has joined #dri-devel

22:04 DrNick has joined #dri-devel

22:05 DrNick is now known as Guest15

22:06 heftig has joined #dri-devel

22:07 Haaninjo has quit [Quit: Ex-Chat]

22:07 gawin has joined #dri-devel

22:08 bylaws has joined #dri-devel

22:09 ella-0[m] has joined #dri-devel

22:10 gouchi has quit [Remote host closed the connection]

22:17 <jenatali> If I have a GL texture, write to it on one context on one thread, and read from it on another context on another thread... is that well-defined?

22:19 <jenatali> I'm realizing our driver has some state that's global per-resource, but needs to be per-context per-resource

22:20 <jekstrand> You might need a glFlush() in there. Not sure.

22:20 * jekstrand isn't a GL expert

22:20 <jekstrand> Kayden: ^^

22:20 <zmike> yes, we had issues with that last year during the conversion to async gallium flushing

22:21 <zmike> the flush guarantees that only one context is truly active at a time

22:21 <jenatali> I see

22:23 tomba has joined #dri-devel

22:28 <jenatali> I don't suppose you have any spec wording on that somewhere?

22:29 mvlad has quit [Remote host closed the connection]

22:31 <zmike> you could probably find it if you looked through git log to find when async flushes got reverted

22:31 <zmike> we all had to put on our big brain hats that day

22:34 <jenatali> Thanks, I'll check it out

22:55 anujp has quit [Ping timeout: 480 seconds]

22:57 <jenatali> zmike: Think I found it (https://gitlab.freedesktop.org/mesa/mesa/-/commit/5066839ffdbeac5b8d24f83e7c55cb20545cd48b) - that's about 2 contexts on a single thread alternating. I'm more interested about 2 contexts on separate threads

22:58 <jenatali> Unless you were talking about something else?

22:59 <zmike> jenatali: oh maybe I was just thinking about it wrong

22:59 <zmike> it's late in the day for me and I've been deep in llvm for about 7 hours :/

22:59 <jenatali> Heh no worries. Thanks anyway :)

23:02 <Kayden> jekstrand: There are Rules(TM)

23:02 <Kayden> The rules make no sense

23:02 <Kayden> Quick rules 101:

23:02 <Kayden> If two contexts simultaneously access a texture or object

23:03 <Kayden> changes may or may not show up from one in the other

23:03 <Kayden> at object bind time, any changes made in other contexts show up

23:03 <Kayden> so, you bind a texture -> it has shown up

23:04 <Kayden> flushes happen sometimes

23:04 <Kayden> but...yeah.

23:04 <Kayden> it's...very very confusing

23:04 <anholt> tomeu: anyone on hand that could help me decipher virgl crosvm fails on some new runners? https://gitlab.freedesktop.org/anholt/mesa/-/jobs/18590463

23:04 <Kayden> if you want to actually cooperatively use things from multiple contexts you might want to just use glFenceSync

23:05 <daniels> anholt: super helpful stdout. taking a wild stab at it, you haven't got devices = ["/dev/kvm"] in your runner def

23:05 <anholt> at least I think I do. the changes definitely took when I added that and privileged = true, otherwise networking stuff killed it early

23:05 <anholt> "Variables passed through:" seems like it's from the guest

23:06 <anholt> err, nope.

23:06 <daniels> anholt: https://gitlab.freedesktop.org/-/snippets/4448

23:07 <daniels> anholt: does nested KVM ... work?

23:07 <daniels> anholt: you need to have a separate entitlement for that on GCE for some reason

23:07 <anholt> let me go poke around

23:07 <daniels> oh, https://cloud.google.com/compute/docs/instances/nested-virtualization/enabling

23:07 <daniels> it's got a lot better now

23:08 <jenatali> Kayden: Thanks. That helps somewhat

23:08 <jenatali> I assume that mesa/main or mesa/st doesn't insert context flushes automatically when tracking these hazards, right? :)

23:09 <jenatali> Just debating if I should try to implicitly synchronize multiple contexts in my backend (like I do in our 11on12 layer for graphics+video) or if I should just assume someone else is going to at least flush them in the right order

23:17 <anholt> daniels: bummer. that didn't seem to do it, but maybe I need to figure out how to invoke qemu on my own without bothering gitlab about it.

23:22 <daniels> anholt: it's pretty straightforward in here (^F $ISO) https://www.collabora.com/news-and-blog/blog/2021/11/26/venus-on-qemu-enabling-new-virtual-vulkan-driver/

23:31 mbrost has quit [Ping timeout: 480 seconds]

23:34 <jenatali> Here we go, finally found it: https://www.khronos.org/opengl/wiki/Memory_Model#Object_content_visibility

23:35 <jenatali> So implicit flushes don't need to happen. App has to explicitly flush and then rebind

23:37 lemonzest has quit [Quit: WeeChat 3.4]

23:40 gawin has quit [Read error: Connection reset by peer]

23:44 kallisti5[m] has joined #dri-devel

23:48 <Kayden> jenatali: Yeah, I think that's right

23:49 rasterman has quit [Quit: Gettin' stinky!]