ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar
<jekstrand> karolherbst: Ok, I've typed an abomination and it works for clamp but aparently repeat was failing too
<jekstrand> I think we've got something wrong with state setup
<karolherbst> mhh maybe
<jekstrand> Because I definitely don't se a workaround for repeat
<karolherbst> how does the failure look like?
* airlied reaches the gpu crash and corrupted git tree part of development :-P
<karolherbst> airlied: first time?
<jekstrand> airlied: Achievement unlocked!
<karolherbst> jekstrand: although I am mildly sure that I do the correct thing from a gallium API perspective
<airlied> karolherbst: in this project :-P
<karolherbst> :D
<jekstrand> karolherbst: I don't doubt that you do
<karolherbst> ahh so no CL CTS run this night sadly :( unless you find a quick fix for repeat :D
<jekstrand> No, I need to be done for the day.
<karolherbst> jekstrand: is it just repeat or mirrored_repeat as well, or just the latter?
<karolherbst> ohhhhhhh
<karolherbst> wait
<karolherbst> what's LEGACY_CLK_ADDRESS_REPEAT
<karolherbst> ahh no, that's just a stupid name
<jekstrand> mirrored is busted too
<karolherbst> yeah... okay.. I don't see any workarounds either
nchery has quit [Ping timeout: 480 seconds]
<karolherbst> "samplerState->setLodPreclampMode(SAMPLER_STATE::LOD_PRECLAMP_MODE::LOD_PRECLAMP_MODE_OGL)" mhhh
<karolherbst> jekstrand: there actually is something
<karolherbst> ehhh.. wait.. only for xe
nchery has joined #dri-devel
<karolherbst> oh well.. tomorrow then
<karolherbst> jekstrand: if you push your stuff I might take a look tomorrow and try to match intels sampler setup
<karolherbst> but REPEAT looks like the same rounding issue anyway... but maybe you fixed that locally
khfeng has joined #dri-devel
columbarius has joined #dri-devel
co1umbarius has quit [Ping timeout: 480 seconds]
<karolherbst> "Loaded tests from test_conformance/opencl_conformance_tests_full.csv, total of 57 tests selected to run:" :3
<karolherbst> "test_preprocessor_line_error may report spurious ERRORS in the conformance log." fun...
OftenTimeConsuming has quit [Ping timeout: 480 seconds]
<karolherbst> okay, well. tomorrow I will know more
alyssa has left #dri-devel [#dri-devel]
pallavim_ has quit [Ping timeout: 480 seconds]
<karolherbst> I have questions
<karolherbst> ==> FAILED 252 of 945 sub-tests.
<karolherbst> PASSED test.
rkanwal has quit [Quit: rkanwal]
sdutt has quit [Ping timeout: 480 seconds]
* icecream95 wishes that Mesa had a debug flag to make mediump always fp16 rather than changing precision depending on the day of the week
<karolherbst> "ERROR: test file (/data/git/OpenCL-CTS/build/test_conformance/gl/test_gl) does not exist. Failing test." how rude
thellstrom1 has joined #dri-devel
thellstrom has quit [Read error: Connection reset by peer]
OftenTimeConsuming has joined #dri-devel
elongbug_ has quit [Ping timeout: 480 seconds]
sdutt has joined #dri-devel
Daanct12 has joined #dri-devel
ybogdano has quit [Ping timeout: 480 seconds]
lemonzest has quit [Quit: WeeChat 3.4]
elongbug has joined #dri-devel
JohnnyonFlame has joined #dri-devel
JohnnyonF has quit [Ping timeout: 480 seconds]
Company has quit [Read error: Connection reset by peer]
heat_ has quit [Remote host closed the connection]
heat_ has joined #dri-devel
dviola has quit [Ping timeout: 480 seconds]
dviola has joined #dri-devel
sdutt has quit [Ping timeout: 480 seconds]
jewins has quit [Ping timeout: 480 seconds]
shankaru has joined #dri-devel
eukara has quit [Ping timeout: 480 seconds]
nchery is now known as Guest2350
nchery has joined #dri-devel
sdutt has joined #dri-devel
karolherbst has quit [Read error: Connection reset by peer]
karolherbst has joined #dri-devel
eukara has joined #dri-devel
Guest2350 has quit [Ping timeout: 480 seconds]
eukara_ has joined #dri-devel
shankaru has quit []
frankbinns1 has joined #dri-devel
eukara has quit [Ping timeout: 480 seconds]
frankbinns has quit [Ping timeout: 480 seconds]
heat_ has quit [Ping timeout: 480 seconds]
sdutt has quit []
sdutt has joined #dri-devel
danvet has joined #dri-devel
ngcortes has quit [Ping timeout: 480 seconds]
LexSfX has quit [Ping timeout: 480 seconds]
LexSfX has joined #dri-devel
dviola has quit [Ping timeout: 480 seconds]
aravind has joined #dri-devel
nchery has quit [Ping timeout: 480 seconds]
Daanct12 has quit [Remote host closed the connection]
khfeng has quit [Ping timeout: 480 seconds]
Daanct12 has joined #dri-devel
illwieckz has quit [Read error: No route to host]
illwieckz has joined #dri-devel
thellstrom has joined #dri-devel
thellstrom1 has quit [Ping timeout: 480 seconds]
sdutt_ has joined #dri-devel
sdutt has quit [Read error: Connection reset by peer]
pallavim has joined #dri-devel
Duke`` has joined #dri-devel
mvlad has joined #dri-devel
tzimmermann has joined #dri-devel
<imirkin> anyone with khronos pull who cares about refpages ... https://github.com/KhronosGroup/OpenGL-Refpages/pull/115
itoral has joined #dri-devel
vyivel has quit [Read error: Connection reset by peer]
vyivel has joined #dri-devel
Duke`` has quit [Ping timeout: 480 seconds]
khfeng has joined #dri-devel
sarnex has quit [Quit: Quit]
sarnex has joined #dri-devel
Duke`` has joined #dri-devel
garrison has joined #dri-devel
i-garrison has quit [Read error: Connection reset by peer]
khfeng has quit [Ping timeout: 480 seconds]
khfeng has joined #dri-devel
aravind has quit [Ping timeout: 480 seconds]
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
pallavim has quit [Ping timeout: 480 seconds]
itoral has quit [Remote host closed the connection]
pallavim has joined #dri-devel
Duke`` has quit [Ping timeout: 480 seconds]
alanc has quit [Remote host closed the connection]
alanc has joined #dri-devel
YuGiOhJCJ has joined #dri-devel
aravind has joined #dri-devel
gouchi has joined #dri-devel
gouchi has quit [Remote host closed the connection]
agd5f has quit [Ping timeout: 480 seconds]
dliviu has quit []
jkrzyszt has joined #dri-devel
glennk has quit [Ping timeout: 480 seconds]
MajorBiscuit has joined #dri-devel
Major_Biscuit has joined #dri-devel
Company has joined #dri-devel
MajorBiscuit has quit [Ping timeout: 480 seconds]
glennk has joined #dri-devel
<javierm> tzimmermann: should I post the rebased version mentioning your in-flight series or just wait for yours to land first and then post ?
<tzimmermann> javierm, if you ack the 2nd patch of my series, i could land this now
<tzimmermann> rob reviewed the first patch
<javierm> tzimmermann: ah, I thought that already gave you my ack for that when discussing, but now see that only has Suggested-by
<javierm> let me do that now then
<tzimmermann> javierm, i replaced the patch that you acked with something else.
<tzimmermann> it's now what you proposed, more or less
<javierm> tzimmermann: yes I know, but thought that said that if you did that were free to add my R-b :)
<tzimmermann> oh, i missed that
<tzimmermann> sry
<javierm> tzimmermann: no worries at all, maybe I'm the one misrememembering then
ahajda has joined #dri-devel
<javierm> tzimmermann: btw, hope you are ok that I kept your A-b in https://lists.freedesktop.org/archives/dri-devel/2022-April/351925.html
<javierm> I just did the changes you and others suggested and added a few more talks/articles to the list
maxzor has joined #dri-devel
itoral has joined #dri-devel
nchery has joined #dri-devel
<cwabbott> anholt: on the last nouveau nir fix, I took the time to fix it properly 2 years ago in RA and sent a patch but no one actually committed it for... reasons?
<cwabbott> I guess no one had the expertise and time to actually understand it or something, but it only took me a day to write it so it shouldn't actually take that long to review and commit
<cwabbott> oh wow, it was actually 4 years ago
rasterman has joined #dri-devel
<Venemo> does NIR allow some intrinsic sources to be NULL, when they are unneeded?
thellstrom has quit [Ping timeout: 480 seconds]
<tzimmermann> javierm, sure, keep the a-b
<tzimmermann> i've now landed the of-device patches
pcercuei has joined #dri-devel
<javierm> tzimmermann: great, thanks a lot
glennk has quit [Ping timeout: 480 seconds]
glennk has joined #dri-devel
Guest2377 has joined #dri-devel
apinheiro has joined #dri-devel
maxzor has quit [Ping timeout: 480 seconds]
Guest2377 has quit [Remote host closed the connection]
thellstrom has joined #dri-devel
pallavim has quit [Read error: Connection reset by peer]
anarsoul has quit [Ping timeout: 480 seconds]
anarsoul has joined #dri-devel
rkanwal has joined #dri-devel
Daanct12 has quit [Quit: Leaving]
itoral has quit [Remote host closed the connection]
guru_ has joined #dri-devel
itoral has joined #dri-devel
oneforall2 has quit [Ping timeout: 480 seconds]
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
dliviu has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
ced117 has quit [Ping timeout: 480 seconds]
dutty has joined #dri-devel
<karolherbst> ehhh
<karolherbst> use_pitches messes things up.. I should add it to the things we test
<dutty> hi!
itoral has quit [Remote host closed the connection]
ced117 has joined #dri-devel
<dutty> I have a question, hopefully you can help me! I've built osmesa+llvmpipe on macOS, it's reported via glGetString as "llvmpipe (LLVM 6.0.1, 256 bits)" however, when I try to build a core context for 4.0 it fails, I only get a 3.3 context. On Windows I'm using prebuilt binaries from https://github.com/pal1000/mesa-dist-win and it seems to work fine
<dutty> My meson config is "meson builddir -Dosmesa=true -Dgallium-drivers=swrast -Dglx=disabled -Dgles1=disabled -Dgles2=disabled -Dshared-glapi=enabled -Dllvm=enabled -Dshared-llvm=disabled -Dprefix=$PWD/builddir/install"
<dutty> Is there anything specific that I need to do to build llvmpipe with 4.x support?
sdutt_ has quit [Ping timeout: 480 seconds]
itoral has joined #dri-devel
<MrCooper> dutty: LLVM 6 might be too old for newer OpenGL
<dutty> That's good to know, thanks MrCooper! I'll try a newer llvm version
<karolherbst> jekstrand: nooo.. there are more fails for subtests I didn't caught with my runner :( some of the image tests fail with "use_pitches"
<Venemo> using LLVM 6 at this point is archeology
<karolherbst> yeah.. don't use llvm6 if you can use something newer
frankbinns1 has quit []
frankbinns has joined #dri-devel
<karolherbst> seems like the other patches also add some basic things, but sooner or later we might just want to rely on llvm
<dutty> ...using GPU='llvmpipe (LLVM 14.0.1, 256 bits)'
<dutty> ...vendor: Mesa/X.org
<dutty> ...version: 4.5 (Core Profile) Mesa 22.0.1
<dutty> Awesome guys, LLVM 14.0.1 did the trick!
maxzor has joined #dri-devel
<dutty> It would be super helpful for newcomers if the docs for building llvmpipe would be updated.
<dutty> Thanks for helping out here!
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
itoral has quit [Remote host closed the connection]
aravind has quit [Read error: Connection reset by peer]
maxzor has quit [Ping timeout: 480 seconds]
mbrost has joined #dri-devel
bond has joined #dri-devel
bond is now known as Guest2416
<dutty> Once again, thank you for the help. Have a good days folks!
dutty has quit []
* karolherbst kicks another CL CTS run
mbrost has quit [Ping timeout: 480 seconds]
Thymo has quit [Quit: ZNC - http://znc.in]
<karolherbst> jekstrand: mhhh.. there is one perf issue I am not really sure to solve. So the client can install event callbacks once a certain status is reached, but atm I am calling it directly from the worker thread. As those can be quite expensive it can also end up stalling the entire pipeline (happens in the CTS). Adding another thread for handling those callbacks could be one idea (one for the entire runtime, not per queue)
<karolherbst> jenatali: ^^ did you do anything about that?
YuGiOhJCJ has quit [Quit: YuGiOhJCJ]
<karolherbst> the spec at least allows calling the cb async
<karolherbst> (there are other cbs as well, so maybe it also makes sense to use it for all kinds of callbacks)
<karolherbst> it does slow down the CTS quite a bit, so that's the annoying part
<karolherbst> mhh.. I also have an issue where I don't set a perfect local size if the global size is odd
garrison has quit []
i-garrison has joined #dri-devel
kj has quit [Remote host closed the connection]
kj has joined #dri-devel
ramacassis[m] has joined #dri-devel
JohnnyonF has joined #dri-devel
elongbug_ has joined #dri-devel
elongbug has quit [Read error: Connection reset by peer]
elongbug__ has joined #dri-devel
JohnnyonFlame has quit [Ping timeout: 480 seconds]
agd5f has joined #dri-devel
elongbug_ has quit [Ping timeout: 480 seconds]
Thymo has joined #dri-devel
macromorgan is now known as Guest2420
macromorgan has joined #dri-devel
Guest2420 has quit [Ping timeout: 480 seconds]
h0tc0d3 has joined #dri-devel
thellstrom1 has joined #dri-devel
thellstrom has quit [Read error: Connection reset by peer]
h0tc0d3 has quit [Quit: Leaving]
<jenatali> karolherbst: I didn't. I didn't really dig into CTS on perf that much, was just focused on making it work
Guest2416 has quit []
<karolherbst> okay
tagr has joined #dri-devel
jewins has joined #dri-devel
JohnnyonF has quit [Read error: Connection reset by peer]
thellstrom1 has quit [Ping timeout: 480 seconds]
JohnnyonFlame has joined #dri-devel
sdutt has joined #dri-devel
JohnnyonFlame has quit [Read error: Connection reset by peer]
<karolherbst> soo.. what's the deal with pipe_grid_info.grid. do all values have to be pot?
JohnnyonFlame has joined #dri-devel
lemonzest has joined #dri-devel
<karolherbst> ehh wait
<karolherbst> no, I just did a mistake
iive has joined #dri-devel
<danvet> robclark, is the dma-fence deadline stuff stalled or am I blind and it landed?
<robclark> danvet: oh, I kinda forgot about it and got distracted on other things
<robclark> iirc a bit of bikeshed about the ioctl but otherwise folks seemed fine with it
<robclark> danvet: unrelated, a-b for landing https://patchwork.freedesktop.org/series/99724/ via msm-next? The few drm core patches at the start are all r-b'd
<danvet> robclark, ack
<robclark> thx.. abhinav__ jfyi ^^^
Haaninjo has joined #dri-devel
maxzor has joined #dri-devel
icecream95 has quit [Ping timeout: 480 seconds]
nchery has quit [Quit: Leaving]
sdutt has quit []
sdutt has joined #dri-devel
nchery has joined #dri-devel
Thymo_ has joined #dri-devel
Thymo has quit [Ping timeout: 480 seconds]
ella-0 has joined #dri-devel
gpiccoli has quit [Quit: Bears...Beets...Battlestar Galactica]
ella-0_ has quit [Read error: Connection reset by peer]
jewins has quit [Quit: jewins]
gpiccoli has joined #dri-devel
heat_ has joined #dri-devel
<karolherbst> does anything in GL actually requires last_block or just driver implementing it using the helpers using that?
<karolherbst> doesn't look like there is any CAP for that
khfeng has quit [Remote host closed the connection]
khfeng has joined #dri-devel
<kusma> bbrezillon: I tried something similar, but I couldn't convince myself that it was an improvement. I think I'm just as unsure here.
<jenatali> bbrezillon: That doesn't work for command lists. Command lists use modifiable VTables. Doing a snapshot at initialization is broken
<karolherbst> the heck..
<karolherbst> the CTS enqueues kernels from event callbacks
<bbrezillon> jenatali: crap
<jenatali> :)
<kusma> Oh, I see. This was about more than just wrapping this a bit more... bespoke.
<bbrezillon> I need to have access to ID3D12Device10 features, when those are avaibled
<jenatali> Yeah. I'll say it's clever, but fragile
<bbrezillon> just wanted to avoid storing a iface version in the wrapper
<kusma> bbrezillon: So, I've been calling ID3D12GraphicsCommandList1_Foo() functions on ID3D12GraphicsCommandList7 objects without problems...
<kusma> Not sure if it's a good idea, but it seems to work?
<jenatali> Yeah that works
<jenatali> On the C++ side, the interfaces use inheritance so they can seamlessly cast from higher to lower
<bbrezillon> yeah, I know...
<kusma> Anyway, if we want to shorten these, maybe we just generate some wrapper functions with a python script or something?
<kusma> Or does that require us to parse the IDL files or something boring like that?
<bbrezillon> so, the solution is to have dev[1,X] if we want to support multiple versions?
<jenatali> You could either store them side-by-side, or you can throw type safety out the window and do a union. I'd store them side-by-side just so you can do null checks to see if the higher interface is available
<bbrezillon> with the union, you don't know which one is supported
<kusma> bbrezillon: you can store the version as an integer
<bbrezillon> unless we also store the iface version somewhere
<kusma> But not sure if that's a win
<jenatali> Yep
<bbrezillon> well, that's a win if we support more than 2 different versions :)
<jenatali> True
<kusma> I doubt we'll require that many versions. We're going to add a hard requirement on advanced barriers soon, and that's going to require a recent-ish runtime.
<kusma> And I believe we can use later interfaces as long as the runtime is recent enough...?
<jenatali> Right
<kusma> (just not call the functions that requires specific flags etc)
<bbrezillon> so dev1 and dev10 without a union?
<bbrezillon> or should I just require dev10 from the beginning?
<bbrezillon> I'm tempted to just call QueryInterface() where we need the new iface for now
maxzor has quit [Ping timeout: 480 seconds]
<jenatali> Keep in mind that a QI also needs a release afterwards, but yeah that sounds fine
<jekstrand> karolherbst: Yeah, once the SPIR-V back-end is upstream, we can probably drop spirv-llvm-translator
<jekstrand> karolherbst: I think I've got a plan for integer textures. It's a horrible plan but it's a plan and it should work.
<karolherbst> ahh.. I need to solve the Send situation :(
<karolherbst> jekstrand: cool
<karolherbst> I am already running the real CTS, but "conversions" is soooo slow, but speeding up isn't easy either, because Rust is in the way :)
<jekstrand> karolherbst: Not sure on callbacks. There's a part of me that thinks having a callback thread isn't a terrible idea but I've not thought through it that much.
Duke`` has joined #dri-devel
<karolherbst> the CL spec explicitly says that callbacks can be called async
<karolherbst> I just profiled conversions
<jekstrand> karolherbst: I need to go read some IMG code before I dive into CL today so I don't forget about it again but I should be available soonish.
<karolherbst> 30% of the time is just spend on calling callbacks or creating the compute state :(
<karolherbst> and the CPU and GPU just stall each other
<karolherbst> but before I can send random things into threads, I really have to figure out the Send situation, otherwise it's just unsafe madness we already have today
<karolherbst> the event work items aren't Send, but I do send them, which.... is okay as I was careful not to mess it up, but rustc can help us telling us what's not safe :)
<karolherbst> like it can shout at as if we pass in an Arc Ref instead of a clone
maxzor has joined #dri-devel
nchery has quit [Ping timeout: 480 seconds]
maxzor has quit [Ping timeout: 480 seconds]
tobiasjakobi has joined #dri-devel
tobiasjakobi has quit []
<anholt> all this ntt work is making me wonder what the mtbf is for swapping pcie cards on a motherboard.
<anholt> cwabbott: make an MR for review?
ybogdano has joined #dri-devel
nchery has joined #dri-devel
gouchi has joined #dri-devel
<robclark> danvet, daniels, ping on https://patchwork.freedesktop.org/series/94350/
<robclark> would be useful because minigbm/gralloc (ie. we need a way to describe things that are tiled but not necessarily ubwc)
guru_ has quit []
maxzor has joined #dri-devel
<bnieuwenhuizen> eobclark: wrt the modifier implementation being platform specific, AFAIU waypipe will copy over dmabufs modifier and all
<bnieuwenhuizen> robclark: ^
<MrCooper> that's kind of silly though, as there's no generic, well-performing way to get the raw tiled data
<robclark> it should perhaps not do that
<robclark> so kinda willing to call that waypipe's problem
<bnieuwenhuizen> (not sure if there was consensus about needing to keep compat with that)
<bnieuwenhuizen> and agree it is stupid :)
<danvet> robclark, for fourcc generally just a mesa+kernel ack and good
<danvet> which I guess would be someone from qcom here maybe?
* danvet dunno
jkrzyszt has quit [Ping timeout: 480 seconds]
<danvet> robclark, bikeshed: you have them out of order, 3 before 2 :-)
<danvet> robclark, also no idea what this has to do with ubwc? seems like bog standard tiled format modifiers?
<robclark> 3 is the "common" one, 2 is normally just used for GMEM.. maybe that was my reasoning
<robclark> maybe abhinav__ could ack.. I _think_ display can support tiled but not ubwc
<robclark> the only thing it has to do with ubwc is that it is _not_ ubwc
<danvet> in general fourcc.h is explicitly for userspace/mesa internal stuff too, if that was a concern
<robclark> ubwc implies tiled
<danvet> oh ubwc isn't some arm caching mode?
<robclark> it is like ccs/afbc.. a bandwidth compressed format
<danvet> ah ok, then sounds all good to me
<robclark> so far we mostly use ubwc for anything allocated by minigbm/gralloc so we hadn't bothered "officially" defining a public modifier for the tiled-by-not-ubwc case
<robclark> fwiw, these fourcc would replace the FD_FORMAT_MOD_QCOM_TILED hack internal in mesa
<danvet> yeah add them, that's what this stuff is for
<danvet> I think both vk and gl specs explicitly mention upstream drm_fourcc.h as registry
<danvet> and the deal was that we explicitly include anything that uses that (or even internally if it makes more sense)
<danvet> so "kernel must have a use for it" does not apply to that file
<danvet> also not really the strict "must be open userspace"
<robclark> ok, sgtm.. mind pulling that into drm-misc or a-b for going thru msm-next? (I can change the ordering if you want)
maxzor has quit [Ping timeout: 480 seconds]
khfeng has quit [Ping timeout: 480 seconds]
ybogdano has quit [Read error: Connection reset by peer]
mi6x3m has joined #dri-devel
<mi6x3m> hey, can anyone help me find where the defines like GALLIUM_I915 are set?
<mi6x3m> in meson somewhere?
<mi6x3m> found it thanks
mi6x3m has quit []
oneforall2 has joined #dri-devel
ybogdano has joined #dri-devel
maxzor has joined #dri-devel
<ajax> has anyone tried to hook up uvesafb to simpledrm?
OftenTimeConsuming has quit [Remote host closed the connection]
OftenTimeConsuming has joined #dri-devel
ybogdano has quit [Ping timeout: 480 seconds]
LexSfX has quit [Ping timeout: 480 seconds]
LexSfX has joined #dri-devel
<jekstrand> karolherbst: Ok, I think I've done enough code review that I can return to commiting grave sins in shaders. :)
ybogdano has joined #dri-devel
anarsoul|2 has joined #dri-devel
anarsoul has quit [Remote host closed the connection]
anarsoul|2 has quit [Remote host closed the connection]
anarsoul has joined #dri-devel
<karolherbst> jekstrand: yay
<karolherbst> boah.... my CTS runs aborts all the time as my ssh session dies
<karolherbst> and screen just killed the session as well
Major_Biscuit has quit [Ping timeout: 480 seconds]
<karolherbst> but at least no new issue :)
<jekstrand> :)
<karolherbst> but conversions takes a ___loooong____
<karolherbst> time
<jekstrand> They take a LONG_MAX time? :P
<karolherbst> probably
<karolherbst> I started the run like 8 hours ago
<jekstrand> :(
<jenatali> Sounds about right
<karolherbst> yeah.. I just don't use SSH anymore for the CTS run then :)
<karolherbst> ohh wait
<karolherbst> I know what killed it
<karolherbst> systemd-oomd
<karolherbst> mhhh
<karolherbst> guess I have to figure out if conversions just eat way RAM
ngcortes has joined #dri-devel
* jekstrand hates these CTS tests
<karolherbst> yeah...
<jekstrand> starting to wonder if we should just make everything unorm
<karolherbst> jekstrand: what do you mean?
<jekstrand> karolherbst: The unorm/snorm tests pass
<jekstrand> it's just integer formats that are broken AFAICT
<jekstrand> Or maybe the integer tests are just pickier
<karolherbst> right.. but those are required afaik
<karolherbst> "Table 19. Minimum list of required image formats for reading or writing"
* jekstrand installs the Intel driver
* jekstrand has shader dumps from the intel driver. \o/
<jekstrand> Ok, maybe this will tell me something.
<karolherbst> yay
<jekstrand> Or maybe not. We'll see.
Major_Biscuit has joined #dri-devel
<jekstrand> No shader workarounds
<jekstrand> bah
<karolherbst> jekstrand: I thought intel converts the coords
<karolherbst> anyway.. wanted to check how they set up the samplers, as that looked a bit different than what happens in mesa
<jekstrand> karolherbst: We found code for it but not in this test
<jekstrand> Yeah
<karolherbst> maybe do the same there and see if at least that works?
* jekstrand wonders if intel_dump_gpu will work
HankB has quit [Remote host closed the connection]
HankB has joined #dri-devel
<jekstrand> Nope. The CL driver juggles too many contexts. :-/
<jekstrand> me tried to fix that many moons ago
<karolherbst> :(
<karolherbst> ahh yeah.. we leak memory :)
<karolherbst> or well.. use more over time
<karolherbst> "definitely lost: 1,615,036 bytes in 38,508 blocks" :)
<anholt> coool. windows is 9 minutes into a build and hasn't even made it past meson configure. https://gitlab.freedesktop.org/mesa/mesa/-/jobs/21513323
<zmike> sounds like windows to me
<karolherbst> anholt: it looks like it downloads stuff forever :P
<karolherbst> jekstrand: okay.. yeah.. so we are leaking like 4.5MB per conversion test case.. I can see how that can end up using tons of RAM :)
nchery has quit [Quit: Leaving]
Major_Biscuit has quit [Ping timeout: 480 seconds]
<jekstrand> :)
Major_Biscuit has joined #dri-devel
mvlad has quit [Remote host closed the connection]
<jenatali> anholt: Ouch... yeah sounds like this machine is having some problems. I think daniels is trying to help stand up a new one, but was having problems getting a license. We're trying to unblock him on that front
<jenatali> Hopefully it all clears up soon
<zmike> jenatali: heya any idea about games that run on windows and use GL?
<zmike> looking for some examples
<jenatali> zmike: Minecraft?
<zmike> hm maybe
<zmike> any others come to mind?
<karolherbst> zmike: blizzard games usually allowed to use GL
<jenatali> (Java edition specifically)
<ajax> doom3
<karolherbst> some source based games as well
<jenatali> Yeah DOOM 3 and DOOM 2016
<daniels> jenatali, anholt: yeah I'm just going to nuke Windows until it's improved
<karolherbst> like the CSS and CS 1.6 I think :P
<anholt> daniels: thanks.
<jenatali> zmike: Looking for any particular GL version?
<anholt> it's that or we would need to up marge's timeout to a few hours.
<zmike> no just any
<karolherbst> jekstrand: ehh.. I think we leak all over the place :(
maxzor has quit [Ping timeout: 480 seconds]
<ajax> doom3 is basically the first gl2 game for linux iirc?
<ajax> but i hear it ran on windows too
<jenatali> It does
<karolherbst> mhh.. somehow I leak fences as well
<jenatali> That was one of our demos when we got it running on our Qualcomm systems with GLOn12
<FLHerne> zmike: You can get OpenTTD to use its GL renderer on Windows I think
<FLHerne> although it defaults to some DX thing
<FLHerne> not that that reflects performance of... any game with a sane rendering pipeline
<zmike> jenatali: a lot of those are only GL on non-windows platforms
<FLHerne> I assume 0ad is OpenGL everywhere
<karolherbst> ahh yeah.. fence is the biggest issue here
<karolherbst> anything else is just "possibly"
<jenatali> Ah, good to know
<jenatali> zmike: Talos Principle comes to mind, unless they deleted their GL renderer after adding Vulkan
<karolherbst> huh..
<airlied> anholt: daniels mentioned zlib dl being slow
<jekstrand> Well, I'm stumpped...
<karolherbst> why does the fence have a ref count of 2?
<anholt> airlied: opened an issue, hopefully someone will sort it out before trying to turn windows back on
<jenatali> zmike: There's also GfxBench and other cross-platform benchmarks that have GL paths
<daniels> airlied: zlib isn't even close to the problem
<karolherbst> ohhh no
<karolherbst> I ref the fence even though I don't have to
Major_Biscuit has quit [Ping timeout: 480 seconds]
<jekstrand> Kayden: Care to go on a chicken bit hunt? Looking for a context bit that affects sampler coordinate precision.
<jekstrand> Hrm... There's a bit in the SAMPLER_MODE register.
<karolherbst> "definitely lost: 5,588 bytes in 4 blocks" yeah okay.. that's much better :)
<zmike> 🤔
<karolherbst> looks like we are working on stuff
<karolherbst> nir_deserialize leaks memory, fun
<karolherbst> and somehow I ended up leaking the kernel object :O
<jekstrand> srsly?
<karolherbst> jekstrand: yeah
<jekstrand> nir_deserialize shouldn't leak
<jekstrand> oof
<karolherbst> it's like 10kb
<jekstrand> Also, SAMPLER_MODE::PLFloatToFixPrecisionFixDisable seems to do nothing. :-/
<karolherbst> huh
<karolherbst> we leak a variable
<jekstrand> weird
<karolherbst> very
<karolherbst> ahh
<karolherbst> ehh no, wrong file
<karolherbst> strange
<karolherbst> I also can't just ask what kind of var that is
<karolherbst> it gets added via exec_list_push_tail, so that would be very strange
<karolherbst> maybe it's ralloc which is leaking
nchery has joined #dri-devel
<karolherbst> ehh
<karolherbst> no idea
<karolherbst> I assume it's bevcause the CTS doesn't deallocs the kernel and the stored nir is just not tidied up
Major_Biscuit has joined #dri-devel
<karolherbst> why isn't alyssa here? :D
rkanwal has quit [Remote host closed the connection]
rkanwal has joined #dri-devel
maxzor has joined #dri-devel
kathleen__ has joined #dri-devel
gouchi has quit [Remote host closed the connection]
tjmercier has joined #dri-devel
Major_Biscuit has quit [Ping timeout: 480 seconds]
rkanwal has quit [Ping timeout: 480 seconds]
Duke`` has quit [Ping timeout: 480 seconds]
* jekstrand really should stop staring at this sampler bug. Intel no longer pays him.
<karolherbst> jekstrand: that bad?
<jekstrand> karolherbst: I've made almost no progress except figuring out with high certainty that the Intel CL driver isn't doing shader hacks.
<karolherbst> yeah... well.. you can push what you got and I can take a look as well
<karolherbst> given that's like the last thing we need to fix...
<jekstrand> we're using different denorm modes...
Akari` has joined #dri-devel
<karolherbst> jekstrand: like the value pushed to the hardware is different, but the enum names are essentially the same?
Akari has quit [Ping timeout: 480 seconds]
<jekstrand> No, like we're flushing and they're not
<karolherbst> ahh
<karolherbst> ohh, you said "denorm", right
<karolherbst> weird that it matters though, but guess that makes somewhat sense
<karolherbst> but weird that denorms matter for coords even if we don't advertise supports for denorms at all
<jekstrand> I don't know that they do
maxzor has quit [Ping timeout: 480 seconds]
<karolherbst> kind of depends what coords you got and what the test expects
<jekstrand> Ok, doesn't seem to make a difference. :-;/
<karolherbst> yeah...
<karolherbst> I usually blame denorms as well, as the p notations has high exponents, but it's binary
<karolherbst> and then it's just a normal value
<marex> robertfoss: so, I just tested lvds dual-link with lt9211, works perfectly :)
<jekstrand> karolherbst: Fiddling with the AddressMinFilterRoundingEnables does change things.
<jekstrand> But not enough to fix it
<karolherbst> well, the question is, does it get just slightly more/less accurate or does the category of the error change?
<jekstrand> It gets to a higher pixel before it finds an error so I guess more accurate?
<karolherbst> what are the failing coords?
<jekstrand> I'm looking at 1D RGBA INT8 CLAMP_TO_EDGE
<karolherbst> sure, but at what coords does it fail?
<jekstrand> Without changing that, I get a fail at pixel 7
<jekstrand> With it, I get a fail at 86
<zmike> mareko: I think https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15504 should be good to go now
<karolherbst> jekstrand: mhh, so that kind of sounds like a precision error, _but_ it can be something else still
<karolherbst> kind of depends where it finds the correct value
<jekstrand> karolherbst: Oh, it's definitely a precision issue
<karolherbst> jekstrand: keep in mind, that they have another thing going on for that combination
<karolherbst> 'Sampler::isTransformable'
<jekstrand> Annoyingly, the state dump I got from my Intel friend is on Gen9 and we're both on 12.
<jekstrand> karolherbst: ???
* jekstrand gresp
<karolherbst> CLAMP_TO_EDGE + NEAREST + not normalize
<karolherbst> d
<karolherbst> ohh, well I guess, that's normalized here?
<jekstrand> yeah
<karolherbst> mhh
<karolherbst> but I think I saw something somewhere.. maybe different file
* jekstrand still suspects chickens
<karolherbst> jekstrand: do you convert the coords to float?
<jekstrand> karolherbst: No, there's zero conversion. They're loaded from the buffer as float and go straight into the sampler
<karolherbst> ahh.. okay
<karolherbst> strange..
<karolherbst> but right.. we do have float coords
<karolherbst> jekstrand: what did you had to change to get the test a little further?
<jekstrand> AddressMinFilterRoundingEnable
<karolherbst> ohh, can I just set it? I didn't find it via grep
<jekstrand> it adjusts when in the calculation things get converted to fixed-point
<jekstrand> iris_state.c:2089
danvet has quit [Ping timeout: 480 seconds]
<karolherbst> yeah, I just wasn't aware that field exist
<karolherbst> ahh
<karolherbst> I think I just messed up git grep :)
<mattst88> emersion: https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/231#note_1347250 -- seriously? I wouldn't even bother getting a review on something like this for mesa
<karolherbst> now I just need to find the place where intel set that stuff :D
<jekstrand> I really wish I had something other than just TGL right now
<karolherbst> I do
<jekstrand> Does it fail the same way on both?
<karolherbst> yes
<jekstrand> :(
<karolherbst> CometLake-H GT2 here
<jekstrand> That shoots another theory
<karolherbst> jekstrand: anyway, flipping that breaks another test anyway
<jekstrand> Can you run that test with INTEL_DEBUG=bat on your cometlake and paste the result
<karolherbst> in mesa?
<jekstrand> yeah
<jekstrand> Thanks! I've got a dump from a SKL run and it'll be easier to compare gen9 with gen9
<karolherbst> yeah
<karolherbst> makes sense
<jekstrand> karolherbst: You're missing a patch. I just pushed my branch. Grab the top one. Then samplers will start showing up in batch dumps again.
<jekstrand> karolherbst: Still not seeing samplers. :-/
<karolherbst> ehh..
<karolherbst> jekstrand: yeah.. no idea, I grabed your patch and nothing changes even after recompiling
<jekstrand> :(
apinheiro has quit [Quit: Leaving]
<jekstrand> karolherbst: I think I may have figured it out and I really don't like it
<karolherbst> oh no, but that kind of sounds like you really figured it out if you don't like it :D
<jekstrand> The difference between us and the Intel driver isn't the results being returned by the sampler. It's the results from the CTS's SW sampling.
<jekstrand> I think we're changing the CPU rounding mode out from under them.
<karolherbst> uhhhh....
<karolherbst> _maybe_
<karolherbst> I saw that happening compiling with Ofast, but not with like debug or something
<karolherbst> does iris change the fp mode in any way?
<jekstrand> Not intentionally
<jekstrand> But who knows what all we call
<karolherbst> yeah.. but I don't think that's it though.. let me try
<karolherbst> mhh
<karolherbst> nope
<karolherbst> that's not it
<karolherbst> the CTS has code to change the rounding mode though
<karolherbst> jekstrand: FlushToZero
<karolherbst> which they call inside conversions
<karolherbst> ohh wait.. I think they stopped doing it
<karolherbst> contractions still do
<karolherbst> jekstrand: btw.. this test passes on iris
<karolherbst> ehh
<karolherbst> llvmpipe
<jekstrand> Yeah, so it could be iris messing something up
<jekstrand> what I know is that resultPtr is the same between the two drivers. :-/
<karolherbst> mhhh odd
* jekstrand builds llvmpipe
<karolherbst> maybe the GPU changes something on the CPU
<karolherbst> which.. would be super odd, but...
<jekstrand> karolherbst: How do I select between iris and llvmpipe?
<karolherbst> LP_CL=1 CL_DEVICE_TYPE=CL_DEVICE_TYPE_CPU
Haaninjo has quit [Remote host closed the connection]
<jekstrand> Ok, this is fun... lavapipe has the same expect as iris and iris has the same actual as the Intel CL driver
<jekstrand> So lavapipe's flipping things too, it's just consistent because it's also using the CPU for sampling.
<jekstrand> Fun!
<karolherbst> oh wow
<karolherbst> but yeah.. kind of makes sense
<karolherbst> question is.. what changes it
<jekstrand> That's the question
<jekstrand> This seems like something mattst88 might kno1
<karolherbst> maybe the intel runtime changes stuff...
<jekstrand> *know
<karolherbst> I disabled the call to _mm_setcsr here
<karolherbst> not sure if there is another way of changing it
<jekstrand> I guess it's possible the Intel driver is setting it...
<karolherbst> that would be like cheating :P
<karolherbst> jekstrand: but guess what
<jekstrand> ?
<karolherbst> mhh they have references, but I don't know if they use it
<karolherbst> mhh they change it in the compiler, but change it back
<karolherbst> mhh, let me checks omething
* jekstrand tries setting a watchpoint on $MXCSR
<karolherbst> jekstrand: sooo.. intels stack does expose other rounding modes
<karolherbst> let's see...
<karolherbst> we round to 6, but CTS expects 7
<karolherbst> we expose NEAREST
<karolherbst> maybe let's expose ROUND to zero
<karolherbst> yeah.. the CTS doesn't care
<karolherbst> soo anyway.. the coord is 0x1.5c9882p-6
<jekstrand> could also be rust, I suppose.
<karolherbst> ohh... maybe
ahajda has quit [Quit: Going offline, see ya! (www.adiirc.com)]
<karolherbst> so rust seems to round to nearest
<jekstrand> The precision flag in mxcsr changes between the start of main() and when we go to do tests
<karolherbst> can you set a watchpoint or something? :D
<karolherbst> seems like it's possible, but just super slow?
<jekstrand> idk
<karolherbst> I'll let it run for a while and see if it gets somewhere
<karolherbst> ahh yeah.. llvm loading
<jekstrand> Ok, so PE is still unset when we get to iris_screen_create
<karolherbst> good point
<karolherbst> it's still slow though :(
<jekstrand> It changes somewhere between iris_create_screen and test_read_image_1D
<karolherbst> yeah... I hope my watchpoint triggers
<karolherbst> it's still inside iris_screen_create here :(
<karolherbst> jekstrand: it changes inside iris_screen_creat
<karolherbst> e
<jekstrand> oh?
<jekstrand> That's progress!
<karolherbst> yeah
<karolherbst> finish inside it and then boom
<karolherbst> okay
<karolherbst> I think I am close
<jekstrand> iris_get_default_l3_config
<karolherbst> yeah
<karolherbst> oh no....
<karolherbst> it's some super annoying magic
<karolherbst> third call to intel_diff_l3_weights or something
<karolherbst> ehh second
<karolherbst> ahh
<karolherbst> maybe now I can set a watch point :)
<karolherbst> bingo
<karolherbst> Watchpoint 3: $mxcsr
<karolherbst> Old value = [ IM DM ZM OM UM PM ]
<karolherbst> New value = [ PE IM DM ZM OM UM PM ]
<karolherbst> :)
<jekstrand> where?
<karolherbst> #0 0x00007ffff713a372 in norm_l3_weights (w=...) at ../src/intel/common/intel_l3_config.c:204
<jekstrand> A division operation
<jekstrand> great
<karolherbst> yeah...
<karolherbst> maybe intel compiles their stack with some magic compilation flag
<karolherbst> ehh they sure do
<jekstrand> karolherbst: Oh, that's because PE is an exception flag
<karolherbst> ahh
<karolherbst> okay, so.. doesn't matter
<jekstrand> Yeah, I think so
<jekstrand> I'm going to run in GDB with the Intel driver and see what $mxcsr looks like
<karolherbst> looks the same to me?
<jekstrand> Yeah
<jekstrand> So why is it computing different values?!?
<karolherbst> the code might tell us
OftenTimeConsuming has quit [Remote host closed the connection]
OftenTimeConsuming has joined #dri-devel
morphis has quit [Ping timeout: 480 seconds]
<karolherbst> yeah.. this looks odd, but I am sure there is a good reason
morphis has joined #dri-devel
tzimmermann_ has joined #dri-devel
pcercuei has quit [Quit: dodo]
tzimmermann has quit [Ping timeout: 480 seconds]
<karolherbst> jekstrand: nope
<karolherbst> intel does calculate different values
<karolherbst> for the same sample
<karolherbst> intel:
<karolherbst> (gdb) p expected
<karolherbst> $8 = {103, 61, -27, -40}
<karolherbst> (gdb) p *(int[4]*)resultPtr
<karolherbst> $9 = {103, 61, -27, -40}
<karolherbst> rusticl/iris:
<karolherbst> (gdb) p expected
<karolherbst> $10 = {9, -92, 116, 108}
<karolherbst> (gdb) p *(int[4]*)resultPtr
<karolherbst> $11 = {103, 61, -27, -40}
<karolherbst> let me get the float coords though
CATS has quit [Ping timeout: 480 seconds]
<jekstrand> karolherbst: Intel vs. rusticl do different things in get_integer_coords_offset
<bnieuwenhuizen> jekstrand: did you have branches lying around for the import/export a fence into/from a dma-buf stuff? (hoping to avoid having to apply a bunch of downloaded patches to inevitably the wrong kernel branch)
CATS has joined #dri-devel
<jekstrand> bnieuwenhuizen: Not rcent
<karolherbst> jekstrand: because the input is different
<bnieuwenhuizen> I'll take not recent :)
<jekstrand> bnieuwenhuizen: This is all I've got: https://gitlab.freedesktop.org/jekstrand/linux/-/branches/stale
<jekstrand> bnieuwenhuizen: I don't see it
<jekstrand> :-(
<jekstrand> karolherbst: There's an xAddressOffset on Intel
<karolherbst> OHHHH
<karolherbst> for fucks sake
<karolherbst> I found it
<jekstrand> karolherbst: oh?
<karolherbst> AHHHHH
* jekstrand is waiting
<karolherbst> AHHHH
iive has quit []
<karolherbst> I'll push in a moment
* karolherbst throws his laptop against the wall
<Kayden> so we didn't need a chicken bit after all?
<jekstrand> karolherbst: WTH?!?
<karolherbst> jekstrand: yeah.. the CTS uses a different rounding mode for !GPU devices :)
<jekstrand> FFS
<karolherbst> I totally forgot about that
<karolherbst> you know what got me to think about htat?
<karolherbst> "gDeviceType != CL_DEVICE_TYPE_GPU" :)
<karolherbst> it's all over the place
* karolherbst reads the spec
<karolherbst> "The device type is purely informational and has no semantic meaning." :)
<karolherbst> sure
<jekstrand> hehe
<karolherbst> "Some devices may be more than one type. For example, a CL_DEVICE_TYPE_CPU device may also be a CL_DEVICE_TYPE_GPU device"
<jekstrand> karolherbst: That's also what they say about the game exeuctable name and the driver vendor string. :)
<karolherbst> jekstrand: best part
<karolherbst> "One device in the platform should be a CL_DEVICE_TYPE_DEFAULT device" :)
<karolherbst> :) :) :)
<karolherbst> ffs
<karolherbst> "I've done nothing wrong" or something
<jekstrand> karolherbst: Of course...
<karolherbst> but luxmark can't deal with default either
<karolherbst> lists the device as "UNKNOWN" instead of gpu
<karolherbst> maybe we just don't set it then...
<karolherbst> intel does return a default device then...
<karolherbst> maybe I just mask it out when the device type gets returned
<karolherbst> but still honor it for getting devices of a specific type
<karolherbst> how idiotic
<karolherbst> maybe I remove default handling, because this gets complicated with multiple devices anyway
<karolherbst> jekstrand: you have a patch for CL_ADDRESS_CLAMP, right?
<karolherbst> this min/max clamping whatever thing
<karolherbst> maybe I'll fix the CTS and let it do a & instead of ==
ngcortes has quit [Ping timeout: 480 seconds]
<karolherbst> it's clearly a CTS bug :P
<jekstrand> karolherbst: I don't think we need anything for clamp
<karolherbst> we do
<karolherbst> this check against < 0.0
Emmy_ has quit [Remote host closed the connection]
<karolherbst> with the magic input value
<jekstrand> karolherbst: Ok... I'm confused. I'll look in a minute. I'm telling Twitter a story right now. :)
<karolherbst> :D
<karolherbst> [CL_R CL_FLOAT 1] - CL_FILTER_NEAREST - CL_ADDRESS_CLAMP - UNNORMALIZED
<karolherbst> FAILED norm_offsets: 0:
<karolherbst> Sample 9: coord {-0.000000(-0x1p-24)} did not validate!
<karolherbst> Expected (0,0,0,1),
<karolherbst> got (-1.27092e-24,0,0,1), error of 0
<karolherbst> this one
<karolherbst> this annoying __builtin_IB_get_snap_wa_reqd business
<jekstrand> Yeah, I'll look in a minute
<jekstrand> I've got hacks that do all that. I just need to revert my more hacks.
<karolherbst> yay
Emmy_ has joined #dri-devel
<karolherbst> yeah.. soo now 2 api tests fail :)
<karolherbst> ahh no
<karolherbst> guess we don't need to return a DEFAULT device afterall
cheako has joined #dri-devel