<anarsoul>
I don't see any rationale behind allocating vec3 at the very end
<jekstrand>
Yeah, seems silly to me too
<Kayden>
Seems fine to me
<Kayden>
unless that causes vec2 to be double-parked
<Kayden>
double-parking vec2s would be bad
<anarsoul>
how is it different with double-parking vec3s?
<Kayden>
if you had a vec4, a vec3, and 2 vec2's, if you got.... xyzw | xyz x | y xy _
<Kayden>
instead of xyzw | xy xy | xyz _
<Kayden>
but I'm not sure it actually does that
<Kayden>
I think it tries to avoid double-parking if it can
<anarsoul>
hold on
<Kayden>
if it came up with xyzw | xyz _ | xy xy that would be just as good :)
<anarsoul>
wouldn't it be xyzw | xyz- | xy xy?
<Kayden>
I think so?
itoral has joined #dri-devel
<Kayden>
oh hmm it advances slots if the packing order is different
<Kayden>
so...it doesn't pack <vec2, scalar, scalar> into the same slot?
<anarsoul>
let me try
<Kayden>
it doesn't look like it would, but I thought it did, and it probably ought to
* Kayden
hasn't seriously read that code in 5 years
kem has quit [Ping timeout: 480 seconds]
<Kayden>
anarsoul: I think your change is good and you're right, there's no particular reason to have vec3 at the end
<Kayden>
it seems like it could do a better job packing some things
<Kayden>
err....the pass could do a better job than it is (regardless of your change)
<Kayden>
hmm....the comment does mention that vec3 is explicitly at the end so that others aren't at risk of being double parked
javierm has quit [Quit: leaving]
<anarsoul>
Kayden: I'll look into the code tomorrow night
javierm has joined #dri-devel
<anarsoul>
off to bed now :)
mattrope has quit [Read error: Connection reset by peer]
<Kayden>
have a good night!
kem has joined #dri-devel
Duke`` has quit [Ping timeout: 480 seconds]
slattann has quit []
Company has quit [Read error: Connection reset by peer]
slattann has joined #dri-devel
Hi-Angel has joined #dri-devel
flto has quit [Ping timeout: 480 seconds]
flto has joined #dri-devel
flto_ has joined #dri-devel
flto has quit [Remote host closed the connection]
pnowack has joined #dri-devel
tomeu has joined #dri-devel
tzimmermann has joined #dri-devel
slattann has quit []
slattann has joined #dri-devel
slattann1 has joined #dri-devel
danvet has joined #dri-devel
slattann has quit [Ping timeout: 480 seconds]
slattann has joined #dri-devel
slattann1 has quit [Ping timeout: 480 seconds]
<demarchi>
I'm seeing a ton of warning on amdgpu in drm-tip... is it just me?
slattann1 has joined #dri-devel
slattann has quit [Ping timeout: 480 seconds]
samuelig has joined #dri-devel
rasterman has joined #dri-devel
slattann1 has quit [Ping timeout: 480 seconds]
<MrCooper>
Kayden: "dmabuf export on buffers that aren't marked SHARED" isn't expected to work, that's what SHARED is about
<MrCooper>
emersion: mutter still disables modifiers by default on amdgpu as well, probably until dma-buf hints
<Kayden>
MrCooper: I wish that were true. It seems that SHARED is flagged for buffers which we know or expect will be shared, but the API lets you spontaneously export whatever you want.
slattann has joined #dri-devel
<Kayden>
MrCooper: In particular, Piglit's bin/ext_image_dma_buf_import-export-tex hits this case
<Kayden>
I'm not really sure how you think that should be fixed
<Kayden>
the sequence of events is... glTexStorage2D() - creates texture and allocates storage - eglCreateImageKHR - I suppose this could mark it shared and convert it there? - eglExportDMABUFImageMESA
<Kayden>
we can't know it's shared at step 1, because we're just allocating an ordinary texture
<Kayden>
step 2, it's already allocated wrongly for sharing
<Kayden>
step 3, it needs to be different
<MrCooper>
right, similar issue with other bind flags
<MrCooper>
some kind of transition at 2 maybe indeed
<Kayden>
hmm. yeah, that would definitely be nicer
<Kayden>
when something is made into an EGLImage it's almost certainly going to be shared
slattann has quit []
aissen_ has quit []
tursulin has joined #dri-devel
<Kayden>
It looks like that eglCreateImageKHR ends up in dri2_create_from_texture(), which checks if EGL_MESA_image_dma_buf_export is possible, and calls pipe_context::flush_resource()
<Kayden>
so flush_resource() could flag PIPE_BIND_SHARED if it isn't already and transition out of non-exportable forms
<Kayden>
that seems like a much nicer solution
<Kayden>
I wonder why radeonsi doesn't do that..
<MrCooper>
not sure offhand
gawin has joined #dri-devel
idr has quit [Ping timeout: 480 seconds]
rasterman has quit [Quit: Gettin' stinky!]
<emersion>
MrCooper: yea josh told me
<emersion>
i don't understand why
<emersion>
it's not like disabling modifiers will make direct scanout work
<emersion>
if anything, it'd be the contrary: only advertise scanout-capable modifiers as supported by the compositor
<emersion>
(what we do i gamescope)
<emersion>
in*
<emersion>
cc jadahl
<Kayden>
if I recall correctly, on...Intel DG1?...if compositors don't do modifiers, then Mesa starts throwing them linear buffers
<emersion>
that's a good thing :)
<emersion>
but it would also be nice if the display hw wouldn't just freak out when y-tiling is used :P
<Kayden>
but even if there are problems with some of the crazy ones, it would be nice to at least have modifier support but restricted to say, I915_FORMAT_MOD_X_TILED at least
<Kayden>
yeah, absolutely :(
<emersion>
y-tiling is the only reason why we have to tell users to disable modifiers
pcercuei has joined #dri-devel
lynxeye has joined #dri-devel
<Kayden>
presumably that includes Y_TILED_CCS, Y_TILED_GEN12_RC_CCS, Y_TILED_GEN12_RC_CCS_CC too.
<jadahl>
emersion: disabling modifiers make it work with xwayland :P (unredirect)
<emersion>
jadahl: shouldn't be the case
<jadahl>
really need to get to implementing that brute force intel thing to enable it for intel
<emersion>
xwayland used to allocate with SCANOUT, but doesn't anymore
<jadahl>
maybe it doesn't anymore then
<emersion>
and also that's a bad excuse for disabling modifiers :)
<jadahl>
heh, I don't disagree :P
<jadahl>
the best excuse is breaking multi head
<emersion>
yeah
<jadahl>
such head ache :|
<emersion>
that one i can get behind
<emersion>
yes ;_;
<Kayden>
that does sound painful :/
<emersion>
downgrading from y-tiled to x-tiled isn't too hard when a single CRTC is involved
<emersion>
but on hotplug if you need to downgrade *other* CRTCs then it's just a huge mess
<emersion>
i wish we had a DRM_CAP_INTEL_PLEASE_NO_BLACK_SCREENS
<jadahl>
my plan has been and still is to allocate all the things on hotplug, TEST_ONLY to see, goto 1 with another modifier for CRTC 1, goto 1 with CRTC 2, or etc
<emersion>
yeah, that's the only way to do it…
<jadahl>
the nice thing is that monitors take forever to turn on/off anyway, so if it takes some milli seconds so be it
slattann has joined #dri-devel
<MrCooper>
emersion: enabling modifiers breaks direct scanout for DRI3 clients (which allocate buffers themselves, not Xwayland)
<emersion>
on DRI3
<emersion>
oh,*
<emersion>
well why don't you just only advertise scanout modifiers if you want it so badly?
rbrune has joined #dri-devel
<MrCooper>
because that would anger daniels ;) I mean would hurt some embedded platforms
yoslin_ has quit []
vivijim has joined #dri-devel
yoslin has joined #dri-devel
<MrCooper>
dma-buf hints to the rescue
<lynxeye>
MrCooper: Not using direct scanout will also hurt some embedded platforms. ;) But yea, dma-buf hints is the only way to solve things generically, without a truckload of assumptions and heuristics.
slattann has quit [Ping timeout: 480 seconds]
slattann has joined #dri-devel
frieder has joined #dri-devel
<pq>
anholt_, FWIW, Weston should be importing YUV to EGL, provided you can find a Wayland client producing the kind of dmabufs you want to test with. But Weston will also fall back to hand-rolled import+conversion if direct EGL import fails.
rasterman has joined #dri-devel
Ahuj has joined #dri-devel
thellstrom has joined #dri-devel
dliviu has quit [Ping timeout: 480 seconds]
<pq>
Kayden, I thought allocating with GL and dmabuf-exporting that was totally a "doctor, it hurts" scenario?
<Kayden>
it certainly doesn't seem common, we have exactly 1 test case in all of piglit and the CTS that hits this
hansg has joined #dri-devel
thellstrom1 has quit [Ping timeout: 480 seconds]
dliviu has joined #dri-devel
jessica_24 has quit [Quit: Connection closed for inactivity]
JohnnyonF has joined #dri-devel
oneforall2 has quit [Quit: Leaving]
JohnnyonFlame has quit [Ping timeout: 480 seconds]
<tzimmermann>
who needs drm_fbdev_overalloc?
<danvet>
tzimmermann, page flipping on fbdev
<tzimmermann>
danvet: indeed, but it's optional AFAIK
<MrCooper>
also non-primary display with larger vertical resolution than primary one
<danvet>
yeah we don't want to waste memory
<danvet>
at least not by default
<danvet>
MrCooper, only if you plug it in later on
<MrCooper>
which happens by default with my external monitor connected via USB-C (DisplayPort alt mode)
<danvet>
plus in theory we could fix that by dynamically allocating
<danvet>
for drivers that use the generic fbdev stuff it should be pretty easy, since there we also intercept mmap
<tzimmermann>
people trigger this bug with simpledrm. the allocated BO cannot be larger than the screen size. so with overalloc, the height check in drm_internal_framebuffer_create() fails
<tzimmermann>
long story short: no console
aissen has joined #dri-devel
slattann has quit [Ping timeout: 480 seconds]
frieder has quit [Remote host closed the connection]
<danvet>
^^ would be great if someone can fix this of you too
<danvet>
I gtg now for a bit, I'll try and do the drm-fixes pull this evening
<tzimmermann>
danvet, some people configure overalloc and see the console fail
rgallaispou has joined #dri-devel
<danvet>
tzimmermann, yeah, don't do that?
<danvet>
or maybe we can hack up simpledrm to not overalloc, dunno
<danvet>
also I thought the overalloc code should fall back to not-overallocated maybe?
<tzimmermann>
but it should work. simpledrm uses shmem. there's no good reason why overalloc would break
<tzimmermann>
i added the not-overalloc workaround why testing, but i think the problem is in the semantics of max_height
<tzimmermann>
which brings me to my question
<danvet>
maybe, but I really gtg now
<tzimmermann>
what is the semantics of mode_config.max_height
<tzimmermann>
no problem
tzimmermann has quit [Remote host closed the connection]
tzimmermann has joined #dri-devel
tzimmermann has quit [Remote host closed the connection]
tzimmermann has joined #dri-devel
slattann1 has joined #dri-devel
slattann has quit [Ping timeout: 480 seconds]
<MrCooper>
tzimmermann: sounds like maybe the overalloc should be clamped to the maximum possible instead of failing (as well as possibly fixing the maximum)?
thellstrom has quit [Quit: thellstrom]
mlankhorst has joined #dri-devel
hansg has quit [Remote host closed the connection]
<tzimmermann>
MrCooper, i thought the same. but the maximum is not clearly defined. the core/helpers expect virtual resolutions, while drivers seem to be setting physical resolutions. i guess we should clarify the docs
<tzimmermann>
well, at least i now have an idea of what to do about this
flacks has quit [Quit: Quitter]
flacks has joined #dri-devel
<vsyrjala>
mode_config.max_{width,height} is the max fb size like the docs say. if drivers want to limit the max display mode dimensions they can do it in .mode_valid/etc.
<daniels>
Kayden: the answer to all your eglExportDMABUFImageMESA problems is just to not use it tbqh
adjtm has quit [Ping timeout: 480 seconds]
kts_ has joined #dri-devel
kts has quit [Ping timeout: 480 seconds]
elongbug has joined #dri-devel
Company has joined #dri-devel
thellstrom has joined #dri-devel
kts_ has quit []
kts has joined #dri-devel
slattann1 has quit []
adjtm has joined #dri-devel
Viciouss7 has quit []
Viciouss has joined #dri-devel
muhomor has quit [Ping timeout: 480 seconds]
kts has quit [Quit: Konversation terminated!]
itoral has quit []
Hi-Angel has quit [Ping timeout: 480 seconds]
thellstrom has quit [Quit: thellstrom]
Hi-Angel has joined #dri-devel
thellstrom has joined #dri-devel
rbrune has quit [Ping timeout: 480 seconds]
Hi-Angel has quit [Ping timeout: 480 seconds]
Hi-Angel has joined #dri-devel
hansg has joined #dri-devel
Peste_Bubonica has joined #dri-devel
muhomor has joined #dri-devel
<tzimmermann>
vsyrjala, so it's the maximum size of the virtual screen
muhomor has quit [Remote host closed the connection]
<tzimmermann>
most drivers seem to treat it like the physical size
<vsyrjala>
should fix them then i guess :) this was discussed a while back iirc, and we updated the docs a bit at the time
<vsyrjala>
i think someone suggested adding another set of things for the other limits, but imo that's a bit pointless as you can do that in the driver .mode_valid hook
<vsyrjala>
also often the timings have much more complicated limits that just two simple max values
<tzimmermann>
vsyrjala, that makes some sense
<vsyrjala>
eg. intel_mode_valid() checks for a lot of other limits
muhomor has joined #dri-devel
Peste_Bubonica has quit [Remote host closed the connection]
Hi-Angel has quit [Quit: Konversation terminated!]
Hi-Angel has joined #dri-devel
Peste_Bubonica has quit [Quit: Leaving]
JohnnyonF has quit [Ping timeout: 480 seconds]
sarnex has quit [Read error: Connection reset by peer]
sarnex has joined #dri-devel
<pinchartl>
has anyone given a thought about how the move to Bazel in Android could affect (in a positive way) building Mesa for AOSP ?
<pinchartl>
(my hopes are close to none based on previous experience with AOSP, I'd like to have a good surprise for once)
<bnieuwenhuizen>
wait, it is Bazel now?
<ddevault>
if you value your sanity you will steer well clear of bazel
<pinchartl>
they're switching from Soong with blueprint to Bazel with starlark as far as I can tell
<ddevault>
the day mesa is built with bazel is the day mesa makes an enemy out of every linux distro
<pinchartl>
ddevault: is it worse than Soong ?
<ddevault>
I am not familiar with soong
<ddevault>
but I am familiar with bazel
<pinchartl>
the concepts are similar as far as I can tell
<ddevault>
frightening.
<vsyrjala>
the faq doesn't address meson vs bazel. i guess not enough people asked that question
<ddevault>
the effort required to maintain a functional bazel system is on par with the effort expended for the rest of the project it builds
<pinchartl>
it should be meson + bazel
<pinchartl>
as in being able to integrate a meson-based project into the overall build system of AOSP
<vsyrjala>
well thye adress make/ninja vs. bazel. not make/ninja+bazel either
<bnieuwenhuizen>
I found bazel quite nice to use actually as long as you don't have too many config needs. That said we're not in charge of making build decisions for Android here
<pinchartl>
if we were, the end result would be much saner ;-)
<pinchartl>
the conclusion, when looking at integrating with their current build system, was along the lines of "no way". I was wondering if the new one could be better
<danvet>
tzimmermann, MrCooper yeah I agree with clamping
<danvet>
mode_config.max_h/w are enforced hints for kms users, as in your drm_fb shouldn't be bigger
<danvet>
but also it's ofc a bit silly, since you can always hide a much bigger buffer and then trim it with stride and offset
<danvet>
but since fbdev emulation is the kms users here, clamping the overallocation to the mode_config limits sounds like the right thing to do
<danvet>
tzimmermann, mripard, mlankhorst I guess I don't get a respinned drm-misc-fixes?
<danvet>
some of the patches in there are almost a month old by now, that's not good for committed -fixes
rgallaispou has quit [Read error: Connection reset by peer]
mattrope has joined #dri-devel
adjtm has quit [Remote host closed the connection]
adjtm has joined #dri-devel
gawin has quit [Ping timeout: 480 seconds]
Ahuj has quit [Ping timeout: 480 seconds]
Duke`` has joined #dri-devel
<robclark>
pinchartl: jstultz might know something about android/AOSP build plans?
<pinchartl>
when we discussed the topic last week during LPC, I think he shared my despair :-)
mlankhorst has quit [Ping timeout: 480 seconds]
gawin has joined #dri-devel
jessica_24 has joined #dri-devel
nchery has joined #dri-devel
nchery has quit [Remote host closed the connection]
kmn has quit [Quit: Leaving.]
pushqrdx has joined #dri-devel
<pushqrdx>
something weird suddenly happened and i don't have enough graphics programming knowledge to explain it, fo some reason even though i don't have any compositor running and i am on modesetting/mesa driver which usually tear like crazy in firefox
<pushqrdx>
suddenly firefox stopped tearing lol
nchery has joined #dri-devel
<pushqrdx>
been like that for several hours already no tearing at all
<pushqrdx>
usually tearing might stop for a few moments because by luck scrolling is aligned with vblank but not for hours
<vsyrjala>
is firefox fullscreen?
<pushqrdx>
no
gawin has quit [Ping timeout: 480 seconds]
tzimmermann has quit [Quit: Leaving]
kts has joined #dri-devel
kts has quit []
kts has joined #dri-devel
anujp has quit [Ping timeout: 480 seconds]
idr has joined #dri-devel
tobiasjakobi has joined #dri-devel
tobiasjakobi has quit [Remote host closed the connection]
<FLHerne>
pinchartl: At least people have *heard* of Bazel
<FLHerne>
that's got to help somehow
<pinchartl>
depends, if hearing of Bazel makes people flee, it may not help :-)
<dcbaker>
pinchartl: I know I joke about Google inventing a new build system every week, but...
<dcbaker>
Bazel is another one of those "Designed to solve only Google's problems" build system
<dcbaker>
options are hard
<dcbaker>
but if you happen to have a cluster of 1000000000 cpus, it can make uses of them :)
<pinchartl>
:-)
<pinchartl>
I expect both Soong and Bazel to have similar issues, as they're designed to address similar use cases
macromorgan has quit [Read error: Connection reset by peer]
<pinchartl>
but I was wondering if, by any chance, Bazel could also happen to help our problems. not by design of course, just by chance :-)
slattann has joined #dri-devel
macromorgan has joined #dri-devel
<dcbaker>
yeah, but Bazel is written in Java :)
<dcbaker>
I think it leaves us exactly where we were before, we either need to teach Bazel to understand meson, or meson to output bazel
<dcbaker>
pinchartl: and I feel even less motivated this go around than I did last time :/
<danvet>
robclark, huh that sounds every busted
<robclark>
danvet: thanks to finch experiment (basically a type of A/B experiment) I was testing with SkiaRenderer (which doesn't have this issue) instead of legacy GLRenderer (which does).. otherwise I would have noticed the problem sooner
<danvet>
robclark, I'm pretty sure iris uses one hw context per gl context, and we do let them free-float
<pinchartl>
dcbaker: I can't blame you. I thought bazel had support for python extensions though. maybe I'm mistaken
<danvet>
robclark, imo this is a gl spec question
<danvet>
or maybe a mesa-should-quirk-stuff question
<robclark>
then it is probably broken w/ GLRenderer if implicit sync is disabled ;-)
<robclark>
and I agree, it is a bad assumption by GLRenderer, but I don't think that is going to get fixed at this point
<danvet>
Kayden, ^^
<dcbaker>
pinchartl: It might. But Soong had support for Go extensions, and when I asked about that they said "Oh, only Google gets to use those", essentially
<danvet>
robclark, yeah still, hurting vk and everyone because of a single gl app seems wrong
kokoz has quit []
gitautas has joined #dri-devel
<danvet>
robclark, or is the hilarious problem that drm/sched conversion now breaks userspace?
<robclark>
I'm not really completely convinced that having a single "ring" per priority per process level hurts vk or anyone..
<robclark>
right
<gitautas>
Hi! I'm trying to develop an ultra-low-latency program that captures the framebuffer and encodes it in the hardware level, without copying the frames over to system memory. My questions are: is there some way to capture the framebuffer on AMD cards? I know that there is an API caled (RapidFire) but that seems to be windows only. Another one is whether there's some kind of abstraction for NvFBC and NvENC? I have an MVP program that works but debugging
<gitautas>
mainting it is a complete pain.
gouchi has joined #dri-devel
<ajax>
robclark: by "some apps" you mean opengl explicitly says using MakeCurrent like that is sufficient for synchronization
<ajax>
(unless you turn it off, and by all means please do, but MakeCurrent implies glFlush
<pinchartl>
dcbaker: right, the same probably applies to bazel extensions. android is such a joke...
<pinchartl>
maybe they should attend some "build your community the right way" workshops, there's lots of them these days
<dcbaker>
I see the same problem around Intel, people who don't understand F/OSS and are only solving their own problems
<pinchartl>
or maybe it's just me who has missed the "disregard your community, you're way better than them" workshops
<dcbaker>
It's a very corporate mentality
<pinchartl>
I think it's particularly present in the Android team
<pinchartl>
there's a lack of humility
ybogdano has joined #dri-devel
alyssa has joined #dri-devel
<dcbaker>
:/ yeah
<alyssa>
cwabbott: Super happy to see the preamble stuff reusable across drivers
<alyssa>
jekstrand: How reusable is NIR constant folding?
<jekstrand>
alyssa: What do you mean?
<alyssa>
I know anholt_ has talked about doing a whole NIR interpreter, which would be Work™, but what about just the subset needed for uniform-constant ALU?
<dcbaker>
pinchartl: on the awesome side though, scipy is moving to meson :)
<robclark>
ajax: well, not 100% sure what GL says, but if we have multiple sched-entities per priority level per process, then doing kernel ioctl in the correct order is not sufficient to enforce execution on the GPU in that same order (unless you rely on implicit sync)
<alyssa>
Once nir_opt_preamble lands, can we have a code path using nir_opt_preamble for only ALU and load_uniform, and then evaluate the preamble in CPU?
<alyssa>
and maybe do that in a driver-generic way?
<robclark>
pinchartl: curious if android is planning to build kernel with bazel? :-P
<alyssa>
I guess that can't really be driver-generic due to load_preamble/store_preamble being needed.
<alyssa>
For AGX, I don't care, Apple is happy to spawn a preamble shader to save a single ALU op
<jekstrand>
alyssa: Should be doable.
<alyssa>
For Mali, there's some fixed overhead to doing a preamble so for simple cases it's probably faster to just do it on the CPU
<jekstrand>
alyssa: The core of constant folding is nir_constant_expressions.c/h which has a helper that takes an op, a set of nir_const_value arrays, bit size, and number of destination components.
* alyssa
nods
<jekstrand>
nir_opt_constant_folding just calls that
<alyssa>
(For Mali, optimal is probably "small amounts of ALU, do it on the CPU. large amounts of ALU, or a texture load or something goofy, punt to hw")
<jekstrand>
Or, for that matter, you could replace all the load_uniform with load_const and fold the shader.
<alyssa>
too expensive
<pinchartl>
robclark: I'd love to see them trying :-)
<jekstrand>
alyssa: Sure
<alyssa>
(I mean. Is it? Hm. Maybe not. Doing NIR ops at draw time seems like a bad idea but..)
<pinchartl>
dcbaker: we're trying to move v4l-utils from autoconf to meson. there's lots of resistance from the V4L2 maintainer, but on the technical side, it's so much better
iive has joined #dri-devel
<jekstrand>
alyssa: When anholt_ was talking about an interpreter, the idea was to "compile" the NIR down to something a bit faster.
<jekstrand>
I don't remember exactly what we decided was reasonable
<dcbaker>
pinchartl: that will be awesome when it's done :)
<alyssa>
jekstrand: Yeah, for sure...
<jekstrand>
But, even if nir_constant_expresssions.c doesn't do what you want, you've got the expressions in nir_opcodes.py. You can code-gen something else.
* alyssa
nods
<jekstrand>
Though I don't recommend that because, seriously, it's complicated.
<jekstrand>
To get all the corners right, in any case.
<alyssa>
It's a shame there's not a "easy" way to JIT nir to something reasonable.
<alyssa>
I guess abusing llvmpipe's compute support is an option but...
<alyssa>
that seems, err, heavy-handed.
<alyssa>
(and doesn't help anholt_ )
<jekstrand>
I don't know what the current best practices are for non-JIT interpreters
<alyssa>
(...actually now I'm wondering what "execute preamble shaders with llvmpipe" would look like. I am, horrifyingly, not disgusted by the idea.)
<alyssa>
I mean we have a whole software rasterizer infrastructure right there ....
<jekstrand>
heh
<jekstrand>
But then you'd have to load up LLVM. :P
<jekstrand>
Your chromebook might not have enough disk for that
<alyssa>
Ruuuude :-p
<pinchartl>
:-)
<zmike>
dcbaker: I thought cpp_args was a thing? is it not a thing?
<jekstrand>
Hrm.... I bet we could amortize the cost of "pure" C evaluation if we did it wide....
<robclark>
alyssa: I guess more of the CPU overhead would be just having to do CPU readback of constbuf.. the things you would actually be evaluating at draw time should generally be pretty straightforward expressions
<dcbaker>
Meson uses "cpp" for "C++"
<jekstrand>
*sigh*
<alyssa>
robclark: that's only high overhead for UBOs or tc
<dcbaker>
zmike: don't ask me why
heat has joined #dri-devel
<zmike>
dcbaker: right, right, I had forgotten this detail
<jekstrand>
dcbaker: Why? (You didn't say *I* couldn't ask. :P)
<dcbaker>
historical happenstance
<robclark>
alyssa: but you probably want TC ;-)
<dcbaker>
lol
<pinchartl>
dcbaker: while you're here, why does meson not support local variables in meson.build files ?
<alyssa>
robclark: eventually
<zmike>
dcbaker: while you're here, I don't suppose you've had a chance to fix that asan thing?
<zmike>
hahahah
<zmike>
quick everyone ask your meson questions!
<pinchartl>
zmike: :-)
<dcbaker>
lol
<jekstrand>
dcbaker: While you're here.... Nah, I've got nothin'. :)
<pinchartl>
I have multiple variables with the same name in different files, it works fine, but I'm concerned that one day I'll use one of those in a file before it's (re)defined and will introduce a bug
<jekstrand>
pinchartl: Yeah, I have that fear sometimes too
<dcbaker>
pinchartl: I think mostly because no one's every written the code
<dcbaker>
there's been plenty of requests for it
<pinchartl>
it's not a *big* deal, just a nice to have
<jekstrand>
Even if it were something as simple as _foo is local
<jekstrand>
or a keyword "local foo = blarg"
<dcbaker>
but meson's architecture kinda assumes all variables have global scope, so it would be pretty non-trivial
<dcbaker>
zmike: not yet
<dcbaker>
it looks reasonable, I just need to figure out how long the argument has existed so we don't inject it in invalid cases
<zmike>
makes sense
<dcbaker>
pinchartl: I could solve it more easily with Meson++ because that uses a flat IR so file boundaries are easy to see and local variables could be folded away really quickly :)
<HdkR>
Interpreters you say? I know about interpreters. Why need a NIR interpreter?
<alyssa>
HdkR: silly shaders doing arithmetic on uniform expressions,
nsneck has joined #dri-devel
<alyssa>
can hoist it to a dedicate preamble shader / preshader / pilot shader,
<alyssa>
but might be faster to just evaluate on the CPU in some cases
<jekstrand>
HdkR: Also, so we can convert softpipe to NIR and finally delete TGSI.
<jekstrand>
nirpipe. :D
<alyssa>
(Things like `varying * sqrt(uniform)`)
<jekstrand>
But, also, CPU folding of uniforms.
<alyssa>
jekstrand: not to be confused with nerdpipe
<jekstrand>
So, when I get distracted by NIR refactoring, does that mean I've been NIRsniped?
<robclark>
we can't completely delete TGSI.. but maybe we can move it into virgl
<robclark>
:-P
<dcbaker>
alyssa: is it really that different?
<jekstrand>
I'd be ok with that resolution. :)
<alyssa>
jekstrand: Yes
<HdkR>
I can definitely understand hoisting to a pilot shader. But what hardware would it actually be faster to interpreter on?
<alyssa>
HdkR: possibly Mali in certain circumstances but haven't benchmarked
<ajax>
robclark: i mean... internally mesa _isn't_ issuing fences around context binds... but it certainly could
<HdkR>
alyssa: Interesting...
<alyssa>
since spawning a compute shader has some overhead
<jekstrand>
HdkR: The problem isn't typically that it's faster to do it on the CPU but that doing it on the GPU is tricky and can introduce stalls.
<alyssa>
jekstrand: The opposite is also true though :-p
<alyssa>
hence, both :-p
<robclark>
ajax: true.. but also kernel broke userspace
<jekstrand>
Oh, doing it on the CPU can definitely introduce stalls. :)
<HdkR>
So the CPU needs to be fast enough, and also the GPU needs to be bad at pilot shaders? :P
<jekstrand>
Or not have pilot shaders
<alyssa>
HdkR: Literally describing MT8183 right there :-p
<HdkR>
jekstrand: That's the nvidia approach. Just make uniform math happen in the same shader :)
<robclark>
for adreno, it's a preamble, so really same shader
<HdkR>
32x speedup baby~
<HdkR>
I like the Adreno approach
<HdkR>
Sounds like there are multiple needs for a lightweight JIT with basic passes though. Something slightly smarter than just code emission
<ajax>
in fact
<ajax>
now that i've said that i want to make zink do it
<HdkR>
Poor zmike :D
<zmike>
what
<zmike>
ajax: let's focus up here buddy, one thing at a time
* zmike
starts rewriting lavapipe as soon as he says that
anujp has joined #dri-devel
<ajax>
i literally started typing "focus, adam" into irc, then switched machines so i could take the laptop out on the porch, and came back to being told to focus.
<zmike>
hahahah
<zmike>
watches synchronized
<ajax>
i want a lot of things though. i want a gles2 driver for the verité, if that's any indicator of the relative position of my head and the clouds
* zmike
takes a long blink
<alyssa>
ajax: I dunno if it makes sense to do preamble stuff in zink + gl drivers or vk drivers + gl drivers - zink or everywherez
<ajax>
i meant the "soften makecurrent's implicit flush into fencing" thing, not preambling
<alyssa>
oh
<alyssa>
also worth discussing preambling though
<zmike>
no
<zmike>
stop nerdsniping.
<alyssa>
zmike: discussing as in like
<HdkR>
I can amble on for a while, probably good to get some pre-amble stretches
<alyssa>
"does it make sense to make it the vk drivers problem so zmike doesn't have to" :P
<robclark>
it's already the vk driver's problem ;-)
<robclark>
so zmike is off the hook
pushqrdx has quit [Remote host closed the connection]
<Kayden>
robclark: i965 and iris have always used separate HW contexts per GL context...since Mesa was back at GL 3.0. Contexts aren't synchronized with one another beyond MakeCurrent implying glFlush. That's the app's job with fences.
<Kayden>
iris is weird in that it uses two HW contexts per GL context - one for render, one for compute, and lets work go out-of-order a bit there, but implicitly synchronized for data dependencies, so it should be transparent to the app
<robclark>
Kayden: the question is where drm_sched_entity fits in on the kernel side.. if you have multiple sched-entity there is nothing ensuring that execution on gpu is same order as flush to kernel
<Kayden>
(eventually when we finally get the fabled compute command streamer we've been promised they'll be able to run in *parallel* which should be really fun...)
<Kayden>
I'm pretty sure drm_sched_entity is tied to HW context in i915, but I haven't honestly read that code
ybogdano has quit [Ping timeout: 480 seconds]
<robclark>
initially on kernel side, sched-entity was mapped 1:1 with submiqueue (kernel side counterpart to userspace context)
<Kayden>
so yeah, it sounds like that would break on i965/iris
<Kayden>
that really is a flawed assumption and a broken app
<Kayden>
codifying that into kernel behavior seems like it will come back to bite you
<Kayden>
as far as I know anyway...
<Kayden>
if anything it seems like the kind of thing that might be better as a mesa driconf
<Kayden>
(which sounds a bit awful)
<robclark>
I think ajax's assessment was gl says MakeCurrent() is a sufficient barrier for multi-ctx rendering on a single thread.. possibly mesa should be creating fences for this.. but OTOH kernel broke userspace
<Kayden>
that...may be true
<Kayden>
your patch says "one sched entity per *process* per priority" though?
<Kayden>
so you're not allowing scheduling between contexts even if they're always running in separate threads?
<Kayden>
that would definitely be oversynchronizing
<robclark>
there isn't really a good way to differentiate between multiple threads vs single thread.. OTOH things get serialized when they are written into the ring
slattann has joined #dri-devel
slattann has quit []
<ajax>
actually i'm not completely convinced of what i said
<ajax>
MakeCurrent is Flush but it's not Finish
<robclark>
the question is, whether you can assume any ordering of what happens after Flush
X-Scale has joined #dri-devel
<ajax>
so... command submission might be ordered between the two ctxs, but there's no requirement that they complete in order, or be externally visible after flushing
<robclark>
that would lean towards "app bug"
<ajax>
if they went down the same command queue it'd have the effect of a finish, because the readback command would issue after the draws flushed.
danvet has quit [Ping timeout: 480 seconds]
<ajax>
but that's implementation detail, i thin
<robclark>
at any rate.. I think it is a kernel-broke-userspace.. if we want to re-allow multiple ctx's in same process to complete out-of-order, I think we need to introduce a new flag in submitqueue-create ioctl so new-userspace can opt-in
pushqrdx has joined #dri-devel
pushqrdx has quit []
<ajax>
i mean... i kind of want to enable that, but i also much prefer everyone turning the implicit flush off and using real sync apis like a grown up
<zmike>
seems like it's from the generated code and not actually anything related to lavapipe though?
<pushqrdx>
what are the odds that by luck firefox is still somehow synchronized with vblank and doesn't tear for literally hours without using a compositor and on modesetting/intel driver
shashank_sharma has quit [Ping timeout: 480 seconds]
gawin has quit [Quit: Konversation terminated!]
heat has quit [Ping timeout: 480 seconds]
<Company>
how expensive is eglCreateWindowSurface()?
<Company>
will I be unhappy if I call it every frame? every few seconds? more than once per window?
elongbug has quit [Ping timeout: 480 seconds]
<ajax>
approximately the cost of allocating the default framebuffer attachments
<Company>
context: We're wondering about switching from U8 to FP16 if somebody suddenly opens a HDR wide gamut image in eog or whatever
<Company>
so that's doable, we could just live without application-level API and dynamically switch as needed
<ajax>
it's a one-time thing though, really. you do it when the underlying winsys window changes, and the pixel format is immutable for a given window
<Company>
oh
<Company>
so we have to decide on U8 vs FP16 once and then stick to it?
mctom has joined #dri-devel
<ajax>
for a given EGLSurface, yes. if you want to change then you need to work out with your presentation manager how to convey the idea that the fp16 window replaces the u8 one
<ajax>
not egl's job
<Company>
I'd just eglDestroySurface() the old one
<ajax>
or: would only be if someone decided to reflect the wayland api up as like eglHigherDefsPlzWL(dpy, oldsurf, newsurf)
<Company>
and eglCreateWindowSurface() a new one
<ajax>
yeah
<Company>
in my mind it's just the buffer manager for the wl_surface
<Company>
so it's like creating a new shm_pool or whatever
<jekstrand>
zmike: Uh.... weird.
<jekstrand>
zmike: Why is that test even running in CI? I wouldn't think that extension was exposed in a CI container.
<zmike>
no idea, but it should be notsupported I think
<zmike>
and yet
danvet has joined #dri-devel
<clever>
where can i find more information on the TFU hw and UIF format used on at least the bcm2711?
<alyssa>
robclark: giggle, fair enough
<alyssa>
I would've hoped vk apps were less dumb about this
mctom has quit []
<ajax>
jekstrand: given we're +1 thread for wsi already, is there any reason not to use two? it's really ugly for that one thread to manage two queues because there's two unrelated ways it can block so there's no good way to shut it down
<jekstrand>
ajax: I see no reason why not, assuming we have a good reason for two.
<ajax>
but you have to, because it allocates, because xcb is an unrelenting fount of joy
<jekstrand>
ajax: We're already at "WSI is hard; let's spawn threads" so meh
<ajax>
no no. it's the right approach.
<jekstrand>
Also, I don't think you need the "cb" in that statement. :P
<ajax>
fair
* ajax
cracks knuckles, red bull
<jekstrand>
But, yeah, if we need to threads, that's fine. We're already spawning behind the app's back so I see no reason why 2 is worse than 1.
* jekstrand
has the nagging feeling that he's going to be asked to review something before too long
<ajax>
i should be so lucky
danvet has quit [Ping timeout: 480 seconds]
<jekstrand>
Better idea: Burn a few of my now infinite "owe you one" points to make daniels review it. :P
<txenoo>
clever, TFU is the V3D Texture Formatting Unit, it is mainly used to convert differnet formats into UIF formatted textures.
<clever>
txenoo: and what exactly is UIF? google cant find much
<txenoo>
TFU main usage in V3D is to convert linear textures to UIF format, so the GPU can sample from them.
<clever>
ive not seen it called UIF in the old docs, the new docs are entirely absent, and the old v3d lacked a TFU
<clever>
just trying to confirm, is UIF the same as what is described on page 105?
<txenoo>
No it is a different format. I think that the layout is more complex, at least the code that handles it. There is code in mesa to upload linear textures using this UIF format.
<clever>
ah
<clever>
so there are 2 different tiled formats at play
<clever>
old v3d (pi0 to pi3) uses whats in this pdf
<jenatali>
Does hardware care about a distinction between shadow samplers and non-shadow samplers? Looks like vtn never labels samplers as shadow samplers. I'm wondering if there's already a pass to add that info into sampler types
<clever>
and new v3d (bcm2711) uses a different one called UIF
<clever>
and new v3d, has a dedicated TFU unit, to handle conversion, including yuv inputs
<alyssa>
jenatali: isn't shadow <===> used with a comparison?
mlankhorst has joined #dri-devel
<anholt_>
clever: there's C code in Mesa for it.
<jenatali>
alyssa: Yeah
<jenatali>
DXIL requires the sampler variable declaration to match the usage
<clever>
src/broadcom/vulkan/v3d_tiling.c: * UIF is the general VC5 tiling layout shared across 3D, media, and scanout.
<clever>
anholt_: ah, i can just read a file like this for more info then
<alyssa>
jenatali: so no, hardware wouldn't care since it'd be embedded in the sampler instruction, there aren't declarations.
<jenatali>
alyssa: Then why is it in the sampler properties in the glsl_type at all?
<clever>
anholt_: i'm also having some trouble getting the vec online with bcm2711, i can feed it the needed 108mhz clock, and get it generating a valid ntsc signal
<clever>
anholt_: but the muxing around the hvs/pv isnt entirely clear to me, and it only ever generates a solid black image
<anholt_>
all of my knowledge of rpi clocks is in the linux kernel clock driver.
<clever>
anholt_: i think the clock problem is entirely solved
<clever>
i can measure the hsync/vsync rate with a scope, and confirm that it is clocked correctly
<clever>
its just the mux to route things thru the pvs/pv/vec pipeline, that i havent solved
<anholt_>
similarly, all my knowledge of tv output on rpi is in the vc4 vec code in the linux kernel.
<clever>
anholt_: there are some things entirely absent from the linux source, like the arm core beying unable to even touch HVS registers until something happens
<clever>
it reminds me of how the DSI drivers cant touch DSI regs, and have to cheat via the dma or mailbox
<alyssa>
by logical deduction, anholt_ has no knowledge of this.
<clever>
or isnt at liberty to say
<anholt_>
given that linux touches those regs, if you find you can't touch those regs, I would suspect that you haven't powered the block on.
<txenoo>
clever, UIF can no be used for scanout.
<clever>
and only what is in the source, has been approved for release
<clever>
anholt_: that is what every engineer has said, and that is totally wrong
<anholt_>
I really don't have any secrets here.
<clever>
anholt_: given that i have full xfce running on ntsc out, its definitely bloody on
<anholt_>
maybe there's some axi bridge flag somewhere for hvs from arm, I dunno.
<clever>
yeah
<clever>
thats my theory
<clever>
and every RPF guy i have talked to, said power domains, and then fell silent
<clever>
but the arm cant touch HVS, so no pageflips, just a dumb static framebuffer
<clever>
txenoo: does the bcm2711 v3d output uif or linear for the final render?
<clever>
vc4 v3d had: linear raster, t-format, and lt-format
slattann has quit []
jekstrand has quit [Ping timeout: 480 seconds]
<jstultz>
robclark: pinchartl: I unfortunately don't know about plans. When I heard about the bazel changes (not knowing anything about bazel) I wondered aloud if it might help with the external integration issues, but the response from google devs was approximately "not likely"
<agd5f>
clever, on older AMD GPUs, interlaces timings didnt work with tiled surfaces. You had to use linear. maybe you have a similar limitation?
<clever>
agd5f: the ntsc problem is present when using linear images, i'm just also researching the v3d in parallel
<agd5f>
ok
<clever>
agd5f: https://www.youtube.com/watch?v=u7DzPvkzEGA this would be an interlaced scanout, with 20 moving framebuffers (all being downscaled), plus one full-screen bg layer
<clever>
the glitching happens when there is too much input data on the same scanline, and it cant generate data in time
<agd5f>
sounds like a watermark or line buffer issue
<clever>
yeah, i believe the glitching at 20 is a memory bandwidth issue
<clever>
its down-scaling on the fly, so it has to fetch a lot more image data then your seeing
<clever>
and there is no cache, so it has to re-fetch it for every sprite on the scanline
<clever>
pre-scaling in ram would make it perform better, but my main goal is to push it over the edge, to learn where the edge is
<jstultz>
pinchartl: and while I don't have any love for the android build system, i'd also not paint a whole team with the big "lack of humility" brush, as that's a common problem i see with individuals everywhere. :)
jekstrand has joined #dri-devel
<txenoo>
clever, the display driver needs to be feeded with linear raster, v3d can render to UIF or linear but if it goes to display linear needs to be used.
<clever>
ah, same as old v3d
<clever>
i assume rendering to a tiled format, is only for when you want to re-use it as a texture in a second pass
<clever>
a cheap way to implement a security camera in a video game, for ex
<txenoo>
clever, yes
<clever>
ive also seen cases of game console emulators not handling this well
<clever>
one n64 game (mario cart i think) had a jumbotron on the course, that showed the main camera render
<clever>
but for cases like the v3d, that means you have to render the scene twice, once to linear, and once to uif
<clever>
so some emulators have a flag to disable that feature
<clever>
i'll have to re-read all of the hvs code, and find the answer to my muxing problem
gawin has joined #dri-devel
hansg has quit [Quit: Leaving]
<daniels>
jekstrand: I got your notification but the message isn’t showing - what am I reviewing for you after I’m done with zmike?
<pinchartl>
jstultz: fair point, I accept the criticism ;-)
markus has quit [Ping timeout: 480 seconds]
markus has joined #dri-devel
<pushqrdx>
if someone wanted to implement a tearfree option (outside the compositor) at a lower level, where would be the best place for that, i am trying to understand the graphic stack on linux
<pushqrdx>
and i wonder why doesn't modesetting and/or mesa provide something like triple buffering or vsync
Bennett has joined #dri-devel
<pushqrdx>
tbh i am bit confused as to where mesa stand in the stack, is it the same thing as modesetting
<ajax>
no
<pushqrdx>
modesetting is part of x then? or is the kernel module
<ajax>
there's a generic driver for X named modesetting, which more or less works with every kernel driver that implements a feature set that is also named modesetting.
<ajax>
'kms' tends to mean the kernel part specifically
<pushqrdx>
so modesetting can be thought of as an interface, and the modesetting driver for X is just an implementation of that for xorg, modesetting the interface is what talks to the kernel side of things then
<clever>
pushqrdx: my understanding is that modesetting and kms, is an api to do a collection of tasks
<clever>
1: configure the resolution of a video video output port
<clever>
2: allocate memory for framebuffers
<clever>
3: configure what xy coord a framebuffer is rendered at
<clever>
4: perform atomic flips between framebuffers
<clever>
opengl drivers are sometimes implemented as a kms device, that can only do 3d render, and spits out a framebuffer when done
<clever>
and the framebuffer handles can be exchanged between devices, so you can display it on a given output
<clever>
but that only holds true for full screen rendering, when you have complete control of the video output
<clever>
X11/wayland complicates matters
<jekstrand>
daniels: ajax rewriting the Vulkan X11 WSI code. :)
Duke`` has quit [Ping timeout: 480 seconds]
<pushqrdx>
so if we think about the route a draw call takes from an xclient to hardware is this somewhat accurate? xclient -> xserver -> xserver-driver(xf86-xxx) -> mesa -> drm -> kernel -> hardware
gawin has quit [Ping timeout: 480 seconds]
ybogdano has joined #dri-devel
<bnieuwenhuizen>
pushqrdx: draw calls themselves go xclient -> mesa -> libdrm (for some drivers) -> kernel -> HW
<bnieuwenhuizen>
assuming you don't use indirect rendering (and you mostly shouldn't)
<bnieuwenhuizen>
then to display the output you're mostly right
<pushqrdx>
bnieuwenhuizen oh so mesa cuts down on the trip to xorg?
<daniels>
jekstrand: I think that might push you into negative credit tbh
<bnieuwenhuizen>
basically assuming you're using GL/vulkan only the final buffer from e.g. glXswapbuffers gets communicated to X
<jekstrand>
daniels: I fear we may be experiencing some pretty serious credit inflation then...
<jekstrand>
Back in my day, credit was worth something... You could get a whole patch series review for half a credit.
<jekstrand>
</old man voice>
pushqrdx_ has joined #dri-devel
rasterman has quit [Quit: Gettin' stinky!]
<ngcortes>
heads up folks, mesa ci will be down this afternoon while we replace a failing disk. Thing should be back online by tomorrow. we'll keep everybody posted.
pushqrdx has quit [Ping timeout: 480 seconds]
gawin has joined #dri-devel
<robclark>
all of mesa-CI or just certain runners? In latter case you should push an MR to disable the necessary jobs
vivijim has quit [Ping timeout: 480 seconds]
<anholt_>
robclark: ngcortes's thing is intel's private CI.
<robclark>
ahh, ok
pnowack has quit [Quit: pnowack]
<daniels>
phew …
<daniels>
ngcortes: would be helpful if you said ‘intel CI’ in future!
<ngcortes>
anholt_, daniels my bad! I did mean the intel mesa ci
<daniels>
jekstrand: back in my day people were thrilled when their winsys gained the ability to understand hotplug …
<daniels>
ngcortes: no problem, good luck with the firefighting!
<ngcortes>
daniels, thanks!
Hi-Angel has quit [Ping timeout: 480 seconds]
pushqrdx has joined #dri-devel
gouchi has quit [Remote host closed the connection]
pushqrdx_ has quit [Ping timeout: 480 seconds]
<ajax>
i think i might hate xcb
<Venemo>
do you hate it less than xlib?
<ajax>
not sure
<ajax>
a few years ago the answer would have been "no" quite quickly though
<Venemo>
really? I thought xcb was made to be saner than xlib
<HdkR>
I've still yet to find the documentation on the xcb extensions. I have no idea where distros build their libraries from
<HdkR>
Definitely isn't in the primary libxcb repo :|
gitautas has quit [Remote host closed the connection]
mlankhorst has quit [Ping timeout: 480 seconds]
nchery is now known as Guest1538
nchery has joined #dri-devel
Guest1538 has quit [Read error: Connection reset by peer]
ybogdano has quit [Ping timeout: 480 seconds]
angerctl has quit [Quit: reboot]
Namarrgon has joined #dri-devel
iive has quit []
Namarrgon has quit [Remote host closed the connection]
Namarrgon has joined #dri-devel
ybogdano has joined #dri-devel
<ngcortes>
anyway, intel mesa ci should be back online by tomorrow. I've taken it offline while he new disk syncs (ci is super slow during this process so I took it offline to prevent it from bottlenecking)
<jekstrand>
dcbaker: It'd be nice if you could give it a look. I think it's mostly fine except a bunch of the generators are still in the util/ directory but run from the runtime/ directory.
<jekstrand>
I don't remember what the BKM for cross-directory python deps is or I'd have moved the generators too.
<jekstrand>
Once we've got that, then I can move things like anv_shader_compile_to_nir into common code (with a better name, of course)
JohnnyonFlame has joined #dri-devel
aswar002 has quit [Quit: No Ping reply in 180 seconds.]
mdnavare has quit [Remote host closed the connection]
aswar002 has joined #dri-devel
<jekstrand>
I'm contemplating whether or not it's practical to have a fully shared vk_pipeline_cache implementation. That seems like it'd be a good idea. Just have to figure out how it'd work.
mdnavare has joined #dri-devel
<jekstrand>
Probably something that caches blobs?
unerlige has quit [Remote host closed the connection]
unerlige has joined #dri-devel
<jekstrand>
RADV has a hand-rolled hash table
X-Scale` has joined #dri-devel
<jekstrand>
airlied: Why'd you hand-roll the hash table in radv_pipeline_cache? It seems everyone has copied+pasted your hand-rolled hash table instead of using struct hash_table like ANV does. :-/
<jekstrand>
If there's a good reason for it, I'm all ears.
<HdkR>
Mesa uses xxhash for its hash table now right?
<jekstrand>
Ugh... THere's so much pipeline cache copy+pasta... :-(
<jekstrand>
HdkR: I think so.
<jekstrand>
HdkR: Yes, we do
<jekstrand>
But not for the hash set. ¯\_(ツ)_/¯
<HdkR>
Could have been before Anthony replaced the hash with xxhash, so it wasn't performing as well as it could? :P
X-Scale has quit [Ping timeout: 480 seconds]
<HdkR>
Looks like it was only a year ago now
<jekstrand>
HdkR: Or it could be that ANV originally had that hash table krh hand-rolled and then radv copied+pasted it before we switched to struct hash_table. :P
<HdkR>
That too
<jekstrand>
And everyone else copied+pasted from RADV
<HdkR>
That's definitely the simpler answer
<bnieuwenhuizen>
jekstrand: wondering if that had anything to do with trying to use Vulkan allocators?
<bnieuwenhuizen>
I mean that is a typical NIH cause
<bnieuwenhuizen>
then again it seems to use plain malloc in radeonsi so maybe not :P
<bnieuwenhuizen>
in radv*
<jekstrand>
According to 10f9901bcef7724cb72fb2fe7e3dd8d6660d2f34, it was because krh wanted to store the shader binaries themselves in the pipeline cache like we did in the i965 shader cache.
<bnieuwenhuizen>
wdym store the binaries themselves?
<jekstrand>
I mean the pipeline cache contains a BO full of shader binaries and you execute directly from the cache.
<jekstrand>
But then we realized that you can't actually do that because clients are allowed delete the pipeline cache after calling vkCreateGraphicsPipelines().
<bnieuwenhuizen>
ah
<jekstrand>
That patch has a bugzilla link.... Tells you how old it is. :D
<bnieuwenhuizen>
want me to write a patch to convert radv or is a mass patch incoming?
<jekstrand>
bnieuwenhuizen: Once !13156 lands, I'm thinking of making a common vk_pipeline_cache implementation and then switching all the drivers over to it.
<jekstrand>
If I do it right, everyone will also get NIR caching and disk caching for free.
<jekstrand>
And the pipeline_cache_control extension, maybe? Whichever one it is that lets you control locking.
ybogdano has quit [Ping timeout: 480 seconds]
ngcortes has quit [Remote host closed the connection]
pcercuei has quit [Quit: dodo]
tursulin has quit [Read error: Connection reset by peer]