<jekstrand>
airlied: Feel free to push and marge if you want. Or I'll be back in the US morning and can do it then.
<airlied>
jekstrand: thanks! I might just add a fixes line just in case pointing to the error rework
<jekstrand>
fine with me
camus1 has joined #dri-devel
camus has quit [Ping timeout: 480 seconds]
mclasen has quit [Ping timeout: 480 seconds]
aravind has joined #dri-devel
karolherbst has quit [Remote host closed the connection]
danvet has joined #dri-devel
camus1 has quit []
camus has joined #dri-devel
Duke`` has joined #dri-devel
alanc has quit [Remote host closed the connection]
alanc has joined #dri-devel
jewins has quit [Ping timeout: 480 seconds]
camus has quit [Remote host closed the connection]
camus has joined #dri-devel
pnowack has joined #dri-devel
sarnex_ has joined #dri-devel
sarnex has quit [Read error: Connection reset by peer]
aravind has quit [Read error: Connection reset by peer]
fluix has quit [Remote host closed the connection]
fluix has joined #dri-devel
kts has quit [Quit: Konversation terminated!]
aravind has joined #dri-devel
flto has quit [Ping timeout: 480 seconds]
flto has joined #dri-devel
tzimmermann has joined #dri-devel
tursulin has joined #dri-devel
gouchi has joined #dri-devel
rasterman has joined #dri-devel
<pq>
Company, color space is mostly about gamut while EDR is mostly about dynamic range, so the two are mostly orthogonal.
<pq>
Company, all the dynamic range parts of the protocol are still very much WIP, as you might be able to tell from the color-and-hdr docs.
<pq>
Company, there are actaully three different approaches to dynamic range: a) EDR, b) ICC adding new tags to ICCv4 for it, c) the industry standard "static HDR metadata", d) ICC max (but no libs to use that), e) other industry standards. Ok, a lot more than three.
<pq>
Company, it's still a completely open question on which of those dynamic range descriptions need to be supported in the protocol explicitly. Maybe all.
<pq>
Company, ICCmax has the problem of no FOSS implementations that I know of. The ICCv4 tags are not there yet I think? But both could be supported with not much changes to the protocol spec.
<pq>
Company, the bigger problem is what to do when there are multiple dynamic range "definitions" delivered via protocol. All these are somehow overlapping, but all do not have the same pieces of information.
<pq>
Company, I'm working on an introduction document that might disambiguate things a bit.
<pq>
Company, with the standard ICCv4 profiles the dynamic range is essentially unknown. AFAIU it has no information at all about how the dynamic range in one profile might relate to the dynamic range in another profile, and the profile connecting space has no notion of it either.
tzimmermann_ has joined #dri-devel
tzimmermann has quit [Remote host closed the connection]
<pq>
Company, now that I think of it, I guess when using ICCv4 profiles people have just calibrated the dynamica range "away". The appropriate range depends on the viewing environment too.
sarnex has joined #dri-devel
Lucretia has joined #dri-devel
sarnex_ has quit [Read error: Connection reset by peer]
Duke`` has quit []
Duke`` has joined #dri-devel
gawin has joined #dri-devel
pushqrdx has quit [Read error: Connection reset by peer]
pushqrdx has joined #dri-devel
gawin has quit [Ping timeout: 480 seconds]
camus1 has joined #dri-devel
camus1 has quit [Remote host closed the connection]
camus has quit [Read error: Connection reset by peer]
camus has joined #dri-devel
pcercuei has joined #dri-devel
fxkamd has joined #dri-devel
camus has quit [Remote host closed the connection]
aperezdc has joined #dri-devel
camus has joined #dri-devel
kts has joined #dri-devel
<Company>
pq: the main thing I'm trying to understand - especially because the web has no API for that either afaics - is how that would have an effect on public application APIs/behavior
<Company>
but I still don't get how that is orthogonal to color spaces - isn't a larger dynamic range equivalent to a wider gamut?
mclasen has joined #dri-devel
YuGiOhJCJ has joined #dri-devel
ccr_ has joined #dri-devel
ccr_ has quit []
<pq>
Company, usually gamut refers to range of colors as in chromaticity. Dynamic range is about brightness of any color, not about which colors you can display.
ccr_ has joined #dri-devel
ccr_ has quit []
<pq>
Company, as in, you could have a dim red laser to show. Your monitor could show the brightness of it, but not the true (physical) color of it.
<pq>
iow, dynamic range is enough, but gamut is not, in that case.
<Company>
i always thought that's part of luminosity and included in color spaces
<pq>
Company, OTOH, if you define gamut such that it also includes dynamic range, then it includes dynamic range. I'm not sure which way is "better", but I often need to talk about color gamut irrespective of brightness (dynamic range).
<pq>
Company, that's because the term "color space" is casually used for almost everything, and most of those things are not even color spaces.
<pq>
my intro doc should help to disambiguate these
<pq>
...which is still WIP
<pq>
when people are talking about monitors, then "wide color gamut" very much refers to the chromaticity range and not the (luminocity) dynamic range.
MrCooper has quit [Remote host closed the connection]
MrCooper has joined #dri-devel
pushqrdx_ has joined #dri-devel
<Company>
but don't color spaces - like XYZ - include luminosity?
<pq>
it is represented in the coordinate system, yes
<pq>
maybe I should use more "color gamut" than "color space", too
<Company>
i mean, all those diagrams about gamut are 2D graphics, so I always just assumed the missing 3rd dimension was the luminocity
<pq>
however, while luminance may be part of the color space coordinate system, it might be normalized somehow, which makes it relative rather than absolute. That makes it hard to know what the actual luminance or dynamic range is.
<vsyrjala>
was just about to say the Y is just a relative luminance (ie. not measured in nits)
<pq>
It might be ambiguious whether "gamut" is 2D or 3D, but the term "color volume" is explicit to include the effect of luminance as well, and that leads to very interesting shapes.
<Company>
okay, that was helpful
<Company>
now I at least understand the problem
<mareko>
tarceri: if I wanted to reorder uniforms in the parameter list based on where they are used in the shader, where would I do it?
<pq>
Company, the problem with 2D gamut diagrams is exactly that it ignores the effect of luminance on the gamut of the display. The only color a display can show at its peak luminance is white.
<pq>
Company, if you add color saturation in any direction to the peak white, then you must reduce the intensity of at least one of the RGB components, which means your luminance goes down when your saturation increases until you hit the edge of the gamut.
<pq>
*color gamut
YuGiOhJCJ has quit [Quit: YuGiOhJCJ]
<pq>
hmm, that's actually a way to understanding scRGB's negative color channel values, as a means to "exceed" your color gamut
Peste_Bubonica has joined #dri-devel
<Company>
yeah, ultimately it's all 3 dimensions, it just matters a lot how you define the values of those dimensions
columbarius has quit [Ping timeout: 480 seconds]
<pq>
That's what display color management is about essentially :-)
columbarius has joined #dri-devel
tobiasjakobi has joined #dri-devel
tobiasjakobi has quit [Remote host closed the connection]
vivijim has joined #dri-devel
jewins has joined #dri-devel
kts has quit [Quit: Konversation terminated!]
camus1 has joined #dri-devel
camus has quit [Remote host closed the connection]
camus has joined #dri-devel
camus1 has quit [Ping timeout: 480 seconds]
sdutt has joined #dri-devel
kts has joined #dri-devel
camus1 has joined #dri-devel
camus has quit [Ping timeout: 480 seconds]
mareko has quit [Remote host closed the connection]
mareko has joined #dri-devel
karolherbst has joined #dri-devel
nchery has joined #dri-devel
fxkamd has quit [Remote host closed the connection]
fxkamd has joined #dri-devel
nchery is now known as Guest4617
nchery has joined #dri-devel
Guest4617 has quit [Read error: Connection reset by peer]
<sylware>
allright, let me plug this patch in my distro
gawin has joined #dri-devel
<sylware>
bnieuwenhuizen: sorry, have to manually apply the patch myself, probably a busybox diff bug
<sylware>
testing...
pnowack has left #dri-devel [#dri-devel]
pnowack has joined #dri-devel
<MrCooper>
when returning -ERESTARTSYS from atomic_check, is the driver supposed to restore the original atomic state passed in, or does it need to handle any modifications in the next atomic_check call?
<sylware>
bnieuwenhuizen: fixed with your patch, I cannot reproduce with dota2 neither with SOR4
<bnieuwenhuizen>
thanks for the confirmation
<sylware>
:)
<sylware>
so, navi10 XLE has l2 coherent cache
<bnieuwenhuizen>
basically it is not coherent if some of the cache slices have been disabled
<bnieuwenhuizen>
which is the case for 5600 XT AFAIU
<sylware>
oh, then rad_info.tcc_rb_non_coherent is true for navi10 XLE
<sylware>
AMD did publish docs with such information??
<bnieuwenhuizen>
not really
<bnieuwenhuizen>
though we have other driver code to read :) (say radeonsi or AMDVLK)
oneforall2 has quit [Remote host closed the connection]
<sylware>
oh ok, I see
oneforall2 has joined #dri-devel
<sylware>
in such massive drivers, finding the l2 cache bug, is literally finding a neddle in a haystack
<sylware>
btw, are fma hw instructions that significant for improved shader performance? Coze if they are, they should be added to spirv.
<pendingchaos>
there are significant, and they're in spirv
unerlige1 has left #dri-devel [#dri-devel]
unerlige has joined #dri-devel
<sylware>
pendingchaos, oh, khronos did add them then... missed that.
<sylware>
last spirv specs I read was 1year and half ago ez.
slattann has joined #dri-devel
<pendingchaos>
I think spirv always had it
<pendingchaos>
as part of GLSL.std.450
<sylware>
oh, this is not core spirv then, you need to add GLSL extensions??? weird.
<pendingchaos>
maybe OpenCL and GLSL fma can behave a bit differently
<jenatali>
IIRC the CL fma has higher precision requirements
gawin has quit [Ping timeout: 480 seconds]
<jenatali>
Has anyone ever tried or had interest in using va/vdpau on swrast? We're interested in seeing if we can get some video decode accel in the d3d12 gallium driver for WSL, but since we're not a dri driver, initializing seems like it'll be a bit tricky without a swrast path
<jenatali>
and/or would there be objections if we built a swrast fallback?
<imirkin>
jenatali: there is (was) a shader-based MPEG1/2 decoder
<imirkin>
such a shader could run on swrast as well as any other hardware
<imirkin>
check vl_*_mpeg* somewhere in gallium/vl
sylware has quit [Remote host closed the connection]
<jenatali>
imirkin: Ack, but it still seems like the only way to get at that would be to have a DRM-based driver so vl_winsys_dri[3] or vl_winsys_drm can be used, right?
<jenatali>
At least, without building a swrast-based init path
<imirkin>
jenatali: at least the video decoding aspect of it is done :p
<imirkin>
you just have to hook up winsys gunk
<jenatali>
Or a dxg-specific path, but I'm not sure we want / are ready to try doing anything like that
<jenatali>
imirkin: Well, we've got codec support in D3D that we want to hook stuff up to. We don't need shader decoders. It's just the plumbing that's tricky
<imirkin>
ah yeah
<imirkin>
then your d3d12 gallium driver just needs to hook up the vl entrypoints
<imirkin>
(btw, you probably know this, but it's hard to beat CPU decoding of MPEG1/2 on modern CPUs ... just moving the data to the GPU and back makes it almost a loss)
<imirkin>
actually, that's only true if the CPU has to handle VLD. if the GPU supports VLD then it's probably a small win.
<jenatali>
Not sure where you're getting MEGP1/2 from. I think we're primarily interested in H264 here
<imirkin>
jenatali: well the software shader is for mpeg1/2 only
<imirkin>
jenatali: also ... ATSC is where you'd get MPEG2 from
<imirkin>
the reason i originally got involved with nouveau was that my CPU-du-jour couldn't decode MPEG2 from an ATSC stream in realtime :)
<imirkin>
(or DVB-T in europe)
<jenatali>
imirkin: The problem with just "hooking up the vl entrypoints" is that our gallium driver can only init as swrast currently, since we don't have devices in /dev/dri
<jenatali>
But vl can't init on swrast like GL/CL can
<imirkin>
jenatali: right, so you'd have to do some hook-up
<jenatali>
Yeah
<imirkin>
you're likely to run into some additional turds in there since vl was really for linux-only, given that it was only for drm drivers, ever.
<jenatali>
We're only interested in Linux here
Anorelsan has joined #dri-devel
<jenatali>
WSL, specifically
<imirkin>
ah right. i still don't have a great mental model of what that is. is that like using qemu-kvm to boot a linux kernel + execute a linux binary, but the windows equivalent of qemu-kvm?
<jenatali>
Yeah basically
<jenatali>
It's a Hyper-V Linux VM
<imirkin>
right. so you're using hyperv to pass data back and forth (as opposed to e.g. virtio with a native linux setup)
<jenatali>
Yep, "hvsocket" is the transport mechanism
<jenatali>
And for graphics specifically, we have a socket between the kernel on the guest, using our libdxgkrnl Linux driver, to the kernel on the host, into the WDDM DxgKrnl component
<imirkin>
i assume you've looked at how virgl works and how it solves some of these problems?
<jenatali>
Well, virgl is a DRM device
<imirkin>
no law that says you can't make a DRM device.
<imirkin>
i realize that's a bigger thing
aravind has quit [Ping timeout: 480 seconds]
<jenatali>
True. I wonder if we'll have to go there long-term
camus has joined #dri-devel
camus1 has quit [Ping timeout: 480 seconds]
<airlied>
jenatali: yeah nobody has touched vl sw paths, but adding them shouldnt be too horrible
<jenatali>
Cool
<airlied>
mapping vaapi to d3d12 might be more fun :-p
<airlied>
though i havent looked at d3d12
<jenatali>
If we implement it at the gallium layer it doesn't look too bad
<jenatali>
airlied: Have you looked at Vk video?
<airlied>
yes debugging it at the moment
<jenatali>
It's pretty similar
<jenatali>
I could see a vaapi -> zink path too
* zmike
runs away screaming
<airlied>
so for h264 vaapi does slice decoding
<airlied>
but vulkan video is picture based
<airlied>
currently working out what that means for layering
<imirkin>
video is soul-sucking in my experience. esp working with undocumented hw decoders...
<imirkin>
so many options
<imirkin>
so few of them work :)
<imirkin>
i have zero appetite to work on video stuff again as a result =/
kenjigashu has joined #dri-devel
kenjigashu has quit [Remote host closed the connection]
<airlied>
imirkin: yeah at least the amd decoder fw interface seems very set on vaapi like behaviour
<daniels>
gstreamer has common code (between at least d3d+v4l2) for stateless codecs which might be useful as a reference, but beware LGPL
<airlied>
getting vk video on it is very challenging
slattann has quit []
<imirkin>
airlied: the thing which ultimately defeated me was some sort of internal constraint which causes some videos to decode some motion wrong. never figured out how to sort it out. i literally feed the same stuff in as the blob, yet it all works for them. and fails across multiple decoder generations with nouveau.
<imirkin>
feels like overflowing some buffer, but ... who knows.
sravn has joined #dri-devel
slattann has joined #dri-devel
<airlied>
imirkin: yeah the opaque fw interfaces do little to encourage debugging
<imirkin>
but much to encourage crying
<bnieuwenhuizen>
my reason to not jump head first into any video stuff for radv
slattann has quit []
<airlied>
already hit the problem of the fw api being non vulkan like
<airlied>
recording to a single cmd buffer not going so well
<jenatali>
Yeah... this all sounds familiar, though at least we have engineers at the respective companies to talk to instead of RE
<airlied>
yeah amd are helping just a matter of how much it takes :-)
<imirkin>
aka sitting with popcorn and laughing at the poor fool trying to make it all work? :p
camus1 has joined #dri-devel
camus has quit [Read error: Connection reset by peer]
JohnnyonFlame has joined #dri-devel
ybogdano has joined #dri-devel
camus has joined #dri-devel
tzimmermann_ has quit []
camus1 has quit [Ping timeout: 480 seconds]
<jekstrand>
bnieuwenhuizen: I finally got back to the pipeline cache patches and updated them
kts has quit [Ping timeout: 480 seconds]
<jekstrand>
bnieuwenhuizen: Double-deserialize is gone.
<jekstrand>
bnieuwenhuizen: There's still some design decisions around just how automatic we want to make things like disk caching, deciding on cached objec types, etc.
<jekstrand>
I think what I've got now is good enough to move forward but there's certainly some room to play in the design space.
<bnieuwenhuizen>
perfect is the enemy of the good here. Lets dedupe and then improve
<jekstrand>
cool
<jekstrand>
In that case, give it a read and see what you think of the new state of things. I'll rebase the last patch into the rest if you like it.
<mareko>
airlied: do you think that the vaapi-like fw interface will make vulkan video slower/inefficient?
moben[m] has joined #dri-devel
<airlied>
mareko: the current fw interface might make it unimplementable without violating vulkan impl rules
* airlied
isn't sure how set in stone the fw apis are, or who can rev them
<airlied>
mareko: the main problem is the fw api takes 3 commands create session, decode, and destroy session
<airlied>
but vulkan works on recording command buffers, so there is no good place to create/destroy the session
<bnieuwenhuizen>
airlied: doesn't the vk api have a session concept?
<jekstrand>
It does
<jekstrand>
The vulkan video API is weird
<jekstrand>
It's got session and session parameter objects
<jekstrand>
I've not torn it apart in enough detail to know how they all play together for sure, but they're there.
<bnieuwenhuizen>
airlied: if those don't work for you then I think that is feedback the WG would be interested in :)
<bnieuwenhuizen>
or maybe we're late, who knows, not like I follow it
<LaserEyess>
maybe I'm being naive, but didn't the vulkan video people work with hw manfacturers to make sure the APIs were implementable?
<jekstrand>
I'm pretty sure AMD has been in the discussions. I'd be very surprised if it's unimplementable for them.
<jekstrand>
whether or not it requires rev'ing the FW API is another question.
<airlied>
jekstrand: different parts of AMD :-)
<LaserEyess>
sounds like they just didn't make it easy then
<bnieuwenhuizen>
jekstrand: there is implementable and there is "implementable without modifying the fw that we don't have the source for"
<airlied>
bnieuwenhuizen: it has the concepts, but you have to submit cmd buffers internally
<airlied>
which isn't very vulkan like
<airlied>
and in fact I can't make work either
<airlied>
the closest I've gotten is making a cmdbuffer with create, decode, destroy
<airlied>
that decodes the I frames
<airlied>
hacking things to submit create and destroy at other times hasn't gotten me anywhere
<jekstrand>
airlied: Yes, they're different parts of AMD but I really don't think their khronos reps are that incompetent.
<jekstrand>
But I could easily believe it requires new FW
<airlied>
jekstrand: yeah like the video rep from AMD likely knows how the hw works, but is probably not caring about what current FW does
<airlied>
whereas I'm a bit more limited :-P
<jekstrand>
Oh, for sure. :)
<jekstrand>
And they really don't care about you, just so we're all clear. :P
* airlied
has already found a couple of bugs in the h264 decode api, just by knowing nothing
<jekstrand>
lol
<jekstrand>
file them, I guess
<airlied>
oh already done
kts has joined #dri-devel
<airlied>
jekstrand: I guess I get to try anv next :-P
<jekstrand>
go for it!
<agd5f>
We don't use different firmwares that I am aware of. Doing so would break existing APIs (VAAPI, DXVA, etc.)
<airlied>
yeah it seems more likely it would need a fw switch to just make it work a bit different
<airlied>
agd5f: I have to look at d3d12 yet, it might be it's like vulkan and it's just some magic you haven't found out
kts has quit [Quit: Konversation terminated!]
<airlied>
the vulkan model would be to keep all state in memory, and have create/destroy not do much except reload from memory ctx
<airlied>
yeah d3d12 looks mostly similiar to vk
<airlied>
the other place the fw api differs is it has one internal memory allocate for DPB management, but the APIs provide DPB buffers explicitly
<airlied>
you can just ignore the API provided ones but it wastes memory
<airlied>
so there might just be new fw interfaces you haven't documented in public yet
<airlied>
I think once Leo gets some time to dig into things, we'll have some clearer picture
kenjigashu has joined #dri-devel
kenjigashu has quit [Remote host closed the connection]
kenjigashu has joined #dri-devel
kenjigashu has quit []
<jekstrand>
bnieuwenhuizen: Also, sorry for the pester, but if you could give my sync MR a read sometime soonish, that'd be really nice. :)
pushqrdx_ has quit []
pushqrdx has joined #dri-devel
Anorelsan has quit [Quit: Leaving]
elongbug has joined #dri-devel
lemonzest has quit [Quit: WeeChat 3.2]
camus1 has joined #dri-devel
camus has quit [Read error: Connection reset by peer]
bbrezillon has quit [Ping timeout: 480 seconds]
mripard has quit [Ping timeout: 480 seconds]
gruetzkopf has joined #dri-devel
gruetzkopf is now known as Guest4637
craftyguy has quit [Read error: Connection reset by peer]
craftyguy has joined #dri-devel
kenjigashu has joined #dri-devel
sravn has quit []
kenjigashu has quit [Remote host closed the connection]
Duke`` has quit [Ping timeout: 480 seconds]
gouchi has quit [Remote host closed the connection]
agx has quit [Read error: Connection reset by peer]
agx has joined #dri-devel
kenjigashu has joined #dri-devel
gawin has joined #dri-devel
Peste_Bubonica has quit [Remote host closed the connection]
<gawin>
I've been wondering again about hardware limitations, as NIR is probably building graph and then searching for shortest path(?), maybe it'd be good approach to add option to set specific cost of instruction. (or is it something like this already possible?)
JohnnyonFlame has quit [Ping timeout: 480 seconds]
<gawin>
imho should be help for legacy hardware, hardware which wasn't designed in "standard" way (new macs?) or hardware which has its own pitfalls (for example some instructions are causing cache miss and it's better to avoid them)
<anholt>
gawin: our algebraic optimizations at the nir level are fairly ad-hoc and just "this is what has worked out well with current hw"
<anholt>
if you want to do some instruction selection to try to come up with a cost-minimized set of instructions, you may want to look at https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/629 -- but I never finished it off to show what it could do.
<anholt>
most gpu hardware is pretty simple and pretty orthogonal, so there's often not as much that one can do with instruction selection compared to say x86.
kenjigashu has quit []
danvet has quit [Ping timeout: 480 seconds]
<FLHerne>
I spotted https://egraphs-good.github.io/ a while ago, it seems like a conceptually-more-elegant solution than NOLTIS
<FLHerne>
but NOLTIS has the big advantage that anholt already mostly wrote it :p
<anholt>
yeah, if we had rust in mesa then egg is really what we want (assuming it is fast enough)
<anholt>
*what we want for algebraic
<gawin>
I agree that in most cases it works fine, though it possible to unknowingly hurt performance of less common platforms
<gawin>
also possibility of comparing cost of accurate and fast path could be nice
<FLHerne>
it's a tool to use e-graphs _out of band_ to automagically generate rewrite rules like the ones opt_algebraic has a ton of
<FLHerne>
but opt_algebraic has all these weird conditions so it's probably hard to model
<gawin>
anholt: do you perhaps remember what was missing?
<anholt>
sorry?
<anholt>
not sure what you're asking about
<anholt>
oh, by finished it off I mean made a backend justify its existence
<anholt>
nir-to-tgsi-noltis was my latest branch using it
<gawin>
so basically the "todo" is debugging on some hardware?
<anholt>
I have converted backends, and I think the code works. however, the stuff I made the backends do with it was equivalent to what the backends did already
<anholt>
didn't get around to making it do something more interesting like ffma regrouping or just-one-uniform-per-instruction copy propagation.