ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar
<zmike> if talking about 64bit io then it seems like it'd be good to add some alignment validation to nir_validate
kzd has quit [Quit: kzd]
<mareko> zmike: clip/cull indexing
<zmike> ah
<zmike> yes, well, there's tricks to work around it
<mareko> we have lots of piglit failures for 64-bit IO already
<zmike> just seems inconsistent that they're needed
<zmike> 64bit io is cancer
<HdkR> How soon until smooth interpolation is allowed for 64-bit IO
<mareko> never
<mareko> also it's possible today
<HdkR> Just do the interpolation manually using the exposed barycentrics? :)
<mareko> there is more to it, you also need explicit loads with a vertex index in FS
<mareko> and disable the vector subtract for P1 and P2 inputs
<HdkR> So you're saying we need a new extension to let the driver do it all for you right? :)
<mareko> no, Vulkan can do it already, hakzsam implemented custom interpolation not so long ago
heat has joined #dri-devel
<zmike> mareko: what's left with your linker thingy? do you plan to merge it in this release cycle?
<mareko> I could if I don't implement the remaining stuff
<zmike> ah
RSpliet has joined #dri-devel
JohnnyonFlame has quit [Ping timeout: 480 seconds]
RSpliet has quit [Read error: Connection reset by peer]
RSpliet has joined #dri-devel
kzd has joined #dri-devel
oneforall2 has quit [Remote host closed the connection]
oneforall2 has joined #dri-devel
heat has quit [Remote host closed the connection]
tristan has joined #dri-devel
tristan is now known as Guest7738
pcercuei_ has quit []
MoeIcenowy has quit [Quit: ZNC 1.8.2 - https://znc.in]
MoeIcenowy has joined #dri-devel
columbarius has joined #dri-devel
co1umbarius has quit [Ping timeout: 480 seconds]
yuq825 has joined #dri-devel
alyssa has joined #dri-devel
<alyssa> mareko: i'm also curious, is the linker meant to handle cases like `v_color = a_color;` where a_color turns out to be 3-channels? (and then in a monolithic pipeline, the linker is able to drop the w output and replace with a constant 1.0 in the FS?)
<alyssa> I *think* that will Just Work if the backend driver calls the i/o linker after lowering vertex inputs in NIR
<alyssa> (should be straightforward in RADV with monolithic pipelines, for ex)
<alyssa> but I notice the proposal has the linker called in the GLSL compiler and not the backend driver, so I wasn't sure if there's a benefit/requirement to call it early
<alyssa> (long before the vertex format key could be applied)
benjaminl has quit [Ping timeout: 480 seconds]
<alyssa> (I also don't know if radeonsi would use it this way... specializing to both the vertex formats AND the linked fragment shader seems, undesireable if it's not already a monolithic pipeline bundled up nicely in VK.)
yyds has joined #dri-devel
Guest7738 has quit [Remote host closed the connection]
tristan_ has joined #dri-devel
heat has joined #dri-devel
TMM has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
TMM has joined #dri-devel
tristan_ has quit [Remote host closed the connection]
tristan_ has joined #dri-devel
YuGiOhJCJ has joined #dri-devel
JohnnyonFlame has joined #dri-devel
mbrost has joined #dri-devel
godvino has joined #dri-devel
krushia has quit [Ping timeout: 480 seconds]
tristan_ has quit [Remote host closed the connection]
tristan_ has joined #dri-devel
Danct12 has quit [Remote host closed the connection]
Danct12 has joined #dri-devel
JohnnyonFlame has quit [Read error: Connection reset by peer]
Danct12 is now known as Guest7752
Danct12 has joined #dri-devel
crabbedhaloablut has joined #dri-devel
tristan_ has quit [Remote host closed the connection]
tristan_ has joined #dri-devel
heat_ has joined #dri-devel
heat has quit [Read error: No route to host]
tristan_ has quit [Remote host closed the connection]
tristan_ has joined #dri-devel
Danct12 has quit [Quit: WeeChat 4.0.2]
Haaninjo has joined #dri-devel
Haaninjo has quit [Remote host closed the connection]
Danct12 has joined #dri-devel
mbrost has quit [Ping timeout: 480 seconds]
JohnnyonFlame has joined #dri-devel
tristan_ has quit [Remote host closed the connection]
tristan_ has joined #dri-devel
godvino has quit [Ping timeout: 480 seconds]
tristan_ has quit [Remote host closed the connection]
tristan_ has joined #dri-devel
aravind has joined #dri-devel
heat_ has quit [Ping timeout: 480 seconds]
tristan_ has quit [Remote host closed the connection]
tristan_ has joined #dri-devel
Duke`` has joined #dri-devel
bgs has joined #dri-devel
JohnnyonFlame has quit [Read error: Connection reset by peer]
pcercuei has joined #dri-devel
Duke`` has quit [Ping timeout: 480 seconds]
Company has quit [Quit: Leaving]
tristan_ has quit [Ping timeout: 480 seconds]
bmodem has joined #dri-devel
junaid has joined #dri-devel
itoral has joined #dri-devel
ayaka_ has joined #dri-devel
<ayaka_> Do we still use that master/client auth thing in drm?
bmodem has quit [Quit: bmodem]
bmodem has joined #dri-devel
rasterman has joined #dri-devel
sima has joined #dri-devel
bgs has quit [Remote host closed the connection]
<mripard> jani: hey, do you know who's in charge of maintaining dim these days? I've had a PR stuck for months on gitlab
junaid_ has joined #dri-devel
jkrzyszt_ has joined #dri-devel
Zopolis4 has joined #dri-devel
vliaskov_ has joined #dri-devel
junaid_ has quit [Remote host closed the connection]
junaid has quit [Remote host closed the connection]
aravind has quit [Ping timeout: 480 seconds]
aravind has joined #dri-devel
MoeIcenowy has quit [Quit: ZNC 1.8.2 - https://znc.in]
MoeIcenowy has joined #dri-devel
<rasterman> hmmm libdrm design question...
vliaskov__ has joined #dri-devel
<rasterman> __u64 user_data (used in several "public" facing structs)
<rasterman> the way it's actually used at least in various places is to stuff pointers into...
<rasterman> this is of course wrong... but it's done. was/ius user_data actually meant to be able to carry a pointer?
<airlied> why is it wrong?
<rasterman> what if my pointers are > 64bit?
<rasterman> what if they are actually special types that are not just plain integers...
<airlied> then you get to define a whole new ABI
<rasterman> ie i have to use a void * or a uintptr_t
<airlied> it can't store "pointers" because they change size
<airlied> so it breaks all sorts of things
<airlied> like 32-bit apps on 64-bit kernels
<rasterman> well in effect i am defining a new abi...
<rasterman> as it's a new architecture...
<rasterman> so i'm wondering if he right solution is a bit of ifdeffing there so essentially
<airlied> if you never want to run any "different" pointer size or emulate another arch that might work
<rasterman> #if (UNTPTR_MAX > 0xffffffff)
<rasterman> uintptr_t user_data;
<rasterman> #else
<rasterman> ... current code
<airlied> you mean 0xffffffffffffffffULL
<rasterman> but really questioning if the INTENT was that user_data can hold a ptr
<rasterman> yeah - thus "essentially"
<airlied> most __u64 are ptrs, but not all
<rasterman> you get the idea :)
<rasterman> well depends... inside kernel - yes. at the kernel/user boundary - yeah (ioctls and so on)
<rasterman> but once you get a bit higher up ... it's questionable
<rasterman> thus i ask :)
<rasterman> what the *intent* is
<rasterman> to me the intent was to carry at least 64bits of int or a ptr
<airlied> some of them apis might have _ptr in them
<airlied> where it expects a user ptr
<airlied> some of them iught not
<airlied> it would have to be a case by case trawl through include/uapi/drm/*
<rasterman> yeah
<rasterman> but this is libdrm...
<rasterman> so its fully up in userspace
<airlied> it's not really, libdrm just calls into the kernel ioctls
<rasterman> one step beyond the kernel boundary
<rasterman> sure
<airlied> you'd have to fix both
<airlied> and libdrm mostly just reproduces kernel interfaces
vliaskov_ has quit [Ping timeout: 482 seconds]
<airlied> any include like drm_mode.h etc are kernel interfaces really
<rasterman> but struct drm_event_vblank is what it exposes directly to userspace beyond
<rasterman> for example
<airlied> that's a kernel interface
sghuge has quit [Remote host closed the connection]
<airlied> libdrm reproduces a bunch of kernel interfaces, but most of the definitions will come from the kernel
sghuge has joined #dri-devel
<rasterman> so i'm wondering if i get to now redefine user_data to be what i think it should be (ie uintptr_t)
<airlied> no you can't use uintptr_t
<airlied> because it changes size
<rasterman> i get to redefine kernel abi too :)
<airlied> the kernel ioctl compat layer was how we dealt with 32-on-64 problems back in the old days
<rasterman> unfortunately i have to use an actual ptr type
<airlied> when a bunch of APIs had 32-bit ptrs in them
<rasterman> or somehting capable fo storing a ptr
<rasterman> i canty just use any old integer of N bits
<airlied> then you likely want to define a new "maxsizedptrstorage"
<airlied> that covers all existing ptrs and your new ones
<airlied> then define brand new structs for every thing that takes a ptr
<rasterman> well i can always ifdef as above
<rasterman> same on the kernel side
<rasterman> as compat32 now is actually good old 64bit
<rasterman> and the new native abi is ... well new.
<airlied> why do you have to use a ptr type though? since the kernel should never dereference these directly
<airlied> they have to be cast to void __user * and use copy_to/from_user
<rasterman> it's a capability architecture
<airlied> it might be an idea just to find all the really ptrs and give them a new type that is __u64 on normal arches
<airlied> then just have a special case for that type
<rasterman> ptrs are actually capabilities. specially flagged with ecc data + cpu instructions to define them as such
<rasterman> they are 128bit cabapibilities + 1 bit of extra separate metadata to flat that as a capability - thus you cant just use any old XXX bits of data to transport pointers. even if you dont ref/deref. you have to "transport" them
apinheiro has joined #dri-devel
<airlied> I assume someone has already done a bunch of the kernel interfaces
<rasterman> yes. me :)
<rasterman> or well specifically ... i fixed the compat32 for now
<airlied> I remember so of the CHERI stuff was discussiing this before
<rasterman> yeah - this is just that.
<rasterman> but now linux - not bsd.
<rasterman> so i fixed up the compat32 bits that all assumed compat abi == 32bit
<rasterman> all of that is working just fine now (dpu+gpu)
<airlied> so yeah I think define a new type like __userptr, make it __u64, hunt them all down, add an arch specific __userptr for your arch
<airlied> port igt-gpu-tools to get some test coverage at least
<airlied> I'd say about half the __u64 are userptrs in disguise
<rasterman> yeah - i'm sure they are
<airlied> we seem to use u64_to_user_ptr in a lot of places so might help point out some of them
<rasterman> thus really figuring out intent...
<rasterman> ie is this someone who just abused the abi realizing it could store a ptr
<rasterman> or is it that it was intended ...
<airlied> if there are any real ptrs they are errors or old legacy designs
sgruszka has joined #dri-devel
<airlied> u64_to_user_ptr seems to have pretty good coverage over what is a ptr
<rasterman> there's a bit of that floating about too... luckily i've managed to ignroe the legacy stuff :)
mvchtz has quit [Ping timeout: 480 seconds]
<airlied> how does 64-bit userspace work on 129-bit ptrs kernel? just doesn't do magic ptrs?
flto_ has joined #dri-devel
frankbinns has quit [Remote host closed the connection]
<rasterman> yeah
<rasterman> the kernel just sees 64bit userspace ADDRESSES ... which fit in any kernels-die 129 bit capability (rememebr the 129th bit is sideband data so its just 128bits in memory where the ptr would normally be)
<rasterman> so a lot of kernel code has moved to need uintptr_t for a lot of stuff
<rasterman> you have to hunt them downa nd follow the breadcrumbs
<airlied> I think once you find them all and fix u64_to_user_ptr to be whatever you need, things should be mostly ptr based inside drm
<rasterman> that's going to be fun. anyway - was just checking that thnigs like user_data even tho they are 64bit types wer intended to also hold ptrs... just was wondering who i should blame :)
<rasterman> ie what was the "contract"
flto has quit [Ping timeout: 480 seconds]
<airlied> yeah the important thing is it shouldn't change side across arches or when emulating one arch on antoher
MajorBiscuit has joined #dri-devel
<rasterman> yup
lemonzest has quit [Quit: WeeChat 3.6]
<airlied> "The best workaround is to use __u64 in place of pointers, which requires a cast to uintptr_t in user space, and the use of u64_to_user_ptr() in the kernel to convert it back into a user pointer."
<rasterman> that's the "tricky" bit. i have to have all usial 64bit abi to stay as is with existing 64bit binaries. they have to "just work"™
<airlied> yeah for that it sounds like you'd need a compat ioctl layer
<rasterman> well we have one :)
<rasterman> compat32 ... :)
<airlied> initial 64-bit ioctl time was a horrible one
lemonzest has joined #dri-devel
<airlied> and there's a lot more abi now
<rasterman> well ok i've abused compa32 at the now compat64 layer
<rasterman> err as the compat64
MajorBiscuit has quit []
frieder has joined #dri-devel
MajorBiscuit has joined #dri-devel
<MrCooper> ayaka_: /dev/dri/card* do, /dev/dri/render* don't though
<ayaka_> MrCooper, yes, I know only a few driver support render only ioctl().I mean the DRM_IOCTL_AUTH_MAGIC
<MrCooper> I know
<ayaka_> in current atomic request way, I didn't know how a client work here
<ayaka_> All I know is a wayland compositor like weston takes the ownership of the master node, then every access must through wayland protocol
<MrCooper> Wayland clients can also use /dev/dri/card*, but then they have to get their file description authenticated by the compositor
frankbinns has joined #dri-devel
<ayaka_> that means a wayland client could do something likes create gem object once that client gains the auth?
<MrCooper> yes, couldn't draw anything otherwise
<ayaka_> I have not seen an application do this. Why we need that, does any GPU userland driver use this mechanism?
<emersion> card+auth is legacy
<emersion> the render node should be used instead
mvlad has joined #dri-devel
<ayaka_> emersion, but render node can't even create a buffer in generic ioctl(), if an application want to do off screen render, its gpu can't even create a command buffer
<emersion> you can do off-screen rendering with render nods
<emersion> nodes*
<emersion> wlroots does it
<emersion> there is no generic API to allocate a buffer and render to it with the GPU
<emersion> you need to use GBM and GL/Vulkan
<ayaka_> maybe you are the right to ask. We are expanding v4l2 uAPI, someone suggest it could lead to V4L3. One thing we are trying to address is the memory allocation
<ayaka_> you see, drm could allocate memory from create_dumb with the help of a userspace library that could tell the size requirement or custom ioctl()
<ayaka_> also gbm could be a fine wrapper
<emersion> create_dumb is not a generic allocation ioctl
<ayaka_> but v4l2 must support allocate memory from a device's memory space(although it could be system memory)
<emersion> it's for a single purpose: allocating a buffer which can be software rendered into and scanned out by KMS
<ayaka_> it is widely used in embedded device
<emersion> dumb buffers are not for GPU rendering, not for video decoding, and not for usage outside of KMS
<emersion> yeah, it's widely *ab*used :)
<ayaka_> that is not what I could stop. We could still solve this for v4l2 or v4l3
tursulin has joined #dri-devel
<emersion> i don't know anything about v4l2, i only know about DRM
<emersion> also note, i'm not interested in downstream hacks done in embedded companies -- i only care about upstream
<ayaka_> I am thinking two things, 1. supporting import buffer with fb_id with an device id for v4l2 2. a command allocation hints between v4l2 and drm
<emersion> no, KMS FB ID is not suitable for cross device sharing
<emersion> it's for KMS only
<emersion> for cross device buffer sharing, there is DMA-BUF
<ayaka_> because more and more pixel formats are compressed, it is impossible for the other device to use
<emersion> the "command allocation hints" is something we've been talking for ages indeed
<ayaka_> I know a dma-buf could work but I wish a id for present a frame(all its planes)
<ayaka_> emersion, sorry, a common allocation hints
<emersion> yes
<emersion> the format modifiers does half the work
<ayaka_> likes which planes and planes would be CMA
donaldrobson has joined #dri-devel
<emersion> it allows you to describe tiling and compression and layout
<emersion> the other half (e.g. placement) is not solved, and needs something new
<ayaka_> well it is still not enough
<ayaka_> for example the same tile format, video device only support 64 alignment while display supports 64, 128 bytes alignment
swalker_ has joined #dri-devel
<emersion> yes, that's part of "the other half" indeed
swalker_ is now known as Guest7768
djbw has quit [Remote host closed the connection]
<emersion> ayaka_: the latest work on this was https://lpc.events/event/9/contributions/615/
<ayaka_> also modifiers are still not enough. For example, we have 2 planes of compressed graphics, 2 planes of compressed meta data. While we could store the (un)compression options?
<emersion> you need to design your modifier bits properly
swalker__ has joined #dri-devel
RSpliet has quit [Ping timeout: 480 seconds]
<ayaka_> emersion, as I said, 4x64bits are not enough
<emersion> it's 1x64bits
<jani> mripard: that would still be me and sima *blush*
bmodem has quit [Ping timeout: 480 seconds]
<emersion> usually you start with lots of bits but in practice you're really interested in some combinations
<ayaka_> emersion, you could see the synaptics's modifier, they need 8x32bits for storing the compression options a plane
<emersion> if they need more bits, maybe they can use the metadata plane to store these
<ayaka_> besides the pixel format which is fixed
<ayaka_> meta data plane is for dma, while those options are written to registers directly
<emersion> are really all combinations of all options used in practice?
<ayaka_> we won't need cpu to access the meta data plane
<emersion> anyways, you should write your concerns to the dri-devel ML
<ayaka_> yes and it is a dynamic values
<emersion> drivers have claimed that 64 bits weren't enough before, and it turned out they were enough :P
<sima> jani, mripard I guess I'm missing context? what do I need to blush about?
Guest7768 has quit [Ping timeout: 480 seconds]
<emersion> and if you want to do cross-device buffer sharing in v4l2/3, you'll need to add an import IOCTL which takes a pixel format, a modifier, and multiple DMA-BUFs
<emersion> and if you want to fix the allocation problem, then we need to do a lot more work than that
<ayaka_> Also a 32 bytes parameters set would come with a compression meta data buffer.
<sima> ayaka_, yeah probably a dri-devel mail with all the info you need and your format description and everything that you'd ideally want is the best way to go
<ayaka_> sima, I did
<sima> thus far everyone who screamed that they need kilobytes of metadata eventually figured out that 56 bits are enough
<ayaka_> for each of DRM-FORMAT-MOD-SYNA-V4H1-64L4-COMPRESSED frame, they would have a different compression options set
<emersion> +1 for an exhaustive documentation of all of the metadata you need and the reasons why you need it
<emersion> good, sima is already on the case :)
<sima> emersion, thx a lot :-)
<ayaka_> emersion, I don't know what those 8 bytes means actually, I just know they are dynamic compression values, are written to registers directly
<jani> sima: <mripard> jani: hey, do you know who's in charge of maintaining dim these days? I've had a PR stuck for months on gitlab
<sima> worst case we need a big table with the actually needed stuff, enumerated
<sima> jani, sounds like mripard volunteered :-)
<emersion> ayaka_: can you find out what they mean?
<jani> sima: :D
<emersion> where do they come from? who picks them?
<ayaka_> because they are 8 registers that you should read for a plane, the compression algorithm is not public
<emersion> but the values you write to the registers, where are they coming from?
<emersion> the algorithm is not public, but are the parameters for the algorithm public?
<ayaka_> from the video decoder hardware
<emersion> designing modifiers blindfolded doesn't sound like a very good idea to me
<emersion> i think we'll need more info here
<sima> javierm, I have vague recollections that we've finally added a generic "does this driver even support this format/modifier" test to addfb, but I can't find it?
<sima> am I dreaming?
<ayaka_> all I tell is if you miss one of that 8 set, the plane can't be uncompressed properly
<emersion> sima, that rings a bell
<mort_> robclark: drm is leaking like a sieve in 6.4.7 as well, here's an excerpt from the kmemleak output: https://p.mort.coffee/REk
<sima> emersion, iirc we've had to add it to the gem helpers because the generic code couldn't check things at the right place ...
<sima> but I seem to be extremely dense at git grep this morning
dos1 has quit [Ping timeout: 480 seconds]
<ayaka_> what is why I am talking about a buffer sharing mechanism for the devices from the same vendor
<emersion> sima, c91acda3a380bcaf41b67c8fbab668ef8ddf91c3
<sima> emersion, thanks a lot, you're awesome
<emersion> np <3
<sima> ayaka_, link to your mail because a quick search for modifier didn't yield much?
<emersion> ayaka_: what you're talking about seems a lot like what we had before explicit modifiers, and we're trying to move away from that
<sima> aside from I got sidetracked on other modifier stuff :-)
<ayaka_> emersion, maybe we could simplify this problem, for example, an compression pixel format with dynamic HDR metadata
<sima> yeah vendor specific hidden metadata is considered uncool these days
<emersion> i mean, your "same vendor import/export magic" stuff
<MrCooper> ayaka_: a modifier is a constant attribute of a buffer; something which changes every frame can't be part of the modifier but has to be in a plane instead
<sima> also this ^^
<ayaka_> MrCooper, yes, but it would lead to the performance issue
<MrCooper> well, there's no choice
<ayaka_> well, how to pass HDR metadate(it is not static, regard it as dolby vision), with a pixel formats has used 4 planes
<MrCooper> HDR metadata needs a separate channel I think, e.g. a Wayland protocol or KMS properties
<emersion> yeah
<emersion> note, the kernel only supports HDR static metadata so far
<emersion> but dynamic metadata would be somewhat similar
<ayaka_> kms properties may not be too bad, while it would to problem to track which metadata belongs to which frame
dos1 has joined #dri-devel
<MrCooper> not with atomic KMS
<ayaka_> when you have lots of buffer instead of two
<MrCooper> just change the FB and metadata in the same atomic commit
mvchtz has joined #dri-devel
<ayaka_> well, I think the idea that sharing a whole frame between v4l2 and drm is not acceptable here. I would just implement it in the vendor kernel
<ayaka_> let the other vendors choose what they want
<emersion> why not do it the upstream way?
<ayaka_> I am not trying to hide any information here. KMS property is not too bad, unless you need to copy it from vdec hardware to kernel, kernel to userspace, then userspace to kernel, kernel to display hardware
<jani> mripard: merged now
<emersion> is the dynamic metadata that much data?
<ayaka_> in my plan, vdec to kernel(sharing drm frame structure) then kernel to display is enough
<emersion> static metadata is like a struct with 6 fields
<ayaka_> not that much, 64bytes for a 4 planes(2 graphics planes) frame
<emersion> that sounds very cheap
<emersion> trying to avoid the copies here sounds like a case of premature optimization
<ayaka_> while, you can't image how many buffer we would have
frieder has quit [Ping timeout: 480 seconds]
<emersion> you will have one new metadata blob per frame
<emersion> copying 64 bytes is trivial compared to all of the other stuff you need to do to display a frame
<ayaka_> in my ugly way, the userspace application didn't need to do any modifier
<ayaka_> do any modification
<emersion> your way may works well for your specific use-case with your specific hardware, but it doesn't work outside of this narrow scope
<emersion> may work*
<ayaka_> likes Gstreamer(although it doesn't drm modifier now) would work fine without set the properties for a pixel format
<ayaka_> well you see, in this tile and compression time, pixel format can't be used outside the same vendor
<emersion> as i said, "for your specific use-case with your specific hardware"
<ayaka_> also userspace access to it is not necessary even it is not a secure buffer
<ayaka_> I believe many vendor would do the same things
<emersion> i said before that i'm not interested in helping downstream vendor hacks
<ayaka_> I am trying not to do so
<emersion> do you understand that we can't add a new uAPI which only works for your specific case?
<ayaka_> Android's GKI won't affect the DRM interface, because drm always could have a userspace which only that should follow a standard
<mripard> jani: awesome, thanks :)
<ayaka_> we don't need drm uAPI to fix for GKI
<ayaka_> we could leave that frame buffer sharing thing aside
<ayaka_> I think I could still work on that Allocation Constraints
<emersion> yeah, we do need to work on that
RSpliet has joined #dri-devel
rgallaispou has joined #dri-devel
frieder has joined #dri-devel
<karolherbst> if I want to get a commit from (without cc stable or fixes or anything, just mentioning it fixes something in the commit description) drm-misc-next into kernel stable trees as fast as possible, what would be the proper steps?
<emersion> why not cc stable?
<karolherbst> because apparently some devs rely on the stable bot script to be smart enough
<karolherbst> anyway, it already got pushed to drm-misc-next, just wondering what's the proper path from there
<ayaka_> emersion, my case is little different, it is more about iommu. And for a frame buffer, the graphics plane could use iommu while the metadata could never do that
<ayaka_> while plane 0 and plane 1 could be contiguous, plane 1 and plane 2 could be not
<karolherbst> well.. it hasn't been merged into Linus' tree yet
RSpliet has quit [Ping timeout: 480 seconds]
<karolherbst> and I kinda don't want to wait those 6 weeks
<sima> karolherbst, cherry-pick to drm-misc-fixes with a sha1 reference to the -next commit (so that people aren't too confused why a commit shows up twice) and add the cc: stable there
<emersion> sounds like it should've been pushed to drm-misc-fixes instead
<karolherbst> sima: thanks
<_jannau__> karolherbst: that's not compatible with the stable tree
<sima> emersion, time machines still don't exist yet, hindsight and all that :-)
<emersion> lol
<sima> karolherbst, dim cite $sha1 or you'll piss of some checkers for the sha1 reference
<karolherbst> yeah.. I have my own `git fixes` alias which I think is doing exactly the same thing or something
<sima> karolherbst, it's a bit a fallout from the linux kernel's funky process of making the release branches the main tree :-/
<sima> karolherbst, with sha1 reference I meant the sha1 of the commit in -next
<sima> so that people realize the duplicated commit was intentional, not a mistake
<ayaka_> but which DMA-heap it should allocate from may not be the part of api
<karolherbst> yeah, I know
<karolherbst> ohhh
<karolherbst> but yeah
<sima> gregkh will still get grumpy, but we do this often enough in drm that it's not a big deal
<sima> karolherbst, for the other thing you have dim fixes $broken_sha1
<karolherbst> but you can also just use cherry-pick -x, no?
* ccr . o O ( is gregkh ever un-grumpy? )
<sima> which also tries to guesstimate whether you need cc: stable
<sima> karolherbst, yeah but that' only adds the sha1, not with the proper lkml approved commit citation format
<karolherbst> ahh
<sima> should abbrev the sha1 and add the commit title, so that it's less ugly and easier for random downstream trees to find what commit sha1 that patch is in their tree
cmichael has joined #dri-devel
<daniels> ayaka_: 'unimaginably large buffers' still isn't enough reason to avoid 64b copies; if you have 100 buffers in flight, then the 100x64-byte cost really does vanish into the line noise of the 100*1920*1080*1.5 you're already moving around per frame
<karolherbst> sima: seems like most commits just have a (cherry picked from commit $full_hash) thing... would be a bit confusing to do it differently than anybody else. How did you mean to integrate the `dim cite` output into the commit message?
<sima> karolherbst, just replace the full length sha1 with the output of dim cite
<sima> and yeah maybe dim cherry-pick needs some fixing?
<sima> jani, rodrigovivi tursulin dolphin ^^ since that just used by drm-intel
<karolherbst> my point was rather, if you do `git log | grep "cherry picked"` you almost only see the full length one :D
<karolherbst> but yeah.. if we want the layout to be different I guess dim should be updated there
RSpliet has joined #dri-devel
<ayaka_> daniels, I would leave this alone. You could regard this as application don't want to add a vendor specific branch
<ayaka_> that is not the intel case, that they are hiding modifier in the kernel, while intel would tell you you are using nv12 pixel format
<daniels> intel don't hide modifiers
<ayaka_> besides, it won't be against GKI, as long as I add the whole buffer allocation and attach api to the v4l2 or possible v4l3. When we get rid of M-variant pixel format we need a method to allocate multiple planes buffer in a call
<ayaka_> From my experience with its gen 9 gpu, the modifier is not visual between vaapi and drm
<ayaka_> until gpu gen 11th, it changes
<sima> karolherbst, they're pretty much all drm-intel-fixes cherry-picks, which are done with dim
<sima> everyone else carefully rebases trees so that fixes never show up in -next
<emersion> oh. i finally managed to do a rst cross-document section reference
<sima> ayaka_, intel libva was seriously asleep at the wheel wrt modifier support
tristan has joined #dri-devel
<sima> they only realized that they have to fix things asap when intel dgpu support happened
<emersion> libva doesn't fully support modifiers yet even
<sima> and when they tried to add some more modifiers to the old legacy implicit thing
<sima> emersion, yeah it's a dumpster fire :-/
tristan is now known as Guest7771
<emersion> just use vulkan ^^
<sima> ayaka_, we've also fully sunset that implicit path on latest gpus (finally, took way too long!)
<sima> emersion, yeah probably the answer
<sima> otoh thinking of intel's libva team contributing to mesa vk ...
<ayaka_> sima, there are more vendor drivers you don't know, realtek is one
<ayaka_> if it is not GKI, I won't need to develop that v4l2
<emersion> aha
<emersion> who would be the maintainer even? airlied?
<karolherbst> sima: I see
<karolherbst> still a bit hesistant of doing things differently than all the others though :D
<daniels> ayaka_: yes, there are a lot of drivers who will need to make the effort to do it properly if they want to become part of upstream
<daniels> we've been here before with graphics - ADF was an Android alternative to KMS which tried to bypass uAPI requirements and let vendors just freeform stuff anything they wanted to in there - it's dead now
<jani> karolherbst: are there any non-intel "(cherry picked from ...)" in the logs though?
<karolherbst> yeah
<karolherbst> but it feels like 95% are from intel
<jani> karolherbst: we only use that for drm-intel-next -> drm-intel-fixes/drm-intel-next-fixes
<jani> it's an artefact of us always applying all the patches to drm-intel-next (or drm-intel-gt-next), and cherry-picking the fixes from there
<jani> gregkh has been grumpy about it, but nobody's ever outright told us not to do it either. it just helps a lot with the committer model
<karolherbst> is greg grumpy about doing that or about the format used?
<karolherbst> anyway, if the format should be changed we kinda should do it in dim I guess
<karolherbst> and I'd rather not add another style scripts might have to deal with if we aren't going to change dim as well
<jani> I don't think there were ever any complaints about the format git cherry-pick -x produces. it was mostly about the fact that the referenced sha1's don't exist in upstream kernel yet, only in our branches and linux-next. they'll only make it to upstream kernel after the merge window
<karolherbst> yeah... I can see that causing some confusions
<karolherbst> some throw in the branch from where that commit comes from
<karolherbst> so I think that would make sense to include
<karolherbst> e.g. `(cherry picked from commit 1f682dc9fb3790aa7ec27d3d122ff32b1eda1365 in wireless-next)`
<jani> right
<jani> idk a lot of the time it just feels like flying below the radar is the best option, not make a fuss about it :p
<karolherbst> yeah.. so using a different format out of the sudden feels like doing the exact opposite :P
<jani> heh
<jani> there's also the never ending debates about Link: etc
<karolherbst> I wished all that stuff would be way more consistent across subsystems
<ayaka_> daniels, you can see many vendor drivers don't use drm at all
<karolherbst> what is there to debate about Link:?
<karolherbst> :D
<karolherbst> shouldn't have called it Link if you don't want random Links
<jani> well, some say it's just wrong to use it like dim does, i.e. adding a Link: back to the patch
<jani> some add Link: liberally to just about anything that looks like a link, but *also* to non-URLs
<karolherbst> as long as Linus merges it it can't be wrong (or something) :P
<jani> see the bit about staying below the radar ;)
<daniels> ayaka_: sure, they don't have to use DRM/KMS, but then they don't get in mainline kernels, so they're not our problem
<ayaka_> I just want to say I am not the worst guy as a developer for the vendor. The only barrier is the Android's GKI
<ayaka_> all I could do is offering a not too vendor specified method that could draw them back to use drm
<ayaka_> if there is not better option, I just leave it alone. We could work on what we could make progress
<sima> ayaka_, does android's gki allow you to shovel drm modifiers through at least, or not even that?
<ayaka_> sima, I could say even what intel does is allow
<sima> intel doesn't do the implicit thing anymore
<sima> and we're pretty much removing it everywhere else too for new platforms for existing drivers
<ayaka_> why that we cause a problem? it didn't break the code that all drm drivers are sharing
<sima> and new drivers in general don't get it
<sima> ayaka_, the goal of upstream is to build an ecosystem
<ayaka_> what would cause problem.
<sima> every vendor doing their own thing hurts that pretty fundamentally, so step-by-step we're replacing these vendor tricks
<sima> and you could argue that the drm design for formats/modifiers is bad and a pain, but it's kinda 10 years too late for that argument
<sima> so unless it's a case of "it cannot work" we'll keep with the current thing
<ayaka_> yes, I know. also we could only have 4 planes
<sima> because rolling out these ecosystem changes takes decades
<sima> allowing more planes is a fairly minor change, I /think/ all the vk/gl extensions would allow that already
<emersion> sima, i assume this patch needs at least one ack from a drm person? if so, any chance you could ack? https://patchwork.freedesktop.org/patch/547819/
<ayaka_> well, that may could help, we could hide those cpu cache data in plane 4 and plane 5
<ayaka_> but I don't think this option would be available until next EGL spec update
<emersion> also, if anyone is up for doc review, this one still needs attention: https://patchwork.freedesktop.org/patch/547783/
<sima> emersion, a-b: me on both
<emersion> ty
<sima> ayaka_, yeah if we have to rev a bunch of extensions then that's a bit annoying
<emersion> lol patchwork added a literal "Acked-by: me" line
<sima> :-)
<ayaka_> I believe I would still need to deal with those old code in next 5 years. Besides, we still need to solve the allocation problem at least
<sima> ayaka_, another trick that we iirc used for afbc is that sometimes the planes have a fixed layout
<sima> like nv12
<sima> and so logically it's multiple planes, but you only need one plane slot to describe the buffer
<sima> since I think afbc had the "we need more than 4 planes" issue too
<sima> ayaka_, which allocation problem?
<ayaka_> unfortunately, we can't
<emersion> sima, the unix allocator
<ayaka_> two planes need to allocate from the shm not dma
<emersion> placement, alignment, etc
<ayaka_> sima, even for a NV12 vendor tiled here, there is a padding line between y and uv plane
Guest7771 has quit [Read error: Connection reset by peer]
<ayaka_> also the address must be page alignment
heat_ has joined #dri-devel
<sima> yeah that's not nv12 anymore ...
<ayaka_> I forgot to mention the secure session id property, although it is not a part of pixel format like the (un)compression options
<sima> uh those are a mess, we only have I think 2 drivers that support secure buffers
<sima> and only very limited use-cases
<sima> (in upstream)
<sima> so that stuff is handled as part of the per-vendor gem render uapi right now
<ayaka_> it is thing we used to encrypt and decrypt the memory context to prevent memory frozen attacking
<sima> yeah i915.ko and amdgpu.ko support memory encryption like that too
<ayaka_> but you could still use the generic render api
<sima> but those buffers arent' shareable
<sima> there's no generic drm render api
<sima> mripard, most of the basic plumbing is (drm) infoframes helpers already
<ayaka_> why secure buffer is not shareable, it is ok for the user to know its fb_id
<ayaka_> that would be useful for audio and video sync
<sima> so not sure how much more you'd want to extract from i915, at least without 1-2 more drivers to show what's actually common and what not
<sima> ayaka_, shareable across drm drivers I mean
<ayaka_> so if we could solve this cross sharing fb_id, you could make a function generic
<ayaka_> into the sunlight
<sima> fb_id ... what do you mean with that one?
<ayaka_> well, both intel and amd are vaapi
<ayaka_> the fb_id is what drm present a whole frame buffer(with all its plane)
<sima> afaik the secure buffer stuff only works with egl extensions (unless you grab some proprietary vaapi with some vendor extensions)
<sima> fb_id aren't shareable at all
<ayaka_> nope, it could work with v4l2 stateful api, also stateless if they listen to my design first
<emersion> fb_id is tied to KMS, it's not a good basis for cross device sharing
<sima> emersion, I'd be impressed if you manage to :-)
<ayaka_> I know your point here. what I am trying to share is the per-vendor struct that drm uses to present a framebuffer
<emersion> IOW, one may want to do cross-device sharing without KMS involved
<sima> ayaka_, ah yeah that makes sense, since these drm fb metadata pieces are also used by vk/gl extensions
<emersion> sima, i mean, it would always be _possible_ to add a KMS FB file descriptor
<ayaka_> I know fb_id is only unique to a device not cross the drm device
<sima> unfortunately neither vaapi nor v4l understand them fully, and the patches to fix that have been stuck for years :-(
<sima> emersion, could just share the drm kms fd and yolo ...
<emersion> aha
<emersion> who cares about races
tristan has joined #dri-devel
<sima> but more seriously, drm fb is meant to be a pure metadata container, there's really no point in sharing that object itself :-)
tristan is now known as Guest7774
<sima> ayaka_, another extension idea would be to add properties to drm_fb, so that you could add all kinds of extensions
<emersion> i think ayaka's point is that there would be value in a share-able metadata container, that way it's easier to add new metadata fields without plumbing the world
<sima> like entire drm blobs for big amounts of metadata
<daniels> (unless your goal is to create a side-channel you can stuff tons of opaque data into - a design we've consistently rejected in the past)
<daniels> heh
<sima> doesn't solve the issue of how to pass it around in userspace
<ayaka_> sima, yes, we have talked about this, too much times of copying
<emersion> but i don't know if it's a good or bad idea
<sima> ayaka_, I don't buy that, unless you can show the overhead
<ayaka_> and that lead to vendor branch code as that
<sima> like atomic ioctl is extremely non-optimized, and thus far no one cared
<emersion> :)
<emersion> hm, well…
<emersion> i do care ;_;
<emersion> the core DRM part seems fine
<emersion> the driver-specific part causes issues
<sima> emersion, which parts?
<emersion> like, i miss frames on amdgpu if i try to do a few test commits
<sima> uh yeah that's a bit much overhead
<ayaka_> blob id is an option, but that can't be shared
<sima> emersion, for simple updates iirc i915 gets a few k commits/s or so
<emersion> they bw computation code takes ages in some cases
<emersion> their*
<sima> emersion, hwentlan_ ^^ I guess you know?
<emersion> iirc vsyrjala had a benchmark patch for i915
<ayaka_> and I don't like the idea that we need to call ioctl() first to get its size then let the kernel fill the buffer in second ioctl()
<emersion> hm i'm pretty sure i had an issue for this, but can't find it back
<ayaka_> let's back to the secure buffer case, that may be a good case
<sima> emersion, amdgpu is also pretty bad because they still have a big split between drm and dc data structures
<sima> so for a _lot_ of things they need to grab all the states and recompute everything
<emersion> oh yeah
<sima> which is not going to be great, but also just a bit a result of their currently still too big impedance mismatch between drm and dc
<ayaka_> I allocate a framebuffer it could contains several planes, but people can't access a plane independently, besides I don't want people know its physics address(It doesn't)
<emersion> clearly we need more abstraction layers to make the thing easier to understand :P
<emersion> i mean, i understand why it's like this but…
<sima> emersion, nah just moving more of the dc state into drm states
<ayaka_> but for a secure buffer, it could be a buffer id that sharing between REE and TEE, that only TEE know where the buffer is
<sima> like was done more on the object side of things
<sima> essentially demidlayer it, so that the dependent state recomputation can be partial
<emersion> maybe if dc had more of a collection of helpers design, rather than midlayer… it would blend better
<sima> yeah
<sima> but it's a ton of work to get there
<emersion> indeed
<sima> also I think i915 does a lot of tricks of only validating against current fifo settings
<sima> and recompute optimal ones only later or when really needed
<sima> that keeps cascading computations in atomic_check at bay
frieder has quit [Ping timeout: 480 seconds]
<ayaka_> btw, I didn't get what those dma fence or sync object are used for in drm, that is because drm doesn't have a queue(ping-pong is not a queue) and gpu(display) could scan out a buffer while the render is also updating it?
<emersion> it's for explicit sync
<emersion> once again removing implicit stuff
<emersion> explicit sync is a requirement for Vulkand and Android
<ayaka_> yes, I know android has this before vulkan in egl. I just don't get it why we need a sync point here that our v4l2 driver didn't really care
<ayaka_> I know what sync event is used for opencl because parallel group working
<mort_> https://p.mort.coffee/npO.diff this is one way to fix a kernel memory leak...
frieder has joined #dri-devel
bmodem has joined #dri-devel
<emersion> ayaka_: GPU rendering is asynchronous
<emersion> i don't know about video en/decoding
<ayaka_> video have the similar idea like video slice or tiles
<ayaka_> we could only decode like only few top lines to damage an area
<ayaka_> that is maybe useful for high frequent rate display and video
<emersion> by "asynchronous", i mean that submitting a GPU command buffer does not wait for completion
<emersion> so, if you draw, then read, the draw might not be complete by the time you read
<ayaka_> yes, I know, if we wait until it finish, userspace need to a flush
<emersion> no, a flush is not enough
<emersion> a flush only ensures that the GPU command buffer has been submitted to the hw
<ayaka_> there are front and back surface in a EGL surface
<emersion> glFinish() will wait for completion
<emersion> glFlush() will not
<ayaka_> yes glFinish() cost a lot
Net147 has quit [Quit: Quit]
Net147 has joined #dri-devel
<ayaka_> I may need to look into its api. I was wondering when we get a front surface, then commit it to kms, if that front surface is completion, what the driver should do?
<ayaka_> using the fb_id in the previous atomic commit?
Guest7774 has quit [Ping timeout: 480 seconds]
<emersion> if the buffer submitted to KMS is not ready at vblank time, the previous frame is re-used, yes
f11f12 has joined #dri-devel
flto_ has quit []
flto has joined #dri-devel
pochu_ has joined #dri-devel
pochu has quit [Ping timeout: 480 seconds]
donaldrobson has quit [Ping timeout: 480 seconds]
donaldrobson has joined #dri-devel
<mripard> sima: it looks like I can't find a way to express my point on that either, so I'll just stop. I don't want to create any unnecessary frustration or tension there, and the discussion is way out of scope for that particular patch series
<sima> mripard, I think I get your point, I just don't agree
<sima> getting hdmi is a big dumpster fire and took years for i915
<sima> but throwing in the towel and declaring the entire issue a userspace problem doesn't help, because userspace is in an even worse position to handle things correctly
<mripard> that's not what I was saying
dviola has quit [Quit: WeeChat 4.0.2]
<Venemo> anholt: how do I tell deqp-runner how many threads I want it to use?
<sima> mripard, I thought this was specifically about i915 having hdmi infoframe code that no one else does?
<mripard> I guess what I was saying is that "well, i915 is fixed so we should just ignore it" is kind of throwing the towel as well
<pendingchaos> Venemo: "-j n" or "--jobs n"
<Venemo> thanks
<mripard> anyway, yes, it's a dumpster fire, it will probably take a long time to fix for all the other drivers as well, and you made it clear that margins are not the right solution and we should fix every thing else
<mripard> so I guess we agree on the most important things there
<mripard> the rest are technicalities
<sima> mripard, definitely didn't want to get "i915 is done, it's all good" across
<sima> just wanted to highlight an example of where hdmi drivers probably would need to be, since the hardest part was figuring out what's all needed
<sima> replicating and extracting more helpers should be much easier
kts has joined #dri-devel
<sima> mripard, I was honestly surprised that aside of i915 no one else seems to use these infoframe helpers fully
yyds has quit [Remote host closed the connection]
itoral has quit [Quit: Leaving]
JohnnyonFlame has joined #dri-devel
JohnnyonFlame has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
JohnnyonFlame has joined #dri-devel
kts has quit [Quit: Leaving]
digetx has quit [Ping timeout: 480 seconds]
digetx has joined #dri-devel
<dliviu> hello, drm-misc-next maintainership question: I have a patch that did not Cc dri-devel, only sima and airlied directly. Noticed when trying to apply with dim. What is the protocol? Should I ask to resend?
Ahuj has joined #dri-devel
Danct12 has quit [Quit: WeeChat 4.0.2]
YuGiOhJCJ has quit [Ping timeout: 480 seconds]
tristan has joined #dri-devel
tristan is now known as Guest7783
<alyssa> jenatali: windows runners were down yesterday (IDK if they still are) so couldn't test windows with nir 2.0 MR, you might want to give that a smoke test
<jenatali> alyssa: Pretty sure I clicked play on the Windows jobs on at least one pipeline
<alyssa> :+1:
YuGiOhJCJ has joined #dri-devel
Company has joined #dri-devel
<alyssa> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24432 really earns its "all jobs" CI pipelines :P
<jenatali> Ugh, the D3D jobs depend on clang-format now?
<jenatali> That's a huge pain
Armada has quit [Remote host closed the connection]
Armada has joined #dri-devel
<jenatali> I need to undo that, there's no good reason to trigger a Linux container just to run Windows jobs and it means I can't run just the jobs I want with one click from the UI anymore
<daniels> jenatali: fwiw, .gitlab-ci/bin/ci_run_n_monitor.py --target dozen-deqp is what you want
<jenatali> Yeah but as I've said, Windows doesn't have a way of storing tokens for that script which makes it a huge hassle every time I want to use it
<jenatali> Which I guess I could try to fix that instead, but also I don't think there's value in depending on clang-format for the Windows jobs
* alyssa regrets putting clang-format in CI
<jenatali> Unless there was a separate clang-format job that ran on the Windows runner, which I also don't think is valuable
godvino has joined #dri-devel
frankbinns has quit [Remote host closed the connection]
godvino has quit [Quit: WeeChat 3.6]
vliaskov__ has quit [Ping timeout: 480 seconds]
Guest7783 has quit [Ping timeout: 480 seconds]
yuq825 has left #dri-devel [#dri-devel]
YuGiOhJCJ has quit [Quit: YuGiOhJCJ]
<javierm> sima: sorry, I missed your message before because I was on PTO, but I see that emersion already answered
<hwentlan_> sima, emersion, yes, I'm aware
heat_ has quit [Read error: Connection reset by peer]
heat has joined #dri-devel
jkrzyszt_ has quit [Ping timeout: 480 seconds]
tristan__ has joined #dri-devel
Haaninjo has joined #dri-devel
<MrCooper> emersion sima hwentlan_: there's https://gitlab.freedesktop.org/drm/amd/-/issues/1740 but it was timeout-closed by Mario
<emersion> ah yes that one
<hwentlan_> yeah, that ticket is still valid
<emersion> i really don't like that auto-close policy
<emersion> i spent a lot of time collecting info about bugs and then it all goes to the void
<hwentlan_> but the reason most things takes long is because they go to our DML (display mode lib) for bandwidth computations which is massive and therefore slow
<hwentlan_> it's not an easy problem to solve
<emersion> yeah…
<sima> yeah 2ms for 5 test_only atomic calls is a bit much ...
<sima> well it just seems to be one really that's bad
<sima> *really bad
<emersion> yeah, really depends what you test for
<hwentlan_> is this mostly about enabling planes?
donaldrobson_ has joined #dri-devel
donaldrobson has quit [Ping timeout: 480 seconds]
heat has quit [Remote host closed the connection]
<hwentlan_> one could probably pre-compute certain common scenarios and cache them, but it wouldn't be pretty and would be sub-optimal since you can't pre-compute everything
heat has joined #dri-devel
<sima> hwentlan_, yeah a bit of grepping in the attached dmesg and it's only plane changes nothing else
<sima> also only one crtc
<sima> so should be a substantial subset of the overall atomic state and I have no idea how you managed to burn down over a ms on computing stuff with that
<MrCooper> https://gitlab.freedesktop.org/drm/amd/-/issues/2186 seems like higher priority though; can make moving the mouse cursor very painful on my gaming rig
<sima> but iirc amdgpu dc is pretty aggressive in escalating to "grab all states, recompute everything"
<sima> the 0.1ms is more in line with "atomic ioctl is just not very fast" stuff I think
f11f12 has quit [Quit: Leaving]
TMM has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
TMM has joined #dri-devel
<sima> for that I'd expect that profiling is needed to make sure you knock out the right inefficiencies, since they're absolutely everywhere
<sima> maybe with vkms with some planes and just pushing through test_only commits as fast as possible
fxkamd has joined #dri-devel
<robclark> mort_: hmm, well apq8016 has legacy cursor.. and is using disp/mdp instead of disp/dpu.. could be something related to one of those two things? Maybe try sw cursor to rule out something cursor related?
<hwentlan_> MrCooper, I agree
heat_ has joined #dri-devel
heat has quit [Remote host closed the connection]
sgruszka has quit [Ping timeout: 480 seconds]
angerctl has joined #dri-devel
<mort_> robclark: I don't know if I'm configuring X correctly .. but I tried to enable swcursor, and it doesn't seem to have stopped the leak
<mort_> I created a /usr/share/X11/xorg.conf.d/20-swcursor.conf with: Section "Device" \n\t Identifier "Card0" \n\t Option "SWCursor" "true" \n EndSection, which should turn on software cursor from what I can tell
Namarrgon has quit [Ping timeout: 480 seconds]
<robclark> if you enable some drm.debug traces, you should see mouse cursor (while nothing else is changing on screen) generate drm traces if hw cursor is used, but not otherwise.. or otherwise maybe check /proc/interrupts (hw cursor updates will enable vblank irq)
<MrCooper> I'd check the Xorg log file for whether the option is actually taking effect
<robclark> ahh, yeah, that might be easier
<mort_> the log does contain '(**) modeset(0): Option "SWcursor" "true"' so yeah, that's taking effect
rasterman has quit [Quit: Gettin' stinky!]
cmichael has quit [Ping timeout: 480 seconds]
fxkamd has quit []
fxkamd has joined #dri-devel
<robclark> mort_: ok.. then in theory you would see this with kmscube? That might be an easier/simpler thing to debug.. plus it can use either legacy pageflip like xf86-video-modesetting and atomic ioctl
<mort_> robclark: kmscube says "failed to set mode: permission denied", or, if I run it with -A, it says "failed to commit: permission denied"
<mort_> that's running without either a window manager or a compositor, if that matters
<robclark> kill xorg first, if you haven't
<mort_> ohh I assumed this was an X application
<robclark> nope
<javierm> mort_: no it's just a kms app. https://gitlab.freedesktop.org/daniels/kms-quads is another nice KMS test app
<mort_> it's leaking both by default and with -A
<robclark> hmm, ok
<mort_> as a hack, could I increment the crtc refcount in the alloc and then just plain kfree it in atomic_state_default_release, or is it meant to still be used after the release
frieder has quit [Remote host closed the connection]
<robclark> it is needed after the release if someone is waiting on one of the completions
tristan__ has quit [Remote host closed the connection]
tristan__ has joined #dri-devel
<robclark> looking at the kmemleak hexdumps.. it is leaking a single reference (the kref starts at the 9th byte)
<mort_> I have also found with my prints that the leaked commits end up with a refcount of 1
yyds has joined #dri-devel
<mort_> robclark: I am happy to help debug this if you want, and help debugging it would be appreciated. However, until and unless we figure it out, I will work around it by having a pool of allocated commit objects which I cycle through and re-use, so I don't think I will work on this myself outside of getting information for you
<sima> ... catching up: are we leaking struct drm_atomic_state?
<mort_> yeah
<robclark> sima: https://p.mort.coffee/REk fwiw
<mort_> with the msm drm driver, on my hardware (but not robclark's), on recent kernels (both 6.1.34 and 6.4.7 are tested), a drm_crtc_commit structure is leaked from drm_atomic_helper_setup_commit every flip
MrCooper has quit [Remote host closed the connection]
tristan__ has quit [Ping timeout: 480 seconds]
<sima> hm yeah that's leaking like a sieve :-/
<robclark> I'm not seeing anything else leaked, so it isn't like we are leaking crtc state or something
<mort_> yeah
<sima> well anything in there is still referenced I assume
<robclark> could be mdp5 vs dpu.. but driver itself doesn't touch the commit obj
<sima> it's a bit of work, but might be worth it to wire up ref_tracker.h support
<sima> ofc the thing is gloriously undocumented :-(
Duke`` has joined #dri-devel
MrCooper has joined #dri-devel
ndufresne has joined #dri-devel
<mort_> sima: see https://p.mort.coffee/DVX, I manually traced inc/dec :p
<dliviu> asking again: is it OK to submit a patch in drm-misc-next that doesn't have a link to patchwork because dri-devel was never Cc-ed?
<sima> oh it's struct drm_crtc_commit
<mripard> we had a similar issue for vc4 at some point, but I can't retell how we fixed it up https://github.com/raspberrypi/linux/issues/4474
<mripard> hopefully it will help :)
<mripard> looking at the offending patch, I think we were doing a get in setup_commit and one in destroy_state
<mripard> but we were also doing a get in duplicate_state
<mripard> so the refcounting was off
bmodem has quit [Ping timeout: 480 seconds]
MajorBiscuit has quit [Ping timeout: 480 seconds]
kzd has quit [Ping timeout: 480 seconds]
<sima> mort_, robclark I think I got it
MrCooper has quit [Quit: Leaving]
<mort_> sima: ooo, interesting, do tell
<sima> broken since years it seems
<sima> 2017 is when we added refcount for the plane_state->commit pointer
jfalempe has quit [Quit: Leaving]
<sima> mort_, testing would be much appreciated, if it checks out I'll bake it into a proper patch
<sima> took me a while to load all the stuff into my brain again, since for a simple flip we have like 5 references floating around ...
<mort_> fwiw, this patch also "fixes" (or, well, works around) it: https://p.mort.coffee/P4n.diff, I'm running that right now and it's not leaking yet graphics is working
<mort_> but I will check out that patch, it's a much more proper fix
<sima> mort_, yeah that's just very dangerous duct-tape
<mort_> indeed
<sima> I'm pretty sure I've found it, because reconstructing the refcount leak with your log pointed me at exactly the code that was buggy
<sima> mdp5_plane_destroy_state wasn't dropping the refcount for plane_state->commit like it should
<sima> and that's been broken since 21a01abbe32a roughly
MrCooper has joined #dri-devel
<mort_> the fix looks very logical
FireBurn has quit [Quit: Konversation terminated!]
yyds has quit [Remote host closed the connection]
benjaminl has joined #dri-devel
<robclark> sima: oh, good catch, looks like mdp5 missed conversion as more stuff got added to plane state
<daniels> emersion: so apparently I don't know how to work git-send-email anymore, but thanks for the necromancy https://lists.freedesktop.org/archives/dri-devel/2023-August/417333.html (cc sima, anyone else interested in dmabuf/uapi)
<emersion> ohhhhhhhhh
rgallaispou has left #dri-devel [#dri-devel]
<emersion> <3
<emersion> i've linked that doc patch *so many times*
<daniels> I'd like to say two years is a new low, but it's probably not :\
<sima> emersion, you'll do the honors of applying it?
<emersion> i'll read it up and, yeah, apply it
<javierm> daniels: the glossary section is very useful!
<emersion> ah the missing cc's
Ahuj has quit [Ping timeout: 480 seconds]
<daniels> javierm: you can thank pq for that! I just copy & pasted and added like two things
<javierm> daniels: cool, thanks pq :)
<daniels> emersion: yeah, turns out if you use git-format-patch so you can write revision notes, then use git-send-email to send those, the Cc in your cover letter only applies to the 0/n, not to the individual patches
<daniels> it also turns out that it won't cc everyone on the thread if you didn't bother telling it to
<emersion> git send-email --annotate
<emersion> in DRM also we put the CC list in the commit message
<javierm> daniels: I've moved to patman almost a decade ago and never looked back
<emersion> git send-email will add everybody automatically
<emersion> but yeah, should really write a git-send-email alternative which doesn't suck
<sima> apparently it's called b4 or something
<javierm> emersion, sima: did you ever try patman? https://u-boot.readthedocs.io/en/latest/develop/patman.html
<_jannau__> another one? patman and b4 are already muh better
<emersion> b4 seemed really tied to the kernel workflow last time i looked
<sima> emersion, I think the only thing we might want to add in a follow-up is that for non-linear format everyone needs to use the stride computation like drm has for that format/modifier combo, or things break
<sima> that's one thing that has lead to really long threads in the past
<emersion> i just want a stateful tool which remembers version/cc/etc depending on the branch i'm in
<emersion> ah, yeah, good point
<daniels> javierm: I've moved to GitLab
<daniels> emersion: from what I've seen of patman, it's the closest thing indeed to that
<daniels> emersion: TIL --annotate!
<_jannau__> b4 depends on email based workflow since it keeps the metadata in an empty commit
<sima> s/impliied/implied/
<sima> emersion, ^^
<emersion> i'll need to check these tools out again
<javierm> daniels: I'm part of the patman evangelism strike force :P
<javierm> having all the patches metadata in the commit messages and just run patman is amazing. And you even have a --dry-run option
<daniels> yeah, patman is definitely the right thing for email workflows
djbw has joined #dri-devel
<sima> emersion, a-b: me on both, just finished reading
<emersion> cool
<sima> Care and attention should be taken to ensure that
<sima> + zero as a default uninitialized value signals no modifier.
<sima> ^^ maybe add ", and must not be accidentally mixed up with DRM_FORMAT_MOD_LINEAR, which equals zero"
<sima> but fine either way
<emersion> hm, zero does not mean "no modifier" then?
<emersion> also zero is not "uninitialized"
<sima> I guess it depends, some interfaces guarantee that no modifier is signalled with MOD_INVALID
<sima> some have an out-of-band flag (like addfb2/getfb2)
<emersion> yeah
<sima> I thought the note is just to make sure that you don't accidentally mix things up since it's confusing
<ayaka> I watched XDC 2022 | Explicit Synchronization for Linux Display Servers
<sima> maybe we should clarify this in a follow up
bgs has joined #dri-devel
<daniels> that's ... not what I meant to write
<ayaka> I think the sync point is only used for notified the event like a frame is rendered(by GPU), it can't be used to notify partly render is done?
<daniels> missing some words I think
<daniels> something like 'Care and attention should be taken to ensure that zero (as a default uninitialized value or boolean comparison) is **not** confused with no modifier.'
<daniels> total sense inversion ffs
<ayaka> besides why we use IN_FENCE_FD? should we send the framebuffer to an atomic request only after the userspace have been notified?
<emersion> i'd use some other word than "uninitialized" i think
<daniels> ('don't be confused! do the opposite of what you should!')
<daniels> emersion: just 'default' or?
<emersion> "uninitialized" in C means that the contents can be anything
<daniels> ayaka: in gfx, fences are only ever used to signal full completion of a frame. there would be no point in giving a Wayland server a fence which signaled when half the frame was complete
<ayaka> daniels, then why amd is trying to implement async page flip
<daniels> ayaka: they're totally different things
<ayaka> async page flip is scan out the rest of part with a new frame
<daniels> emersion: I guess I meant 'explicitly initialised but to all-zero as a default value that's totally safe as a "nothing here" sentinel for everything except FDs and modifiers'
<daniels> emersion: but that seems overly wordy
<daniels> ayaka: yes, which is different to beginning to scan out something before anything's been rendered to it
<emersion> "zero as a default value when omitted"?
<ayaka> then why I can't flush the top part with a new frame for example I want to display a large graphics while its bottom is not decode yet
<emersion> or just "as a default value"?
tobiasjakobi has joined #dri-devel
<emersion> but yeah, i'm nitpicking here :P
donaldrobson_ has quit [Remote host closed the connection]
<daniels> emersion: yeah, I think just 'as a default value' is probably easiest?
<emersion> wfm
<daniels> or 'default or initial'
<daniels> wanna just fix up locally or should I resend?
<emersion> yeah, initial sounds good to me
<emersion> i can fix up locally
<daniels> ayaka: you can if you want, it's just that very few people want that
<daniels> emersion: thanks!
tobiasjakobi has quit []
<ayaka> daniels, here is the case, a platform is designed for 4K decoding and display. But for the 8K display, its performance may not be enough. So you can't wait a frame to be finished to display
junaid has joined #dri-devel
<daniels> sure, in that case don't bother fencing, just display whatever's around at the time, and then the user will see half and half
<ayaka> you must start to scan it out for example its half is done
<daniels> ok, so then use fences to fence when half is done, or a third is done, or whatever
<daniels> that's something you'd have to put in your driver, but I don't imagine 'signal fence when you've done half the work' would get accepted as uAPI upstream, so it's just downstream hacks in which case you can do whatever you want to
<ayaka> well, that is what opencl could do, I am thinking about allow create multiple fences in video4linux because I didn't find any video4linux driver uses the fence
<ayaka> its stateless decoder could support decode a slice but it has been hold the buffer until the whole frame is done
idr has quit [Ping timeout: 480 seconds]
swalker__ has quit [Remote host closed the connection]
<ayaka> also I didn't get why the kms driver itself would consume IN_FENCE, not the userspace should wait until the notification then submit the commit
<ayaka> if the fence is not come that commit would not be submit, if the next commit came, it is the previous commit would be discard?
<daniels> no, you can't queue multiple submits
Zopolis4 has quit [Quit: Connection closed for inactivity]
rasterman has joined #dri-devel
<emersion> hm, sorry, no time to finish this up tonight
<daniels> emersion: I'm pretty sure it can survive another day :P
<ayaka> daniels, yes, I forget I should wait for the event or the out fence
vliaskov has joined #dri-devel
<ayaka> but why we could let the kernel wait for fence not in the userspace?
<daniels> because it avoids spurious wakeups and unnecessary queues with relatively deep/complex chains of operations
aravind has quit [Ping timeout: 480 seconds]
<ayaka> daniels, I think I need some document to understand how this fence (in and out) work with egl
<ayaka> from the kmscube, it would create an in fence for each cube frame and an out for each scan out
<daniels> in-fences are waited for before rendering begins; out-fences are signaled when rendering completes
<daniels> if you search around, there are a few presentations on how explicit synchronisation works
<ayaka> I could understand what in fence and out fence are used for from drm doc
<ayaka> but I just wondering why we can't resue those fence, we just need the out fence to make gpu to unlock the front buffer
<ayaka> while in fence for display to wait the gpu completed its front buffer
<daniels> how would you reuse fences? fences refer to one specific point in time
<daniels> the in-fence kmscube passes to KMS, refers to the completion of exactly one drawing command made by the GPU
<ayaka> because creating fence is a cost
<daniels> it doesn't change in time to be 'the completion of whatever the latest rendering is'
<ayaka> I am thinking the long time pending kmssink in gstreamer
<daniels> yes, so in its render callback it would have to accept a dma-fence with each frame
<ayaka> we don't need a in fence here but could we re-use out fence here
<ayaka> we just create two out fence here
<daniels> no, because fences always refer to one specific point in time
apinheiro has quit [Quit: Leaving]
<sima> robclark, ah right, the vma tracking locking fun was in the context of rpm I think ...
<sima> mort_, have a testing verdict on the patch already?
<ayaka> daniels, then should we drop the only poll drm event then switch to out fence way. Creating something and destroying something frequently doesn't sound a good idea
<robclark> vma lock was completely uninvolved with that since isn't used for reclaim.. only to give userspace an error if it tried to change the VA of a in-use vma... the rpm/qos hell is still there
<robclark> since that is about the obj resv / reclaim vs rpm/qos locking
<ayaka> s/only/old/
<daniels> ayaka: this is something that happens once every 16ms and is a very lightweight operation. I have no idea why you think it's a bottleneck.
alyssa has left #dri-devel [#dri-devel]
<zmike> eric_engestrom: any idea what's going on with https://gitlab.freedesktop.org/zmike/mesa/-/jobs/46637627
<zmike> I retried a couple times and it seems broken
<eric_engestrom> zmike: I've seen a bunch of 500 & 503 on the gitlab registry this afternoon, I'm guessing this is why it's failing
<eric_engestrom> I don't know any more than that though, ask the admins on #freedesktop
<ayaka> in 60fps mode it is true
<eric_engestrom> also, it might be related to the windows runners being overwhelmed, possibly someone doing something heavy
<daniels> looking at that job log, it's trying to pull x86_64-test_base and failing
<daniels> looking at the x86_64-test_base job log, the job claimed to succeed, but the container push failed https://gitlab.freedesktop.org/zmike/mesa/-/jobs/46634148
<daniels> so, retry that, then retry the other one
<zmike> k
sukrutb has joined #dri-devel
Kayden has quit [Read error: Connection reset by peer]
K`den has joined #dri-devel
K`den is now known as Kayden
<ayaka> daniels, now I am thinking the software fence is not fast enough, I am thinking anyone offer a hardware fence. Besides, ping-pong and page flip could be not enough, we had better prepare a queue that display could scan them out in order in such high fps case
<ayaka> cpu is not that real time for such task
<daniels> I haven't seen a system with a combination of such a high frame rate and such a low-end CPU that it couldn't service the IRQs quickly enough to do pageflips
<daniels> but if that's a problem you're seeing, that's something you'll be solving I guess
<ayaka> the cpu is not that low end it is quad arm a55 cores
<ayaka> but cpu has more work to do, also we don't use irq but message box now
<daniels> A55s can do pageflips
<ayaka> as I said, cpu is occupied by the audio
<daniels> well, if you manage to get yourself into a situation where you can't schedule enough time for one ioctl every 16ms (or 8ms or whatever), then that sounds quite bad, but also not something upstream's going to design for
<daniels> tbh it sounds like a case of trying to optimise problems which don't exist; you're much better off doing actual real-world measurements before guessing
<ayaka> Currently, it didn't run drm. But we have to use a queue(msg box) here to implement that refresh rate
greenjustin_ has joined #dri-devel
<ayaka> decoder could work exactly that speed in 4k mode, while not much for us to wait. Also we have to count the cost for REE and TEE context switching
<ayaka> the point is whether the upstream method should do such thing frequently. Or why would we invent poll() that yield the thread
greenjustin has quit [Ping timeout: 480 seconds]
_whitelogger has joined #dri-devel
sukrutb has quit [Remote host closed the connection]
benjamin1 has joined #dri-devel
alanc has quit [Remote host closed the connection]
alanc has joined #dri-devel
pochu_ has quit [Ping timeout: 480 seconds]
idr has joined #dri-devel
benjaminl has quit [Ping timeout: 480 seconds]
ngcortes has joined #dri-devel
mvlad has quit [Remote host closed the connection]
Kayden has quit [Quit: -> JF]
<mort_> sima: I just installed a kernel with the patch, and so far it looks good! kmalloc-256 has been at 1380K for a while now, nothing else looks suspicious
<mort_> I'll run a few memory logging tools I have overnight to see if I encounter anything weird but it's definitely not leaking a commit per flip anymore
tursulin has quit [Ping timeout: 480 seconds]
<sima> mort_, can I have some email for reported/tested-by credits?
<mort_> sima: the best one to use is probably dorum@noisolation.com
oneforall2 has quit [Remote host closed the connection]
sghuge has quit [Remote host closed the connection]
idr has quit [Remote host closed the connection]
sghuge has joined #dri-devel
idr has joined #dri-devel
oneforall2 has joined #dri-devel
Kayden has joined #dri-devel
<mort_> hardware is a snapdragon 410 if that's relevant
fee1dead has joined #dri-devel
ngcortes has quit [Ping timeout: 480 seconds]
benjaminl has joined #dri-devel
<fee1dead> Hi all. I'm having some trouble with a Touring card and the new NVK vulkan driver. My small vulkan triangle example is crashing Sway. I know it is experimental and not even merged, so maybe I should just wait? Wanted to know if it is a normal outcome or worth reporting and also asking for help if it is the latter. Thanks in advance!
benjamin1 has quit [Ping timeout: 480 seconds]
crabbedhaloablut has quit []
bgs has quit [Remote host closed the connection]
<airlied> fee1dead: probably at the just wait stage
<fee1dead> Will do.
danylo has quit [Ping timeout: 480 seconds]
rasterman has quit [Quit: Gettin' stinky!]
ngcortes has joined #dri-devel
junaid has quit [Remote host closed the connection]
<sima> robclark, just sent out the mdp5 leak fix, I'm assuming you'll apply somewhere?
<robclark> yeah.. it shows up in freedreno patchworks so one of us will grab it
<airlied> 90/92 sessions passed, conformance test FAILED so close, gl4.6 llvmpipe run, one ballot test in two configs
alyssa has joined #dri-devel
<alyssa> Anyone know of test coverage for perspective correct interpolateAtOffset?
<pendingchaos> the vulkan coverage probably isn't very good?
<pendingchaos> IIRC RADV's is at least somewhat incorrect
<alyssa> I'm not seeing anything in either gles or vk cts
<alyssa> I also don't have a VK driver that's able to run the vk tests regardless, so really hoping for GL CTS or piglit coverage :p
<alyssa> unfortunately the gles cts tests all pass even if I do perspective-incorrect interpolation
<alyssa> Oh I know how to test this, I can just forcibly lower all interpolation and then see what breaks
rgallaispou has joined #dri-devel
ced117 has quit [Ping timeout: 480 seconds]
<zmike> piglit?
ohmadcs^ has quit [Remote host closed the connection]
ohmadcs^ has joined #dri-devel
<alyssa> worth a shot but I don't see coverage
<zmike> I know there's interpolateatoffset coverage
Duke`` has quit [Ping timeout: 480 seconds]
<alyssa> yes, but nothing that specifically tests for perspective correction
<alyssa> i.e. with gl_Position.w set to anything other than 1
<zmike> shader_runner to the rescue
ced117 has joined #dri-devel
sima has quit [Ping timeout: 480 seconds]
benjamin1 has joined #dri-devel
vliaskov has quit [Remote host closed the connection]
heat_ has quit [Remote host closed the connection]
benjaminl has quit [Ping timeout: 480 seconds]
Haaninjo has quit [Quit: Ex-Chat]
rgallaispou has quit [Quit: WeeChat 4.0.2]
jmondi has quit [Read error: Connection reset by peer]
pcercuei has quit [Quit: dodo]
paulk-bis has joined #dri-devel
paulk has quit [Read error: Connection reset by peer]
fee1dead has quit [Remote host closed the connection]
JohnnyonFlame has quit [Ping timeout: 480 seconds]
kzd has joined #dri-devel
fxkamd has quit []
danylo has joined #dri-devel
<glennk> alyssa, i vaguely remember the ulp error between affine and true perspective being less for the affine than trying to correct using a float division when doing this on evergreen
<alyssa> hum?
djbw has quit [Read error: Connection reset by peer]
<glennk> see tgsi_interp_egcm in r600_shader.c
<glennk> it uses affine interpolation for interpolateAtOffset
glennk has quit [Ping timeout: 480 seconds]
rsripada has quit []
rauji___ has joined #dri-devel
JohnnyonFlame has joined #dri-devel