ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar
<alyssa>
airlied: I'm not trying to win the argument. I'm explaining why I can't do this anymore.
<alyssa>
just, physically I mean
<alyssa>
instead of just quietly moving downstream & onto smaller projects
<alyssa>
my body can't deal with this anymore
minecrell has quit [Quit: Ping timeout (120 seconds)]
<alyssa>
i'm not built for high stress environments, which everything in mesa/mesa is for better or worse
<airlied>
do you have some deadlines external to this, adding pressure or something?
<airlied>
like I get the I've got to ship something tomorrow, oh CI screwed me over stressors, but I wasn't aware too many of the main mesa devs were in that sort of work environment
<gfxstrand>
airlied: She's the sole developer of a highly anticipated driver. That's a hell of a lot of pressure.
* gfxstrand
ought to know
<alyssa>
real
<airlied>
gfxstrand: but it's not time dependant, a day, or a week doesn't matter in any way really
minecrell has joined #dri-devel
<gfxstrand>
Technically, no. But there's a lot of emotional weight there.
<gfxstrand>
How many times has someone asked me to fix vkcube with NAK in the last 6 months?
<gfxstrand>
Like, of all of the things in this world, that one literally doesn't matter but people seem to care and they ask.
<gfxstrand>
It results in a lot of pressure, regardless of whether or not there are deadlines.
<gfxstrand>
Because the deadline becomes ASAP, always.
<gfxstrand>
And, yes, there's no particular date by which stuff has to get done but when you feel like you started off 5 years behind, that doesn't much matter.
<jenatali>
alyssa: That tool should be buildable on Linux. But let me send that issue to my team and see if anyone can look
<gfxstrand>
And the world is full of critics but not very full of people offering real help.
<gfxstrand>
Staying sane in the midst of that requires a certain amount of mental/emotional discipline and self-care.
<alyssa>
jenatali: thanks
<Company>
... and we're probably all perfectionist enough to expect too much from ourselves
<jenatali>
If not, yeah disabling the job would be okay until I'm back at the end of the month (assuming the offending MR can't wait)
<gfxstrand>
(Well, there are lots of people who try to offer but trying to offer and able to provide are very different things when it comes to help building a driver stack.)
heat__ has quit [Ping timeout: 480 seconds]
<gfxstrand>
This isn't me complaining. Just saying that "under a lot of pressure" can be true, even without horrible managers breathing down your back. Sometimes a bunch of impatient users is just as bad if not worse.
<gfxstrand>
And then you add on self-expectations and.... Yeah, it gets to be a lot.
<alyssa>
people are expecting dx12 games to run, like, yesterday
<airlied>
yeah maybe we need some positive affirmations for driver developers
<airlied>
"your driver will exist for hopefully 10 years, sprinting now isn't going to make it better"
<airlied>
"ignore all users, don't read the comments"
<gfxstrand>
Easier said than done...
<gfxstrand>
But, yes, it's good to remember those things.
alyssa has quit [Quit: alyssa]
<airlied>
gfxstrand: also after the first couple, I think vkcube was more of a troll, I'm pretty sure I had patches for vkcube working on NAK months ago :-)
<airlied>
so in that case I did help, but you also said you didn't want that help at that exact time
kzd has joined #dri-devel
<airlied>
maybe the XDC CI talks could become a workshop, where people can decide to remove some of the stuff that got added, or isn't adding that much value anymore
<gfxstrand>
I do think we need to take a very hard look at cost vs. benefit of CI jobs.
<gfxstrand>
Where cost isn't just runtime or $$$ spent on runners
<airlied>
yeah the 10min goal for runs is not use if it takes 45mins to start the job
<airlied>
like I think it's possible at the moment to actually merge a core change inside the 60m timeout, which is way better than it was 3-4 weeks ago
<jenatali>
Please do ping me if any of our jobs are (non-infrastructure) being problematic. I haven't been paying attention but since I haven't heard until today I've been assuming we're not causing any issues
<zmike>
or if it's just that runner being way overloaded somehow 🤔
<airlied>
yeah seems like runner overload
<jenatali>
That's just the runner being overloaded, yeah
<jenatali>
If it's consistent we can turn down the fraction but I'd hate to miss out on coverage all the time for the infrequent overload
<zmike>
I don't think the fraction matters since it takes 10-15 mins on a "good" run
<zmike>
but I've never had any other job hit this
<zmike>
so it seems to me something is going on here
<airlied>
there's the same runner used in a good and bad run there as well
<airlied>
so it's not specific to some runner configuration, so just sounds like other jobs are scheduled on the same machine
<zmike>
they're not the same
<zmike>
one is 5, the other is 6
<zmike>
but yes, scheduling may be a factor
<zmike>
we had issues with this before where certain jobs weren't scheduling properly
<airlied>
zmike: the first overrun you posted is 6
<jenatali>
zmike: The Windows runners are shared among all freedesktop projects that use them. Gstreamer often hammers them
<zmike>
🤕
<zmike>
maybe we need a dedicated mesa runner then if we're going to keep adding ci for windows?
<jenatali>
Agreed
aissen has joined #dri-devel
minecrell has quit [Quit: Ping timeout (120 seconds)]
<jenatali>
I don't think we have plans to add more CI at this point
minecrell has joined #dri-devel
Daanct12 has joined #dri-devel
co1umbarius has joined #dri-devel
columbarius has quit [Ping timeout: 480 seconds]
ngcortes_ has quit [Ping timeout: 480 seconds]
ngcortes has quit [Ping timeout: 480 seconds]
yyds has joined #dri-devel
yuq825 has joined #dri-devel
ngcortes_ has joined #dri-devel
ngcortes has joined #dri-devel
aissen has quit [Ping timeout: 480 seconds]
aissen has joined #dri-devel
Leopold_ has quit [Remote host closed the connection]
Leopold_ has joined #dri-devel
Zopolis4 has joined #dri-devel
linyaa has quit []
kts has joined #dri-devel
krumelmonster has quit [Ping timeout: 480 seconds]
ngcortes_ has quit [Ping timeout: 480 seconds]
ngcortes has quit [Ping timeout: 480 seconds]
krumelmonster has joined #dri-devel
alanc has quit [Remote host closed the connection]
alanc has joined #dri-devel
ran_ has joined #dri-devel
Company has quit [Quit: Leaving]
ran_ has quit [Remote host closed the connection]
ngcortes_ has joined #dri-devel
ngcortes has joined #dri-devel
ngcortes has quit [Remote host closed the connection]
ngcortes_ has quit [Remote host closed the connection]
ngcortes_ has joined #dri-devel
ngcortes has joined #dri-devel
sukrutb has joined #dri-devel
mclasen has quit [Ping timeout: 480 seconds]
Duke`` has joined #dri-devel
JohnnyonFlame has joined #dri-devel
OftenTimeConsuming has quit [Remote host closed the connection]
OftenTimeConsuming has joined #dri-devel
aravind has joined #dri-devel
elongbug_ has quit [Read error: Connection reset by peer]
anholt has quit [Remote host closed the connection]
<airlied>
gfxstrand: still awaiting an answer on copy prop vs cast deref following in 25536
<kode54>
I'm planning to test out 25576
fab has joined #dri-devel
benjaminl has quit [Remote host closed the connection]
benjaminl has joined #dri-devel
benjaminl has quit [Remote host closed the connection]
benjaminl has joined #dri-devel
crabbedhaloablut has joined #dri-devel
kzd has quit [Ping timeout: 480 seconds]
Duke`` has quit [Ping timeout: 480 seconds]
ngcortes_ has quit [Ping timeout: 480 seconds]
ngcortes has quit [Ping timeout: 480 seconds]
tzimmermann has joined #dri-devel
benjaminl has quit [Remote host closed the connection]
benjaminl has joined #dri-devel
sgruszka_ has joined #dri-devel
darkglow has quit [Remote host closed the connection]
darkglow has joined #dri-devel
itoral has joined #dri-devel
rasterman has joined #dri-devel
yyds has quit [Remote host closed the connection]
yyds has joined #dri-devel
glennk has joined #dri-devel
kts has quit [Ping timeout: 480 seconds]
mvlad has joined #dri-devel
sima has joined #dri-devel
daissi has joined #dri-devel
rgallaispou has joined #dri-devel
Dark-Show has joined #dri-devel
jkrzyszt has joined #dri-devel
eukara has quit [Ping timeout: 480 seconds]
Zopolis4 has quit [Quit: Connection closed for inactivity]
adavy has quit [Ping timeout: 480 seconds]
pochu has joined #dri-devel
frieder has joined #dri-devel
pcercuei has joined #dri-devel
<daniels>
well it sure is motivational waking up to a screenful of ‘you’re literally destroying Mesa out of malice’ text. guess I’ll go back to holiday.
ficoPRO10 has joined #dri-devel
lynxeye has joined #dri-devel
<daniels>
gfxstrand: too true - ‘And the world is full of critics but not very full of people offering real help.’
<daniels>
airlied: I don’t think a workshop would be of any value. the people who want to contribute and discuss already do. the rest is just bullshit from people who never look at the stats, engage with the people working on it, or try anything - just demand to permanently remove a single job because they saw a failure once
<daniels>
there’s a discussion to be had about runtime for sure, but that’s mostly about how much testing can really be decimated, and how much more difficult it should be to merge more driver features or new CTS
frieder has quit [Ping timeout: 480 seconds]
<daniels>
both expand runtime, but without a bunch of either manual effort or labour gone into tooling, there’s no ‘why doesn’t the CI team make my driver testing faster’ answer to ‘why are jobs so slow’ - it’s just (continually) trying to fix it after the fact
<daniels>
but this is just as much speaking into the void, so back to enjoying Scotland o/
neniagh_ has quit [Ping timeout: 480 seconds]
frieder has joined #dri-devel
sarahwalker has joined #dri-devel
tursulin has joined #dri-devel
ficoPRO10 has quit [Ping timeout: 480 seconds]
rgallaispou has quit [Remote host closed the connection]
rgallaispou has joined #dri-devel
ficoPRO10 has joined #dri-devel
yyds has quit []
yyds has joined #dri-devel
vyivel has quit [Read error: Connection reset by peer]
vyivel has joined #dri-devel
<pq>
"ignore all users, don't read the comments" is what I do with Wayland and HDR specifically. They haven't really come to my home turf yet either, and I sure won't be looking for comments about my work.
<pq>
Having practically zero presence in social media is good for you.
<ccr>
one can always go and read Phoronix forums for a dose of sane, well articulated and insightful views!
<pq>
yeah... I sometimes go there to check that yep, it's still the same - and I *never* read anything that might touch my work.
<pq>
reddit and whatever I've never even looked at to begin with
<daniels>
well yeah, feedback is good and helpful, but ideally it’s at least one of actionable, insightful, or at least friendly and well-intentioned
<emersion>
the fact that my users are power-users does mitigate somewhat these issues, i must admit
<soreau>
daniels: I hope you're enjoying your vacation . Scotland sounds great!
hansg has joined #dri-devel
vyivel has quit [Read error: Connection reset by peer]
alyssa has joined #dri-devel
cmichael has joined #dri-devel
vyivel has joined #dri-devel
YuGiOhJCJ has quit [Quit: YuGiOhJCJ]
vyivel has quit [Remote host closed the connection]
vyivel has joined #dri-devel
<MrCooper>
mripard: I don't see how the DRM panic handler could "prepare" atomic state; every atomic commit defines the full state for all CRTCs/planes/connectors, so going from the arbitrary state that happens to be active when a panic occurs to the "prepared" state would require going through the full atomic commit machinery, which is likely to run into locking issues if nothing else, in particular if the panic occurs while an atomic commit is being
<kusma>
The nightly CI runs haven't been passing all jobs in like forever...
<kusma>
That's a pretty bad indicator for the health of the CI IMO...
<kusma>
Seems the last three months has been particularly rough. July 16th was the last time it seemed reasonably healthy... Anything notable happened around then?
<MrCooper>
the only important job in scheduled pipelines is the git-archive one, other than that MR pre-merge pipelines are far more important
<kusma>
The nightly runs all jobs, so it's a good indicator for the jobs
<kusma>
Good indicator for the health of the jobs, I mean
<daniels>
kusma: nightly jobs aren’t the same as pre-merge; they’re the only ones that run both full suites that take too long for pre-merge, and also jobs which are just way too unstable for pre-merge
<kusma>
daniels: Sure, but it tells us a lot about the health of things.
<daniels>
so given they have much wider coverage, the expectation is that they fail and we fix it up post-hoc
<kusma>
Sure. But the same jobs have been failing for weeks on end
<daniels>
well yeah, people keep writing buggy code … and also some of those jobs are nightly-only for a reason, because the platform is fundamentally unstable
<daniels>
really? I’ve seen
<kusma>
And we notice some of the rarer of them when we touch on common code.
Lynne has quit [Remote host closed the connection]
<sima>
MrCooper, mripard yeah for panic the only realistic thing is that the panic handler just writes into the buffer that's currently being displayed
<sima>
and praying it's cpu visible
<sima>
everything else needs to take so many locks and touch so much hw state that it's not realistic
<kusma>
I'm not saying the nightlies must pass all the time or bust. But it failing for every single nightly for three months, that can't be the goal here.
<MrCooper>
as jfalempe pointed out, at least amdgpu can write to the FB even if it's not directly CPU accessible, doesn't help for tiling though
<sima>
there was infra for just writing the base address, but unless you exactly match position and size and format it's already too much for most hw/drivers
<kusma>
I've been away from upstream CI for a while, but last week when I merged a branch touching common code for the first time in a while... well, that was pretty rough.
<MrCooper>
that can happen, isn't directly related to the scheduled pipeline though
<sima>
MrCooper, I think we need a "write x/y pixel in this fb" helper and then just add enough helpers for the common formats/tiling layouts
<sima>
until it's good enough
Company has joined #dri-devel
<shadeslayer>
DavidHeidelberg: re Barcelona hack weekend, when is that? ( I live here, so happy to be a fly on the wall )
itoral has quit [Remote host closed the connection]
Daanct12 has quit [Ping timeout: 480 seconds]
Daanct12 has joined #dri-devel
<sima>
pq, glReadPixels is just reading, so as long as the driver has a fallback path to use the gpu to sample the pixel (since cpu mmap isn't guaranteed) it can be made to work
<sima>
but yeah not seeing anything in the spec that would guarantee it does work
<sima>
Company, ^^
<emersion>
pq, i believe EXTERNAL is read-only
<emersion>
pq, EXTERNAL is e.g. used with YUV shader lowering, and that's read-only
vyivel has quit [Remote host closed the connection]
<Company>
yeah, this is about reading only
<kusma>
The GLES 3.2 spec for glReadPixels says "An INVALID_OPERATION error is generated if the read framebuffer is not framebuffer complete."
leandrohrb5 has joined #dri-devel
<emersion>
i don't think EXTERNAL works with FBOs
<kusma>
You can always opt out of supporting it by reporting GL_FRAMEBUFFER_UNSUPPORTED
<kusma>
I mean, maybe technically speaking not (that's *really* just about the formats)...
<pq>
Company, for texturing from. FBOs are about writing into.
<emersion>
TEXTURE_2D != FBO
<Company>
but no idea, it could be that it only applies to formats where eglQueryDmaBufModifiersEXT() returns FALSE for external_only
<kusma>
@pq: No, not only.
<pq>
kusma, yes, you can use an FBO to read from a texture, but doesn't FBO completeness require the attachments to be renderable?
<kusma>
You can bind an FBO to the read-buffer only, for instance
<emersion>
"This extension provides a mechanism for creating EGLImage texture targets from EGLImages."
<Company>
rght, this is important in the ARM link above, too
<emersion>
this extension does not mention FBOs
<Company>
they only bind the read framebuffer
<pq>
where an FBO is bound to should not make any difference to whether the FBO is complete, should it?
<kusma>
@pq: It requires the texture (and all attachments) to be complete... And that requires it to be color-renderable if it's a color format (and similar for depth / stencil)
<pq>
don't you query FBO completeness before binding to anything?
<pq>
kusma, right, and EXTERNAL is not renderable. That's the whole point of EXTERNAL that it does not need to be modifiable at all.
<pq>
Company, interesting
<kusma>
@pq: Renderable is determined by the format, not by the source the image comes from
<Company>
lots of usual formats aren't renderable
<kusma>
There might be some differences between GL and GLES here, BTW
<pq>
kusma, yeah, and we're talking about YUV here, because why else would you ever use EXTERNAL.
<Company>
like, the FLOAT32 or INT16 formats often aren't
<kusma>
@pq: Yeah, those would probably not be renderable, unless the implementation supports an extension that marks them as such.
<pq>
TL;DR: no guarantees at all that what the ARM doc said will work.
<kusma>
The ARM docs doesn't seem to mention YUV?
<Company>
I'm just trying to find the best way to read a dmabuf into an rgb buffer - ideally without havning to support shaders
<kusma>
Company: Could you blit from it?
<emersion>
the safest choice is to render the EXTERNAL texture into a FBO
<emersion>
then glReadPixels() from that FBO
<kusma>
Yeah, that sounds like the easiest option
<Company>
kusma: blitting is from the read framebuffer, so I would assume glReaadPixels() works at that point?
<kusma>
Company: yeah, fair point
<pq>
A rendering pass will also let you do any pixel format (I mean bit depth specifically) conversion you might want. That may even be crucial for glReadPixels performance.
<Company>
so far I'm less concerned with performance and more with "will this work?"
<pq>
but again, I don't think any EGL spec yet guarantees detailed YUV->RGB conversion, so if it converts automatically, you'll get just "something".
<pq>
does "too slow" count as "works"?
<kusma>
Company: considering that external images are meant primarily for texturing, texturing with it would probably be the most robust option, no?
<pq>
there are many ways to do things that would end up too slow, like import to GBM and gbm_bo_map() (did that even map YUV at all? at least it will not convert to RGB)
<kusma>
I think Companysaid that they want RGB output anyway?
<pq>
exactly
<Company>
yeah, I want to end up with RGB output
<pq>
so they want YUV converted to RGB, and doing that on the CPU afterwards is too slow, but still works
<Company>
this is about the GTK fallback path, which is used for corner cases, debugging and such
<Company>
by default this is going to go into a shader just fine
<pq>
just do what emersion said then
<pq>
but if it's for debugging, are you sure you want an automatic YUV->RGB conversion?
<Company>
but someone might want to print a webcam screenshot into a PDF or whatever
<Company>
for now I do, because our infrastructure doesn't handle YUV
<pq>
ok
<Company>
I seem to be dealing with dmabufs instead of implementing color management support
rasterman has quit [Quit: Gettin' stinky!]
alatiera has joined #dri-devel
<pq>
I'm not surprised. The things I've have to do with Weston and still need to do...
frieder has quit [Ping timeout: 480 seconds]
Dark-Show has quit [Quit: Leaving]
<MrCooper>
pq: "why else would you ever use EXTERNAL", e.g. because some modifiers are supported only with EXTERNAL (e.g. the LINEAR one with the nvidia driver)
<pq>
oh, right, I never consider proprietary drivers.
kts has quit [Quit: Konversation terminated!]
<pq>
is that due to not supporting mipmaps?
<pq>
or clamp modes
<MrCooper>
it's just an example, might even be the same with nouveau, since it's due to the HW not supporting linear colorbuffers together with depth/stencil
<pq>
TEXTURE_2D can be read-only too, right?
<emersion>
can it?
<emersion>
i'd always assume write ops to work with TEXTURE_2D
<pq>
yeah, most formats are not renderable, traditionally
<emersion>
by "writable", i mean "upload pixel data"
frieder has joined #dri-devel
<emersion>
not "render to"
<pq>
well, not read-only, because you can write to it with TexImage2D
<pq>
so they need to forbid TexImage2D into a LINEAR...
<pq>
hmm, IOW, if one can import a dmabuf as TEXTURE_2D, you can write into it with TexImage2D? Interesting.
<pq>
I mean, TexSubImage2D
<pq>
TexImage2D would just drop the dmabuf
vyivel has joined #dri-devel
Zopolis4 has joined #dri-devel
<alyssa>
rpi3 seems especially unstble today
<alyssa>
seen them flake for infra several times today cross idffernet mrs
<alyssa>
eric_engestrom: ^
kts has joined #dri-devel
yuq825 has left #dri-devel [#dri-devel]
kts has quit [Quit: Leaving]
sarahwalker has quit [Remote host closed the connection]
glennk has quit [Ping timeout: 480 seconds]
glennk has joined #dri-devel
heat__ has joined #dri-devel
sgruszka has joined #dri-devel
<karolherbst>
can we get https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22580 reviewed and merged? I kinda need it. It's a MR to deal with array_derefs on vecs, which are legal spir-v. Not sure if also some Vulkan applications are running into this or not, but probably not, as nothing is crazy enough to do indirects on vecs besides LLVM
<mripard>
sima: so we should just drop the ball for some workloads and they'll never get to see any panic message?
<sima>
mripard, I think it's a balance between the amount of work and likelihood of having such a case
<mripard>
framebuffer compression is common on ARM
<sima>
and yeah if your display is in a mode where we can't print without a modeset, that's just it
<sima>
mripard, yeah so I guess needs a helper to clear the compression bit and write the raw pixel values
<sima>
same for all the other compressed formats
<karolherbst>
alyssa: maybe in case you are bored enough? Some CL applications are running into this bug (e.g. darktables) and if we use more CL for shaders, it might also happenr randomly
<sima>
but ime trying to flip from compressed to uncompressed in panic is no-go, way too much code you need to run and hw to reprogram
<mripard>
is the buffer/mapping going to be big enough to do so?
<sima>
mripard, if the buffer isn't mapped already there's nothing you can do
<sima>
i.e. if it's not in the kernel linear map
<sima>
or we'd need some really funny panic vmap slot which special tlb flush because you really cannot do an ipi for the normal pte flush flow
ficoPRO10 has quit [Ping timeout: 480 seconds]
<sima>
maybe the kmap_local would work, dunno
<mripard>
I agree, which is why I was saying we should have a separate buffer for that
ficoPRO10 has joined #dri-devel
<sima>
I guess could ask ogness and measure the loudness of the resulting scream
<sima>
mripard, what does that help?
<mripard>
you always have a premapped buffer?
<sima>
yeah
<mripard>
so you know for certain that you have that one you can use
<sima>
but how do you get that thing to be scanned out?
<sima>
you can guarantee that you'll be able to write into a preallocated buffer
<sima>
you can't guarantee the user will see it
<sima>
with the approach of using the buffer the user currently sees, you maximize the latter
<sima>
(not all cases, manual upload that's cpu driven is not going to happen)
<mripard>
you just change the base address of that plane to that buffer's mapping?
<sima>
mripard, also with my proposal in theory the driver could use the premapped approach and do a minimal flip to that, if the hw can do that
<sima>
mripard, yeah, if that's all you do then many drivers/hw can help
<sima>
but the moment you change format it's already much, much worse
<sima>
mripard, it's also not something that we need to do in generic code, since for many gpu the mapping issue isn't one
<sima>
like x86 doesn't need to punch out pages from the kernel mapping, worst you need some clflush
<sima>
or amd with their peek/poke registers
<sima>
(clflush is ok in panic)
<mripard>
with dma-buf and dma-heaps, you just can't control if the buffer is accessible by the CPU, at all
<mripard>
I guess it's something that x86 can ignore, but that's not the case for the rest of the world
<MrCooper>
mripard: "no panic output under some circumstances" is strictly better than "no panic output ever"
<sima>
depends upon the gpu, if you have mmio peek/poke it'll work
<sima>
yeah that too
<javierm>
mripard, sima: but isn't the case anyways with the current VT/fbcon panic messages that those could be lost after a KD_GRAPHICS ?
<sima>
javierm, yeah it's all shunted off because it caused way more trouble than where it helped
<sima>
I think all the panic you currently get is if a) you're on fbcon b) your kms driver is direct scanout
<sima>
there's not a lot of b) left since the helpers switched to shadow buffers for many cases
<javierm>
what I'm trying to say that in practice, the panic handler will be useful for very early panic before the initrd even can start a user-space KMS based console
<javierm>
because anything after that, is just best effort if the fbcon/VT can recover from a switch from VT
<sima>
userspace based console wont print panics either
<sima>
that kills the kernel
<sima>
oopses can get through, if you're lucky
<mripard>
MrCooper: guess I'm just tired to be on the receiving end of "well it works for i915 but we never really bothered with ARM so you'll have to do it all again"
<jfalempe>
javierm, yes that's my primary focus, when you messed up the initrd. so the panic with simpledrm already works for this case.
<sima>
mripard, the thing is I think you can extend it to the arm cases and compression cases
<sima>
but it all gets very hw/driver specific fast
<sima>
and some combos just wont work ever
<sima>
like I see no way to make cpu driven manual upload over usb or spi work, ever
<sima>
so you could also say that no one ever thinks about usb&spi, therefore we need to nuke this entire effort to add a panic handler
<pq>
sima, is it not the normal case to have the current VRAM framebuffer likely not even CPU accessible at all, let alone already mapped? As GPU rendering and KMS display do not normally need it to be. So magic peek/poke it is.
<mripard>
we really can't access USB and SPI atomically
<sima>
pq, there's a lot of cases where it happens to be accessible
<mripard>
for a lot of devices we can switch planes atomically
<mripard>
so that's not really the same situation?
<sima>
well I'm also not understanding why we can't make a panic framework work for that case with a bunch more helpers and stuff?
<sima>
inflicting the atomic switch on everyone is what we had, and it was a good fireworks show
<sima>
it's really hw dependent whether the atomic flip is a good idea or not
<sima>
also you need nmi, not just atomic
kzd has joined #dri-devel
<alyssa>
karolherbst: will look
<karolherbst>
thanks!
<sima>
and nothing except panic code in your driver ever runs in nmi context
<alyssa>
i like excuses to procrsatinate slides
frieder has quit [Ping timeout: 480 seconds]
<karolherbst>
ohh right..s lides
<mripard>
sima: I repeatedly said in my mails that I was fine not going for the atomic commit road but that the get_scanout_buffer interface was too restrictive to expand it to other drivers later on
<mripard>
if we figure out another interface that allows to expand later with helpers, I'm all for it
Haaninjo has joined #dri-devel
<jfalempe>
mripard, what do you need that can't be done with the get_scanout_buffer ?
<sima>
jfalempe, by far not everything can be done with an iosys_map
<sima>
but I think the "I need a preallocated buffer" case should actually fit, it's amd's peek/poke that doesnt
<sima>
jfalempe, essentially for more general case it's the ->panic_draw_xy that noralf linked somewhere in that discussion
heat__ has quit []
<jfalempe>
Yes, I think the driver can pre-allocate a buffer, and use that in the get_scanout_buffer() then it probably wants a scanout_flush() callback to send it the hw ?
heat has joined #dri-devel
<jfalempe>
sima, Yes, I already planned to add this too, but I don't think it's what mripard needs.
<sima>
jfalempe, if it's just a base address flip then that can be done in get_scanout_buffer
<jfalempe>
sima, but if it needs a flush, or dma start, then it should be done after the buffer is written.
Daanct12 has quit [Ping timeout: 480 seconds]
<sima>
jfalempe, I guess there's a bit an argument whether you want a raw data write function or a pixel write function
kzd has quit [Ping timeout: 480 seconds]
<sima>
looks like the current idea is to put format handling into shared code, not sure that'll scale that well for all the driver specific tiling/compression
<sima>
jfalempe, flush after every write
<sima>
speed doesn't matter in panic, you're competing with a serial console
<jfalempe>
yes, we're not looking for high FPS in the panic code ;)
<sima>
at least until we have a driver where a flush is a full screen upload
<sima>
then "flush every pixel" is probably too slow
frieder has joined #dri-devel
<mripard>
sima: what hardware can't turn off compression or change tiling/compression atomically? All the ones I've seen have the registers vblank synchronized so it doesn't really matter
<sima>
mripard, reallocating fifo space tends to kill you
<sima>
or require at least so much code to be run that it's a big no-no for panic context
<mripard>
yeah, ok. so same thing than vc4
<sima>
but again it's highly hw dependent, if you have hw that can always & easily flip to full screen xrgb8888 scanout and nothing will ever break
<sima>
then that's probably a good thing to code up in the ->panic_prep callback (or whatever we paint that bikeshed)
cmichael has quit [Quit: Leaving]
<sima>
so that might need some extensions of the framework
<mripard>
which is kind of why I was suggesting the "prepared commit" approach, even though it's flawed
<sima>
and a driver where this really is better than just painting into the current buffer
<jfalempe>
sima, that can be done in the get_scanout_buffer() before returning the buffer.
<sima>
mripard, the problem ime is that the moment you do this, you tempt people to just call into their modeset code
<sima>
"it worked in testing"
<sima>
been there, done that, ripped it all out
<sima>
so that's why I'm heavily leaning towards a panic framework that's very heavily biased towards "we're sure this wont make anything worse"
<sima>
jfalempe, yeah I guess you could shoehorn that in
ficoPRO10 has quit [Ping timeout: 480 seconds]
<sima>
the trylock should already guarantee that the modeset code isn't touching the hw at least
<sima>
uh
<sima>
jfalempe, the locking in your patch seems entirely absent?
<sima>
that looks not good at all
<mripard>
jfalempe: I guess what I'd really like to see is how that works with a "real", helpers based driver, not a firmware based one.
<jfalempe>
sima, yes, I need to look into that too
<sima>
I thought noralf had coded up something in some really old patches?
<sima>
or at least I typed up some design
<mripard>
(firmware or relying on shadow buffer and conversions)
<sima>
mripard, I'm on board with that
<mripard>
like get_scanout_buffer() operating at the drm_mode_config_funcs level probably doesn't work very well when you have multiple CRTCs and outputs
<sima>
jfalempe, I'd expect you'll end up with higher level functions like the ones I've proposed, and more bits pushed into helpers that drivers call as needed
<sima>
mripard, yeah the discussion I had with noralf years ago covered all that
sre has joined #dri-devel
<jfalempe>
sima, currently I know mostly simpledrm, mgag200 and ast drivers, which are very simple. So that will be hard for me to make it work on more complex drivers.
kzd has joined #dri-devel
<sima>
the real fun is probably a compressed yuv planar buffer on an arm-soc display
<mripard>
jfalempe: I can help with that
<sima>
aside from peek/poke I think that's the most complex one where we can realistically push out pixels
<mripard>
and probably give you a board if you can't get one
<jfalempe>
mripard, thanks
<jfalempe>
yeah I only have a few very outdated arm board here.
Dark-Show has joined #dri-devel
frieder has quit [Ping timeout: 480 seconds]
<robclark>
mripard: re: fbc, there is probably a straightforward way to have no compression.. ie. fill the metadata plane with a value that indicates "no compression" (which might just mean memset it to zero)
<robclark>
mripard, sima: hmm, but crash during protected content scanout might be fun
<pq>
that's just reboot, right? ;-)
<sima>
robclark, yeah just clearing the metadata plane should work for most I think
<sima>
there's some where the metadata is in some special gpu buffer though :-/
<sima>
robclark, and yeah compression might be fun, but iirc on intel you can clear it with a single mmio write and nothing else
<sima>
*encryption I meant
ficoPRO10 has joined #dri-devel
<sima>
and then write unencrypted panic stuff into the same buffer
sarahwalker has joined #dri-devel
<mripard>
sima: the way it's typically done on ARM is that Linux gets a buffer that has been decoded by the firmware and the content can't be accessed by the access level Linux runs with
<sima>
ah yeah that wont work
<sima>
so I guess another case where you need to flip to something else
<robclark>
you might still be able to flip to a linear
sgruszka has left #dri-devel [#dri-devel]
sgruszka has joined #dri-devel
<mripard>
sure, but you can't write into the same buffer
sgruszka has quit [Remote host closed the connection]
vyivel has quit [Remote host closed the connection]
vyivel has joined #dri-devel
junaid has joined #dri-devel
urja has quit [Remote host closed the connection]
urja has joined #dri-devel
rgallaispou has left #dri-devel [#dri-devel]
AnuthaDev has joined #dri-devel
Duke`` has joined #dri-devel
kts has joined #dri-devel
kts has quit []
kts has joined #dri-devel
vyivel has quit [Remote host closed the connection]
junaid has quit [Remote host closed the connection]
janesma has joined #dri-devel
Zopolis4 has quit [Quit: Connection closed for inactivity]
<mdnavare>
airlied: vsyrjala : For the drm_atomic_check_only() warning of "Adding extra CRTC not allowed without modeset" we see that getting trigged when we move the overlay plane playing a youtube video across to the second screen on MST, this needs higher cdclk now and hence all crtcs get added to the state so that they can all be turned off. But we just throw this warning and continue causing a EBUSY err later and visual artifacts like
<mdnavare>
glitches. Can we reject this here instead of just a warning?
<mdnavare>
or perhaps reject in the vendor hook when this cdclk condition arises?
vyivel has joined #dri-devel
<sima>
mdnavare, you need to flag that this requires a modeset in both crtc
<sima>
at least as an interim fix
<sima>
crtc_state->mode_changed = true; when you pull in crtc for cdclk changes
tzimmermann has quit [Quit: Leaving]
hansg has quit [Quit: Leaving]
aravind has quit [Ping timeout: 480 seconds]
<mdnavare>
Thanks sima , I think we do that already but in drm_atomic_check_only it still throws that warning instead of just rejecting it altogether
kts_ has joined #dri-devel
kts_ has quit [Remote host closed the connection]
<sima>
mdnavare, oh right the logic is different, but maybe we need to adjust
<sima>
the driver is supposed to check for drm_atomic_state->allow_modeset and fail the commit outright
<sima>
the idea is that on some hw you can fall back to less optimal configurations that don't require a full commit
<sima>
if the driver only sets ->mode_changed then it cannot differentiate
<sima>
robclark, 82836692d5d74 is very entertaining, silently upgrading ALLOW_MODESET when userspace didn't set it is really not how the uapi is supposed to work
rasterman has joined #dri-devel
<sima>
I think the allow_modeset = true needs to go
<mdnavare>
ok so you are saying that in this case say in i915 where it needs this cdclk change, and when it adds crtc, it can check the drm_atomic_state->allow_modeset and if it is not set then fail it, sima
kts has quit [Ping timeout: 480 seconds]
macromorgan_ has joined #dri-devel
<sima>
robclark, and probably some shuffling to run the ctm pm stuff in the right place ...
vyivel has quit [Ping timeout: 480 seconds]
<vsyrjala>
for i915 what i'd like to have is proper barriers in the commit timeline. that's the reason we add the extra crtcs to the state currently, they don't actually need a commit. but -ENOTIME so haven't tried to cook up anything real
neobrain has quit [Remote host closed the connection]
frieder has joined #dri-devel
vyivel has joined #dri-devel
<mdnavare>
so as an interim solution, can we fail this in i915 intel_atomic_check after checking drm_Atomic_state if allow modeset is not set?
<mdnavare>
vsyrjala: Like sima suggested
macromorgan has quit [Ping timeout: 480 seconds]
sarahwalker has quit [Remote host closed the connection]
<robclark>
sima: but we need to allocate the dspp block.. which we can't do statically (potentially fewer dspp blocks than lm's (aka crtc))
AnuthaDev has quit [Quit: AnuthaDev]
<sima>
robclark, yeah, but drivers aren't allowed to touch allow_modeset
<robclark>
I can ask the compositor folks about whether they can set ALLOW_MODESET when there is a ctm change.. but that isn't how it has worked before and not sure if it will cause problems on other drivers
<sima>
robclark, yeah this is a pretty bad situation
<sima>
robclark, so why can't you fix msm code?
<sima>
assuming you can actually reassign this stuff without full modesets
<sima>
if you need flickering, then this is flat out breaking uapi
<robclark>
I think the allocating dynamically allocated resources is tied to allow_modeset.. but does not mean flicker in this case
anholt has joined #dri-devel
oneforall2 has quit [Remote host closed the connection]
<robclark>
abhinav__ or lumag probably more famliar
oneforall2 has joined #dri-devel
frieder has quit [Remote host closed the connection]
<mdnavare>
robclark: Have you sent a patch for this already to clarify the drm rules around allow mdoeset?
<zamundaaa[m]>
robclark: KWin at least doesn't... but it may if there's a modeset pending because of other stuff already. So the prop changing only with modesets would be pretty annoying
<mdnavare>
robclark: And when you say msm fix or i915 fix for this would be to then reject this in the driver?
<robclark>
yeah, I kinda prefer the current soln that sima doesn't like so it isn't a modeset from uapi PoV
<zamundaaa[m]>
Can you un-set the CTM / set it to identity without a modeset?
alyssa has quit [Quit: alyssa]
<sima>
robclark, the trouble is that there's nothing guaranteeing that this it the only reason you need a modeset
<sima>
so you could end up with flicker that's not expected by userspace
gouchi has joined #dri-devel
<sima>
so yeah probably needs some rework where you run the right additional functions for the case where it's not a modeset, but just a ctm enable change
<mdnavare>
robclark: Currently does msm just set the crtc_state->mode_changed= true w/o letting the userspace know about it? I should check what i915 does when it sees that cdclk needs to be changed and adds all crtcs to the state
<mdnavare>
I guess my question would be should such atomi_check calls be rejected in the vendor specific drivers for these conditions when perhaps its needing to set mode_changed or should this be rejected higher up in drm_atomic_check_only and return an -EINVAL or some error
<mdnavare>
robclark: sima vsyrjala airlied : Any suggestions?
<sima>
robclark, also I don't really buy your "no flickering" claim, setting crtc_state->mode_changed forces a full on->off->on transition of a crtc, and I don't see any special handling of that case in msm code
<sima>
so pretty sure this is just broken
<sima>
at least a few greps for the usual suspects didn't turn up anything interesting
<robclark>
sima: idk but people would be yelling if ctm triggered a flicker so I guess it works somehow
<sima>
robclark, it's only when you enable/disable it, not when you change it
<sima>
and yeah I'm pretty sure it will go on/off/on
<sima>
also, maybe manual upload panels paper over this for the internal output?
<robclark>
hmm, I can't rule out that userspace does modeset initially with identity matrix?
<sima>
yeah
Kayden has quit [Quit: pulseaudio broke again]
<sima>
if you do a modeset to do the ctm_on/off transition, it's fine
<sima>
which would also mean we could silently fix this by just dropping the allow_modeset = true; case
<sima>
anyway I sent out the doc patch with the 2 reasons this is bad in the commit message
Kayden has joined #dri-devel
<sima>
once we have consensus on that one way or the other it should also be clearer what msm needs to do
lynxeye has quit [Quit: Leaving.]
<sima>
robclark, I would be not surprised at all that if you dig into compositor history people do the identity matrix "because it avoids flickering"
<robclark>
actually I don't see where we obviously need allow_modeset.. but I'll defer to abhinav__
<sima>
well you set crtc_state->mode_changed
<sima>
allow_modeset only breaks the uapi
<sima>
it has no impact on what actually happens in the commit or atomic check
junaid has joined #dri-devel
<robclark>
hmm, exiting self refresh also sets allow_modeset
<sima>
yeah we cheat in some places ...
<sima>
probably should do better, and iirc the idea was to just pull these helpers into drivers since it didn't really seem to worth the trouble
<sima>
for self refresh the trouble is that there's not really any other way to make it happen
<sima>
unless you push the responsibility for going into self refresh to userspace
<sima>
robclark, I guess what we could do is make it a per-crtc flag to override the allow_modeset check or something
<sima>
or well add a new one like crtc_state->uapi_silent_modeset
<sima>
and then use that one in self refresh helpers
<robclark>
maybe .. but don't have time right now and wasn't the one that implemented that so not sure offhand what the issue they were hitting was
<robclark>
ask me after xdc ;-)
<robclark>
or abhinav__ when he shows up
vyivel has quit [Read error: Connection reset by peer]
<sima>
hm forgot to cc abhinav__ on the patch
<sima>
but I guess irc backlog
vyivel has joined #dri-devel
<lumag>
sima, robclark: yes, the issue is that we have to reallocate resources if CTM usage changes.
<sima>
lumag, yeah that's what the code comment says
<sima>
the question is whether that results in flicker or not, and whether it has to result in flickering or not
<lumag>
Only a part of LM blocks (=crtc) support CTM, other blocks do not. So if we are dual-headed, turn CTM on the first head, then disable it, turn it on the second head, the second head might not get it.
<lumag>
sima, I think flickering might depend on the rest of the display pipeline. E.g. DSI panel might be quite permissive. DSI->HDMI bridge? Not so sure.
<robclark>
I don't believe re-assigning dspp to crtc in and of itself would flicker, any more than sspp (plane) reassignement would
<lumag>
robclark, we are reassigning LMs, not just DSPP.
<lumag>
And the issue might be in disable_outputs / enable_outputs if I understand sima's concern
j`ey has joined #dri-devel
<robclark>
oh, well.. I think still the same, as long as there is one avail to reassign
<lumag>
robclark, if there is none, we just fail the atomic_check
<sima>
yeah just reassigning limited shared resources shouldn't cause issues
<sima>
you might have sync issues if you push them through as nonblocking commits, maybe it was that?
<lumag>
sima, so what's your concern? disable/enable_outpus?
<lumag>
no, nonblocking commits will not work. We have to wait for vsync
<sima>
you also set ->mode_changed
<sima>
lumag, I meant not blocking the ioctl
<lumag>
sima, what would be the point?
iive has joined #dri-devel
<sima>
and setting ->mode_changed forces a full on->off->on cycle for the crtc
<sima>
at least if I'm not entirely out of the loop how this code works
<sima>
so that's definitely flickering, and then you hide that all by force-setting allow_modeset, which really only userspace is allowed to set
junaid has quit [Remote host closed the connection]
<sima>
lumag, also no idea what you meant with you question
<lumag>
sima, I asked, what would be the point in not blocking the ioctl.
<lumag>
and anyway, if I'm reading drm_atomic_get_crtc_state() comment, we are in a bigger trouble.
<sima>
lumag, it's like half the reason why the atomic ioctl exists
<sima>
lumag, yeah you're probably in big trouble all around
<lumag>
:D
<sima>
DRM_MODE_ATOMIC_NONBLOCK from the uapi header is what I meant with nonblocking
<sima>
so yeah you cannot add random other crtc
<sima>
only if allow_modeset is set
<lumag>
I mean, we might have to add new CRTCs to the state, if we change backing hardware blocks.
<sima>
and you're not allowed to just set that
<lumag>
ye
<lumag>
yes
<sima>
so you cannot do that
<sima>
instead if you have to manage shared resources, you have to put these into separate driver private state structures
<sima>
which you can then add without breaking anything
<sima>
but then you get the fun of having to manage synchronization yourself for nonblocking atomic commits
<sima>
so fixing this will likely require a pile of work
<sima>
also the current "fix" causes flickering, 90% sure on that
<lumag>
we already have a private obj with allocated resources state, but then this state was propagated to drm_crtc_state.
<sima>
yeah propagating is fine
<sima>
adding crtc without ->allow_modeset being set is not fine
<sima>
setting ->allow_modeset is very much not fine
<lumag>
Yes, I got that
<sima>
(I need to fix self refresh helpers I guess)
<lumag>
propagating is not fine, as we then are adding new crtc states to the state
<sima>
oh I meant just propagating the allocation to each crtc
<sima>
so that you know which one is your resource for a specific crtc without having to always get the global state (which isn't great either because it might result in oversync issues)
<sima>
if you propagate more, then that might be not fine
<lumag>
ok, we are propagating in the atomic_mode_set, which is probably fine, as we are touching the flipped state rather than adding stuff to new_state
<lumag>
I think I'd like to catch robclark and abhinav__ during XDC (and you if you have time, ofc) and discuss a part of that. I wanted to rework parts of that resource state machine.
<j`ey>
do you need to be subscribed to dri-devel to send emails?
junaid has joined #dri-devel
pekkari has joined #dri-devel
pekkari has quit []
<sima>
j`ey, shouldn't be needed, just means you need to be manually approved which can cause delays
<abhinav__>
sima sorry i am just catching up on this thread. we need a full modeset when ctm state changes because we reallocate the hw resources of the pipeline feeding one display to another. so lets say we had hardware blocks to just feed one display and then someone turned off the CTM using the display settings on that display, we can re-assign it to the
<abhinav__>
other one which also needs it but didnt get it as we had limited resources. Is your concern why we set mode_changed or more about allow_modeset
<sima>
allow_modeset
<sima>
but you probably need that because you set mode_changed and add crtc
<j`ey>
sima: ok, thanks!
<sima>
you hit the EBUSY uapi check if allow_modeset isn't set by userspace for this case
<abhinav__>
regarding allow_modeset, I have to check about that exactly. but yes most likely that is the reason that usermode did not set allow_modeset for this case
<abhinav__>
so are you suggesting that we keep mode_changed but remove allow_modeset setting and let usermode handle that?
<abhinav__>
for CTM on/off transition cases
<lumag>
abhinav__, I fear that wouldn't work. We might have to crack our atomic_check to force resource reallocation in this case w/o setting crtc->mode_changed.
<sima>
abhinav__, you'll probably have more bugs, see the things that lumag mentioned
<sima>
lumag, ideally yes, but the bigger issue is setting crtc_state->mode_changed on other crtc
<sima>
that's the thing which gets you into trouble with the EBUSY check
<sima>
but yeah ideally you can allocate shared resource for ctm without ->mode_changed
<sima>
note I'm not sure you set ->mode_changed on other crtc, but it sounded like you do from what you described
<abhinav__>
lumag without mode_changed would not be possible. there are display blocks which cannot be re-assigned without a full teardown
<abhinav__>
and we cannot expect usermode to know all those hw blocks
<sima>
abhinav__, yeah but you can't do a full teardown if userspace didn't allow that
<sima>
if allow_modeset is not set by userspace, and you need a modeset that results in flickering and stuff like that, you must fail the atomic ioctl
<sima>
or well, atomic_check callback really from a driver pov
<sima>
probably most fundamental rule of the atomic ioctl is that if ALLOW_MODESET is not set, there's zero flickering
<sima>
and that the driver will try hard to hit the next vblank (but that one is a bit more a best effort thing)
tobiasjakobi has joined #dri-devel
tobiasjakobi has quit []
<mdnavare>
sima: so looking at Danvet's rules , looks like anytime driver needs to override this if allow modeset not set we need to reject that with a -EINVAL and same ind rm_atomic_Check_only. I am gonna send a patch for that
ngcortes_ has joined #dri-devel
ngcortes has joined #dri-devel
Company has quit [Quit: Leaving]
<abhinav__>
sima i see your point but I dont know if this is a generic rule observed by all drivers. I do see multiple drivers forcing mode_changed based on their hw specific conditions. Are you suggesting each of those are accompanied by a corresponding allow_modeset? Regarding flickering, I do get it that its a unexpected behavior for usermode. but it
<abhinav__>
depends on the use-case. if lets say a user toggled a setting , they might actually be fine with one blink or lets say user was using a writeback connector then there wont be any physical display for the flicker. so i am wondering if this is a really a hard expectation for all use-cases
<sima>
abhinav__, setting mode_changed is fine
<sima>
setting allow_modeset is where you break uapi
<sima>
and the thing is, yes sometimes flickering is required by the hw, but the atomic ioctl contract is that userspace, and userspace alone gets to decide whether flickering is ok
<sima>
because like you point out, this dependes upon the use-case, and the kernel has no idea what use-case userspace is aiming for
<sima>
that's why we have the ALLOW_MODESET flag in the uapi
<sima>
wrt writeback, the uapi allows you to have a writeback connector in parallel to a real one
<sima>
so even that one is kinda complicated
<sima>
also ALLOW_MODESET also allows you to impact _other_ crtc
<sima>
so there's cases where you could support a writeback request, but it would impact other crtc
<sima>
again, userspace gets to decide
<sima>
drivers might set mode_changed a few too many times for the writeback case, that might indeed be the case
<sima>
but also, no one then overrules userspace by changing ->allow_modeset
<lumag>
sima, so the expectation is that if we need to perform a modeset, we set the crtc_state->mode_changed, then drm_atomic_check_only() would return -EINVAL if userspace didn't allow full modesetting?
<sima>
yeah
<sima>
there's the hack in self refresh helpers, but I think I know how to fix that one
Duke`` has quit [Ping timeout: 480 seconds]
<abhinav__>
sima ok, i think we need to evaluate the allow_modeset removal in the msm ctm code and if it breaks, we would need compositor to set allow_modeset for ctm transitions. I guess for hardware which doesnt need a full modeset, they wont set mode_changed so no impact
<sima>
abhinav__, yeah if we are breaking current userspace that's going to be tricky
<sima>
I guess worst case we need a Kconfig/modparam to enable the old hack
<lumag>
abhinav__, I wonder how does that work for mali, which just sets crtc_state->mode_changed
<sima>
and ideally the driver does the least amount of ->mode_changed that you can do with a given hw
<zamundaaa[m]>
Setting the CTM requiring a modeset might be acceptable. If unsetting it also requires a modeset, that's not gonna work
<sima>
zamundaaa[m], msm needs both, at least looking at the code
<abhinav__>
yeah we need both
<sima>
I guess what would work is if the driver internally sets an identity matrix if allow_modeset isn't set
<sima>
and only nukes the ctm for real when allow_modeset is set
<sima>
that would at least prevent the mode_changed when disabling the ctm
Duke`` has joined #dri-devel
<lumag>
and if allow_modset is set, this would allow us to play dirty tricks with all CRTCs, correct?
<sima>
zamundaaa[m], but in general there really is a pretty endless list of hw limitations if you look across all drm drivers
<sima>
lumag, yup, that's what I'm trying to clarify in my doc patch
<sima>
a driver can look (as in read, not write) at allow_modeset in atomic_check
<abhinav__>
lumag i guess it all depends on which usermode it was tried with
<sima>
if you have a more "optimal" (whatever that means) path which requires a modeset
<lumag>
abhinav__, yep
<sima>
and a fallback for the case without modeset (could be things like just keeping an identity matrix, or could mean you fail with EINVAL)
<zamundaaa[m]>
sima: sure, but userspace can't work if you require modesets for disabling color blocks
<sima>
zamundaaa[m], if this is a hard requirement, we need it documented and agreed with drivers and have testcases and all that
<sima>
pretty sure there's plenty that break this
<zamundaaa[m]>
We have it as a hard requirement for the new color pipeline system. Idk if it's documented anywhere for the existing stuff
<sima>
malidp is pretty harsh with setting ->mode_changed
<lumag>
abhinav__, One of the issues was that atomic_check(encoder) was not called unless we set mode_changed early. I think we can follow mali's trick more or less.
i-garrison has quit [Remote host closed the connection]
<lumag>
sima, is mali also going to be frowned upon? Then we can't use it as a role model.
<sima>
lumag, as long as your callbacks allow, the atomic helpers are intentionally idempotent
<sima>
so just call them again (like mali does)
i-garrison has joined #dri-devel
<sima>
lumag, no idea, I don't make the uapi rules myself, that's a contract between what drivers can do and what compositors need
<sima>
emersion, ^^
<lumag>
Interesting. IPU also forces mode_changed in significant amount of conditions.
<sima>
yeah plane changes have lots of these conditions
<sima>
like nothing guarantees even that you can go back to a full screen xrgb fb
<zamundaaa[m]>
for generic compositors at least, that definitely sounds very broken
<sima>
zamundaaa[m], I guess for robustness across the board (since this is by for not just a ctm issue) you need a fallback to full screen fb and gpu compositioning and then just mark that driver as "can't use fancy features"
<sima>
zamundaaa[m], display hw is very very funny
<sima>
if you include all the drivers that require a modeset for fb format changes, the list gets very long
<zamundaaa[m]>
Well, I don't think the rules need to be (or can be) that strict for every case. With color management stuff it's just a bit more important because it's so inherently dynamic
<sima>
zamundaaa[m], but this is already the case for planes
<sima>
and I thought for many cases you practically need planes to make use of hw ctm
<sima>
and that would make people who just want to have a static setup rather unhappy
<sima>
zamundaaa[m], also if we require rules like this, it would mean that on some especially funky hw we'd not be able to expose hw features
<lumag>
... and then comes gnome-shell/mutter, telling: ``Oh, so you have universal universal planes? We don't like you, let's segfault!''
<zamundaaa[m]>
sima: For planes and formats my expectations from drivers are:
<zamundaaa[m]>
- if switching to a format works without a modeset, switching back to the previous format also needs to work without a modeset. If the first switch fails without a modeset, that would be fine
<zamundaaa[m]>
- if enabling a plane works without a modeset, disabling it again also has to work without a modeset. Enabling it in the first place may fail
vyivel has quit [Remote host closed the connection]
vyivel has joined #dri-devel
<lumag>
zamundaaa[m], it might be not that simple, unless you disallow other modesets inbetween.
tlwoerner_ has joined #dri-devel
<sima>
zamundaaa[m], yeah I think you're already asking a lot more than drivers/hw can
<lumag>
I mean, the 'enabling' part might have freed hardware resource that got taken by the other output, so 'disabling' = 'switching to previous' might not work anymore.
<sima>
especially when you chain up changes
<sima>
and I think we flat out had asymmetric conditions too
tlwoerner has quit [Ping timeout: 480 seconds]
<sima>
zamundaaa[m], like lumag points out, general the real fun only kicks in with multiple outputs though
<zamundaaa[m]>
how would enabling a plane on one output free up resources on another, when you just have non-blocking commits without allow_modeset?
ficoPRO10 has quit [Ping timeout: 480 seconds]
<sima>
zamundaaa[m], if you cover the underlying plane entirely it doesn't really need anything anymore
<sima>
not sure that's the example lumag is thinking of
<zamundaaa[m]>
sima: well, that sucks... I don't think desktop compositors can work with that. From the moment that a mode is set, until the user manually and intentionally triggers a modeset in some way, there must not be any flicker, ever
<lumag>
zamundaaa[m], I was thinking about switching the format. If we switch from YUV to RGB, this might free CSC blocks, which might be taken by another user. In case of msm it is slightly different path, but the idea is the same.
<karolherbst>
or maybe it should use disk_cache_put_nocopy instead?
<sima>
zamundaaa[m], yeah I fully understand, but I also think the pragmatic reality is that you probably need a fallback
<sima>
so that users can at least file a bug report against their drm display driver that it sucks too much
<sima>
if you just keel over, that's not good
Duke`` has joined #dri-devel
<sima>
like we have a really hard time just making the current atomic check/commit stuff right, guaranteeing certain things around always being able to go back means the testing combinatorial cases go really badly through the roof
ficoPRO10 has joined #dri-devel
<sima>
zamundaaa[m], and if you insist too much that a driver must not require a modeset you just get hacks like the one in msm that sparked this entire discussion, where the driver flat out lies to userspace
<robclark>
it might be possible to guarantee that we can at least always unwind to single fullscreen layer, to cut the testing space.. since that would be good enough for compositor to fallback to gpu
<zamundaaa[m]>
robclark: yeah, that would be enough
fab has quit [Quit: fab]
<lumag>
robclark, well, with multirect in place we can do that now.
<sima>
robclark, once you reallocate fw and scanout bw that's out
<sima>
like 1. full screen rgbx 2. go to full screen yuv 3. enable 2nd output that needs the bw you freed up with the switch to yuv 4. no going back to full screen rgbx on first output
<sima>
and if we just preemptively reserve everything we possible need for full screen rgbx people will get pissed about missing power/feature targets
<sima>
all we can guarantee that if userspace does a TEST_ONLY | ALLOW_MODESET check with full fb rgbx
<lumag>
sima, then it should be limited to: go fullscreen rgbx on a single output.
<sima>
that we can get to that with a full modeset
<robclark>
sure, but compositor could still unwind to _single_ fullscreen layer on _one_ display.. shutting down non-primary outputs is less bad than getting into a point where it can't pageflipe
<lumag>
quite usefull... useless.
<sima>
lumag, that requires a modeset to kill the other output?
<zamundaaa[m]>
robclark: to do that you'd still need a modeset
<lumag>
robclark, then we are also back to the question, what is the primary display
<lumag>
or which one is primary
<sima>
yeah as soon as we allow modeset as an escape hatch there's all kinds of things userspace can check and the kernel will guarantee
<sima>
like with ALLOW_MODESET if a configuration A works, you can _always_ switch to it
jkrzyszt has quit [Remote host closed the connection]
dtmrzgl1 has quit [Remote host closed the connection]
<sima>
it's only for !ALLOW_MODESET where a configuration might be possible, you can't switch to it from the current one without flickering
dtmrzgl has joined #dri-devel
rsripada has quit [Remote host closed the connection]
rsripada has joined #dri-devel
ngcortes_ has quit [Remote host closed the connection]
ngcortes has quit [Remote host closed the connection]
<sima>
and if we'd want to allow userspace to check for the !ALLOW_MODESET case, we'd need to be able to validate entire flip changes
ngcortes_ has joined #dri-devel
mattrope has quit [Remote host closed the connection]
<sima>
so you could check that A->B and B->A works
dolphin has quit [Remote host closed the connection]
dolphin has joined #dri-devel
<sima>
or if you then do B->C you can check that C->A still works
mattrope has joined #dri-devel
aswar002 has quit [Remote host closed the connection]
<sima>
but the current uapi doesn't allow you to do check whether two subsequent flips would work
aswar002 has joined #dri-devel
shankaru has quit [Remote host closed the connection]
fdu has quit [Remote host closed the connection]
fdu has joined #dri-devel
<sima>
*consecutive
shankaru has joined #dri-devel
<sima>
and frankly doing TEST_ONLY for A->B->C scares me a bit ...
cyrinux has quit []
<sima>
(it would mean a rewrite of everything since drivers assume the current/preceeding state is attached to the object)
pzanoni has quit [Remote host closed the connection]
lstrano has quit [Remote host closed the connection]
Ryback_ has quit [Remote host closed the connection]
Ryback_ has joined #dri-devel
pzanoni has joined #dri-devel
lstrano has joined #dri-devel
cyrinux has joined #dri-devel
tursulin has quit [Ping timeout: 480 seconds]
<lumag>
sima, might I ask an unrelated question? We have been trying to get https://patchwork.freedesktop.org/series/120393/ for quite a while. Patches are acked to go via drm-something, but several last revisions went w/o any attention and/or comment.
<lumag>
Is it fine to push them through drm-misc? Should it be taken through drm-intel? Just drop them and drop USB-C DP on msm? :D
<sima>
lumag, if you have an ack from i915 display maintainers (jani or rodrigovivi) and the ack for the usb side then just push into drm-misc
i-garrison has quit []
<sima>
assuming no one is screaming about the concept in general, I have not much useful clue for both bridges and usb-c oob signalling
<lumag>
sima, ack, thank you
i-garrison has joined #dri-devel
<lumag>
sima, you don't want to know that :D
<lumag>
It was more of a question of policy / permission. You answer is fine, we have all the acks.
<sima>
lumag, I guess some acks from bridge folks would be good too, but *shrugs*
<sima>
lumag, but yeah in general if you have a patch set that has multiple different trees because it goes across, just get acks from everyone else and then land in drm-misc
<sima>
or whichever tree makes sense as the main one
<sima>
well there's a lot more reasons for a broken driver, but even for a good driver you just might have really suck hw
<sima>
the one thing that we really should be good at guaranteeing (because the entire atomic framework was built to make it happen reliably) is if a full config passed TEST_ONLY | ALLOW_MODESET
<sima>
we can always commit it
<sima>
only exception could be "ran out of memory meanwhile, sorry"
<sima>
but that's just linux' overcommit
<lumag>
sima, another question: drm-misc-next is still on the v6.5-rc2. Is it fine to backmerge drm/drm-next to update it to 6.6-rc? Otherwise I'm getting a conflict when applying patches.
vyivel has quit [Read error: Connection reset by peer]
<sima>
lumag, ask drm-misc maintainers to do the backmerge for you, and explain the reason they should record in the commit message
<sima>
since merges are generally rare and need some extra care that's not for committers as a rule
<sima>
but yeah generally drm-misc-next shouldn't be stuck on an old -rc for that long ...
<sima>
time to sleep here now, ttyt
crabbedhaloablut has quit []
vyivel has joined #dri-devel
<tleydxdy>
what is DRM_V3D_PERFMON_CREATE, seems v3d is some broadcom gpu? I'm running a vulkan application on amdgpu and I see this pop up in strace
sima has quit [Ping timeout: 480 seconds]
<robclark>
tleydxdy: in cases where same ioctl nr are used by multiple drivers, strace should list all the possible ones (or at least all the ones it knows about)
tlwoerner_ has quit [Ping timeout: 480 seconds]
mvlad has quit [Remote host closed the connection]