ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar
oneforall2 has quit [Remote host closed the connection]
oneforall2 has joined #dri-devel
co1umbarius has joined #dri-devel
columbarius has quit [Ping timeout: 480 seconds]
stuarts has quit []
JohnnyonFlame has joined #dri-devel
JohnnyonF has quit [Ping timeout: 480 seconds]
binhani has joined #dri-devel
srslypascal has quit [Remote host closed the connection]
YuGiOhJCJ has joined #dri-devel
diwrrr has joined #dri-devel
Suraj has joined #dri-devel
mohamexiety has joined #dri-devel
binhani has quit [Quit: Konversation terminated!]
nehsou^ has joined #dri-devel
Suraj has quit []
<Venemo>
gfxstrand: considering that we rarely need to prove workgroup-uniformity, the alternate idea is to just do a parent-walk starting from the ssa def and determine it that way. which would be probable much easier.
binhani has joined #dri-devel
mbrost has quit [Ping timeout: 480 seconds]
<diwrrr>
Hi. Does anyone know if it's possible to use VRR in Xorg with PRIME render offloaded applications? I have a Intel iGPU and an RX580, and it's working but only on Wayland, for render offloaded apps. Just wanna know if it's even supported by the modesetting driver, can't find much info anywhere.
mohamexiety has quit []
flibitijibibo has quit [Ping timeout: 480 seconds]
diwrrr has quit [Remote host closed the connection]
diwrrr has joined #dri-devel
oneforall2 has quit [Remote host closed the connection]
ayaka_ has quit [Remote host closed the connection]
ayaka_ has joined #dri-devel
oneforall2 has joined #dri-devel
mbrost has joined #dri-devel
aravind has joined #dri-devel
diwrrr has quit [Quit: Konversation terminated!]
ngcortes has quit [Read error: Connection reset by peer]
Zopolis4 has joined #dri-devel
bmodem has joined #dri-devel
mbrost has quit [Read error: Connection reset by peer]
rmckeever has quit [Quit: Leaving]
elongbug has quit [Read error: Connection reset by peer]
lemonzest has joined #dri-devel
i509vcb has joined #dri-devel
<i509vcb>
In the context of a Vulkan driver that supports VkPhysicalDeviceDrmPropertiesEXT, would the driver be expected to report a primary node or would it only technically be required to report a render node? Technically there could be a card node that is advertised but it wouldn't do much at all then.
binhani has quit [Quit: Konversation terminated!]
srslypascal has joined #dri-devel
<mareko>
the joke that ChatGPT could replace a CEO is a reality now, in China
agd5f has quit [Remote host closed the connection]
agd5f has joined #dri-devel
kts has quit [Quit: Konversation terminated!]
ayaka_ has quit [Remote host closed the connection]
tzimmermann has joined #dri-devel
Zopolis4 has quit []
danvet has joined #dri-devel
mvlad has joined #dri-devel
JohnnyonF has joined #dri-devel
JohnnyonFlame has quit [Ping timeout: 480 seconds]
Mangix_ has joined #dri-devel
Mangix has quit [Ping timeout: 480 seconds]
fxkamd has quit []
frieder has joined #dri-devel
<emersion>
i509vcb: software rendering
Company has quit [Quit: Leaving]
sghuge has quit [Remote host closed the connection]
sghuge has joined #dri-devel
<Lynne>
i509vcb: the render node, no point in exposing the display node, since you couldn't run graphics/compute ops on it
<Lynne>
which are mandatory
digetx has quit [Remote host closed the connection]
digetx has joined #dri-devel
<emersion>
Lynne: swrast can
robobub has joined #dri-devel
robobub has quit [Max SendQ exceeded]
Zopolis4 has joined #dri-devel
robobub has joined #dri-devel
yuq825 has joined #dri-devel
digetx is now known as Guest9108
digetx has joined #dri-devel
Guest9108 has quit [Ping timeout: 480 seconds]
heat has quit [Ping timeout: 480 seconds]
Ahuj has joined #dri-devel
<MrCooper>
hays: make sure CONFIG_FRAMEBUFFER_CONSOLE is enabled in the kernel configuration
alanc has quit [Remote host closed the connection]
alanc has joined #dri-devel
pivi has joined #dri-devel
pcercuei has joined #dri-devel
<mslusarz>
Venemo, cmarcelo: there are some crazy internal people who force enable nv mesh shader on anv (even though I told them once they should stop doing that), so we have to understand why they do that, but other than that I don't see any problem with removing NV_mesh_shader
<mslusarz>
we have to be careful to not remove all codepaths which are, in theory, nv extension-only, because some of them will still be needed for ext on anv (at least for dg2) :/
<pivi>
hello everybody
<pivi>
is it fine to ask general question on DRM development here? I am getting crazy on how to fix a few drivers and I would need some guidance
<MrCooper>
pivi: sure
apinheiro has joined #dri-devel
lynxeye has joined #dri-devel
godvino has joined #dri-devel
Ahuj_ has joined #dri-devel
tursulin has joined #dri-devel
Ahuj_ has quit [Read error: Connection reset by peer]
<pivi>
I have the following situation DRM display chain: TIDSS (parallel RGB output) -> DSI Bridge (tc358768) -> LVDS bridge (ti-sn65dsi83) -> LVDS display. Between the TIDSS and the TC358768 I have a MEDIA_BUS_FMT_RGB666_1X18, however the TIDSS receive the LVDS mode e.g. MEDIA_BUS_FMT_RGB888_1X7X4_SPWG. If I manually force the mode on the TIDSS it's all good, otherwise, nothing really works
<danvet>
uh
<pivi>
I tried implementing atomic_get_input_bus_fmts on the tc358768, but without much luck so far
<danvet>
so the short answer is that the atomic framework isn't quite ready for this yet
<danvet>
or at least I need to check where we are
<pivi>
so there is no way on the framework to have a bridge having a differnt media fmt between input/output ?
rasterman has joined #dri-devel
<danvet>
pivi, well I need to first read up on where we are, we have some pieces but maybe not all yet
<danvet>
might also be better for bbrezillon
ice99 has joined #dri-devel
Ahuj has quit [Read error: Connection reset by peer]
godvino has quit [Read error: Connection reset by peer]
godvino has joined #dri-devel
<danvet>
pivi, nah from a quick look bbrezillon all added it in 2020 and it should work with that hook
<danvet>
unfortunately there's no overall howto doc with maybe a nice dot graph that shows how it does work
<danvet>
pivi, so where are you struggling?
<pivi>
I defined atomic_get_input_bus_fmts in tc358768 and the TIDSS just keep do not get it
<pivi>
not sure if there is more that needs to be implemented, the tc358768 does not really implement the whole atomic callbacks
<danvet>
you need them all
<danvet>
it's not documented I guess, there's just a code comment
<pivi>
I would need to move to the atomic API also the TIDSS?
<pivi>
anyway, is in general appreciated to update DRM drivers to the atomic API ?
<danvet>
hm tidss is already atomic?
<danvet>
what you need is that each bridge is supporting the atomic state stuff
<danvet>
because that's where the input/output bus fmt is stored in
<danvet>
i.e. the bridge_funcs->atomic_duplicate/destroy_state hooks
<pivi>
ok, cool. so sn65dsi83 seems fine, tc358768 I can update myself. the TIDSS I am not sure on the current state neither if I need to update it (it's not just a bridge)
<danvet>
the kerneldoc in there explains that they're mandatory if you implement any of the other atomic_ hooks
<danvet>
pivi, from a quick look tidss is atomic already
<danvet>
pivi, if you feel like, might be good to do a patch to add the "To implement this hook implementing the @atomic_duplicate_state and @atomic_destroy_state hooks must be implemented too" boilerplate at the bottom of each atomic_ hook kerneldoc to help people realize this?
<pivi>
ok, so if this is the case I am confused. because the tc358768 I already did before asking here, but I was not able to get the results I was expecting. (to be clear, while I am familiar with kernel development, I am not with the DRM subsystem, so who knows ;-)
<pivi>
danvet: I can do it, as soon as I get something "working" :-)
<bbrezillon>
pivi: do you have a public branch I can look at?
<danvet>
pivi, thx!
<pivi>
bbrezillon: not yet, I have a mess I am ashamed of :-). In addition to that I need some TI downstream patch on the TIDSS to test on my HW (TI AM62).
<bbrezillon>
it's supposed to be a 2 steps thing => first select the bus formats for all links, store them in the atomic state, and then, when the atomic state is applied, program the bridge/display controller accordingly
<pivi>
bbrezillon: I can prepare something easily, would you prefer this or just a RFC on the mailing list ?
<pivi>
bbrezillon: do I need to have all the bridged to implement atomic_get_input_bus_fmts ?
<pivi>
bbrezillon: s/bridged/bridges/
<danvet>
I guess if you really want to go overboard, a bus format negotiation DOC: section with dot graph that explains how it all flows and connects would be perfect
<danvet>
pivi, yup
<danvet>
otherwise the next bridge gets MEDIA_BUS_FMT_FIXED
<danvet>
maybe also something we should add to the docs somewhere ...
<pivi>
danvet: hehe, I feel a little bit bad about me documenting stuff I can barely understand at the moment ;-)
<danvet>
plus a link to drm_atomic_helper_bridge_propagate_bus_fmt as the no-op/pass through helper
<danvet>
pivi, don't be, you're the perfect guinea pig to discover the gaps
<bbrezillon>
for those where you have a choice to do, yes. If the bridge supports just one input or output format, the core can pass the FMT_FIXED for you.
<danvet>
like we try to document stuff, be the devs working on something generally are blind to what is important to note when you have no idea
godvino1 has joined #dri-devel
<danvet>
bbrezillon, I guess making drm_atomic_helper_bridge_propagate_bus_fmt is too much risk of breaking stuff?
<javierm>
pivi: IME that's the best moment to post doc patches since a) it's meant exactly for people like you and b) getting review is the best way to make sure that you understand the concepts
<danvet>
^^ this ^^
<pivi>
anyway, I do no understand what's going on. I have implemented atomic_get_input_bus_fmts() on the tc358768, yet the tidss get neither MEDIA_BUS_FMT_FIXED nor the only one option I provided from the tc358768 (RGB666)
<danvet>
yeah it just skips these callbacks if you don't have the other atomic hooks implemented
<pivi>
no, I am sure atomic_get_input_bus_fmts()
<danvet>
pivi, you should get a WARN_ON splat though
<pivi>
danvet: I had all the callback implemented
<danvet>
select_bus_fmt_recursive() <- the one in here
<pivi>
anyway, I agree with bbrezillon , I should really have code to share, otherwise it's just too difficult
<danvet>
note that tidss needs to dig out the bus fmt from the input fmt for the first brige's state
<danvet>
it's not in the crtc state tidss has access to
<danvet>
I think at least
<pivi>
danvet: probably this is what is missing
<danvet>
bbrezillon, we don't have a helper for this for the crtc?
elongbug has joined #dri-devel
<danvet>
this = getting the output fmt for the crtc so it matches input_fmt of the first bridge
<danvet>
maybe with a default of looking at display info fmt if there's no bridge chain
<pivi>
danvet, bbrezillon: I gonna prepare some branch that I can share here to make this a little bit more concrete.
<danvet>
bbrezillon, or do the crtc drivers just open code the bridge state lookup dance?
<bbrezillon>
danvet: nope, we don't have that, but the first bridge in the chain is usually part of the display controller driver. Besides, there's no concept of output bus format at the CRTC level
<danvet>
bbrezillon, yeah, but for bridges they have both input&output in their state
<bbrezillon>
I'd need to look at it, it's been a long time
<danvet>
so just figured a nice helper might be useful
<danvet>
but if all the drivers just put the fifo stuff into the crtc/planes the the first bridge is the first real output then I guess they don't need that
<danvet>
if the crtc->bridge bus fmt is an internal impl detail
<bbrezillon>
yeah, not sure get_input_fmts() is used on the first bridge, I'd have to check
<danvet>
but maybe for some drivers (tidss?) it's different
<danvet>
drm_crtc_helper_get_output_fmt(crtc, state) or so might be good
<danvet>
state so we get the right one both in check and commit functions
<danvet>
pivi, ^^ you might want to consider adding that
sgruszka has quit [Ping timeout: 480 seconds]
<danvet>
bbrezillon, mxsfb_crtc_atomic_enable and mxsfb_crtc_mode_set_nofb essentially open-code (and in a convoluted way) the helper I have in mind
<danvet>
hm sounds like we even want to provision for a default to handle that case
<danvet>
but essentially the algorithm to compute the crtc output bus flags:
<danvet>
1. look at first bridge's state input_bus_cfg
<danvet>
2. look at first bridge->timing->input_bus_flags old style approach
<danvet>
3. look at connector->display_info.bus_format[0]
<danvet>
4. some default
<danvet>
pivi, if you also need this in tidss then I think would be good to extract that from mxsfb (the code is a mess) and reuse
<danvet>
4. is some _driver_ default
<HdkR>
Did Xe ever fix its struct packing in the uapi with the RFC? I didn't pay attention to it after the first posting.
<danvet>
mlankhorst, ^^
<danvet>
mbrost/thellstrom not here, so this is on you :-)
<bbrezillon>
danvet: I think step 3 only makes sense if there's just one bridge in the chain, but I also don't want to change that :D
<danvet>
bbrezillon, step 3 is for no bridge
<danvet>
but most drivers just avoid that by having a panel bridge so I guess in practice it's really just legacy fallback
devilhorns has joined #dri-devel
<bbrezillon>
uh, you're right
<danvet>
so I think the value in this would just be in kinda officiating it all (plus it's what mxfsb has implemented right now)
<danvet>
with maybe a very strong hint in the kerneldoc that bridges really should implement the atomic state stuff
<danvet>
plus display_info is the fallback already in the bridge helpers too, so this would be consistent
Mangix_ has quit []
itoral has quit [Remote host closed the connection]
shashanks has joined #dri-devel
Mangix has joined #dri-devel
pochu has joined #dri-devel
godvino1 has quit []
godvino has joined #dri-devel
vliaskov has joined #dri-devel
ice9 has joined #dri-devel
ice99 has quit [Read error: Connection reset by peer]
ice9 has quit [Read error: Connection reset by peer]
ice9 has joined #dri-devel
godvino1 has joined #dri-devel
godvino has quit [Ping timeout: 480 seconds]
gio has quit [Ping timeout: 480 seconds]
djbw has quit [Read error: Connection reset by peer]
mohamexiety has joined #dri-devel
godvino1 has quit []
aravind has quit [Ping timeout: 480 seconds]
aravind has joined #dri-devel
smilessh has joined #dri-devel
kxkamil has joined #dri-devel
dtmrzgl has joined #dri-devel
_xav_ has quit [Ping timeout: 480 seconds]
_xav_ has joined #dri-devel
kts has joined #dri-devel
APic has quit [Quit: Upgrading GNU Screen]
APic has joined #dri-devel
<zamundaaa[m]>
When allocating memory with VK_EXT_image_drm_format_modifier, is it possible to decide whether or not the resulting buffer should be scanout capable? If so, how?
<dj-death>
we should probably assume that by default
<emersion>
zamundaaa[m]: it is not possible
<emersion>
per the spec
<emersion>
in practice: mesa always allocates scanout capable, nvidia proprietary does not
<emersion>
(given a scanout-capable modifier)
<emersion>
zamundaaa[m]: if you want guaranteed scanout-capable, you need GBM and GM_BO_USE_SCANOUT
<zamundaaa[m]>
well that's annoying then
<emersion>
why?
<zamundaaa[m]>
The background behind this question is that multi gpu in KWin right now isn't fast on all PCs. That is, on some devices it needs to fall back to CPU copy, which on some devices is very very slow and inefficient
<emersion>
why do you want to allocate with vulkan?
<zamundaaa[m]>
Using EGL to import a dmabuf across GPUs proved to be buggy and doesn't exactly offer a ton of control either
<emersion>
EGL or Vulkan should work the same
<emersion>
IOW, one shouldn't be more buggy than the other
<zamundaaa[m]>
I can't decide if a buffer is accessible from the CPU with EGL
<zamundaaa[m]>
With Vulkan I could allocate buffers where I'm guaranteed that copying it around and rendering both work well
<lumag>
aknautiy_, jani sugested that you might help with it (while he doesn't have time)
<zamundaaa[m]>
emersion: yeah I know that performance wise it's never gonna be the best for both at the same time
<zamundaaa[m]>
but with EGL I'm facing a situation where using modifiers doesn't work because of a RadeonSI bug, and if I force linear then performance is absolutely horrible once I need to blit stuff from the framebuffer (to OpenGL-allocated textures). So I assume Mesa allocates the buffer in system memory, because I can't explain it any other way.
<lumag>
narmstrong, robertfoss: while the rest of the series is not finished with the reviews, Jani wrote that he is fine with merging patches 1-5 through drm-misc. Would that be possible?
<narmstrong>
lumag: yep I saw his reply, I thought an intel guy would apply those
<lumag>
jani, could you please comment ^^
<ccr>
zmike, aye aye
aravind has quit [Ping timeout: 480 seconds]
aravind has joined #dri-devel
<pepp>
zamundaaa[m]: which radeonsi bug? #8431?
robobub has quit []
<zamundaaa[m]>
yeah
godvino has joined #dri-devel
Zopolis4 has quit []
pjakobsson_ has joined #dri-devel
pjakobsson has quit [Ping timeout: 480 seconds]
ahajda has joined #dri-devel
flibitijibibo has joined #dri-devel
godvino has quit [Ping timeout: 480 seconds]
<lumag>
narmstrong, ok, Intel guys will take care of those series.
<danvet>
gfxstrand, robclark syncobj ack or discussion still ongoing?
kts has quit [Quit: Konversation terminated!]
nehsou^ has quit [Remote host closed the connection]
fxkamd has joined #dri-devel
yuq825 has left #dri-devel [#dri-devel]
samuelig has quit [Ping timeout: 480 seconds]
vjaquez has quit [Ping timeout: 480 seconds]
godvino has joined #dri-devel
<danvet>
fyi moved drm-fixes to -rc4 if anyone wants to roll forward
<danvet>
plan to do the same with -next once the syncobj pull is in
<pivi>
danvet, bbrezillon : I was able to make some (_small_) progress on this morning discussion. Hacking around the tidss code I was able to confirm that indeed my changes on the tc358768 input bus format are correct.
<pivi>
what is still confusing to me is how this tidss driver should be changed. in your example this morning you somehow mixed up bus_flags and bus_format, and I have no idea if this is wanted and I just do not understand it or something else
<pivi>
(for the tc358768 bridge I'll prepare and send a patch in the next couple of days)
jdavies has joined #dri-devel
<pivi>
danvet: I mean this
<pivi>
1. look at first bridge's state input_bus_cfg
<pivi>
2. look at first bridge->timing->input_bus_flags old style approach
<pivi>
3. look at connector->display_info.bus_format[0]
jdavies is now known as Guest9144
<pivi>
4. some driver default
<danvet>
I just looked at what mxfsb is doing, maybe that doesn't make much sense
<pivi>
this fe141cedc433 ("drm/imx: pd: Use bus format/flags provided by the bridge when available") from bbrezillon looks also interesting
<pivi>
as an example to look at
<danvet>
hm yeah that looks a bit like another copy of the same theme
<danvet>
pivi, and yeah that helper would need to fill out a drm_bus_cfg with both format and flags I guess
<danvet>
I think mxfsb only computed flags for whatever reasons
<pivi>
the current API, atomic_get_input_bus_fmts, is only taking care of bus format. is supported to have the equivalent for the bus flags or not yet? On my specific case I do not need it, the whole bridge chain is able to do anything and I do not need to restrict/alter it in any way
<gfxstrand>
danvet: I think I'll RB today if robclark responded since I last looked
<danvet>
pivi, it supports filling out drm_bus_cfg I thought, which has both?
<robclark>
ok, I guess I'm sending another revision for a couple more typo's
Guest9144 has quit [Ping timeout: 480 seconds]
<bbrezillon>
for tidss to take part in the bus format negotiation, the encoder bit of the tidss driver should implement the drm_bridge interface
<pivi>
from what I can tell it has a `unsigned int *num_input_fmts` output parameter, and that's it.
<danvet>
robclark, I'm fine if you just respin the pr with the ack and fixes
Leopold_ has quit [Ping timeout: 480 seconds]
<danvet>
robclark, if you do maybe also include the igt/userspace links for completeness, since the pr mail is kinda like series cover letter if we do topic branch
<pivi>
bbrezillon: I did hackaround something in tidss_encoder_atomic_check() at the moment.
<robclark>
ok, will do.. need to do a couple other things first
<pivi>
it is already going through the bridges, I just take the bus fmt out of the first one and use it. in the immediate it was mainly to prove that my changes on the other bridge were correct
<bbrezillon>
pivi: well, that can work, but that means you don't get to select the input format chosen by the first bridge in the chain
<bbrezillon>
meaning you can end up with something that's not supported by the tidss driver at the time the atomic state is checked
<pivi>
bbrezillon: this fe141cedc433 ("drm/imx: pd: Use bus format/flags provided by the bridge when available") is doing exactly what you wrote here, implementing the drm_bridge interface, right?
<bbrezillon>
I don't remember, let me check
<jani>
lumag: narmstrong: looks like the patches didn't pass our CI (didn't apply). would you mind rebasing and resending patches 1-5 again as a new thread, Cc: intel-gfx, and we can apply with the CI results
godvino has quit [Read error: Connection reset by peer]
fxkamd has quit []
<bbrezillon>
pivi: yep, I think it does
<mslusarz>
Venemo, cmarcelo: I got a confirmation that we are free to remove support for NV_mesh_shader from ANV
<gfxstrand>
danvet, robclark: The problem is that I'm still not 100% seeing how deadlines fit into Vulkan. Like, robclark's patch works and does a thing but is it the right thing? IDK.
<danvet>
gfxstrand, you mean for the entire end-to-end tuning problem from app to hw?
<pivi>
bbrezillon: thanks. I'll focus on getting the tc358768 changes ready first, and after dig on the proper way to improve the tidss driver.
<danvet>
I have vague terrible memories to an endless framerate vulkan extension shed
<robclark>
I think there is some room for vk extension to give app control
<robclark>
but that doesn't exist today
<robclark>
in the mean time we can do what is right 90% of the time and driconf our way around the rest if needed
<Venemo>
mslusarz: great, feel free to open a MR to do so. after both radv and anv are merged we can then remove it from nir and spirv
<lumag>
jani, I tried, they apply on top of drm-tip. Is there any other branch that I should use?
dvereb has joined #dri-devel
<jani>
lumag: huh. let me hit retest and see what happens
<danvet>
robclark, yeah that's my take too, we've had deadlines boosting in some form in i915 since forever, it's clearly useful
<danvet>
but also the larger bikeshed is epic so we just need something which is 80% or so but at least across drivers to move one step forward
<danvet>
I think at least
<danvet>
worst case we'll have have a DEADLINE2 ioctl or something :-)
<robclark>
yeah, my thought as well
<robclark>
I don't think we need new ioctl for this.. since the debate is in userspace
<robclark>
maybe for tursulin's deadline scheduler, that is a different flag
frieder has quit [Remote host closed the connection]
<danvet>
gfxstrand, so the ack I'm looking for is not "this is definitely what we'll use in a vk extension" but more "this doesn't look totally wrong and looks good enough to at least move the discussion"
<dvereb>
Good morning. Apologies if I'm in the wrong place. I've encountered a bug using fox-toolkit and their mailing list indicated it appears to be in the mesa library. I'm here to double check where to post it. I've narrowed down one specific line that fixes my issue (a simple boolean flag). It was fixed in 22.3.4, but has since been changed back by 23.0.0. Do I subscribe & post it to the mesa-users mailing list, should I bring
<dvereb>
it up here, or perhaps something else? Thank you.
<gfxstrand>
danvet: Well, that seems to be a bit of the problem.
<gfxstrand>
danvet: I think the biggest confusion is over what a deadline means and whether or not not having a deadline is a secret third thing that we actually want.
<gfxstrand>
Or should not having a deadline be considered a legacy behavior?
bmodem has quit [Ping timeout: 480 seconds]
<daniels>
life sure would be easier if deadlines were legacy and deprecated, yeah
<danvet>
yeah it's not really a deadline but more a "maybe uncomfortableness will ensue past this time" line
<bnieuwenhuizen>
random Q I have with the entire thing is how does this affect the direction of going UMF?
<danvet>
bnieuwenhuizen, you don't :-)
devilhorns has quit []
<danvet>
the entire umf conundrum is that you get lower latency at the price of less fairness because the scheduler has no idea wtf is going on anymore
<gfxstrand>
bnieuwenhuizen: Yup. That's another question.
<gfxstrand>
Do we want to add more usages of dma_fence as a communication channel.
<gfxstrand>
We tried that with error propagation. It did not go well.
<danvet>
in theory intel hw/fw can schedule away on umf waits
<danvet>
in practice it kills both throughput _and_ latency and so isn't enabled
<bnieuwenhuizen>
side-note, did anyone do anything interesting with the value of the deadline? I thought the series just made Intel boost if there is a deadline?
<gfxstrand>
I assume HW/SW can schedule away on UMF waits. The problem is that there's no communication channel with UMF.
djbw has joined #dri-devel
<danvet>
bnieuwenhuizen, iirc it roughly controls sched order in the same prio class
<gfxstrand>
bnieuwenhuizen: Rob's been doing some power management stuff with freedreno based on it. There was an XDC talk, I think.
<danvet>
but the main one is ramping clocks
<danvet>
bnieuwenhuizen, imo what exactly we do can be changed since it's best effort tuning problem
<danvet>
and those are a lot more lax wrt regressions
<bnieuwenhuizen>
yeah I guess for scheduling that could work. For clocks I'm still like "what should the clock be if we don't know the workload?"
mbrost has joined #dri-devel
<danvet>
bnieuwenhuizen, best I've seen is try to remember the old one and reset that one
<danvet>
per ctx or so
Daaanct12 has joined #dri-devel
<danvet>
that's good for spikey workloads
<danvet>
you could also co to max and then ramp down, but with todays power constrained chips that often ends badly
<danvet>
*go to max
<danvet>
remembering the old freq needed for that ctx is the hard part :-)
<danvet>
the new cpu most recent virtual deadline patches that floated around and lwn covered look really interesting for this
<danvet>
but on the gpu we don't have the infra to get accurate enough timings
<danvet>
plus no useful preempt either
<danvet>
useful as in = doesn't take substantial part of an entire frame, you might as well just wait for the job to finish
alyssa has left #dri-devel [#dri-devel]
mbrost has quit [Read error: Connection reset by peer]
<danvet>
gfxstrand, bnieuwenhuizen robclark do we need to rename it to something like DISCOMFORTLINE to make the concept clear and untangle from actual deadline scheduling?
YuGiOhJCJ has quit [Quit: YuGiOhJCJ]
Daanct12 has quit [Ping timeout: 480 seconds]
<robclark>
s/DEADLINE/PONY/
<robclark>
danvet: anyways, at least the name does change uabi
<danvet>
well uapi but yeah it's be annoying
<robclark>
I tried to emphasis "hint" everywhere.. that was the best solution to communicate what it does that I came up with
pivi has quit [Quit: thanks you all - talk to you soon again]
mbrost has joined #dri-devel
alyssa has joined #dri-devel
heat has joined #dri-devel
<danvet>
robclark, yeah it really should be clear enough ...
<alyssa>
and then once in a while will send up a big batch of reviewed commits to Marge upstream
<alyssa>
to reduce mesa ci roundtrips
<alyssa>
anything that touches non-vendor code is still going directly to mesa/mesa for review, this is just for purely src/asahi + src/gallium/drivers/asahi changes
<danvet>
this starts to feel a lot like the kernel pr model with local integration trees
<alyssa>
TBD whether this is a materially better or worse workflow
<danvet>
(minus the boutique tree nonsense maybe)
<alyssa>
danvet: it's not ideal, no.
<danvet>
alyssa, oh no judgement meant, just observation and maybe a place for where to steal ideas
<alyssa>
but there's no good option here so I'm trying a bad option in the hopes it's useful.
<danvet>
like "don't make the subtrees too small" for some value of "too small"
<alyssa>
ah, yeah
<alyssa>
I'm doing this at the asahi project level with the idea that it is mostly a closed set of developers and reviewers going around there
<danvet>
yeah it makes some sense, eventually no monotree scales well enough
<alyssa>
yeah
<HdkR>
"Week 48 mega-sync"
<alyssa>
even if our CI were perfect, the serialization of the merge queue is a fundamental limit
<danvet>
*monorepo
<HdkR>
I like it, how soon until drm entirely lives in gitlab for PRs? :)
<alyssa>
spamming the marge queue with a zillion alyssa MRs will only slow down everyone else's merges and frustrate alyssa
<danvet>
monotree with multiple repos is imo better than outright full splitting
<alyssa>
AFAIU, the major pain point with this is that it discourages drive-by reviewa
<danvet>
HdkR, I'm pretty sure there's a universe where it already happened :-(
<danvet>
alyssa, yeah it encourages silos, hence the "don't be too small"
<alyssa>
and then by the time people see the upstream batched MR it's too late for making changes
<bnieuwenhuizen>
also starts to introduce rebase conflicts, where whoever is upstreaming is now likely responsible for the rebase pain
<alyssa>
which is why I'm hopeful that by bringing this up here, anybody who cares about reviewing asahi changes can do so
<bnieuwenhuizen>
though I bet that isn't the worst yet
<danvet>
bnieuwenhuizen, which is why linux does real merges for these integration trees
<alyssa>
bnieuwenhuizen: I'm betting on the rebase conflicts being less painful than the merge queue. This is an experiment. If it doesn't work we'll go back to the old way.
<danvet>
least because a rebase also invalidates all the subtree testing you've done
<danvet>
it's some tradeoff, but generally if you rebase right before merging there's excellent chances git bisect falls over on it
<danvet>
the tradeoff ofc is that if you don't rebase, then there's a chance you might be stuck on an unlucky regression in your baseline
<alyssa>
danvet: re size of tree, the level I would see would be per-hardware vendor
<alyssa>
s/tree/repo
<alyssa>
e.g. both radeonsi and radv would be a repo together
<alyssa>
(even though it's different software teams backing them)
<bnieuwenhuizen>
other side is whether it makes sense to have marge-bot batch MRs together for CI to make the single repo scale better
<alyssa>
that seems harder to do nicely
<bnieuwenhuizen>
but that comes with more human overhead in case the batch fails
<alyssa>
I mean
<alyssa>
I'm all good for people making CI scale better
<alyssa>
This is one approach Lina and I are going to try. If it works, great. If it doesn't, we go back, great.
<alyssa>
I
<danvet>
alyssa, ime 10 people is the lower boundary
<danvet>
plus/minus
<alyssa>
I'm mostly making noise here not to endorse the approach but to let people know that this is where the Asahi commits are going if they're looking for them.
<danvet>
below that you don't have a team and the overhead of the separate tree in terms of paperwork doesn't pay for itself
<danvet>
at least long term, for an experiment smaller might be better
<danvet>
paperwork = rebasing, doing integration mr, handling the inevitable fallout for everyone, rotating that responsibility
<alyssa>
yeah, we'll see how bad that gets
<alyssa>
it doesn't seem scary now but we'll see
<gfxstrand>
robclark, danvet: I don't think the name is the problem.
gouchi has joined #dri-devel
<gfxstrand>
The problem to me is what things with no deadline mean in this context. Is it the same as 0? The same as UINT64_MAX? Some secret 3rd thing?
<gfxstrand>
If it's a secret 3rd thing, is that the thing we actually want for vkCmdWaitForFences()? Or do we want deadline=0? Sure, deadline=0 solves a particular bug because the Intel wait-boosting is funky but is that what we really want?
<gfxstrand>
The reason why I care is because, if deadline=Some(0) isn't what we want but we really want deadline=None (Switching to Rust because it makes more sense), then Mesa as a userspace driver isn't a good idea.
tobiasjakobi has joined #dri-devel
tobiasjakobi has quit [Remote host closed the connection]
<HdkR>
alyssa: But how do you feel about shorter pre-merge CI and longer post-merge CI and regressions get shamed until resolved? :)
<alyssa>
HdkR: this was shot down very hard.
alyssa has left #dri-devel [#dri-devel]
<HdkR>
oop
<tursulin>
gfxstrand: Yeah, and we don't really need deadline=now to solve the clvk wait-boosting problem. Just *some* way to know someone is waiting. Hence I was concerned about the proliferation of "now" for random things.
<tursulin>
what does no deadline hint mean is probably not interesting, just means no one cares
<tursulin>
I do not understand what is the issue with marking waits as deadline=U64_MAX
<gfxstrand>
Same problem as you were raising with deadline=0. Someone may suddenly de-prioritize all Vulkan workloads because they say they don't care how fast they run.
<tursulin>
the only extra part is that it needs extra handling in the kernel and both msm and i915 parts can AFAICT work just the same as proposed
<gfxstrand>
Whatever default we pick will have implications in the future.
<gfxstrand>
At least with deadline=0 we're indicating that someone is actively waiting on it right now and, as far as we can tell, has work to do whenever we get finished.
<tursulin>
gfxstrand: you worry about deadline=UINT_MAX getting de-prioritized relative to deadline=no-deadline?
<gfxstrand>
Sure
<gfxstrand>
Depends on the behavior of no-deadline.
<gfxstrand>
That's why I'm harping on the no-deadline case and nailing down exactly what that means relative to the other two.
<tursulin>
right
<tursulin>
it is probably worth trying to nail that down rather than to give up and immediately talk about adding more deadline flags. It is not likely policy discussion will be easier with more flags..
<tursulin>
I mean more deadline flags if this becomes a problem.
<gfxstrand>
One could make an argument that None should be equivalent to Some(u64::MAX) and that setting a deadline does `deadline = min(deadline, new)`. That seems like reasonable behavior. If that's the behavior we want, I want to make sure we're clear on it.
<tursulin>
From the scheduling point of view that is IMO sane and reasonable. I think only problem arises with the waitboost hack.
<gfxstrand>
What Vulkan does is a separate policy decision, IMO.
<gfxstrand>
It's not really a waitboost hack. It's us providing the most accurate information we know.
<gfxstrand>
And the most accurate information we know is that the client is waiting on us. What will they do when we're done? How important is that work? Will they actually schedule more GPU work right away? We have no way of answering those questions. All we know is that they're sitting there waiting.
<tursulin>
yep, hence "now" is questionable and I wouldn't call it accurate information - we have information someone is waiting sure, but not what the deadline is
<gfxstrand>
But U64_MAX isn't accurate, either. We know that someone's waiting and they probably care when we get done.
<tursulin>
that is also true :)
<tursulin>
well that they are waiting, whether they care when they get done who knows
<gfxstrand>
I said "probably" :)
<bnieuwenhuizen>
some of it might also be a wait before just freeing the memory, which could totally be U64_MAX unless you have memory pressure
<gfxstrand>
The problem we have right now with i915 is that there is a heuristic and it's basically "Things using the old uAPIs win"
<tursulin>
gfxstrand: it's not even that clear cut
<tursulin>
we know it harms some workloads pretty badly
<gfxstrand>
bnieuwenhuizen: Yeah, we could add a VK_SEMAPHORE_WAIT_LOW_PRIORITY flag to communicate that information.
<tursulin>
what people really want is ability to control waitboost per context
<bnieuwenhuizen>
gfxstrand: pls don't make Vulkan more complicated :P
<gfxstrand>
lol
<danvet>
gfxstrand, so in terms of rust FencePrio = None | SomeWaiter | Deadline uint64
<danvet>
which is I think what you're arguing for?
<danvet>
atm we just have None | Some uint64
<danvet>
(yes it's haskell not rust syntax)
<gfxstrand>
danvet: ¯\_(ツ)_/¯
<tursulin>
if we document that no deadline and U64_MAX should be handled/scheduled equally, and we add a wart (documented or not?) of "intercepting" U64_MAX in drivers to apply waitboost also in that case, does that work?
<gfxstrand>
danvet: I guess, over-all, I'm just not sure why Mesa should be the driver for this.
<danvet>
tursulin, atm they're not
<danvet>
at least not in the patches because of boosting
<tursulin>
yes I know it's not like that in the series
<danvet>
gfxstrand, aside from atomic flips the kernel has no clue what's going on
<danvet>
tursulin, I guess you could scale the waitboost somehow with the deadline, dunno
<danvet>
so that UINT_MAX = no boosting
<danvet>
that would ditch the special case of "no deadline ever applied" being different from UINT_MAX
<danvet>
like you get full boost if the deadline is less than 20ms and nothing if it's more than 10s or whatever you feel like
<danvet>
so I'd go more towards "drivers must make sure there's no difference between MAX and no deadline"
<tursulin>
that makes sense but "is not the waitboost you are looking for" :)
<danvet>
yeah it's a different thing
<gfxstrand>
danvet: If None and Some(u64::MAX) are different, then wouldn't we want None for all the Mesa things? In that case, why do we have meas patches?
<danvet>
vk ext to specify the deadline, just kick the can down the road
<tursulin>
series is piggy backing i915 style waitboost on top of deadline hints
<gfxstrand>
This feels a lot like "mesa should do a thing so we have an excuse to land uAPI"
<gfxstrand>
I don't like those.....
<danvet>
gfxstrand, I think because that's not what i915 did
<danvet>
gfxstrand, oh I thought robclark has chromium patches for this stuff
<danvet>
if chromium only uses sync_file imo drop the syncobj patch and move on
<tursulin>
syncobj is critical for clvk
<danvet>
still could split it out if that helps to keep things moving?
<tursulin>
it's what brings the i915 style waitboosting to it
<gfxstrand>
Wait-boosting is critical for clvk
<gfxstrand>
If we want wait-boosting to happen via deadline=Some(0) then sure
<gfxstrand>
But then why are we arguing that vkWaitForFences should be None/Some(u64::MAX)?
<tursulin>
but the deadline is IMO "fake" - what is needed is a side channel to say "someone is waiting"
<danvet>
well that's why I argued for enum FencePrio = { None, SomeWaiter, Deadline{uint64)}
<danvet>
and then I guess internally the kernel can just map that to Deadline(0) until we have more clue
<danvet>
since on the compositor/kms side we do have an actual deadline
<tursulin>
which allows channeling "someone is waiting" data
<tursulin>
and avoids ABI conundrums
<danvet>
we could also just add the wait flag to syncobj for now if that's all mesa needs
<danvet>
bit funny uapi with deadline on sync_file and waiters-present on syncobj but oh well :-)
Daanct12 has joined #dri-devel
<mareko>
in NIR, it seems that it's possible to have a non-divergent SSA that is in a conditional block executed based on a divergent condition
<robclark>
danvet: yeah, syncobj is the critical thing right now.. the deadline part is meant to also encompass what i915 does on missed vblank deadlines
<danvet>
robclark, so since the cover letter didn't have it, what is the userspace for sync file then?
<robclark>
as far as scheduling.. I'm on the fence about whether it should piggy-back on the same mechanism/flag.. the current thing is meant to be feedback for gpu freq mgmt
<danvet>
and is there another one for syncobj?
<gfxstrand>
mareko: Yes. As long as all active channels agree, it's considered convergent.
<robclark>
right now the sync_file userspace is igt
<danvet>
uh
<robclark>
but also response to compositor folks
<danvet>
well that's not enough for sure for merging
<danvet>
tsk, tsk
<gfxstrand>
mareko: The moment that gets fed through a phi converging different values from divergent control-flow, the result of the phi is divergent.
<robclark>
pointing out that atomic helper vblank isn't enough if userspace is making composition decision
<danvet>
robclark, I guess I was naively assuming chromium compositor was using that
<robclark>
we aren't _yet_
<robclark>
but the need is pretty clear
<danvet>
yeah some mr/pr/changelist/whatever needs to be ready
<danvet>
yeah but we don't conjectured uabi in drm
<robclark>
if you have userspace deciding whether to recomposite with previous frame or new one
<danvet>
(ok sometimes mistakes happen but they really shouldn't)
<danvet>
*merge conjecture uabi
<danvet>
yay another typo, I give up
<gfxstrand>
mareko: The choice as to whether or not you can use an SGPR and how that works with possible conflicts across divergent branches is a problem left for the back-end RA.
<Lynne>
dj-death: nice, I'll test your descriptor buffer patchset with my code on an a750 tonight
Daaanct12 has quit [Ping timeout: 480 seconds]
lynxeye has quit [Quit: Leaving.]
<robclark>
danvet: I do also have a use for the sync_file uapi but because it is from a vm guest it's going to take some virtgpu plumbing to get to point where I can use it
<danvet>
I guess if the plumbing is really this hard we could merge behind modoption like with the atomic ioctl
<danvet>
but maybe kms atomic ioctl boosting is good enough for that?
<danvet>
(essentially what I replied on-list too)
<robclark>
for android we have a similar issue to what gnome-shell/etc have, we have SF waiting for fence in userspace and deciding shortly before vblank which version of a surface to use
<robclark>
I'll catch up on list in a bit. doing three different things at once atm
<mareko>
gfxstrand: it's for new linking opts... if I have a store writing a non-divergent value, the stored value can be divergent if the store is conditional
smilessh has quit [Ping timeout: 480 seconds]
mbrost has quit [Ping timeout: 480 seconds]
<gfxstrand>
mareko: Yes, that tracks. Stores and phis behave similarly in that way.
<gfxstrand>
So a variable value is convergent if all its stores store convergent values in convergent control-flow.
<gfxstrand>
You have to look at both
<dj-death>
Lynne: thanks, I haven't tested much with vkd3d-proton
<dj-death>
Lynne: I need to do that
<zamundaaa[m]>
<danvet> "but maybe kms atomic ioctl..." <- At least in the case of KWin, boosting once the atomic commit is submitted would be too late, as I'm currently working towards only ever submitting buffers that are ready to KMS, to allow for cursor updates etc to happen later in the refresh cycle
<danvet>
zamundaaa[m], I know
<danvet>
I'm trying to get this thing unstuck somehow :-)
JohnnyonF has quit [Ping timeout: 480 seconds]
<danvet>
atm everything dies in "this isn't good enough for my use-case"
<danvet>
or in "we didn't fully plumb this through, it's just an igt"
<cwabbott>
gfxstrand: I'm finally looking at the vulkan common state thing, and I think vk_subpass_info isn't enough for us - we need to fill out the full thing
Company has joined #dri-devel
<cwabbott>
we need depth/stencil format, color formats, etc. to precalculate a rendering-bandwidth-per-pixel estimate
<mareko>
gfxstrand: I don't think it's tracked with load_input/store_output
<cwabbott>
I don't think it's useful to have a vk_subpass_info that has stuff that vk_renderpass_state doesn't
<alyssa>
gfxstrand: i think NVK should be renamed NAK
<alyssa>
meaning Not A Kompiler
<gfxstrand>
alyssa: :P
<alyssa>
much more descirptive
gouchi has quit [Remote host closed the connection]
<karolherbst>
too late
<karolherbst>
we already got t shirts
<karolherbst>
(and I have a sticker on my laptop)
<gfxstrand>
I still need to figure out how to get airlied his T-shirt
<karolherbst>
it would be really invonvenient for me to replace it
<zmike>
I have two stickers so it definitely can't be done
<alyssa>
you can just claim the stickers were for NVK all along
<alyssa>
Nvidia Very Kool kompiler
mbrost has joined #dri-devel
<alyssa>
not to be confused with codegen, also known as NVK, meaning Not Very Kool
<gfxstrand>
:P
<gfxstrand>
daniels: Someone's telling me they're unable to fork Mesa. Some "you've reached your project max" message?
<daniels>
gfxstrand: they need to click on the issue link in the banner telling them that they can’t fork until they do
gawin has joined #dri-devel
<gawin>
as CI is disabled anywhere outside of main repo, only way to make a run is to create MR to main repo?
<daniels>
gawin: try now, you should be able to run in your repo
rasterman has quit [Quit: Gettin' stinky!]
<gawin>
seems moving forward, thx
<cwabbott>
gfxstrand: the thing is, I'd have to expand vk_subpass_info until it's basically the same as vk_renderpass_state, at which point what's the reason for a separate type?
<cwabbott>
I'd rather just delete vk_subpass_info
apinheiro has quit [Quit: Leaving]
ahajda has quit [Ping timeout: 480 seconds]
fab has quit [Quit: fab]
ice99 has joined #dri-devel
ice9 has quit [Read error: Connection reset by peer]
ice9 has joined #dri-devel
ice99 has quit [Ping timeout: 480 seconds]
ice99 has joined #dri-devel
ice9 has quit [Ping timeout: 480 seconds]
kzd has quit [Quit: kzd]
kzd has joined #dri-devel
ice9 has joined #dri-devel
ice99 has quit [Ping timeout: 480 seconds]
mbrost has quit [Remote host closed the connection]
mbrost has joined #dri-devel
Guest9052 is now known as ybogdano
rosefromthedead has joined #dri-devel
sarnex has quit [Read error: No route to host]
sarnex has joined #dri-devel
macromorgan has quit [Read error: Connection reset by peer]