ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar
Haaninjo has quit [Quit: Ex-Chat]
zzoon_2 has joined #dri-devel
Piraty has quit [Remote host closed the connection]
mbrost_ has quit [Remote host closed the connection]
mbrost_ has joined #dri-devel
pallavim has quit [Ping timeout: 480 seconds]
ngcortes has quit [Read error: Connection reset by peer]
Piraty has joined #dri-devel
mbrost_ has quit [Ping timeout: 480 seconds]
Piraty has quit []
Piraty has joined #dri-devel
co1umbarius has joined #dri-devel
Piraty has quit [Remote host closed the connection]
columbarius has quit [Ping timeout: 480 seconds]
Piraty has joined #dri-devel
Kayden has joined #dri-devel
penguin42 has quit [Remote host closed the connection]
mbrost_ has joined #dri-devel
mbrost_ has quit [Remote host closed the connection]
mbrost_ has joined #dri-devel
mbrost_ has quit [Ping timeout: 480 seconds]
mbrost_ has joined #dri-devel
mbrost_ has quit [Ping timeout: 480 seconds]
mbrost_ has joined #dri-devel
TMM has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
TMM has joined #dri-devel
mbrost_ has quit [Ping timeout: 480 seconds]
mbrost_ has joined #dri-devel
dcl^ has joined #dri-devel
aravind has joined #dri-devel
Guest12549 has quit [Quit: ZNC 1.8.2 - https://znc.in]
peelz has joined #dri-devel
peelz is now known as Guest242
heat has quit [Ping timeout: 480 seconds]
aravind has quit [Ping timeout: 480 seconds]
aravind has joined #dri-devel
alanc-away has quit [Remote host closed the connection]
alanc-away has joined #dri-devel
mbrost_ has quit [Remote host closed the connection]
mbrost_ has joined #dri-devel
mbrost__ has joined #dri-devel
mbrost has joined #dri-devel
mbrost_ has quit [Ping timeout: 480 seconds]
danvet has joined #dri-devel
mbrost__ has quit [Ping timeout: 480 seconds]
orbea has quit [Remote host closed the connection]
orbea has joined #dri-devel
Zopolis4_ has joined #dri-devel
itoral has joined #dri-devel
mbrost has quit [Read error: Connection reset by peer]
Company has quit [Quit: Leaving]
bmodem has joined #dri-devel
mbrost has joined #dri-devel
mbrost_ has joined #dri-devel
mbrost has quit [Ping timeout: 480 seconds]
dcz has joined #dri-devel
tzimmermann has joined #dri-devel
camus1 has quit [Ping timeout: 480 seconds]
mvlad has joined #dri-devel
rasterman has joined #dri-devel
vliaskov has joined #dri-devel
camus has joined #dri-devel
frieder has joined #dri-devel
frieder has quit [Remote host closed the connection]
i-garrison has quit [Ping timeout: 480 seconds]
i-garrison has joined #dri-devel
kzd has quit [Quit: kzd]
jfalempe has joined #dri-devel
sghuge has quit [Remote host closed the connection]
sghuge has joined #dri-devel
pjakobsson has quit [Remote host closed the connection]
frieder has joined #dri-devel
Zopolis4_ has quit [Quit: Connection closed for inactivity]
frieder has quit []
frieder has joined #dri-devel
tursulin has joined #dri-devel
pcercuei has joined #dri-devel
kts has joined #dri-devel
urja has quit [Read error: Connection reset by peer]
urja has joined #dri-devel
Haaninjo has joined #dri-devel
<airlied> Venemo: hey you might know, do mesh shaders outputs get fixed function clipped like any outputs from the vert stages?
Ziemas has quit [Ping timeout: 480 seconds]
lynxeye has joined #dri-devel
vliaskov has quit [Ping timeout: 480 seconds]
vliaskov has joined #dri-devel
pochu has joined #dri-devel
kts has quit [Quit: Konversation terminated!]
swalker_ has joined #dri-devel
swalker_ is now known as Guest262
swalker__ has joined #dri-devel
<javierm> danvet: I wrote on Friday the patch you suggested for mutter but have some issues with the compat layer, the damaged clips are added to the plane's atomic state before committing
<javierm> danvet: I've added my findings in the MR https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/2979, it would be great if you could take a look when you have some time
<javierm> tzimmermann: probably you can help here too ^
mbrost_ has quit []
Guest262 has quit [Ping timeout: 480 seconds]
elongbug has joined #dri-devel
<bbrezillon> danvet: With drm_sched_resubmit_jobs() being deprecated, I wonder who's supposed to set the job's parent field back to something non-NULL before drm_sched_start() is called (those are set to NULL in drm_sched_stop(), when the fence callback is removed). Are drivers supposed to manually iterate over the pending_list to set it up?
mbrost has joined #dri-devel
mbrost_ has joined #dri-devel
<dj-death> any radv folk willing to glance over the NIR/Spirv bits of KHR_ray_tracing_position_fetch? : https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22734
mbrost has quit [Ping timeout: 480 seconds]
mwalle has joined #dri-devel
devilhorns has joined #dri-devel
thellstrom has joined #dri-devel
gawin has joined #dri-devel
<javierm> emersion: ah, Zack did post a v2 that I missed before https://lore.kernel.org/all/50fd57193508f33a1e559ef74599c9e52764850d.camel@vmware.com/T/
<mareko> airlied: yes
<mareko> airlied: mesh shaders are just like vertex shaders, but are launched as compute shaders that must produce "VS outputs"
<emersion> javierm: ah, so what's missing?
<javierm> emersion: only a single thing that danvet mentioned, I've just asked in the mailing list to Zack if plans to post a v3 or we could help with that
<javierm> basically not using a DRIVER_VIRTUAL driver cap and instead add a new plane type
<javierm> a DRM_PLANE_TYPE_VIRTUAL_CURSOR or something like that. Since other than that the cursor, the virtual machine KMS drivers behave by the uAPI contract
<javierm> emersion: but I'm very happy that this is almost ready to land, I first thought that would need to implement the whole thing to get virtio-gpu out of the mutter atomic deny list
YuGiOhJCJ has joined #dri-devel
<emersion> hm, not sure a new plane type makes sense
<emersion> you'd also need the not-fun part: igt
_xav_ has quit [Ping timeout: 480 seconds]
_xav_ has joined #dri-devel
djbw has quit [Ping timeout: 480 seconds]
mbrost_ has quit [Ping timeout: 480 seconds]
<danvet> javierm, don't use dirtyfb and page_flip together
<danvet> dirtyfb is for frontbuffer rendering, no page flip
<danvet> and hence the fb check to make sure that you actually report damage for the current fb
<danvet> page_flip is for a new buffer and has an implicit damage of everything
<danvet> if you want both page flip and damage, you must use atomic
<danvet> javierm, I don't have an account there, please add this to the mr ^^^
<danvet> bbrezillon, I'm lost, or well not on top of sched discussions enough to have any idea of what you should do :-/
<danvet> emersion, btw on your hdr hackfest summary, unstable uabi isn't a thing in upstream
<emersion> hm
<danvet> all the "hide it behind knobs" we've done was for uapi we planned to keep forever, but weren't sure we have all the pieces and the kernel+userspace stack was bug-free enough to unleash it on users
<danvet> but uapi you plan to actually break/remove is a completely different kind of thing
<danvet> unless you are very careful with not leaking users and make sure all users can fall back to something you wont get regression reports for
<danvet> it'll be a regression and the experimental uabi becomes rather permanent
<emersion> i'm personally in the "let's not merge vendor specific stuff" camp FWIW
<emersion> but this is something we wanted to discuss on-list
<emersion> maybe it's no big deal if it's stable
<danvet> I'm no fan of vendor kms props either
<emersion> harry seemed okay with maintaining this uAPI forever for this hw
<danvet> your writeup at least sounded like that's the compromise because you can remove it again
<danvet> and there's solid chances that's not how it'll play out
<emersion> yeah… that's how we wanted to compromise
<danvet> yeah if it's for some specific gpu just to get things going it might be ok
zzoon_2 has quit [Ping timeout: 480 seconds]
<danvet> yeah that's not really how it works :-)
<emersion> okay, good to know
<emersion> yup, it would just be for AMD DCN 3
<emersion> but then nothign stops somebody else from wanting to do the same thing
<emersion> or expand to more hw
<emersion> then things get into a meh state
<danvet> yeah I think we need really solid reasons that we really can't get off a good per-plane color mgmt api without going vendor specific at first
<emersion> anyways, this is all mostly to deal with the generic uAPI being stuck
<emersion> i really hope we can un-stuck it now
<emersion> a lot of frustration seems to come from the cost of adding new uAPI
<emersion> (from me first tbh)
<emersion> we've also discussed mandatory vkms impl for new uAPI BTW
<emersion> (which is at odds with the above)
<emersion> (vkms not caught up enough for this to be practical just yet anyways)
<Venemo> airlied: yes, mesh shader output rasterization works exactly the same as any other
<javierm> danvet: ah, I see. So then was my misunderstanding how the legacy KMS API should be used. Thanks for the clarification, I'll just drop that MR then and instead focus on using atomic for virtio-gpu
<danvet> emersion, yeah I think vkms for blending uapi would be nice, so that you can validate the igts against something everyone can use
<emersion> javierm: nice :)
<javierm> emersion, danvet: it was meant to be a temporary workaround but I understand now that legacy KMS + damage clipping + page flip isn't a supported combination
<javierm> emersion, danvet: about uAPI, I remember than in media/v4l2 at some point new features were merged but just the uAPI not exposed
<danvet> yeah that'd be an option too
<javierm> that seems to be a good compromise to at least experiment and try to find patterns between the different drivers to define a generic uAPI
<emersion> uAPI not exposed?
<emersion> how does that work?
<javierm> emersion: people will need to run a patched kernel but it's better than keeping all the patches out-of-tree for a long time
<emersion> ah, so just a patch to #define ENABLE_WHATEVER basically?
<emersion> that's a bit weird
<javierm> emersion: it is, yeah. But better than getting stuck with a uAPI that was defined too early
<emersion> yea
<ccr> what is needed is a time machine. travel to future to acquire the perfect uAPI and see what mistakes were made, etc.
anholt_ has joined #dri-devel
<bbrezillon> danvet: any idea would I should ask?
<danvet> könig? or whoever marked that function obsolete
<bbrezillon> also tried hooking up native fence support, as you suggested last time, and it's not clear to me when the core can check the FW seqno against the drm_sched_job->parent->seqno
anholt has quit [Ping timeout: 480 seconds]
<bbrezillon> danvet: yep, it's könig, but he doesn't seem to be on IRC
<danvet> yeah könig's not on irc
<bbrezillon> oh well, I'll write an email
invertedoftc096 has quit []
heat has joined #dri-devel
elongbug has quit [Remote host closed the connection]
elongbug has joined #dri-devel
<karolherbst> jenatali: if you have some time, mind reviewing the clc patch in https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22776 ?
<gawin> karolherbst: that snippet for nv30 where should be put? have you sent patch for next stable release?
<karolherbst> I have no idea yet
<karolherbst> would have to figure something out what would be the best place for it
hansg has joined #dri-devel
<karolherbst> I've asked Ben tho
hansg has quit []
apinheiro has joined #dri-devel
mbrost has joined #dri-devel
mbrost_ has joined #dri-devel
bmodem has quit [Ping timeout: 480 seconds]
mbrost has quit [Ping timeout: 480 seconds]
TMM has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
TMM has joined #dri-devel
apinheiro has quit [Quit: Leaving]
itoral has quit [Remote host closed the connection]
bmodem has joined #dri-devel
bmodem1 has joined #dri-devel
bmodem has quit [Ping timeout: 480 seconds]
dcz has quit [Ping timeout: 480 seconds]
dcz has joined #dri-devel
Jeremy_Rand_Talos__ has quit [Remote host closed the connection]
Jeremy_Rand_Talos__ has joined #dri-devel
dcl^ has quit [Remote host closed the connection]
dcz_ has joined #dri-devel
dcz has quit [Read error: Connection reset by peer]
invertedoftc096 has joined #dri-devel
dcz has joined #dri-devel
dcz_ has quit [Ping timeout: 480 seconds]
heat has quit [Remote host closed the connection]
heat has joined #dri-devel
mbrost_ has quit [Ping timeout: 480 seconds]
mbrost_ has joined #dri-devel
mbrost__ has joined #dri-devel
JohnnyonFlame has quit [Ping timeout: 480 seconds]
stuarts has joined #dri-devel
mbrost_ has quit [Ping timeout: 480 seconds]
rasterman has quit [Quit: Gettin' stinky!]
fxkamd has joined #dri-devel
<swick[m]> emersion: you mean the adaptive sync CTS?
<emersion> yea
<emersion> there are more CTSes but haven't looked at the others
<swick[m]> not being able to read DisplayID from the dedicated bus doesn't seem to be a problem... I've not seen a single monitor exposing data there
greenjustin is now known as Guest304
<emersion> do you have a command to read everything from user-space via I2C?
greenjustin has joined #dri-devel
<emersion> i tried to come up with one but pretty sure i got it wrong
<swick[m]> ugh, on a custom kernel without i2c-dev rn..
heat has quit [Read error: Connection reset by peer]
heat has joined #dri-devel
<swick[m]> I used i2cdetect and i2cget FWIW
<emersion> yea that's what i (tried to) use as well
<swick[m]> and EDDC specifies the addresses
<emersion> i'm trying `i2cget 3 0xA0 0x00 b 32` and it's not working
<emersion> "SET Segment 0, device A0h or A4h, start address 00h, READ 128 bytes"
JohnnyonFlame has joined #dri-devel
<bbrezillon> mbrost__: Do you have a branch with the latest drm_sched-kthread-to-wq patches, or should I assume https://gitlab.freedesktop.org/drm/xe/kernel/-/commit/18b968ab6a98eb1801a385d6a6a82967f3ebb322 is the latest version?
greenjustin has quit [Remote host closed the connection]
greenjustin has joined #dri-devel
<mbrost__> I want to post another, non-RFC, rev of this soon too https://patchwork.freedesktop.org/series/116055/
<mbrost__> The branch I shared should be more or less Xe based on that RFC + feedback
<mbrost__> oh, I might have mis-understood your question... multitasking...
<mbrost__> the version you posted is close, in the branch I shared I allow the user to pass in the run_wq
Company has joined #dri-devel
pallavim has joined #dri-devel
devilhorns has quit [Quit: Leaving]
frieder has quit [Remote host closed the connection]
alanc-away has quit [Remote host closed the connection]
alanc-away has joined #dri-devel
bmodem has joined #dri-devel
kzd has joined #dri-devel
robmur01 has quit [Ping timeout: 480 seconds]
bmodem1 has quit [Ping timeout: 480 seconds]
YuGiOhJCJ has quit [Quit: YuGiOhJCJ]
<swick[m]> emersion: what do you mean with not working? most displays just don't have anything there...
<emersion> i mean that the parameters i give to i2cget are invalid
<emersion> it complains about them being out of range
<swick[m]> there was a gotcha wrt the address but I already forgot
aravind has quit [Ping timeout: 480 seconds]
mszyprow has joined #dri-devel
<swick[m]> emersion: ah yes, the address i2c tools wants is the one without the read/write bit, so A0h right shifted by one
JohnnyonFlame has quit [Ping timeout: 480 seconds]
<swick[m]> we should probably add a script to libdisplay-info to read the EDID and DisplayID blobs directly from i2c
<emersion> that would be handy
<emersion> hm still getting the error
<emersion> Error: Chip address out of range (0x08-0x77)!
<emersion> maybe i'm not understanding what chip vs. data address is?
fxkamd has quit []
lemonzest has quit [Quit: WeeChat 3.6]
<swick[m]> i2cget 3 0x50 0x00 does not work?
<emersion> bleh sorry, i used 0x80 by mistake
<emersion> i get the expected "read failed"
<emersion> hm, but that's the same address for EDID though?
<emersion> the spec says, for EDID:
<emersion> "SET Segment 0, device A0h, start address 00h, READ 128 bytes"
<emersion> and for DisplayID:
<emersion> "SET Segment 0, device A0h or A4h, start address 00h, READ 128 bytes"
<emersion> now i'm confused
<bbrezillon> mbrost__: thanks, I think I had a version with the custom run_wq already
<emersion> i'd like to check that i can grab the EDID at least
robmur01 has joined #dri-devel
<mbrost__> cool, that should be the latest
<swick[m]> emersion: A0h can contain either EDID or DisplayID, A4h can only contain DisplayID afaiu
<emersion> i see
<swick[m]> but A0h should be readable
<mbrost__> I'll probably get all drm sched changes out today on the list
<emersion> hm
CME has quit []
<bbrezillon> mbrost__: drm_sched_main() still has a loop to dequeue jobs until there's a reason to stop dequeuing (dep no signaled, or all job slots filled). I was wondering if it wouldn't be better to dequeue one job at a time (or at least limit the number of jobs you dequeue) so that you don't block other works scheduled on the same run_wq.
<bbrezillon> That's particularly important if we want to use the same ordered (single-threaded) workqueue for both the job dequeueing and tdr, to guarantee that nothing tries to queue stuff to the HW while we're resetting it
bmodem has quit [Ping timeout: 480 seconds]
<bbrezillon> mbrost__: which is basically what ckönig suggested here https://lore.kernel.org/dri-devel/5c4f4e89-6126-7701-2023-2628db1b7caa@amd.com/
lemonzest has joined #dri-devel
mbrost__ has quit [Ping timeout: 480 seconds]
swalker__ has quit [Remote host closed the connection]
<jenatali> Down to 41 CTS fails :)
greenjustin_ has joined #dri-devel
tzimmermann has quit [Quit: Leaving]
greenjustin has quit [Ping timeout: 480 seconds]
kts has joined #dri-devel
mszyprow has quit [Ping timeout: 480 seconds]
mbrost has joined #dri-devel
<mbrost> bbrezillon, that is kinda tricky
<mbrost> give me a few and let see what it looks like
rsalvaterra has quit []
rsalvaterra has joined #dri-devel
<mbrost> but let's say you use the system_wq, you can as many cores on the machine running in parallel as long as each item is its work_struct
<mbrost> the system_wq is just a wrapper for a pool of kthreads I think
<mbrost> so IMO I don't see each work_struct running in a loop while it has work as that big of a deal
tursulin has quit [Ping timeout: 480 seconds]
<mbrost> Another option to would be pass in unique run_wq to each scheduler and now you timeslicing
tlwoerner has quit [Quit: Leaving]
<mbrost> if you have more work_struct than cores...
JohnnyonFlame has joined #dri-devel
<mbrost> or pass in a few ordered wq to many schedulers
tlwoerner has joined #dri-devel
kts has quit [Quit: Konversation terminated!]
<mbrost> but either way let prototype 1 dequeue per pass of the main loop if think that would be better, I'm not going argue this one either way
Guest304 has quit [Remote host closed the connection]
greenjustin_ is now known as greenjustin
<bbrezillon> mbrost: so, multithreaded workqueue improves the situation, yes, but ckönig was actually suggesting using a single-threaded workqueue, so we don't have to call drm_sched_stop/start() to stop/start the drm_sched while we're resetting the GPU
<mbrost> well you can pass in any run wq you want but confused how you could away from not calling start / stop
<mbrost> the existing code calls start / stop on the kthread
<mbrost> we call start / stop in our reset flows and they rock solid, esp when compared to the i915
<mbrost> do you mean the same ordered wq as the reset one / tdr?
<bbrezillon> well, if you have a single-threaded wq for your drm_sched workers, and you queue your reset worker to this queue, you're guaranteed that the reset function will only be executed when all drm_sched workers are idle
<bbrezillon> yes, that's what ckönig suggested, if I'm correct
<mbrost> yea that is true
<mbrost> nothing preventing you from doing that now but yea in that case i guess dequeuing 1 item would make a bit more sense
<mbrost> i don't think I want to design Xe to share a WQ though, calling start / stop is easy enough
<bbrezillon> mbrost: ok. I guess we need a replacement for drm_sched_job_resubmit() then...
<bbrezillon> didn't just what you were using in the Xe driver
<bbrezillon> *didn't check
<mbrost> and like I said our reset flows are rock solid, I've written tests to hammer these paths as I was super paranoid about this not working after working on the i915
<mbrost> we just kill the entity scheduler if it hangs
<bbrezillon> drm_sched_stop()
<bbrezillon> I guess
<mbrost> no recovery, I think this faith's idea for all VM bind drivers i think
<mbrost> We have a buy in from our UMDs
<mbrost> 1 sec, i'll point you to our TDR callback
<bbrezillon> you mean remaining jobs are just cancelled?
yussef has joined #dri-devel
<mbrost> yea and entity is banned
<mbrost> that function should be pretty well commented
alanc-away has quit []
alanc has joined #dri-devel
<bbrezillon> hm, okay. So that's for the 'one entity is going crazy, but the GPU and FW-proc are still functional' case
<mbrost> yes
<bbrezillon> we do have global hangs where we need to stop all entities, reset the GPU and FW, and start all entities (and by start, I mean kick the previously active entities at the FW level, and call drm_sched_start())
<mbrost> We also have GT resets (entire GPU is trashed) but in practice that should be impossible or very, very rare
<mbrost> in that case, we try to find the bad entity, ban it, resubmit the others
<bbrezillon> that's where synchronization gets messy, because the reset operation, which is always queued on a workqueue, has to wait for all drm_sched workers to be idle
<mbrost> A GT reset basically should only be triggered by junk hardware or if our KMD has a bug
<mbrost> yea, let me point you to our GT reset code...
<bbrezillon> mbrost: would you mind replying to Christian regarding the drm_sched_{stop,start}()? I feel like your opinions are diverging here, and more and more driver keep depending on your work ;-)
<mbrost> yea we do have to loop over every entity and stop it
<bbrezillon> yep, that's what we were planning to do initially
<mbrost> there is nothing stopping from the current interfaces for doing it either way
<mbrost> s/for/from
<bbrezillon> it's just that, with drm_sched_resubmit_jobs() being deprecated, I was a bit lost as to what drivers were supposed to do to reassign the drm_sched_job::parent fence
<mbrost> that is news to me it being deprecated
Duke`` has joined #dri-devel
<mbrost> the seems like a unilateral decision, drivers should be allowed to do this either way
<mbrost> is this merged?
<bbrezillon> it's in Linus' tree :)
<mbrost> ugh... I need to pay better attention to the list
<bbrezillon> was merged in 6.3-r1 apparently
<mbrost> Xe doesn't need most of the nonsense in that function anyways
<mbrost> really all we need is loop over pending jobs and call run_job
<bbrezillon> well, the only thing we'd need in the powervr driver is a loop re-assigning the parent fence, we don't even need to call run_job() (the ringbuf should already be filled and it content preserved, unless I'm mistaken)
<mbrost> ok maybe we just open code the resubmit
<bbrezillon> just felt odd to iterate over the pending_list manually
<mbrost> in Xe the parent fence would be the same anyways
<bbrezillon> but if that's how it's supposed to be done, I'm fine with that
<bbrezillon> yep, same here, we don't re-allocate the fence
<bbrezillon> it's the same object, same seqno
<mbrost> and ref count is still correct too
<bbrezillon> in any case, we should document what drivers are expected to do between the stop and start calls, because it's unclear right now
<bbrezillon> like, the bare minimum is to re-assign parent fences, otherwise pending jobs are dropped when drm_sched_start() is called
<mbrost> again I don't think we should force a driver to do anything
<mbrost> I'm confused by that last statement
<bbrezillon> well, it's not about forcing them to do anything, but if you call drm_sched_start() after having kicked the queue, your driver is likely to end up with use-after-free bugs
<mbrost> in Xe you can run_job as many times as you want you always get the same fence back
<bbrezillon> at least that's what happens if you keep in-flight jobs in some internal list/queue
<mbrost> which is assigned to the parent
<mbrost> it is fence which looks at a memory location for a value to be greater or equal to the jobs seqn
<bbrezillon> yep, same here: u32 mem object containing the seqno
<bbrezillon> and the dma_fence is just a wrapper around that, with dma_fence::seqno being checked against the FW fence object seqno
<bbrezillon> run_job() does queue things to the ringbuf though
<mbrost> i think I understand what you are getting at
JohnnyonFlame has quit [Ping timeout: 480 seconds]
<bbrezillon> but we can easily add a "submitted" boolean to avoid resubmitting it
rasterman has joined #dri-devel
<bbrezillon> and even if we were actually resubmitting, that's not a big deal, as long as we reset the ringbuf pointer before kicking the queue
<mbrost> we stop the signaling of fences returned from run_job in our reset flows
<mbrost> xe_hw_fence_irq_stop
<mbrost> xe_hw_fence_irq_start
tobiasjakobi has joined #dri-devel
tobiasjakobi has quit []
<mbrost> resets are confusing
<mbrost> if it would help I can post a kernel doc patch explaining how Xe does these and avoids races
<mbrost> it might already done this at one point...
<bbrezillon> dunno, I really thought everyone was on-board with this drm_sched_resubmit_jobs() deprecation
<bbrezillon> looks like it's not really the case...
<mbrost> If you do a GT reset (think of it as a power cycle) and you don't want lose all pending work, run_job needs to be called again
<mbrost> on all pending jobs
<mbrost> get rid of drm_sched_resubmit_jobs() fine, Xe will just have a loop where it calls run job
<bbrezillon> then what's the point of deprecating it, if people keep open-coding it...
<mbrost> yea that what i'm confused about
<bbrezillon> I mean, I can code it differently, and optimize the resubmit logic, but why should I bother, it's not a hot-path anyway
<mbrost> how would a pending job complete after a power cycle unless you stick it back on the hardware
<bbrezillon> tried reading amdgpu code, and I don't quite get how they do it...
mbrost has quit [Ping timeout: 480 seconds]
mbrost has joined #dri-devel
<mbrost> i might have lost a few comments
macromorgan is now known as Guest324
vliaskov has quit [Quit: Leaving]
macromorgan has joined #dri-devel
<mbrost> 'implement
<mbrost> + * recovery after a job timeout.' - yea that isn't how Xe is using it
<mbrost> Xe uses after the entire device is reset
lynxeye has quit [Quit: Leaving.]
Guest324 has quit [Ping timeout: 480 seconds]
CME has joined #dri-devel
jewins has joined #dri-devel
mvlad has quit [Remote host closed the connection]
djbw has joined #dri-devel
JohnnyonFlame has joined #dri-devel
junaid has joined #dri-devel
camus1 has joined #dri-devel
camus has quit [Ping timeout: 480 seconds]
iive has joined #dri-devel
CME has quit []
CME has joined #dri-devel
mbrost has quit [Remote host closed the connection]
mbrost has joined #dri-devel
elongbug has quit [Remote host closed the connection]
elongbug has joined #dri-devel
mbrost has quit [Remote host closed the connection]
mbrost has joined #dri-devel
gawin has quit [Ping timeout: 480 seconds]
junaid has quit [Remote host closed the connection]
elongbug has quit [Read error: Connection reset by peer]
mbrost has quit [Ping timeout: 480 seconds]
mbrost has joined #dri-devel
gawin has joined #dri-devel
stuarts has quit [Remote host closed the connection]
Duke`` has quit [Ping timeout: 480 seconds]
apinheiro has joined #dri-devel
dcz has quit [Ping timeout: 480 seconds]
<anholt_> Have you wished that piglit didn't take so long? I have an MR for you! https://gitlab.freedesktop.org/mesa/piglit/-/merge_requests/804
<mbrost> Seems to work just fine, it would be fun with we had some benchmarks and see if this had a difference
<jenatali> anholt_: <3
<zmike> anholt++
<anholt_> I swear we need to have static int called = 0; assert(called++ < 1000) in piglit_probe_pixel.
<jenatali> Sounds like a good idea to me
thellstrom has quit [Ping timeout: 480 seconds]
ngcortes has joined #dri-devel
<DemiMarie> danvet emersion: What about hiding the uAPI behind BROKEN?
<DemiMarie> I assume no distro will ship a CONFIG_BROKEN=y kernel, and if they do (or patch out the dependency) they get to keep both pieces.
stuarts has joined #dri-devel
<emersion> DemiMarie: what is CONFIG_BROKEN?
ngcortes has quit [Ping timeout: 480 seconds]
pcercuei has quit [Quit: dodo]
<DemiMarie> emersion: It is a catchall, never-enabled Kconfig entry used to disable broken code (hence the name). If there is a Kconfig that should never actually be set, one can use `depends on BROKEN` to make sure it is in fact never set.
<emersion> i see
dcz has joined #dri-devel
Zopolis4_ has joined #dri-devel
mbrost_ has joined #dri-devel
ngcortes has joined #dri-devel
mszyprow has joined #dri-devel
mbrost__ has joined #dri-devel
konstantin_ has joined #dri-devel
mbrost has quit [Ping timeout: 480 seconds]
gawin has quit [Quit: Konversation terminated!]
konstantin has quit [Ping timeout: 480 seconds]
mbrost_ has quit [Ping timeout: 480 seconds]
rasterman has quit [Quit: Gettin' stinky!]
pallavim has quit [Ping timeout: 480 seconds]
apinheiro has quit [Quit: Leaving]
mszyprow has quit [Ping timeout: 480 seconds]
mbrost__ has quit [Ping timeout: 480 seconds]
i509vcb has joined #dri-devel
pallavim has joined #dri-devel
mbrost has joined #dri-devel
mbrost_ has joined #dri-devel
mbrost_ has quit [Remote host closed the connection]
mbrost_ has joined #dri-devel
iive has quit [Quit: They came for me...]
mbrost has quit [Ping timeout: 480 seconds]
RSpliet has quit [Quit: Bye bye man, bye bye]
RSpliet has joined #dri-devel
danvet has quit [Ping timeout: 480 seconds]
alatiera2 has joined #dri-devel
alatiera has quit [Ping timeout: 480 seconds]
alatiera2 is now known as alatiera
<airlied> zmike: you any clues on the lvp memory model fails that CI is seeing?
jernej has quit [Quit: Free ZNC ~ Powered by LunarBNC: https://LunarBNC.net]
jernej has joined #dri-devel
penguin42 has joined #dri-devel
<penguin42> TToTD: If you hit a PROFILING_INFO_NOT_AVAILABLE it might be because you missed a wait() on an event - it took me a while to figure that out
<penguin42> especially since ROCm didn't complain about it
<zmike> airlied: I had a ticket where I was tracking all the known fails
<zmike> if it's not in there I have no idea
<airlied> ah there was a flake lodged by DavidHeidelberg[m]
<DavidHeidelberg[m]> ?
<airlied> you logged a lavapipe flake a while back, seems to be more prevalent now, will have to make the effort to track it down I suppose
FireBurn has joined #dri-devel
<airlied> zmike: just saw 20520 fell down the crack
<DavidHeidelberg[m]> right
<jenatali> Anybody know who I should ping to get opinions on !22800?
<airlied> jenatali: whoever wrote your qsort :-P
<jenatali> Heh, blame the C standard authors for not requiring it to be stable :P
<zmike> airlied: oh I forgot about that
<zmike> I'm not actually sure it's correct now that I look again
oneforall2 has quit [Remote host closed the connection]
alanc has quit [Remote host closed the connection]
<airlied> Venemo, mareko : do mesh shaders interact with transform feedback?
<airlied> ah d3d12 doesn't seem to support it at least
<jenatali> Right
greenjustin has quit [Ping timeout: 480 seconds]
<jenatali> Huh, actually having working caches seems to have taken ~20 minutes off my CTS run time. I'll take it
<jenatali> Maybe it means I can bump the CI factor from 4 to 3...
oneforall2 has joined #dri-devel
<zmike> airlied: no, there is no xfb with mesh