ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar
<Lyude> btw - generally with interrupts for hotplugging it's good to do as little as possible in any kind of handler that could block things, and then schedule a work to do things
<alyssa> Lyude: "messenge handler calls drm_kms_helper_hotplug_event..." explains the sequence of bad event
<Lyude> yeah - I realized that after I typed it whoops
<alyssa> we got a hotplug message, handling it causes the KMS core to tell us to do a mode set so we fire a message and don't issue a vblank until it's handled
<Lyude> mhm. makes sense
<alyssa> but KMS core hangs waiting for a vblank, in the message handling thread, preventing us from handling the "mode set successful" message
<alyssa> roughly.
<alyssa> not familiar with scheduling work
<alyssa> thx
<Lyude> you want workqueues specifically
<Lyude> I would suggest putting all the hotplugging stuff on it's own workqueue, so that you can flush the entire wq before suspend/resume
<Lyude> btw, feel free to cc me on whatever patches for this you send to the kernel :).
<alyssa> ack
<alyssa> am too tired to parse docs
<alyssa> that's probably my cue to stop hacking
<alyssa> (or the fact i almost mispelled cue as "queue")
dllud has joined #dri-devel
dllud_ has quit [Read error: Connection reset by peer]
dllud_ has joined #dri-devel
dllud has quit [Read error: Connection reset by peer]
columbarius has joined #dri-devel
co1umbarius has quit [Ping timeout: 480 seconds]
Surkow|laptop has quit [Remote host closed the connection]
Surkow|laptop has joined #dri-devel
boistordu_ex has quit [Remote host closed the connection]
pnowack has quit [Quit: pnowack]
boistordu_ex has joined #dri-devel
gpiccoli_ has joined #dri-devel
gpiccoli has quit [Remote host closed the connection]
JohnnyonFlame has joined #dri-devel
<zackr> airlied: not yet. so in general mksstats are enabled in our windows driver because they're really cheap and really the only way of getting guest/host stats (for a complete view of a system). i meant to have someone on the team profile the linux side to make sure there's no performance regressions but we haven't done it yet so atm i'd caution against enabling it by default
<airlied> zackr: cool, it shall remain off!
<airlied> Lyude: ^ fyi
<zackr> <insert whatever emoji cool people use instead of the thumbs up one here>
<zmike> 🏓 mostly
Lucretia has quit [Ping timeout: 480 seconds]
loki_val has joined #dri-devel
crabbedhaloablut has quit [Ping timeout: 480 seconds]
Lucretia has joined #dri-devel
boistordu has joined #dri-devel
<Lyude> airlied: thanks!
nchery has quit [Remote host closed the connection]
boistordu_ex has quit [Ping timeout: 480 seconds]
<Lyude> zackr: feel free to let us know when/if you want it turned on in fedora by default btw
dllud has joined #dri-devel
dllud_ has quit [Remote host closed the connection]
tlwoerner_ has joined #dri-devel
tlwoerner has quit [Ping timeout: 480 seconds]
tlwoerner_ has quit []
nchery has joined #dri-devel
tlwoerner has joined #dri-devel
nchery has quit [Ping timeout: 480 seconds]
anusha has quit []
vbelgaum has quit [Ping timeout: 480 seconds]
Lucretia has quit [Ping timeout: 480 seconds]
Lucretia has joined #dri-devel
slattann has joined #dri-devel
FluffyFoxeh has quit [Quit: Close the World, Open the nExt]
xlei has quit [Ping timeout: 480 seconds]
slattann has quit []
slattann has joined #dri-devel
FluffyFoxeh has joined #dri-devel
slattann has quit []
Lucretia-backup has joined #dri-devel
Lucretia has quit [Ping timeout: 480 seconds]
Duke`` has joined #dri-devel
slattann has joined #dri-devel
vivijim has quit [Remote host closed the connection]
macromorgan has quit [Read error: Connection reset by peer]
mattrope has quit [Read error: Connection reset by peer]
xlei has joined #dri-devel
Lucretia has joined #dri-devel
Lucretia-backup has quit [Read error: Connection reset by peer]
ppascher has quit [Ping timeout: 480 seconds]
sdutt_ has joined #dri-devel
sdutt has quit [Read error: Connection reset by peer]
ppascher has joined #dri-devel
lemonzest has joined #dri-devel
pnowack has joined #dri-devel
itoral has joined #dri-devel
thellstrom has joined #dri-devel
<sven> alyssa: i'm not sure i follow. also, which rtkit/mailbox version are you using?
<sven> alyssa: and we can do multiple messages in rtkit if required. it's just been easier to only deal with a single one for now
<sven> but yeah, in general don't do much work in callbacks
thellstrom1 has quit [Ping timeout: 480 seconds]
Hi-Angel has joined #dri-devel
xlei has quit [Ping timeout: 480 seconds]
slattann has quit []
slattann has joined #dri-devel
mbrost has quit [Ping timeout: 480 seconds]
gpuman has joined #dri-devel
gpuman has quit []
gpuman has joined #dri-devel
mlankhorst has joined #dri-devel
gpuman_ has joined #dri-devel
gpuman has quit [Ping timeout: 480 seconds]
frieder has joined #dri-devel
jkrzyszt has joined #dri-devel
rasterman has joined #dri-devel
itoral has quit [Remote host closed the connection]
sdutt_ has quit [Ping timeout: 480 seconds]
gpuman has joined #dri-devel
gpuman_ has quit [Ping timeout: 480 seconds]
danvet has joined #dri-devel
The_Company has quit []
tursulin has joined #dri-devel
Ahuj has joined #dri-devel
pnowack has quit [Read error: No route to host]
slattann has quit []
i-garrison has quit []
i-garrison has joined #dri-devel
slattann has joined #dri-devel
gpuman_ has joined #dri-devel
gpuman has quit [Ping timeout: 480 seconds]
lemonzest has quit [Ping timeout: 480 seconds]
lemonzest has joined #dri-devel
pcercuei has joined #dri-devel
camus1 has joined #dri-devel
camus has quit [Read error: Connection reset by peer]
gpuman has joined #dri-devel
gpuman_ has quit [Ping timeout: 480 seconds]
agx has quit [Read error: Connection reset by peer]
agd5f has quit [Remote host closed the connection]
agd5f has joined #dri-devel
agx has joined #dri-devel
jbarnes has quit [Read error: Connection reset by peer]
slattann has quit []
karolherbst has quit [Remote host closed the connection]
lemonzest1 has joined #dri-devel
karolherbst has joined #dri-devel
jbarnes has joined #dri-devel
shfil has joined #dri-devel
ZeZu has quit [Ping timeout: 480 seconds]
lemonzest has quit [Ping timeout: 480 seconds]
kurufu has quit [Remote host closed the connection]
kurufu has joined #dri-devel
pzanoni has quit [Ping timeout: 480 seconds]
ZeZu has joined #dri-devel
lemonzest1 has quit []
lemonzest has joined #dri-devel
rsalvaterra_ has quit []
rsalvaterra has joined #dri-devel
dllud_ has joined #dri-devel
dllud has quit [Read error: Connection reset by peer]
dllud has joined #dri-devel
dllud_ has quit [Read error: Connection reset by peer]
slattann has joined #dri-devel
JohnnyonFlame has quit [Remote host closed the connection]
shashanks has quit [Ping timeout: 480 seconds]
shfil is now known as filip
gpuman_ has joined #dri-devel
filip has left #dri-devel [#dri-devel]
filipg has joined #dri-devel
kmn has joined #dri-devel
filipg has quit []
filipg has joined #dri-devel
gpuman has quit [Ping timeout: 480 seconds]
filipg has quit []
gawin has joined #dri-devel
jkrzyszt has quit [Ping timeout: 480 seconds]
<gawin> bonjour, how are legacy amd drivers (r300, r600) being compared/tested with fglrx? passthrough?
<gawin> (though not sure if gert and other guys who are maintaining legacy drivers are now online)
kmn has quit [Remote host closed the connection]
camus1 has quit [Ping timeout: 480 seconds]
camus has joined #dri-devel
jhli has quit [Read error: Connection reset by peer]
gpuman has joined #dri-devel
gpuman_ has quit [Ping timeout: 480 seconds]
i-garrison has quit []
itoral has joined #dri-devel
i-garrison has joined #dri-devel
rsalvaterra_ has joined #dri-devel
cef is now known as Guest7322
cef has joined #dri-devel
rsalvaterra has quit [Ping timeout: 480 seconds]
<Venemo> gawin: I don't think anyone maintains fglrx anymore
<alyssa> sven: nod.
Guest7322 has quit [Ping timeout: 480 seconds]
cef is now known as Guest7323
cef has joined #dri-devel
Guest7323 has quit [Ping timeout: 480 seconds]
cef is now known as Guest7325
cef has joined #dri-devel
Guest7325 has quit [Ping timeout: 480 seconds]
rsalvaterra_ has quit []
rsalvaterra has joined #dri-devel
rgallaispou has left #dri-devel [#dri-devel]
cef is now known as Guest7326
cef has joined #dri-devel
Guest7326 has quit [Ping timeout: 480 seconds]
gpuman_ has joined #dri-devel
gpuman has quit [Ping timeout: 480 seconds]
slattann has quit []
slattann has joined #dri-devel
slattann has quit []
xexaxo has joined #dri-devel
NiksDev2 has joined #dri-devel
rsalvaterra has quit [Quit: Leaving...]
gpuman has joined #dri-devel
NiksDev has quit [Ping timeout: 480 seconds]
gpuman_ has quit [Ping timeout: 480 seconds]
cef is now known as Guest7328
cef has joined #dri-devel
Guest7328 has quit [Ping timeout: 480 seconds]
slattann has joined #dri-devel
<jekstrand> airlied: Looks like it doesn't think it needs a barrier. :)
<jekstrand> airlied: What test is this? And does it expext subgroups?
txenoo has joined #dri-devel
X-Scale has joined #dri-devel
pnowack has joined #dri-devel
xlei has joined #dri-devel
X-Scale` has quit [Ping timeout: 480 seconds]
xexaxo has quit [Remote host closed the connection]
xexaxo has joined #dri-devel
vivijim has joined #dri-devel
slattann has quit []
slattann has joined #dri-devel
gpiccoli_ has left #dri-devel [#dri-devel]
gpiccoli has joined #dri-devel
gpuman_ has joined #dri-devel
gpuman has quit [Ping timeout: 480 seconds]
adjtm has quit [Ping timeout: 480 seconds]
adjtm has joined #dri-devel
slattann has quit []
agd5f has quit [Remote host closed the connection]
agd5f has joined #dri-devel
Bennett has joined #dri-devel
iive has joined #dri-devel
ella-0 has joined #dri-devel
Company has joined #dri-devel
frieder_ has joined #dri-devel
frieder has quit [Ping timeout: 480 seconds]
macromorgan has joined #dri-devel
leandrohrb27 is now known as leandrohrb
ella-0 has quit [Ping timeout: 480 seconds]
gpuman has joined #dri-devel
jernej has quit [Quit: Free ZNC ~ Powered by LunarBNC: https://LunarBNC.net]
gpuman_ has quit [Ping timeout: 480 seconds]
ella-0 has joined #dri-devel
Sumera is now known as Guest7344
jernej has joined #dri-devel
sdutt has joined #dri-devel
gpuman_ has joined #dri-devel
gpuman has quit [Read error: Connection reset by peer]
alanc has quit [Remote host closed the connection]
alanc has joined #dri-devel
<zackr> Lyude: ah, great to know, thanks!
<zmike> anyone know a test case for EXT_multisampled_render_to_texture?
ella-0 has quit [Ping timeout: 480 seconds]
<zmike> robclark / alyssa I assume one of you knows this since you've both done it
Ahuj has quit [Ping timeout: 480 seconds]
mattrope has joined #dri-devel
<Lyude> I assume there's probably not much info I'll get on this, but does anyone know if the tensor chips in the pixel 6 will be supported by upstream mesa eventually?
<alyssa> zmike: chromium gles webgl aquarium
<zmike> alyssa: I tried that one and it doesn't seem to be using it?
<alyssa> gles
<zmike> 🤔
<alyssa> not gl
<zmike> how do I force chrome to use es?
<alyssa> --use-gl=egl
<zmike> neat
<imirkin> coz egl == gles? sigh
<robmur01> Lyude: internet says it's Mali-G78, so that'll be down to how far alyssa gets with Valhall... :)
<Lyude> robmur01: oooooooh! nice
<alyssa> yep
<HdkR> robmur01: Wait, Internet actually stated it's Valhall? That's nice
<zmike> another dumbguy question: how do I tell which process of chromium is actually trying to do driver stuff?
<robmur01> HdkR: as far as you trust a supposed geekbench report (via GSMArena), at least. It seems to be implied that it's essentially a warmed-over Exynos, so seems reasonable to me
<bnieuwenhuizen> zmike: I mostly try ps aux | grep gpu-process or something like that
<HdkR> robmur01: nice
<bnieuwenhuizen> warmed-over?
nchery has joined #dri-devel
<zmike> bnieuwenhuizen: I don't see any mesa threads in that proc though?
<bnieuwenhuizen> hmm
<zmike> or in any process
<robclark> disable gpu sandbox?
<zmike> is that --no-sandbox ?
<zmike> aha
<zmike> thanks
gpuman has joined #dri-devel
<karolherbst> ref count of 0xab6e5340.. I smell memory corruptions :(
gpuman_ has quit [Ping timeout: 480 seconds]
vbelgaum_ has joined #dri-devel
<alyssa> karolherbst: you should consider unreferencing some things :-p
<karolherbst> alyssa: with that value I think referencing more is actually cheaper
<alyssa> Hm yes
gawin has quit [Quit: Konversation terminated!]
<ajax> did anyone ever bother REing the mga warp microcode? it doesn't look terribly complicated.
<vsyrjala> i haven't heard of anyone doing it. i did think about it for a split second at some point, but wasn't bored enough to actually do it
<karolherbst> :(
gpuman has quit [Remote host closed the connection]
<karolherbst> I guess tegra_set_vertex_buffers needs fixing
gpuman has joined #dri-devel
frieder_ has quit [Ping timeout: 480 seconds]
sdutt has quit []
sdutt has joined #dri-devel
gpuman_ has joined #dri-devel
gpuman has quit [Ping timeout: 480 seconds]
slattann has joined #dri-devel
<karolherbst> ehhh
<karolherbst> util_set_vertex_buffers_count gets called with the dst and src having the same buffer with a refcount of 1
hch12907_ has quit [Remote host closed the connection]
lileo has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
flibitijibibo has joined #dri-devel
<jekstrand> jenatali: Trying to pick up cmarcelo's nir_var_mem_image MR
<jekstrand> jenatali: What storage class does OpenCL use for images?
<jenatali> jekstrand: I believe they're uniform, let me double-check
<jenatali> Actually I think it's UniformConstant
gpuman has joined #dri-devel
rgallaispou has joined #dri-devel
gpuman_ has quit [Ping timeout: 480 seconds]
<jenatali> jenatali: Oh, nevermind, they're Function - it's just the spirv_to_nir entrypoint wrapper that turns them into uniform
<jekstrand> Oh
<jekstrand> Hrm....
* jekstrand will have to think about that
<jenatali> See %24, %input_addr, and %39
<jenatali> IMO it still makes sense to treat the image variable/descriptor as uniform, and only treat data within images as the new var_mem_image
<jekstrand> jenatali: I go back and forth on that
<jekstrand> I don't especially like the way that SPIR-V works.
anusha has joined #dri-devel
<jekstrand> And for UBOs, SSBOs, etc. NIR doesn't follow that model. It does for images and textures as an artifact of history from GLSL
<jenatali> In what way is NIR different?
<jenatali> Oh I guess I see what you're saying, the variable mode is the mode if the data within the variable
<jenatali> Yeah alright
<jekstrand> Yeah
<jekstrand> It doesn't try to distinguish between the handle and the data.
<jekstrand> One thing I've thought about doing for a while now if I ever find the time is to rework image load/store to use a new nir_deref_type_texel
<jekstrand> At that point, the deref chain will go all the way down to actual image data
<jenatali> Ooh, cool
<karolherbst> this entire kernel function think just makes no sense and people tried to deal with and I think that just ends in big tears here :/
* jenatali has to go run a couple errands, will be offline for a bit
<karolherbst> jekstrand: interesting
<jenatali> jekstrand: If you need me to help switch CLOn12's compiler stack over to use the new image mem type, I can do that, just need to budget time for it among all the other things I've got going on :P
<jekstrand> jenatali: I can try to code blind but I'd like at least help testing
mbrost has joined #dri-devel
gpuman_ has joined #dri-devel
gpuman has quit [Ping timeout: 480 seconds]
pzanoni has joined #dri-devel
gouchi has joined #dri-devel
<jekstrand> cmarcelo: Did we never write the pass to lower non-scoped barriers to scoped?
<robclark> karolherbst: fwiw re: emitMOV() / isaspec.. if the instruction encoding is completely different for all the variants, you can just model it as a different instructions.. see `__instruction_case()`...
<karolherbst> robclark: ohh, that's neat
<karolherbst> actuallly.. this is really ehlpful
<karolherbst> so we could just model isaspec 100% like nvidia and then just map codegen to that
<jenatali> jekstrand: I can definitely help test
<jenatali> jekstrand: nir_shader_compiler_options has a use_scoped_barrier, dunno if there's a pass that does it, but I believe the spirv parser respects that
ella-0 has joined #dri-devel
<jekstrand> Kayden: Any idea why we're emitting L3 FENCE messages for TCS outputs?
gdevi[m] has joined #dri-devel
<melissawen> on uapi, if the last element in a struct is u32, should I add a u32 pad to align in 64? or is it ok not to add this pad in the end considering that everything before is well arranged?
gdevi has joined #dri-devel
Peste_Bubonica has joined #dri-devel
<FLHerne> jekstrand: Is that alyssa's thing here? https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9024
<FLHerne> (nir: Lower GLSL styles barriers to scoped)
anon_ has joined #dri-devel
gdevi has quit []
<alyssa> FLHerne: I closed since the intel one was better
gpuman has joined #dri-devel
<alyssa> er
slattann has quit []
quasselcore has quit [Read error: Connection reset by peer]
gpuman_ has quit [Ping timeout: 480 seconds]
gdevi[m] has quit []
gdevi[m] has joined #dri-devel
itoral has quit [Remote host closed the connection]
xexaxo has quit [Ping timeout: 480 seconds]
gdevi[m] has quit []
gdevi[m] has joined #dri-devel
gpuman_ has joined #dri-devel
mlankhorst has quit [Ping timeout: 480 seconds]
ella-0_ has joined #dri-devel
gpuman has quit [Ping timeout: 480 seconds]
ella-0 has quit [Ping timeout: 480 seconds]
ella-0_ has quit []
ella-0 has joined #dri-devel
anusha has quit []
gdevi[m]1 has joined #dri-devel
jljusten has quit [Quit: WeeChat 3.0.1]
gdevi[m] has quit []
gdevi[m]1 has quit []
gdevi[m] has joined #dri-devel
gdevi[m] is now known as gdevi
anusha has joined #dri-devel
jljusten has joined #dri-devel
hch12907 has joined #dri-devel
alyssa has left #dri-devel [#dri-devel]
<jenatali> zmike: gdevi has been looking at the primconvert stuff on our side, and it seems like it might be a bit premature
<zmike> hm?
<jenatali> Since we don't support triangle fans, we need those primconverted, but it seems there's still places that try to call straight to the driver to draw triangle fans without going through vbuf
<zmike> are there?
<zmike> 🤔
<jenatali> Pretty sure we saw quad drawing doing that
<zmike> huh
<jenatali> gdevi can comment if he saw more
<zmike> probably should be fixed then
<jenatali> Is that just a matter of making sure it goes through cso/vbuf?
* jenatali isn't super familiar with mesa/st
<zmike> would depend on the callsite?
anon_ has quit []
<zmike> zink theoretically has the same issue for macos since trifans aren't supported there, but nobody ever tests that
<jekstrand> FLHerne: I gues, maybe?
<jekstrand> I'm going to try and make them all happen
<jenatali> zmike: Yeah looks like u_blitter goes straight to pipe->draw_vbo
<zmike> ah
<zmike> alright, so I guess that'll need to check the supported primtypes cap on create (probably just for trifans tho) and then use primconvert internally
<zmike> should be simple to add in
<jenatali> Eh or just switch to triangle list honestly :P
<jenatali> Or strip
<zmike> or that
<jenatali> But I guess that might have the same problem on other hardware? Does anyone not support strips?
<jenatali> zmike: Huh... blitter_context has a use_index_buffer that seems never used
<jenatali> Oh only by v3d... but that might actually just be the easier thing for us to do
<Venemo> mareko in response to our previous chat about HW edge flags, it seems that Vulkan doesn't have any decomposed primitive type like OpenGL so I think we can get away with always disabling it and never writing them
gpuman has joined #dri-devel
gpuman_ has quit [Ping timeout: 480 seconds]
ella-0 has quit [Ping timeout: 480 seconds]
Hi-Angel has quit [Remote host closed the connection]
<airlied> jekstrand: it was dEQP-VK.spirv_assembly.instruction.spirv1p4.entrypoint.comp_workgroup_entry_point
<jekstrand> airlied: weird
gpuman_ has joined #dri-devel
gpuman has quit [Ping timeout: 480 seconds]
lemonzest has quit [Quit: WeeChat 3.2]
<cmarcelo> jekstrand: the linked MR is the approach I was taking, teach GLSL to output the new barriers (like spirv has a toggle), then we can move each driver.
Hi-Angel has joined #dri-devel
<cmarcelo> jekstrand: re: barrier for TCS, I think that's for synchronizing the output memory (URB / L3)
<jekstrand> cmarcelo: RE TCS: Maybe. I'm not convinced it's needed. We'll see if Jenkins agrees once I get through my other obvious GL bugs. :)
<cmarcelo> jekstrand: I think depending on the patch mode / size, we could drop it (and I'd expect us to already do that...). but don't expect to be able to drop it in general. what if the patch is covered by 2 hw threads?
<jekstrand> cmarcelo: Given that the fence message is sent to the dataport, does it even fence the right thing?
<cmarcelo> jekstrand: my understanding: TCS share memory via output memory (can be read/write), and those ops go through the dataport too.
<jekstrand> cmarcelo: That's quite possible. Our HW is very confusing as far as FENCE is concerned
gpuman has joined #dri-devel
<Kayden> jekstrand: I'm not sure what else we would send, there are only L3 fences and SLM fences
<Kayden> TCS outputs are the original poor mans SLM, multiple threads only write their invocation, but can read any invocation, with barriers to let you sync up
gpuman_ has quit [Ping timeout: 480 seconds]
<Kayden> so for a barrier we want to make sure that all writes are visible to every thread
<jekstrand> Kayden: Yeah, the more I argue about it, the more convinced I am that we probably need a real barrier message.
<jekstrand> I'll put it back before I do my rework of doom. :D
<cmarcelo> jek
<Kayden> obviously for SINGLE_PATCH mode, if it's PATCHLIST_8 or less, we don't need them
<Kayden> I -thought- we got rid of those
<Kayden> but I could be mistaken
<cmarcelo> jekstrand: what you mean a "real" barrier message?
<Kayden> for 8_PATCH mode...we need actual barriers all the time
<jekstrand> cmarcelo: I mean FENCE instead of just a control barrier
<cmarcelo> don't we already do that? the scoped barrier has control and memory semantics... (I'm probably missing some nuance here)
<jekstrand> cmarcelo: We do. I deleted it because I thought maybe it was pointless. :P
<Kayden> we should probably do smarter TCS dispatch
<Kayden> I'm pretty sure on TGL we can skip the TCS and just run VS -> TES when there isn't one, rather than a passthrough copy shader
<jekstrand> Ooh, that'd be nifty
<Kayden> and radeon fuses the VS and TCS when the patch size doesn't change
<Kayden> so you just run them together as one shader dispatch
Hi-Angel has quit [Remote host closed the connection]
<marex> flto: I just ran into the following on a2xx / linux 5.10.y, does it look familiar ?
<marex> [<c056533b>] (schedule_timeout) from [<bf91d44f>] (msm_wait_fence+0x15b/0x254 [msm])
<marex> [<bf91d44f>] (msm_wait_fence [msm]) from [<bf91c7d1>] (msm_ioctl_wait_fence+0x75/0xec [msm])
<marex> [<bf91c7d1>] (msm_ioctl_wait_fence [msm]) from [<c03ef20f>] (drm_ioctl_kernel+0x77/0xa8)
<marex> [<c03ef20f>] (drm_ioctl_kernel) from [<c03ef3cb>] (drm_ioctl+0x18b/0x2f0)
<marex> it wasn;t there a few 5.10.y releases ago, this is 5.10.65
<marex> ah, there is also this
<marex> schedule_timeout: wrong timeout value bf942be7
Hi-Angel has joined #dri-devel
gpuman_ has joined #dri-devel
gpuman has quit [Ping timeout: 480 seconds]
ella-0 has joined #dri-devel
gpiccoli has quit [Quit: Bears...Beets...Battlestar Galactica]
gpiccoli has joined #dri-devel
<jekstrand> Kayden, cmarcelo: Ok, I'm really not sure what we're supposed to fence on DG2 then.
<karolherbst> I love debugging weirdo memory corruptions happening through bad ref counting :(
<Kayden> jekstrand: That's a good question :(
<Kayden> it's really only URB access between multiple threads
<Kayden> and it's not in the L3 anymore really
<Kayden> well
<jekstrand> Kayden: Looks like there's a new URB fence message
<Kayden> at least not in the configurable part of the L3...
<jekstrand> Which we're totally not emitting
<jekstrand> jljusten: ^^
<Kayden> there have been URB fence messages in the past too I think?
<Kayden> but yeah, that sounds like exactly what we want
<jekstrand> Kayden: I think in the past it really was a dataport fence
<jekstrand> But when they split everything up for LSC, URB needed its own thing
<jekstrand> I'm going to drop in a TODO for now
<Kayden> yeah
<Kayden> I imagine it's changed and we should switch to that.
Duke`` has quit [Ping timeout: 480 seconds]
gpuman has joined #dri-devel
<jekstrand> I'm now very glad I decided to do a refactor of doom. :D
* jekstrand sings the doom song
gpuman_ has quit [Ping timeout: 480 seconds]
<marex> uh, it is as if WAIT_FENCE ioctl from userspace is filling the ktime_t timeout in msm_wait_fence with nonsense
<flto> marex: someone reported something like that on mailing list.. IIRC the only resolution was it works with newer kernels
<marex> flto: yeah, I saw the report, Otavio never came back
<marex> sucks
<marex> I need this to work, so ... gotta start digging
<marex> flto: I suspect it works with newer kernels because of a61acbbe9cf87 ("drm/msm: Track "seqno" fences by idr")
<marex> that removes the entire fence code path
<marex> flto: one more question ... are all the a3xx and newer freedreno-running SoCs aarch64 or is there some old arm32 ?
<marex> I recall seeing something about 32bit ktime_t in kernel config recently, maybe that is where the problem comes from
<marex> wait a minute, I still have freedreno enabled in libdrm, I likely dont want that
gouchi has quit [Remote host closed the connection]
jhli has joined #dri-devel
moa has joined #dri-devel
<flto> marex: arm32+a3xx is a thing, but its not well supported - AFAIK upstream still requires some hacks to work on those SoCs (no iommu driver).. freedreno CI for a3xx is aarch64
<imirkin> flto: works sort of ok (at least used to)
<imirkin> it does hang a lot
<imirkin> people keep breaking bits of it, becomes very hard to sort out what broke
<marex> ... and ... I, with my limited ability to debug this stuff, run right into it
<marex> very good
<jekstrand> Kayden, jljusten: I've now got patches which add URB fencing, in theory. I'll have jljusten test before we commit to it.
<Company> ajax: do I blame you if llvmpipe can't blend semitransparent khaki onto khaki properly and my testsuite complains about the result not being khaki because it's #F0E58C isntead of #F0E68C or does GL allow same color alpha blends to be off-by-one?
<marex> flto: urgh ... it does look like ktime_t corruption in the WAIT_SYNC call, digging, please wait
<marex> I mean, there is this static inline ktime_t to_ktime(struct drm_msm_timespec timeout) and so I added this print
<marex> pr_err("%s[%i] max=%lu sec=%llu nsec=%llu\n", __func__, __LINE__, NSEC_PER_SEC, timeout.tv_sec, timeout.tv_nsec);
<marex> and I get
bluebugs has quit [Ping timeout: 480 seconds]
<marex> [ 19.487743] to_ktime[732] max=1000000000 sec=18446744092 nsec=1192708573
<marex> well that looks like something is casting u64 to u32 or vice versa
<anholt_> Company: I don't think GL would guarantee you that.
<marex> flto: and that is called from
<marex> 878 static int msm_ioctl_wait_fence(struct drm_device *dev, void *data,
<marex> 883 ktime_t timeout = to_ktime(args->timeout);
<marex> there ... there timeout is struct drm_msm_timespec timeout;
<Company> anholt_: i'll just switch to a less weird color then - it works fine on actual GPUs but I don't need to dare the GL gods
gpuman_ has joined #dri-devel
gpuman has quit [Ping timeout: 480 seconds]
gawin has joined #dri-devel
moa is now known as bluebugs
<minecrell> marex, flto, imirkin: FWIW (didn't really follow the discussion much) I use arm32+a3xx on a weird msm8916 device that can only boot arm32 because of outdated (signed) firmware. Works just fine there, not aware of any problems that don't also happen with aarch64
<minecrell> So the hangs on other arm32+a3xx platforms are probably largely unrelated to arm32
<imirkin> minecrell: yeah, it's all platform-specific
<imirkin> i have an apq8064
<imirkin> works best with some random 3.18ish kernel
thellstrom has quit [Remote host closed the connection]
<imirkin> to which i've lost the source
<marex> minecrell: ahh
thellstrom has joined #dri-devel
<marex> that timespec_t might be 32bit on 32bit system, which might end up converted the wrong way
<marex> but that should be no problem too
vivijim has quit [Ping timeout: 480 seconds]
gawin has quit [Quit: Konversation terminated!]
<minecrell> marex: I searched through the chats in the postmarketOS channels and in March someone posted similar stuff with msm_wait_fence on msm8974 (arm32 + a3xx too, 5.11.4)
<minecrell> dunno if this helps :p
gawin has joined #dri-devel
JohnnyonFlame has joined #dri-devel
<minecrell> don't see much of a discussion though, I'm guessing it happened randomly only or something
<marex> minecrell: I saw the nexus5 report too ;-)
<marex> I can apparently trigger it consistently with qt5 qopenglwidget qtbase example
gpuman has joined #dri-devel
gpuman_ has quit [Ping timeout: 480 seconds]
<karolherbst> uhh.. now I have to understand why this breaks tegra: https://gitlab.freedesktop.org/mesa/mesa/-/commit/7688b8ae980223f094be9c70fe695e2122caf3e3
Hi-Angel has quit [Ping timeout: 480 seconds]
<minecrell> marex: let me know if some additional testing on arm32 and/or arm64 + a3xx could help you, if it's as easy as starting some standard qt5 application I can probably find someone who can run that without too much effort
<marex> Thread 1 "qopenglwidget" hit Breakpoint 1, msm_pipe_wait (pipe=<optimized out>, fence=<optimized out>, timeout=<optimized out>) at ../git/src/freedreno/drm/msm_pipe.c:122
<marex> 122 ret = drmCommandWrite(dev->fd, DRM_MSM_WAIT_FENCE, &req, sizeof(req));
<marex> (gdb) p req
<marex> $1 = {fence = 336, pad = 0, timeout = {tv_sec = 18446744345, tv_nsec = 790802689}, queueid = 3}
<marex> that is a lot of seconds there
<marex> #1 0xb5a9cf14 in fd_fence_finish (pscreen=<optimized out>, pctx=0x0, fence=0x4ee2c0, timeout=18446744073709551615) at ../git/src/gallium/drivers/freedreno/freedreno_fence.c:154
<marex> this is also weird
<marex> notice the timeout value
<marex> that's u64 for -1 , hah
gpuman_ has joined #dri-devel
<marex> oh, that's PIPE_TIMEOUT_INFINITE that gets misconverted then
gpuman has quit [Ping timeout: 480 seconds]
<marex> maybe all I need to do is call fd_pipe_wait (without timeout) if PIPE_TIMEOUT_INFINITE
<marex> hum
thelounge637 is now known as alatiera
<robclark> marex: no.. that just calls fd_pipe_wait_timeout(timeout=~0)
<marex> oh, dang
<robclark> note etnaviv has an at least slightly fixed version of get_abs_timeout()
Hi-Angel has joined #dri-devel
<robclark> but probably get_abs_timeout() is the place to fix it?
<marex> I was about to ask whether src/freedreno/drm/msm_pipe.c msm_pipe_wait was the right place, but yes, let's take a look
<marex> since the msm_pipe_wait() is where the PIPE_TIMEOUT_INFINITE gets propagated to
<marex> robclark: should we also someone special-case handle the PIPE_TIMEOUT_INFINITE ?
<robclark> marex: the CPU_PREP ioctl also takes a `struct drm_msm_timespec`.. but get_abs_timeout() is used in both, so I think that would be the place to handle it.. maybe what etnaviv already does to handle rollover is enough?
<marex> the rollover isn't particularly "infinite"
<robclark> well, anything over a few seconds is essentially infinite.. if gpu hasn't finished the kernel will eventually decide the gpu has hung and reset it
JohnnyonF has joined #dri-devel
pcercuei has quit [Quit: dodo]
<robclark> so for infinite, as long as you use something that is sufficiently far out in the future it should be ok, I think
<marex> well, I won;t argue with you about this, I'm not deep enough to make informed decision either way
<marex> I would probably opt for some flag in the ioctl indicating there is infinite timeout
<marex> (in the struct passed along the ioctl I mean)
JohnnyonFlame has quit [Ping timeout: 480 seconds]
<robclark> marex: perhaps, I was mostly thinking of how we could fix this without requiring new UABI or patches that older kernels hitting this issue don't have ;-)
<karolherbst> danvet, airlied, mlankhorst: is the drm-mist-fixes situation resolved now or should I just push to -next for now?
<marex> robclark: indeed
<marex> robclark: I will test this now and open a MR in a bit
<robclark> cool, thx
<marex> robclark: thank you for the input
<robclark> np, sorry I hadn't seen the email thread about that issue
<robclark> good catch in realizing what was going wrong
Hi-Angel has quit [Ping timeout: 480 seconds]
<marex> robclark: there was no email thread, I ran into it when I asked flto about it, so a few hours ago
<danvet> karolherbst, needs more mlankhorst I think
<robclark> ahh, ok, I thought someone mentioned something about a report on list
<karolherbst> danvet: okay.. will try tomorrow then or so
<marex> robclark: ah that, there was an old report, but never followed up on
<robclark> ahh, ok
The_Company has joined #dri-devel
<robclark> oh, I guess I did see it.. and forgot about it
<marex> :)
oneforall2 has quit [Remote host closed the connection]
Company has quit [Ping timeout: 480 seconds]
<marex> robclark: btw is there some process for picking these fixes into older mesa versions, like 21.2.y ?
<marex> robclark: like the Fixes: tag in kernel ?
<robclark> There are a couple options, Fixes or Cc tags is enough if the fix cherry-picks back cleanly
<robclark> otherwise you can open an MR against the 21.2-staging branch
<robclark> (ie. `Cc: mesa-stable`)
<dcbaker> cc: "21.2" mesa-stable
<dcbaker> please :)
oneforall2 has joined #dri-devel
<marex> ah ok
<dcbaker> PSA: I sent an email about patches needing backport to 21.2, there's a bunch of panfrost, some aco, freedreno, zink, intel, and lavapipe off the top of my head
<dcbaker> Fixes is better though if it does fix a parciular commit
<robclark> dcbaker: I sent an MR for one of the freedreno patches, and suggested dropping the other (if you didn't see it yet)
<dcbaker> I haven't yet, thanks!
<robclark> np, thx
iive has quit []
<marex> I suspect that timeout fix would be Fixes: f3cc0d27475 ("freedreno: import libdrm_freedreno + redesign submit")
<marex> because that rollover has been unhandled ever since the beginning
<marex> do we still care about fixing it in libdrm too ?
Peste_Bubonica has quit [Quit: Leaving]
rasterman has quit [Quit: Gettin' stinky!]
Surkow|laptop has quit [Remote host closed the connection]
Surkow|laptop has joined #dri-devel
gawin_ has joined #dri-devel
gawin has quit [Read error: Connection reset by peer]
JohnnyonF has quit []
<marex> robclark: uh ... that still fails
<marex> that is odd
alyssa has joined #dri-devel
<marex> well duh, we also have to handle the same for tv->sec
<alyssa> jekstrand: Did I miss ralloc->malloc nir drama?
<karolherbst> okay...
tursulin has quit [Read error: Connection reset by peer]
<karolherbst> I am convinced that util_set_vertex_buffers_mask is busted
<karolherbst> if called with take_ownership == true
<idr> alyssa: Seems like piles of things are broken on some older Intel platforms.
<idr> It's only mildly dramatic.
<idr> Would not get any of us on Springer, so... *shrug*
idr has quit [Quit: Leaving]
Bennett has quit [Remote host closed the connection]