yyds_ has quit [Remote host closed the connection]
g0b has quit [Remote host closed the connection]
mripard has quit [Remote host closed the connection]
max_ has joined #dri-devel
max_ has quit [Remote host closed the connection]
max_ has joined #dri-devel
max_ has quit []
mripard has joined #dri-devel
fab has quit [Quit: fab]
doras has joined #dri-devel
pankart[m] has joined #dri-devel
camus1 has joined #dri-devel
camus has quit [Ping timeout: 480 seconds]
Duke`` has quit [Ping timeout: 480 seconds]
<daniels>
emersion: btw, I didn't see what you pushed it as (have been away the last few days), but GPLv2-only is fine by me for the doc patch, or any other prevailing license if that makes life easier
<karolherbst>
though I can't really pin point my pain points yet, one of the problem is, that values are reloaded all the time
<karolherbst>
though in theory the input addresses can be the same and I suspect that's where optimization bails?
<karolherbst>
so because it can't exclude the possibility of those buffers being the same it has to asume the store voids the content
<karolherbst>
yeah.. changing the global to constant makes it _way_ faster
<karolherbst>
I might just fix the CTS
<DavidHeidelberg[m]>
karolherbst: btw. about ubsan, it just means that if we going to test with ubsan, we have to omit Nouveau build, but if you don't need it, it's fine
<karolherbst>
not saying that enabling ubsan is bad, it's just annoying if it reports such false positives (well.. not strictly, but the problem is something else)
<karolherbst>
it doesn't complain about any other indexing with an int
<karolherbst>
why with enums?
<karolherbst>
and then also in the wrong way
<DavidHeidelberg[m]>
no idea.. (never used ubsan before).. it's like with every compiler, it catches useful stuff, but it's usually 99.x% :(
<zmike>
my experience with ubsan so far has been 100% false positives and useless reports
sergi1 has joined #dri-devel
fab has joined #dri-devel
Tooniis[m] has joined #dri-devel
alyssa has left #dri-devel [#dri-devel]
<karolherbst>
gfxstrand: what's the intended way of getting `nir_intrinsic_load_global_constant` from CL constant buffers? let them be emitted as ubos and then use a global ptr format?
<karolherbst>
currently nir_lower_io turns `nir_var_mem_constant` things into normal global memory after lower_explicit_io
<karolherbst>
or maybe I just figure out how to make ubos work, but I wanted to use the mem_constant way for it first anyway
Duke`` has quit [Ping timeout: 480 seconds]
kts has joined #dri-devel
ids1024[m] has joined #dri-devel
greenjustin_ has joined #dri-devel
yshui` has joined #dri-devel
go4godvin has joined #dri-devel
go4godvin is now known as Guest268
digetx is now known as Guest270
digetx has joined #dri-devel
mxlcjckmsmz has quit [Read error: Connection reset by peer]
<dottedmag>
libgbm seems to be pretty vague about what gets returned from gbm_bo_get_fd/gbm_bo_get_handle. Can one assume when DRI backend is in use that former returns dmabuf fd, and latter GEM handle?
<emersion>
it's always the case
<emersion>
no matter the backend
<dottedmag>
aha, thanks
<emersion>
i'd merge a patch making this more obvious
<jani>
sima: no idea at all, and when I look at get_maintainer.pl I don't understand what it does either. (and I may very well have added that option to dim!)
<jani>
sima: I think e.g. dim fixes is overeager at adding Cc's
MayeulC has joined #dri-devel
kzd has joined #dri-devel
<sima>
jani, yeah I think it's bad either way
<sima>
depth 1 limits how much people get added
lynxeye has quit [Quit: Leaving.]
<sima>
but otoh if you have depth and the file is deep in a driver, not even the driver author is cc'ed
<sima>
depth=1 I mean
<sima>
and both amd and i915 are very deep at this point
<sima>
also many others
Haaninjo has joined #dri-devel
samueldr has joined #dri-devel
greenjustin has quit [Ping timeout: 480 seconds]
<linkmauve>
enunes, anarsoul, what would be needed for robustness support in lima? I’m getting a bit tired of phosh encountering the timeout and not being able to recover from that, and I think robustness is the way to go.
nchery is now known as Guest279
nchery has joined #dri-devel
benjaminl has quit [Ping timeout: 480 seconds]
<javierm>
sima: I'm on PTO untl next week. But I do have it in my TODO list to take a look
Quinten[m] has joined #dri-devel
enick_185 has joined #dri-devel
shoffmeister[m] has joined #dri-devel
mripard1 has joined #dri-devel
msizanoen[m] has joined #dri-devel
<enunes>
linkmauve: I never looked into robustness. I think someone helping to debug the actual reasons for the pinephone issues first would be more helpful
<linkmauve>
enunes, I can reproduce with any shader taking more than 1s to execute.
<linkmauve>
I then see it timeout in dmesg, and the program will continue executing with the context having been lost.
<enunes>
well then it hits the job execution timeout and that's really not normal, from what I understand in the pinephone issues it's coupled with power management switches going on at the same time
<pq>
linkmauve, would your program use robustness extensions is they were exposed?
<linkmauve>
pq, not yet, but I would add support to it yes.
<pq>
linkmauve, doesn't any existing API call return you errors after the timeout?
<linkmauve>
glGetError() keeps returning 0, and the rest of the GLES 2.0 APIs don’t return any error AFAIK.
kts has joined #dri-devel
<pq>
I was thinking more like EGL.
<pq>
since contexts are an EGL thing, and egl funcs might return errors, like SwapBuffers
<pq>
AFAIU, robustness will only ensure your app doesn't crash, and you can know about losing context and if you were the culprit or not, but the remedy is always to tear everything down and re-create from scratch.
<pq>
so if EGL happened to tell you about it already, maybe you wouldn't need robustness
<linkmauve>
That’s fine I think, as a compositor I’d be fine with recreating everything on crash.
<pq>
There has been some talk about GPU reset handling on dri-devel@ recently, what should happen.
Eighth_Doctor has joined #dri-devel
<enunes>
I think work on robustness is orthogonal to fixing the actual known driver issue. someone still needs to put the needed effort to go in the code and debug these issues which only happen on the pinephone
<enunes>
would be great if someone from the pinephone community stepped up to do that, with our current contributor capacity it might take another while
tak2hu[m] has joined #dri-devel
<enunes>
and I think it needs to be better than those issues with a 100 maybe-relevant drive-by comments of people hitting any timeout for any reason
pushqrdx[m] has joined #dri-devel
<linkmauve>
enunes, how would you go at debugging this kind of issue?
<linkmauve>
I have a PinePhone, but I have no access to any hardware debugging tool I think.
<enunes>
enable some power management debugging in the kernel, try to see if the driver is using the subsystem correctly, maybe look in the kernel mali driver to see how they handle it and if our kernel mali driver is missing something
<enunes>
there are also issues for frequency scaling and other for device suspend, so it could be that for each of these
Targetball[m] has joined #dri-devel
knr has joined #dri-devel
devarsht[m] has joined #dri-devel
daniliberman[m] has joined #dri-devel
hansg has joined #dri-devel
<enunes>
maybe for example devfreq needs to be guarded in some place that currently isn't so it's happening in a wrong moment, etc
<gfxstrand>
karolherbst: If the driver controls lowering (recommended), then they can pick out UBOs while they still have deref chains. Then anything that path can't convert to UBOs, you convert to global.
<gfxstrand>
karolherbst: The other way to do it would be to use a different addr format for constant memory (which is possible since it can't be included in generic) and lower it to something else.
<karolherbst>
gfxstrand: mhh.. yeah, maybe. I just want to emit a load_global_constant, or maybe just turn them all into ubos and the driver does their thing anyway....
<karolherbst>
I should probably just to the ubo thing right away
<karolherbst>
I think it's fair to use ubos there given what limits other drivers expose
<ids1024[m]>
When buffers are allocated for scanout in Iris, the ISL_SURF_USAGE_DISPLAY_BIT usage it set, and in https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/intel/isl/isl.c#L1727-1749 that adjusts the row pitch to be suitable for scanout, considering the alignment required by Nvidia or AMD. It looks like pitch alignment is also an issue for importing dmabufs on Nvidia and AMD cards too, though, not just scanout? On Wayland
<ids1024[m]>
linux-dmabuf-unstable-v1 has a scanout flag, but as far as I can tell there's currently no other way to handle alignment requirements for import on a different GPU.
<emersion>
ids1024[m]: yes.
bgs has joined #dri-devel
<gfxstrand>
bnieuwenhuizen, airlied: What does RADV do if it runs out of command buffer space? Do you hand a pile to the kernel or do some sort of goto/chain thing?
Guest212 is now known as DemiMarie
benjaminl has joined #dri-devel
<anarsoul>
enunes: linkmauve: timeout issue might as well be a bug in Allwinner A64 arch timer
<anarsoul>
the timer is buggy and experiences time jumps, maybe workaround in the kernel isn't sufficient to fix it
benjamin1 has joined #dri-devel
<bnieuwenhuizen>
gfxstrand: chain
Dr_Who has joined #dri-devel
<anarsoul>
linkmauve: enunes: to rule it out, you need to add support for A64 into timer-sun4i and switch to timer-sun4i clocksource
<gfxstrand>
bnieuwenhuizen: On all hardware or is it really restricted?
<bnieuwenhuizen>
Except on old HW where we can batch multiple buffers
<gfxstrand>
bnieuwenhuizen: I see a use_ib thing
<bnieuwenhuizen>
Which with power of two growth also works
benjaminl has quit [Ping timeout: 480 seconds]
junaid has joined #dri-devel
cmichael has quit [Quit: Leaving]
kasper93 has quit [Remote host closed the connection]
swalker__ has quit [Remote host closed the connection]
aradhya7 has joined #dri-devel
idr has joined #dri-devel
sgruszka has quit [Remote host closed the connection]
jkrzyszt has quit [Ping timeout: 480 seconds]
lynxeye has quit [Quit: Leaving.]
alyssa has joined #dri-devel
rgallaispou has joined #dri-devel
masush5[m] has joined #dri-devel
<anarsoul>
linkmauve: btw, we have #lima channel to discuss lima issues
rgallaispou has left #dri-devel [WeeChat 4.0.3]
YHNdnzj[moz] has joined #dri-devel
<anarsoul>
linkmauve: it looks like timer-sun4i already supports A64. So I guess you can make sure that it's enabled in the kernel and just try to switch clocksource
greenjustin has joined #dri-devel
benjaminl has joined #dri-devel
benjamin1 has quit [Ping timeout: 480 seconds]
vliaskov has quit [Ping timeout: 480 seconds]
frieder has quit [Remote host closed the connection]
donaldrobson has quit [Ping timeout: 480 seconds]
greenjustin has quit [Ping timeout: 480 seconds]
mripard has quit [Quit: mripard]
Hazematman has joined #dri-devel
Leopold_ has joined #dri-devel
pixelcluster_ has joined #dri-devel
zzxyb[m] has joined #dri-devel
pixelcluster has quit [Ping timeout: 480 seconds]
pixelcluster_ has quit []
pixelcluster has joined #dri-devel
<Venemo>
I can't get over how difficult it is to read NIR 2.0 now
<zmike>
just let your eyes lose focus like you're staring at a magic eye puzzle
<Venemo>
:D
tomba has joined #dri-devel
gdevi has joined #dri-devel
Mis012[m]1 has joined #dri-devel
jasuarez has joined #dri-devel
zzoon has joined #dri-devel
ohadsharabi[m] has joined #dri-devel
aradhya7 has quit [Quit: Connection closed for inactivity]
ajhalaney[m] has joined #dri-devel
kunal_10185[m] has joined #dri-devel
jenatali has joined #dri-devel
<alyssa>
Venemo: what's the problem?:(
<Venemo>
just need to get used to it, I guess
<Venemo>
two things that look weird to me are how the bit sizes are tabulated and how intrinsics without definitions are tabulated
<alyssa>
Oh, you mean the nir_print
<alyssa>
I thought you meant the nir_def
guru_ has joined #dri-devel
oneforall2 has quit [Ping timeout: 480 seconds]
<sima>
airlied, I created topic/drm-ci and asked sfr to add it in case you missed my reply
<sima>
koike, robclark daniels ^^ more acks from (driver) maintainers would be real good
gouchi has joined #dri-devel
junaid has quit [Remote host closed the connection]
guru_ has quit []
oneforall2 has joined #dri-devel
anujp has quit [Ping timeout: 480 seconds]
<robclark>
sima: probably better to ping folks who _haven't_ already acked it... I don't think you can just add my a-b 5x times :-P
anujp has joined #dri-devel
<sima>
robclark, I figured you're most motivated to find those people :-P
<robclark>
I think i915 has the most devices so far, in drm/ci
<Venemo>
alyssa: I mean the result of nir_print_shader yes
anujp has quit [Ping timeout: 480 seconds]
anujp has joined #dri-devel
Surkow|laptop has quit [Ping timeout: 480 seconds]
fab has quit [Quit: fab]
junaid has joined #dri-devel
junaid has quit [Remote host closed the connection]
hansg has quit [Quit: Leaving]
<karolherbst>
gfxstrand: is there a way in vulkan to query how much private memory a pipeline/shader/whatever consumes?
<karolherbst>
I'm sure vulkan does not bother with that information at all
<karolherbst>
just wanted to double check
Surkow|laptop has joined #dri-devel
kzd has quit [Quit: kzd]
sima has quit [Ping timeout: 480 seconds]
<pixelcluster>
karolherbst: unless we talk raytracing there is no standardized way, no
<pixelcluster>
private mem doesn't exist as a concept in vulkan, except in vulkan for rt pipeline stacks
<alyssa>
karolherbst: is this for ruzticl?
Haaninjo has quit [Quit: Ex-Chat]
<pixelcluster>
if you know the driver you could perhaps filter it from pipeline executable properties, if it includes that info there
<pixelcluster>
s/in vulkan for rt pipeline stacks/in raytracing for the stacks used in recursive calls/
<karolherbst>
yeah
<karolherbst>
alyssa: ^^
<karolherbst>
pixelcluster: mhhh
<alyssa>
karolherbst: >:)
<karolherbst>
pixelcluster: I think I'll let zink just use nir->scratch_size
<karolherbst>
the good thing is... I can return whatever value
<karolherbst>
this is just a hint in CL
<karolherbst>
no idea what applications would do with that though
<karolherbst>
maybe account for VRAM and see if their stuff fits? dunno
<gfxstrand>
karolherbst: no
<karolherbst>
:')
<karolherbst>
gfxstrand: ohh.. is there a property for compute shaders to know how many threads can be in a block per compiled pipeline/object?
<gfxstrand>
max local workgroup size? Yes.
<karolherbst>
with variable block size that is
<karolherbst>
ahh, cool
<karolherbst>
where could I find that?
<gfxstrand>
VkLimits
<gfxstrand>
Or maybe VkDeviceLimits?
<karolherbst>
not device limits, I need it per compiled shader
<karolherbst>
seems like clvk just returns the device limit...
<karolherbst>
(that's not very helpful)
<karolherbst>
thing is... launching a kernel with the value returned there has to _always_ succeed
<karolherbst>
if the device limit is 1024, but a given compiled shader can only support 512 threads, then it would be a bug to return 1024
<gfxstrand>
If it's possible for a compiled shader to only support 512 threads, then it's a bug to return 1024 in the VkDeviceLimits
<karolherbst>
mhhh
<karolherbst>
the spec wording doesn't make me very confident about that
<gfxstrand>
vkCmdDispatch doesn't have a way to safely fail
<karolherbst>
ahh, fair enough then
<karolherbst>
so drivers have to be very pessimistic about what they report through maxComputeWorkGroupInvocations then
<gfxstrand>
Most drivers don't have Intel's insane flexibility
<karolherbst>
(or lower it)
<karolherbst>
I think it's also a problem on nvidia hardware
<gfxstrand>
Could be
<robclark>
it's a problem on all hw I think ;-)
<karolherbst>
you have 64k grps, and you can allocate up to 255 per thread
<karolherbst>
but you can launc 1024 threads
<karolherbst>
so you'd have to report 256 threads max to be safe
<karolherbst>
ehh seems like newer GPUs can even do 2048 or 1536 threads...
<karolherbst>
uhm.. not per block
<karolherbst>
seems like nvidia reports 1024 though...
<karolherbst>
so either they cheat it, or you can kinda deal with nonsense here
guru_ has joined #dri-devel
Duke`` has quit [Ping timeout: 480 seconds]
oneforall2 has quit [Ping timeout: 480 seconds]
gouchi has quit [Remote host closed the connection]
guru_ has quit [Ping timeout: 480 seconds]
oneforall2 has joined #dri-devel
rgallaispou has joined #dri-devel
psykose has quit [Remote host closed the connection]
<Venemo>
karolherbst: what exactly do you mean by private memory there?
<karolherbst>
memory an implementation will have to allocate to run a certain shader
<karolherbst>
like scratch memory or other funky stuff
<karolherbst>
might even include spilled values and other things
<karolherbst>
essentially any kinda of global memory you'd have to allocate to run something
<columbarius>
emersion: thanks
greenjustin has joined #dri-devel
<Venemo>
karolherbst: I don't think these are defined in the API in any way
<Venemo>
they are entirely implementation dependent
<alyssa>
karolherbst: You would presumably need a vk ext for this
crcvxc has joined #dri-devel
<karolherbst>
or I just don't care :P
<karolherbst>
there are more improtant things to figure out :D
crcvxc_ has joined #dri-devel
oneforall2 has quit [Ping timeout: 480 seconds]
<alyssa>
valid
rgallaispou has quit [Quit: WeeChat 4.0.3]
crcvxc has quit [Ping timeout: 480 seconds]
<karolherbst>
does vulkan have a way to report the native GPU pointer size?
<alyssa>
karolherbst: 64
<karolherbst>
I guess that's fair
<karolherbst>
so if your GPU has 32 bit pointers, you can't do vulkan? :P
<karolherbst>
(though I guess you just zero the high bits)
<alyssa>
:P
<alyssa>
eric_engestrom: We think we've found a Mesa EGL bug (though there's a small chance it's a CTS bug)
<alyssa>
The symptom is that Mesa does not pass CTS if gles1 is disabled at build-time (-Dgles1=disabled).
<alyssa>
Presumably this wasn't caught because the gles1 option is default true and people are running CTS on their development builds, and not the wacko space-optimized release builds that system integrators come up with
<alyssa>
(But it looks like at least 1 major distro is unwittingly shipping non-conformant Mesa due to this issue)
<alyssa>
One failing test is `dEQP-EGL.functional.create_context.no_config`
<alyssa>
This test unconditionally creates contexts of all APIs, and then skips based on the error code
<alyssa>
It looks like Mesa is returning the wrong error code for GLES1 on -Dgles1=disabled builds
<alyssa>
causing the gles1 portion of the test to fail rather than be correctly skipped
<alyssa>
apparently other tests like `dEQP-EGL.functional.create_context.rgb888_no_depth_no_stencil` are also failing... I can't tell why, since they don't seem to exercise GLES1, but maybe I'm looking at the wrong CTS source code.
<alyssa>
This is based on debugging on Asahi, but presumably all Mesa drivers are affected
<alyssa>
I'm hopeful this will turn out to be a 1-line Mesa patch... and not a CTS bug...
<alyssa>
Regardless, could you take a look at this tomorrow? Thank you! :-)