ella-0_ has quit [Read error: Connection reset by peer]
<mattst88>
anholt: I'm trying to use bloaty, but it's giving me --
<mattst88>
bloaty: Don't know how to parse DWARF form: 31
<anholt>
haven't seen that one
<mattst88>
v1.1 similarly didn't seem to know what to do with the DWARF format I gave it (just using gcc-11), so I tried bloaty from git
<HdkR>
Is this bloaty on libGL or something? I've also never run in to that
moa has quit []
<mattst88>
I ran it on iris_dri.so
<HdkR>
Hm, seems fine here on a release build + clang
bluebugs has joined #dri-devel
ahajda has quit []
<anholt>
mattst88: I've been doing mine on debugoptimized, gcc 11.2, lld builds, fwiw.
<mattst88>
okay, I'll give clang and/or lld a try
<anholt>
(do you maybe have some extra -g flags set in a meson native/cross file, perhaps?)
tursulin has quit [Read error: Connection reset by peer]
gawin has joined #dri-devel
* jekstrand
installs bloaty just to see what it is
<gawin>
has someone reasonable clang format config file?
<jekstrand>
For what?
<jekstrand>
I'm pretty sure the only driver in Mesa that has any chance of being clang-format clean would be RADV and only because bnieuwenhuizen recently re-formatted the whole thing.
<bnieuwenhuizen>
"recently" I'm sure we diverged a bit
<bnieuwenhuizen>
need to redo it at some point
<bnieuwenhuizen>
would be easier to put in CI if it wasn't dependent on the clang version as to what the exact formatting is
<HdkR>
jekstrand: bloaty is great, lets you see stats about why your executable is an absolute monster.
<bnieuwenhuizen>
of course I'm pretty sure it'll tell jason that nir algebraic is huge
<mattst88>
anholt: -ggdb3 in my debug config and -fno-omit-frame-pointer in my release config. I'll try removing them
<anholt>
I have no-omit-frame-pointer, but ggdb3 seems likely
<gawin>
jekstrand: being not able to format a file is probably bad answer?
<heat>
mattst88, I was curious and ran bloaty on one of my local executables, GCC 11.2 and it errors out with "Data is in new DWARF format we don't understand"
<heat>
so you're not alone ;)
iive has quit [Ping timeout: 480 seconds]
<mattst88>
heat: good to know, thanks!
alanc has quit []
co1umbarius has joined #dri-devel
alanc has joined #dri-devel
columbarius has quit [Ping timeout: 480 seconds]
zf_ is now known as zf
ybogdano has quit [Ping timeout: 480 seconds]
Haaninjo has quit [Quit: Ex-Chat]
pnowack has quit [Quit: pnowack]
gawin_ has joined #dri-devel
gawin has quit [Ping timeout: 480 seconds]
utf64 has quit [Ping timeout: 480 seconds]
camus has joined #dri-devel
camus1 has quit [Ping timeout: 480 seconds]
Company has quit [Read error: Connection reset by peer]
<ishitatsuyuki>
Is there a good way to, say, see how much fragmentation is going on in ttm?
gawin_ has quit [Ping timeout: 480 seconds]
Lightsword_ has quit []
Lightsword has joined #dri-devel
mbrost_ has joined #dri-devel
boistordu_old has joined #dri-devel
jewins has quit [Ping timeout: 480 seconds]
aravind has joined #dri-devel
mbrost has quit [Ping timeout: 480 seconds]
boistordu_ex has quit [Ping timeout: 480 seconds]
ngcortes has quit [Ping timeout: 480 seconds]
tarceri has quit [Remote host closed the connection]
tarceri has joined #dri-devel
tarceri_ has joined #dri-devel
tarceri has quit [Read error: Connection reset by peer]
<daniels>
if only there was some easier way to elide that call
<emersion>
text editors are boring, real devs use binary editors
<daniels>
I always thought sway was about choice, but here you are forcing your poor users to manually patch binaries
kts has joined #dri-devel
<HdkR>
Well now I'm thinking about what it will take to calculate function signatures so I can patch them at runtime.
<HdkR>
Probably not too terrible thinking about it. Might be nice
kts_ has joined #dri-devel
kts_ has quit []
kts has quit [Ping timeout: 480 seconds]
jewins has joined #dri-devel
<FLHerne>
emersion: ...wat
<emersion>
time to remove our config file parser. who needs that when you can just dynamically patch the executable's data section
rodrigovivi has joined #dri-devel
<FLHerne>
Hey, if it works for dwm
kts has joined #dri-devel
<GyrosGeier>
why not go the wm2 route and have the compiler parse the config for you at build time?
hch12907 has joined #dri-devel
mattrope has joined #dri-devel
hch12907_ has quit [Ping timeout: 480 seconds]
fxkamd has joined #dri-devel
camus has joined #dri-devel
camus1 has quit [Read error: Connection reset by peer]
airlied_ has quit [Ping timeout: 480 seconds]
airlied has joined #dri-devel
nchery has quit [Ping timeout: 480 seconds]
nchery has joined #dri-devel
sdutt has joined #dri-devel
Major_Biscuit has quit [Ping timeout: 480 seconds]
mbrost has joined #dri-devel
guru_ has quit []
oneforall2 has joined #dri-devel
vivijim has quit [Quit: Coyote finally caught me]
vivijim has joined #dri-devel
<bbrezillon>
is it valid to destroy a VkPipelineLayout while VkPipeline created with this layout are still alive and are being passed to vkCmdBindPipeline() after this destruction?
<bbrezillon>
I'm asking because that's happening in one of the deqp test I'm looking at
<danylo>
bbrezillon From spec "A VkPipelineLayout object must not be destroyed while any command buffer that uses it is in the recording state." so it should be wrong to destroy it before even binding the pipeline
<zmike>
anyone know offhand what the max vec length in vulkan spirv is?
<jenatali>
And 64bit vals in the next VUID says it has to be 2
<jekstrand>
bbrezillon: Yes, it's valid.
<jekstrand>
bbrezillon: In ANV, we reference count VkPipelineLayout precisely because of annoyances like that.
<bbrezillon>
ok, that's unexpected, but I guess I'll do the same :)
<jekstrand>
bbrezillon: Generally, VkPipelineLayout is a template structure (VkRenderPass is too) where it's expected that everything that takes it copies out whatever data it needs at that time.
<jekstrand>
It gets passed back in by the client at a few key points but the one passed back in may not be exactly the same VkPipelineLayout, just one with identical create parameters.
<jekstrand>
bbrezillon: What do you need to hang onto it for?
<bbrezillon>
the ID3D12RootSignature, but that one is already wrapped with a ComPtr<> so copying it is not a problem
<bbrezillon>
not sure if it's better to copy things around of refcount the pipeline_layout object
<bbrezillon>
*or refcount the pipeline_layout object
<jekstrand>
I don't remember why we refcount. There's one place we need it where it's not passed in but it's been a long time
<bbrezillon>
jekstrand: OOC what are the vkCmds retaining the pipeline layout until the cmd buffer leaves the recording state mentioned in the spec. I thought they were referring to vkBindPipeline(), but it's obviously not the case.
camus has quit [Remote host closed the connection]
<pH5>
bbrezillon: VK_KHR_maintenance4 explicitly allows applications to destroy the VkPipelineLayout object immediately after use.
hch12907_ has quit [Ping timeout: 480 seconds]
hch12907_ has joined #dri-devel
hch12907 has quit [Ping timeout: 480 seconds]
aravind has quit [Ping timeout: 480 seconds]
vivijim has quit [Quit: leaving]
hch12907 has joined #dri-devel
gouchi has joined #dri-devel
<graphitemaster>
Does anyone know what the performance implications are of ARB_robust_buffer_access_behavior.txt ?
<graphitemaster>
I'm trying to understand the purpose of the extension since it's not very obvious. Robust as a descriptor often has to do with safety. I want to know if it's _faster_ than riddling some shaders with branches to check in the cases an out of bounds could occur.
MrCooper has quit [Remote host closed the connection]
<ajax>
graphitemaster: i think the intent, usually, is that when you see that extension the hardware already enforces the OOB safety so it should be no overhead
<ajax>
obviously not quite so true for like llvmpipe
MrCooper has joined #dri-devel
<graphitemaster>
Like if the extension just makes the shader compiler inject branches everywhere that would make everything slower than selective insertions of branches in the cases I know it could occur. The hope was that the extension enables zero-cost defined OOB in which case it would be faster than ifs.
<emersion>
the ext is useful for other purposes
<graphitemaster>
ajax, You can see my concern though right
<emersion>
a driver which exposes it might support GetGraphicsStatus(), but might still have lower perf when robust buffer access is enabled
hch12907_ has quit [Ping timeout: 480 seconds]
<emersion>
e.g. the arm driver docs recommend to enable robust buffer access for debug, but disable for release
<graphitemaster>
The fact it's conflated with the global robustness stuff is a real disappointment.
<emersion>
yeah, should've been 2 separate exts imho
<graphitemaster>
OOB should be a GLSL extension that can be enabled on a shader regardless of a robust context to get defined OOB within that shader.
<graphitemaster>
As per usual, no information online about how much slower a robust context is on different drivers and hardware.
rgallaispou has left #dri-devel [#dri-devel]
heat has joined #dri-devel
hch12907_ has joined #dri-devel
tobiasjakobi has joined #dri-devel
tobiasjakobi has quit [Remote host closed the connection]
ybogdano has joined #dri-devel
hch12907 has quit [Ping timeout: 480 seconds]
<daniels>
emersion: a recommendation which tells you exactly how much support the hardware has for it ...
OftenTimeConsuming has quit [Remote host closed the connection]
<emersion>
ahah yea
mszyprow has quit [Ping timeout: 480 seconds]
Major_Biscuit has quit [Ping timeout: 480 seconds]
ngcortes has joined #dri-devel
<imirkin>
daniels: dunno. adreno doesn't need to jump through too many hoops to limit accesses of images/ssbo's. otoh nvidia has no notion of a "ssbo" out of bounds, so all those checks are done "by hand"
heat has quit [Ping timeout: 480 seconds]
<daniels>
imirkin: well depending on your pov, stealing uniform slots and clamping every image/ssbo access isn't a huge amount of hoops per se, but it sure doesn't look nice in disasm :P
<imirkin>
daniels: well, the point is that it's not done by FF hardware
<imirkin>
i guess on adreno that clamping ain't completely free either, since it's gotta happen _somewhere_
<imirkin>
the costs are just much more opaque, which makes them ok :)
<daniels>
if you can't see a problem, I'm pretty sure it doesn't exist
frieder has quit [Remote host closed the connection]
OftenTimeConsuming has joined #dri-devel
<agd5f>
graphitemaster, it's basically an extension for GPU memory protection. Most GPU's from the last 5-6 years have this
mlankhorst has quit [Ping timeout: 480 seconds]
tzimmermann has quit [Quit: Leaving]
mbrost has quit [Ping timeout: 480 seconds]
mbrost has joined #dri-devel
Haaninjo has joined #dri-devel
<mareko>
r600 onwards has full OOB protection
hch12907 has joined #dri-devel
hch12907_ has quit [Ping timeout: 480 seconds]
<agd5f>
make that 10-12 years then
dviola has quit [Quit: WeeChat 3.4]
<imirkin>
on nvidia, you just access raw addresses, so there's no way to know if it's "in bounds" or out
<graphitemaster>
For texelFetch / imageLoad at least on NV, I seem to always get 0 for OOB
<agd5f>
imirkin, there's no concept of virtual memory?
<imirkin>
agd5f: sure is
<imirkin>
but it's based on pages
<graphitemaster>
I guess if all the major AMD and NV cards can ensure 0 for OOB imageLoad / texelFetch by default, even without a robust context I'm fine with relying on this undefined behavior.
<imirkin>
and if you access an illegal page, you get an exception in the shader
<imirkin>
s/illegal/unmapped/
<graphitemaster>
I'd hate to riddle shaders that are already my bottle neck in with if statements for boundary cases *shrug*
<imirkin>
which are something of a pain to handle (we don't in nouveau, i don't think blob does either, generally speaking)
<graphitemaster>
This was somewhat the idea behind border texels back in the early days of textures, except that's only true of samplers.
<imirkin>
(they probaby do for cuda and such)
<agd5f>
imirkin, yeah, traps are pain. We just set page faults to return 0 on reads and writes are dropped
<graphitemaster>
We should just extend the texture border concept to buffers, but let it be programmable in how many texels you get, to a limit. I say a minimum of 4 should be a spec requirement.
<imirkin>
agd5f: but if it's inside the page, the access Just Works
<imirkin>
irrespective of the size of the underlying object
<imirkin>
which need not be page-aligned. and could be sub-allocated.
<agd5f>
sounds similar
bl4ckb0ne_ has left #dri-devel [#dri-devel]
bl4ckb0ne has joined #dri-devel
mszyprow has joined #dri-devel
<imirkin>
so you have to have checks in the shader to ensure you don't go "out of bounds"
agx has quit [Remote host closed the connection]
agx has joined #dri-devel
Duke`` has quit [Ping timeout: 480 seconds]
agx has quit [Read error: Connection reset by peer]
agx has joined #dri-devel
mvlad has quit [Remote host closed the connection]
mbrost has quit [Ping timeout: 480 seconds]
mszyprow_ has joined #dri-devel
anholt has quit [Remote host closed the connection]
anholt has joined #dri-devel
mszyprow has quit [Ping timeout: 480 seconds]
mbrost has joined #dri-devel
<anholt>
sweet. single deqp-runner command now can do softpipe deqp and piglit in 1:50 on my system.
<ajax>
zang
<anholt>
next up: gl extension sanity checking so we can finally retire the old python piglit runner and save something like 15 minutes of runner job time per full pipeline. and maybe finally stop maintaining features.txt.
<ajax>
no new drivers without a working drm shim
eukara has quit [Remote host closed the connection]
eukara has joined #dri-devel
hch12907 has quit [Ping timeout: 480 seconds]
<jenatali>
anholt: The python piglit runner is still the only way to run it on Windows, deqp-runner doesn't work there yet
<jenatali>
Though I think lfrb at Collabora's looking at that...
mszyprow_ has quit [Ping timeout: 480 seconds]
mszyprow_ has joined #dri-devel
gouchi has quit [Remote host closed the connection]