<gfxstrand>
anholt, daniels: What's up with these Intel trace jobs? It's consistently taking 15-20 minutes between lava printing "waiting for job XXXXX to start" and "job XXXXX started". For the GLK jobs, I'm seeing total job times of upwards of 40-45 minutes for a single job. It's destroying CI throughput.
tristianc has joined #dri-devel
<gfxstrand>
(Not always. Sometimes they complete faster than that. But I saw one take 43m)
<anholt>
gfxstrand: unfortunately, lava starts the job on gitlab before a machine is ready. so your job is just stuck behind other jobs in the lava farm. you'd be queued up either way, but lava queuing inside the gitlab job means that it can time out and fail.
<anholt>
so, maybe they're oversubscribed and someone needs to crank down how much testing we do on them
<gfxstrand>
Okay, that makes sense. Feels a bit weird but okay.
<anholt>
or maybe something else has gone wrong. file an issue for it for someone to look into it (not me, I'm out on medical)
<gfxstrand>
kk
tristianc has quit []
tristianc has joined #dri-devel
heat has quit [Read error: Connection reset by peer]
heat has joined #dri-devel
bmodem has joined #dri-devel
YuGiOhJCJ has joined #dri-devel
bmodem has quit [Ping timeout: 480 seconds]
bmodem has joined #dri-devel
sima has joined #dri-devel
bgs has joined #dri-devel
heat has quit [Ping timeout: 480 seconds]
<daniels>
the reason they optimistically start in advance is to reduce latency
<daniels>
there are two reasons that happens anyway. one is that a bunch of machines took a dive and we haven’t fixed them yet because it’s still super early in Cambridge. another is that someone is hammering on CI to test their personal branches and eating all the capacity
RAOF has joined #dri-devel
kzd has quit [Ping timeout: 480 seconds]
bgs has quit [Remote host closed the connection]
sukrutb has joined #dri-devel
alanc has quit [Remote host closed the connection]
alanc has joined #dri-devel
<daniels>
gfxstrand: also, it makes life vastly easier if you attach links to jobs
Haaninjo has joined #dri-devel
sghuge has quit [Remote host closed the connection]
sghuge has joined #dri-devel
tristan has joined #dri-devel
tristan is now known as Guest8574
<daniels>
emersion: btw, can I help with the modifier doc at all?
junaid has joined #dri-devel
<emersion>
I'll have more time in the next few days
<emersion>
I completely forgot what the comments about it were
Leopold_ has joined #dri-devel
junaid has quit [Remote host closed the connection]
<daniels>
emersion: no prob!
mvlad has joined #dri-devel
pcercuei has joined #dri-devel
<linkmauve>
Hi, when I rmmod amdgpu I get this stack trace in dmesg, it isn’t an issue for me bug maybe you’d be interested in it? https://linkmauve.fr/files/journald.log
<linkmauve>
This is only the beginning of the file, at the end my whole CPU crashes and I have yet to figure out why (even though I can reproduce using either amdgpu or i915).
sgruszka has joined #dri-devel
lemonzest has quit [Quit: WeeChat 3.6]
lemonzest has joined #dri-devel
konstantin_ is now known as konstantin
f11f12 has joined #dri-devel
donaldrobson has joined #dri-devel
Ahuj has joined #dri-devel
vliaskov has joined #dri-devel
JohnnyonFlame has quit [Read error: Connection reset by peer]
RSpliet has quit [Quit: Bye bye man, bye bye]
RSpliet has joined #dri-devel
swalker__ has joined #dri-devel
f11f12 has quit [Quit: Leaving]
swalker__ has quit [Ping timeout: 480 seconds]
donaldrobson has quit [Ping timeout: 480 seconds]
donaldrobson has joined #dri-devel
Guest8574 has quit [Ping timeout: 480 seconds]
<cwabbott>
gfxstrand: for NV12, we do the swizzle by composing it with the user swizzle in the texture descriptor (because we can't represent it with the swap, and border colors aren't a thing for YUV formats)
tristan has joined #dri-devel
tristan is now known as Guest8585
Company has joined #dri-devel
<cwabbott>
gfxstrand: afaict we don't use the info in that table at all, because qcom has native formats for all of the ycbcr formats and we don't use any of the lowering code apart from nir_convert_ycbcr_to_rgb()
<cwabbott>
the HW format does return things in a different order, but we handle that ourselves with the texture swizzle
<cwabbott>
so, qcom is already correct and there's nothing you have to do there
<cwabbott>
*turnip is already correct
cmichael has joined #dri-devel
junaid has joined #dri-devel
<MTCoster>
From `docs/vulkan/base-objs.rst`:
<MTCoster>
> We also provide an implementation of
<MTCoster>
vkEnumerateInstanceExtensionProperties() which can be used similarly
<MTCoster>
Is there a reason this isn't exposed as vk_common_*?
<MTCoster>
Ignore me, I'm being dumb
YuGiOhJCJ has quit [Quit: YuGiOhJCJ]
Company has quit [Read error: Connection reset by peer]
<kj>
Is there a mechanism to mask/disable specific Vulkan device extensions at runtime? So even if the driver supports the extensions, it doesn't get advertised
<kj>
Maybe an env variable or a Vulkan layer
<kj>
I don't suppose MESA_EXTENSION_OVERRIDE works with Vulkan. Or does it?
<pixelcluster>
it's a bit tedious/only for temporary debugging, the process would be using vulkaninfo --json to export a profile .json, editing that json manually to remove the extension(s) and then passing it to the profiles layer
<kj>
Thanks. Will give it a go. Looks like it might be what I'm looking for, the "Exclude Device Extensions" setting
Guest8585 has quit [Ping timeout: 480 seconds]
donaldrobson has quit [Ping timeout: 480 seconds]
donaldrobson has joined #dri-devel
junaid has quit [Remote host closed the connection]
<dolphin>
airlied, sima: Final drm-intel-gt-next PR sent, just 4 patches and a backmerge of drm-next appeared since previous week.
donaldrobson has quit [Ping timeout: 480 seconds]
donaldrobson has joined #dri-devel
sukrutb has quit [Ping timeout: 480 seconds]
tango_ is now known as Guest8592
tango_ has joined #dri-devel
Guest8592 has quit [Ping timeout: 480 seconds]
MajorBiscuit has joined #dri-devel
donaldrobson has quit [Ping timeout: 480 seconds]
donaldrobson has joined #dri-devel
heat has joined #dri-devel
bmodem has quit [Ping timeout: 480 seconds]
Haaninjo has quit [Quit: Ex-Chat]
fxkamd has joined #dri-devel
vsyrjala has quit [Ping timeout: 480 seconds]
mbrost has joined #dri-devel
agd5f has quit [Read error: Connection reset by peer]
Guest8455 has quit []
agd5f has joined #dri-devel
Danct12 has joined #dri-devel
i509vcb has quit [Quit: Connection closed for inactivity]
kts has joined #dri-devel
kts has quit []
Ahuj has quit [Ping timeout: 480 seconds]
yuq825 has left #dri-devel [#dri-devel]
rauji___ has joined #dri-devel
kzd has joined #dri-devel
kts has joined #dri-devel
lfrb40 has joined #dri-devel
sre5 has joined #dri-devel
agd5f has quit [Read error: Connection reset by peer]
agd5f has joined #dri-devel
mbrost has quit [Ping timeout: 480 seconds]
kts has quit [Ping timeout: 480 seconds]
sgruszka has quit [Remote host closed the connection]
<gfxstrand>
cwabbott: What do you mean by that? What exactly does the hardware magically do?
<gfxstrand>
cwabbott: Is it just that the hardware magically handles multi-plane in a single descriptor and single texture fetch?
<cwabbott>
yup
<gfxstrand>
Okay, that's what Mali has, too.
<gfxstrand>
So we may yet want to generalize some
<cwabbott>
I think it's mostly already generalized?
<gfxstrand>
cwabbott: How does it handle chroma offsets?
<cwabbott>
that's in the descriptor too iirc
<gfxstrand>
Okay
<gfxstrand>
Makes sense
<gfxstrand>
It would have to be, I guess. That or in the sampler.
<alyssa>
AGX has that and also magic formats that do the CSC too
<alyssa>
I don't know if we want to use the latter
<alyssa>
(in any API)
<cwabbott>
yeah, there's CHROMA_MIDPOINT_X and CHROMA_MIDPOINT_Y fields
<cwabbott>
I assume that's what you mean
<daniels>
from an app point of view, having hardware-defined magic inscrutable CSC is no worse than having magic nir_lower_ycbcr-defined magic inscrutable CSC
<daniels>
if they care about the details, they open-code it
junaid has quit [Ping timeout: 480 seconds]
<alyssa>
daniels: yeah just a question of which is less work for the drivers :~)
<gfxstrand>
alyssa: Eventually, it should just be a matter of a flag or two we hand off to the NIR pass
<alyssa>
sure
<alyssa>
the spicy part is that Metal includes non-CSC multiplane formats, but does /not/ include any CSC formats
<alyssa>
public documented Metal, I mean
<alyssa>
the CSC formats are private Apple APIs on macOS for... some reason
<alyssa>
which makes r/e considerably more annoying
<gfxstrand>
I really doubt the CSC really costs much.
<alyssa>
same
<gfxstrand>
So I'd be inclined to treat it like qcom/mali and just use the multi-plane and lower the CSC in NIR.
<gfxstrand>
And probably IMG, too
<alyssa>
yeah, that's where i'm at
<alyssa>
if you're trying to save power, cut the gpu out of the loop, stop worrying about a few FMAs.
Leopold__ has joined #dri-devel
simon-perretta-img_ has quit []
simon-perretta-img has joined #dri-devel
OftenTimeConsuming has quit [Ping timeout: 480 seconds]
Leopold_ has quit [Ping timeout: 480 seconds]
OftenTimeConsuming has joined #dri-devel
mbrost has joined #dri-devel
donaldrobson has quit [Ping timeout: 480 seconds]
<simon-perretta-img>
Is there a way to use nir(_opt)_algebraic to build up a vec in the replace expression, with the order of the elements being specified by (const) arguments in the search expression?
<simon-perretta-img>
So for example if we have: testop(value.xy) and testop_split(base, value, elem), and want to fold ('testop_split', ('testop_split', 0, 'val0', 0), 'val1', 1) into ('testop', (vec2', 'val0', 'val1'))
<simon-perretta-img>
If that sort of op only takes a vec2 then sure, it'd be simple/cheap enough to emit an additional match expression for when the inner testop_split has "val1, 1" and the outer one "val0, 0", but for larger vecs that seems like it might be excessive
<simon-perretta-img>
Hence, is there currently a way to match ('testop_split', ('testop_split', 0, 'val0', 'idx0'), 'val1', 'idx1') and replace it with something like ('testop', (vec2', 'idx0': 'val0', 'idx1': 'val1')) ?
<alyssa>
simon-perretta-img: what problem are you trying to solve?
<simon-perretta-img>
I've got hardware instructions for e.g. pack_unorm_4x8(value.rgba), but also pack_unorm_1x8_split(base, value, elem) - so with base = B and unorm(val) = V, the output with elem = 1 will be 0xBBBBVVBB
<simon-perretta-img>
I can translate both variants into backend instructions, but wanted to see if I could add an algebraic case to fold the 4x 1x8 cases into a single 4x8 case
<robclark>
daniels: app can't really open code it once modifiers enter the picture.. nv12+ubwc doesn't work with the "lets pretend it is R8+R8G8" thing
<simon-perretta-img>
alyssa: Scalarised fragment shader store_outputs
<simon-perretta-img>
I've just been keeping them as vector store_outputs for now in order to use pack_unorm_4x8, but a backend/hardware requirement is that its input needs to be a contiguous block of 4 regs, so I'm exploring/experimenting using the 1x8 variant
<simon-perretta-img>
So currently for folding them I've written a C pass that walks back the base of each 1x8 chain, etc. but was also hoping there might be a way to do the same with nir_algebraic
<alyssa>
Don't scalarize store_output then
<alyssa>
Problem solved
mbrost_ has joined #dri-devel
mbrost has quit [Ping timeout: 480 seconds]
<alyssa>
"its input needs to be a contiguous block of 4 regs" this really isn't a big deal
<simon-perretta-img>
True, but if I've got a gl_FragColor = vec4(0.0, 0.0, 0.0, 1.0) or similar, that means reserving 4 contiguous temps to use for that vec rather than being able to use 0 temps with the split variants
<simon-perretta-img>
...it would be constant folded
<alyssa>
it's an ALU op, it constant folds
<alyssa>
yes
<alyssa>
:)
<simon-perretta-img>
Missing the forest for the trees :D
<alyssa>
Yes
<alyssa>
simon-perretta-img: also, word of advice: you have infinite ALU
<alyssa>
if you're spending time to "improve performance by reducing ALU", chances are you're wasting your time
<alyssa>
some of us like to do that to relax and/or procrastinate on real work
<alyssa>
but in general ... I don't think I can remember a single ALU saving optimization I've ever done that's actually moved the FPS needle
mbrost has joined #dri-devel
<alyssa>
sure, it saves power, but .. meh
<alyssa>
These are problems to worry about when you have a conformant VK1.3 driver that's running DX12 games at full perf and your biggest problem is some extra moves
<alyssa>
I once added some "important" opt_algebraic rules that reduced the cycle count of a massive shader in glmark2 by 22% on mali-g57
<alyssa>
do you know what happened to fps for that glmark2 scene?
<alyssa>
nothing. nothing happened to it. zero change on mali-g57.
<simon-perretta-img>
For sure, this was more of a "pack_unorm_4x8 and pack_unorm_1x8 both already work, but it's Friday afternoon and I'm curious if I can write something short to fold the latter" kinda thought haha
<alyssa>
because, despite that shader having 1000 instructions of ALU ... even a dinky little Mali can do lots of ALU, that scene was totally bound by memory bandwidth and driver overhead due to constantly glGenerateMipmap'ing
yyds has joined #dri-devel
<alyssa>
simon-perretta-img: Also, it's a lot easier to scalarize than to vectorize, as you're now discovering
<alyssa>
so would make more sense to keep store_output vectorized and always do pack_unorm_4x8 and just have targeted heuristics to break it down to 1x8 when that's actually better
<alyssa>
although I suspect that's "almost never"
<simon-perretta-img>
Yeah, makes sense
mbrost_ has quit [Ping timeout: 480 seconds]
<simon-perretta-img>
Although, part of the reason for investigating the scalar route for fs outputs (and when unpacking for vs inputs) is for funky formats that don't have HW instructions to pack/unpack them in their entirety
mbrost_ has joined #dri-devel
sukrutb has joined #dri-devel
<simon-perretta-img>
Which sure, might be more of a "deal with it when it comes to it" issue when nir_format_convert.h isn't enough
<simon-perretta-img>
But anyway, that's a bit of a different topic haha
<simon-perretta-img>
Thanks for the pointers!
* alyssa
hugs her formatted store
mbrost__ has joined #dri-devel
mbrost has quit [Ping timeout: 480 seconds]
krushia_ has quit []
<daniels>
gfxstrand: the 3 FMAs certainly vanish into line noise compared to dual tex load
<daniels>
robclark: mmm, yeah … I wonder if we want an EGLImage ext for ‘raw values only pls’, so it can still do the nice sampling (the co-issue helps on Mali too), but let the user be in control of CSC
<alyssa>
yuv considered a mistake
<alyssa>
*harmfuk
<alyssa>
**harmful
<robclark>
yeah, an ext that gave you the unconverted yuv could work
sarnex has quit [Read error: Connection reset by peer]
sarnex has joined #dri-devel
<robclark>
alyssa: 1.5 bytes per pixel is a lot less than 4 bytes per pixel
<robclark>
yuv is probably one of the more sane things about video :-P
<robclark>
(or less insane?)
mbrost_ has quit [Ping timeout: 480 seconds]
<alyssa>
robclark: video considered harmful
<alyssa>
read a book?
<alyssa>
:p
<robclark>
heh, _that_ I would agree with :-P
sarnex has quit [Read error: Connection reset by peer]
sarnex has joined #dri-devel
<alyssa>
got it in 2!
yyds has quit [Remote host closed the connection]
DemiMarie has left #dri-devel [#dri-devel]
Danct12 has quit [Quit: A-lined: User has been AVIVA lined]
gouchi has quit [Quit: Quitte]
Company has joined #dri-devel
mbrost__ has quit [Ping timeout: 480 seconds]
bmodem has quit [Ping timeout: 480 seconds]
Duke`` has joined #dri-devel
oneforall2 has quit [Remote host closed the connection]
Haaninjo has joined #dri-devel
<alyssa>
when did my baby driver get so big? O_O
oneforall2 has joined #dri-devel
<HdkR>
alyssa: It grew up :D
<zmike>
when you played the right music ?
<alyssa>
HdkR: o_O
<HdkR>
The grow up and glow up?
benjaminl has quit [Ping timeout: 480 seconds]
heat has quit [Remote host closed the connection]
heat has joined #dri-devel
Danct12 has joined #dri-devel
mvlad has quit [Remote host closed the connection]
benjaminl has joined #dri-devel
mauld has quit [Ping timeout: 480 seconds]
mauld has joined #dri-devel
Duke`` has quit [Ping timeout: 480 seconds]
mbrost has joined #dri-devel
konstantin_ has joined #dri-devel
konstantin has quit [Ping timeout: 480 seconds]
anujp has joined #dri-devel
mbrost has quit [Ping timeout: 480 seconds]
Haaninjo has quit [Quit: Ex-Chat]
dviola has quit [Ping timeout: 480 seconds]
heat has quit [Remote host closed the connection]
pcercuei has quit [Quit: dodo]
Surkow|laptop has quit [Ping timeout: 480 seconds]