ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar
<jekstrand>
Maybe we're losing a @0 = true somehow?
<idr>
jekstrand: In the "pass" code that zmike posted, there is no assignment in either branch of the if-statement.
<idr>
Because an earlier optimization pulls it out.
<idr>
It converts it from 'if (cond) { a = true; discard; } else { a = false; }' to 'if (cond) { discard; } else { } a = cond;'
<jekstrand>
idr: Ok, backing up a bit....
<jekstrand>
I believe that the way the Intel drivers are interpreting they SPIR-V they're getting is correct, within the limits of Intel's discard implementation. The problem is a combination of the SPIR-V we're recieving and the "don't kill necessary helpers" discard behavior.
<jekstrand>
So the question is why is Zink moving stuff around and what was it trying to do at the start.
<jekstrand>
Maybe you're already there but I'm trying to get us all on the same page. :)
<jekstrand>
Oh, I see where that store is going....
<jekstrand>
This is gonna get annoying
<idr>
All of the writes to 'a' become the phi. Then the phi gets converted back to stores, but the store in the then-branch is on a different side of the discard.
<idr>
I spent a couple hours stepping through all of these transformations.
<jekstrand>
idr: Were you looking at it on iris or with Zink?
<idr>
Both
<jekstrand>
Ok
<idr>
I started with Iris because I was expecting that disabling the optimization would lead to the same hang.
<idr>
It did not... so I dug deeper.
<idr>
It's the extra "full" cycle out of and back into SSA form when the change occurs.
<jekstrand>
Yes
<jekstrand>
Looking at emit_discard() in nir_to_spirv.c I think I have our culpret
<jekstrand>
OpKill in SPIR-V is considered control-flow because it nomianlly has GLSL discard semantics (in spite of the D3D name)
<jekstrand>
So nir_to_spirv emits the discard and then creates an unreachable block and sticks everything after the discard there.
<jekstrand>
In particular, that's where it's stashing that MOV we need.
<jekstrand>
So, in spirv_to_nir, we thow it away because it doesn't even parse unreachable blocks.
<idr>
Hm...
<jekstrand>
If nir_to_spirv instead emitted an "if (1) OpKill", it could leave the stuff after the discard sort-of reachable.
<jekstrand>
That's not great but it'd hack around the problem.
<jekstrand>
Really, it needs to use OpTerminate or OpDemoteToHelperInvocations
<idr>
When I was looking at the NIR_DEBUG=print_fs output, I'm 99% sure the assignment of true was still there.
<jekstrand>
It will be before it goes into SPIR-V
<jekstrand>
It's part of the NIR -> SPIR-V pass
<jekstrand>
And it's still sort-of there in the SPIR-V, just in an unreachable block.
<idr>
Let me check again. I was fairly sure this was inside anv.
<jekstrand>
It's not there in the "from SPIR-V" dump from zmike
stuart has quit []
* jekstrand
takes the trash out
* Sachiel
resists
<idr>
Okay... I was partially misremembering.
<idr>
With the breakpoint set in nir_convert_from_ssa and NIR_DEBUG=print_fs, I see that before the last invocation while still in zink, everything looks "fine."
<jekstrand>
Sachiel: What are you worried about. I live in Austin, not Portland. There's plenty of room in the can.
<idr>
After, we have...
<idr>
if ssa_37 {
<idr>
block block_1:
<idr>
/* preds: block_0 */
<idr>
r0 = mov ssa_1
<idr>
intrinsic discard () ()
<idr>
/* succs: block_3 */
<idr>
} else {
<idr>
block block_2:
<idr>
/* preds: block_0 */
<idr>
r0 = mov ssa_0
<idr>
/* succs: block_3 */
<idr>
}
<jekstrand>
Ok, yeah, that's what I'd expect
<idr>
And it vanishes, I think before emit_discard() ever happens.
<idr>
Let me double-check that.
<jekstrand>
Unfortunately, we don't have the spirv disassembler hooked up in any convenient way to get a good SPIR-V dump
<zmike>
not true
<jekstrand>
zmike: Oh?
<zmike>
ZINK_DEBUG=spirv
<jekstrand>
Nice!
<zmike>
👍
<idr>
So, that is the final NIR before emit_discard().
<anholt>
zmike: oh, that's great. now I also want it hooked up in vtn too (how did you all live without that?).
<jekstrand>
Yeah, and that NIR is fine with either interpretation of discard. If you kill lanes, they're dead and so r0 doesn't get used from anything that discards. If you don't, it gets written and everyone's happy.
<idr>
Is this something Vulkan validation layers would have caught?
<idr>
(Assuming that the SPIR-V is actually boats.)
<jekstrand>
idr: That's the problem: Nothing is boats!
<jekstrand>
Except Intel discard behavior
<idr>
Eh... code in an unreachable block?
<jekstrand>
unreachable blocks are perfectly fine SPIR-V, they just don't get executed.
<jekstrand>
So I think we've got roughly two ways to fix this and we should probably do both of them:
<idr>
Valid and fine are different. :p
<jekstrand>
The real fix is to emit OpTerminate instead of OpDiscard
<zmike>
anholt: feel free to copy
<jekstrand>
OpKill, rahter
<jekstrand>
*rather
<jekstrand>
A partial fix would be to emit "if (true) { OpKill }" for discard so that the stuff after it is still technically reachable.
<idr>
The reason we have this weird behavior is that universally doing either OpKill or OpDemote breaks some OpenGL app.
<jekstrand>
That would get us to where Zink naturally inherits the behavior of the underlying Vulkan driver's discard.
<jekstrand>
idr: Yup. But, since Zink is the GL driver in charge of dealing with silly GL apps, it should be making that call, not the Vulkan driver.
<jekstrand>
And, with VK_EXT_demote_to_helper_invocation, you have both OpTerminate and OpDemoteToHelperInvocation so it can pick which behavior it wants.
<jekstrand>
If the Vulkan driver doesn't support that, "if (true) { OpKill }" SHOULD fall back to native driver behavior without making code after discards dead unless the underlying Vulkan implementation decides to kill everything after the OpKill and dead code it.
<jekstrand>
Let me try to type that up on the issue
<zmike>
I can probably jam it in real quick tomorrow if you make a ticket to remind me of this
columbarius has joined #dri-devel
<zmike>
though I wouldn't say no if some enterprising individual wanted to tackle it themselves
co1umbarius has quit [Ping timeout: 480 seconds]
<jekstrand>
Nope. Because the SPIR-V is 100% valid
<jekstrand>
ugh... commenting on backlog
<jekstrand>
zmike: I'm typing up what needs to be done now in case you're not reading all the IRC and/or putting the pieces together. Then I'm off
<zmike>
what a legend
<zmike>
idr: you too, thanks for investigating
<jekstrand>
zmike: There you go. Wrote you a book.
<jekstrand>
Wow, a no-fault Zink bug report! The only real person to blame is discard(). :D
<zmike>
I think we both get a beer after this one
<jenatali>
Ugh there's Matrix spammers on the bridged version of this channel...
<karolherbst>
jenatali: like troll spammers or real spammers
<zmike>
yet another reason to use irC
<jenatali>
Posting spam links with promises of money
<karolherbst>
that crypto scam stuff?
<jenatali>
Something like that
<karolherbst>
if somebody mentions coins, just block them
<karolherbst>
should be easy to automate :P
<karolherbst>
but what are those bots doing, just searching for open matrix instances and join?
<jenatali>
Seems that way
<karolherbst>
though wondering why nothing ends up on the irc side
<karolherbst>
unless they get blocked here but the bridge doesn't recognize it?
<Sachiel>
not authenticated maybe
<karolherbst>
then why do the messages even show up on the matrix side?
<karolherbst>
I'd expect that the bridge at least enforces the same rules
<zmike>
doubtful
<Sachiel>
to show up here through the bridge, they need to also have a registered account and authenticate to it, that's why you see the one sided conversations sometimes
<Sachiel>
if the bots are just going through the motions for the matrix stuff to work, it won't show up here
<jenatali>
Yeah, they have Matrix accounts but aren't authed with NickServ
<zmike>
the matrix bridge would have to take into account channel modes, which is just way too much effort
<karolherbst>
well....
<karolherbst>
it should though
<jenatali>
It probably could if IRC was less of an ancient technology lol
<karolherbst>
:D
<karolherbst>
I guess you'd need channel mode to whatever mappings for each server
<karolherbst>
but...
<karolherbst>
you could at least support that for some and make that configurable
<karolherbst>
but I also suspect that baned accounts won't be baned on matrix side...
<airlied>
oh man can't wait until joss finds matrix :-P
<Sachiel>
+b is a fairly standard mode though, so maybe that one is supported just fine
<karolherbst>
how is joss doing btw? long time hearing from joss
<karolherbst>
Sachiel: one could say the same about requiring authenticated to speak inside a channel
apinheiro has quit [Ping timeout: 480 seconds]
<Sachiel>
don't know why I said 'fairly'. +b is one of the modes from the old RFC 1459, so anything not supporting can't have much claim of supporting IRC. +M is not, maybe it's part of the IRCv3 stuff, or maybe it's server specific, so it's understandable that it's not widely understood
<airlied>
but it might not be what you wanted :-P but now it should operate the same as iris does now and be consistent
<airlied>
anholt: is the hsw runner down?
clararussell[m] has quit [autokilled: This host violated network policy and has been banned. Mail support@oftc.net if you think this is in error. (2022-08-17 04:47:49)]
sdutt_ has quit [Read error: Connection reset by peer]
<anholt>
airlied: back now
<anholt>
its power seems to be getting flaky. will see if I need to pick up another one
aravind has quit [Ping timeout: 480 seconds]
remexre has left #dri-devel [#dri-devel]
danvet has joined #dri-devel
srslypascal is now known as Guest302
srslypascal has joined #dri-devel
Guest302 has quit [Ping timeout: 480 seconds]
kts has joined #dri-devel
rasterman has joined #dri-devel
gawin has joined #dri-devel
Duke`` has joined #dri-devel
fab has joined #dri-devel
aravind has joined #dri-devel
fab has quit [Ping timeout: 480 seconds]
lemonzest has joined #dri-devel
thellstrom has joined #dri-devel
MajorBiscuit has joined #dri-devel
Major_Biscuit has joined #dri-devel
jewins has quit [Ping timeout: 480 seconds]
tzimmermann has joined #dri-devel
MajorBiscuit has quit [Ping timeout: 480 seconds]
<tzimmermann>
mripard, thanks for handling the remaining patches in drm-misc-fixes
tzimmermann has quit [Quit: Leaving]
frieder has joined #dri-devel
fab has joined #dri-devel
mvlad has joined #dri-devel
jkrzyszt has joined #dri-devel
fab has quit [Ping timeout: 480 seconds]
saurabhg has joined #dri-devel
pcercuei has joined #dri-devel
<MrCooper>
I do think there was a missed opportunity to at least try Matrix when we had to migrate away from FreeNode
rgallaispou has joined #dri-devel
<HdkR>
Would be nice if Matrix solved the +M problem
lynxeye has joined #dri-devel
Haaninjo has joined #dri-devel
pcercuei has quit [Quit: brb]
pcercuei has joined #dri-devel
kts has quit [Ping timeout: 480 seconds]
warpme___ has joined #dri-devel
JohnnyonFlame has quit [Ping timeout: 480 seconds]
Akari has quit [Ping timeout: 480 seconds]
apinheiro has joined #dri-devel
Akari has joined #dri-devel
Jeremy_Rand_Talos_ has quit [Remote host closed the connection]
Jeremy_Rand_Talos_ has joined #dri-devel
gawin has quit [Ping timeout: 480 seconds]
saurabhg has quit [Ping timeout: 480 seconds]
camus1 has joined #dri-devel
camus has quit [Ping timeout: 480 seconds]
devilhorns has joined #dri-devel
JohnnyonFlame has joined #dri-devel
tobiasjakobi has joined #dri-devel
tobiasjakobi has quit []
<bylaws>
jekstrand: kinda curious on the need to use a helper shader to clear attachments in NVK, did newer NV HW drop native clears?
kts has joined #dri-devel
gawin has joined #dri-devel
<bylaws>
Also I emailed NV to enquire about missing regs in headers
Guest253 is now known as dreda
srslypascal is now known as Guest328
srslypascal has joined #dri-devel
<bylaws>
They redacted some things in order to try and get them out the door but for non-sensitive regs they could try to negotiate to get them published
srslypascal is now known as Guest329
srslypascal has joined #dri-devel
srslypascal has quit []
Guest328 has quit [Ping timeout: 480 seconds]
Guest329 has quit [Ping timeout: 480 seconds]
srslypascal has joined #dri-devel
thellstrom1 has joined #dri-devel
thellstrom1 has quit []
<lygstate>
airlied: Do you know anything about llvmpipe/softpipe cpu caps override by LP_FORCE_SSE2 and LP_NATIVE_VECTOR_WIDTH, I am improving on that, needs some feedback
apinheiro has quit [Ping timeout: 480 seconds]
thellstrom has quit [Ping timeout: 480 seconds]
saurabhg has joined #dri-devel
bmodem has quit [Ping timeout: 480 seconds]
RSpliet has quit [Quit: Bye bye man, bye bye]
RSpliet has joined #dri-devel
<karolherbst>
ehh is it just me or are some of the CI tests just super unreliable failing due to random timeouts or hw being annoying?
<Ristovski>
airlied: The ones that now show up with the patch all work in a basic test (no more hangs). However, all the ACountersN and NOACountersN are gone, is that expected (does hsw not have those or something?)
<karolherbst>
also, "dEQP-GLES3.functional.draw_buffers_indexed.random.max_required_draw_buffers.4" seems busted on lavapipe
<karolherbst>
but no idea why I am the one hitting it
<Ristovski>
Or wait, I assume the rest might be in INTEL_performance_query now, I was looking at AMD_performance_monitor .-.
<karolherbst>
ehh virgl
<karolherbst>
on llvmpipe
<Ristovski>
Well, the ACountersN/NOACountersN were shown in GALLIUM_HUD as well, which should pick up both?
aravind has quit [Ping timeout: 480 seconds]
YuGiOhJCJ has quit [Quit: YuGiOhJCJ]
f11f12 has joined #dri-devel
rkanwal has joined #dri-devel
Haaninjo has quit [Quit: Ex-Chat]
<zmike>
karolherbst: again, it has to be down to some changes in the build that you did
<zmike>
those tests pass regularly
<karolherbst>
well.. I didn't change anything which should affect this.. sooo
<jekstrand>
Also, not sure how far it goes back, we may need the shaders eventually anyway.
<karolherbst>
the same fails I am also seeing
<karolherbst>
and I am sure this MR doens't change a thing :)
<jekstrand>
And we'll need shaders for vkCmdBlitImage so we needed the framework anyway
<zmike>
karolherbst: looks like you need the first patch from my mold MR
<zmike>
to fix the fedora-release build
<zmike>
this is so weird though
<karolherbst>
huh...
Duke`` has quit [Ping timeout: 480 seconds]
<karolherbst>
the fedora job passes on my rusticl MR
<karolherbst>
ohh.. because I don't bump that tag...
<karolherbst>
figures
<daniels>
karolherbst: you're now discovering the reason (apart from that it takes ages & even just the networks are flaky) why we don't just install random versions of all our build deps every time :P
<karolherbst>
:D
sdutt has quit []
sdutt has joined #dri-devel
gawin has joined #dri-devel
Duke`` has joined #dri-devel
devilhorns has quit [Remote host closed the connection]
<Frogging101>
Steam says I have 3416MB of shaders pre-cached but my .cache/mesa_shader_cache is only 412M. What is it referring to?
<Ristovski>
Frogging101: It keeps them in a separate `steam_shader_cache` folder iirc, not the default mesa one
heat has joined #dri-devel
<Ristovski>
Frogging101: Ah there we go: ~/.steam/steam/steamapps/shadercache
ppascher has quit [Ping timeout: 480 seconds]
<Frogging101>
So if I wanted a "clean run" with all the shader-related stuttering intact, would I delete both the mesa one and the steam one?
fxkamd has joined #dri-devel
<Ristovski>
The mesa one (~/.cache/mesa_shader_cache) _shouldn't_ interfere with Steam games at all afaik. If you open any of the subfolders in the steams `shadercache` path, you will see it contains the `mesa_shader_cache`. I assume Steam uses MESA_SHADER_CACHE_DIR to override the path as to not interfere with the default one in `~/.cache`.
heat has quit [Remote host closed the connection]
<Frogging101>
Halo 2 has some pretty bad stuttering issues. It's not unplayable but it's frustrating. It stutters heavily whenever you enter a new area for the first time. You have to look in all directions to warm up the cache
<Frogging101>
I want to know why this is
devilhorns has joined #dri-devel
gawin has quit [Ping timeout: 480 seconds]
<Ristovski>
Frogging101: Do you have steams "Shader Pre-Caching" enabled?
<Frogging101>
Yes
<Frogging101>
It only happens the first time, and that's why I think it's shader related, because shaders are one thing that is cached
<ishitatsuyuki>
Frogging101, if it's DXVK it has a state cache and it's not shipped by Steam (with the exception of Deck)
<ishitatsuyuki>
basically for DX11 use cases even the driver shader cache isn't good enough and may cause hitches when too much of them are loaded at once
<ishitatsuyuki>
DXVK's solution is to cache the pipelines before compile and load it at a much earlier timing (at startup or at registration, I don't know)
ybogdano has joined #dri-devel
<karolherbst>
zmike, daniels: anyway... any good idea on how we want to handle those? I also see that we have a couple of timeout issues and I think we need a better way of dealing with timeouts specifically
<karolherbst>
I have another MR blocked just because timeouts are flaky
<zmike>
karolherbst: I still don't understand why doing a container update is causing timeouts?
<karolherbst>
because somebody changed something and the container was never updated
<karolherbst>
things just run longer...
<karolherbst>
this happens
<zmike>
I guess we'll never know what the cause was
<karolherbst>
maybe it was 4m 55 seconds, now it's 5 m 02 seconds and it's just very stable timing
<karolherbst>
and the timout is 5m
<zmike>
no, the individual tests are failing
<zmike>
the timeout for them is 60s
<Frogging101>
ishitatsuyuki: So is this not fixable? It is DXVK. It's DX11 I believe
<zmike>
I don't think they were near that before
<karolherbst>
yeah.. dunno
<karolherbst>
something something.. maybe the test changed
<Frogging101>
And I also have to ask the old "WWWD?" (What Would Windows Do)
<ishitatsuyuki>
it *might* be better with graphics pipeline libraries
<ishitatsuyuki>
or worse, it's not clear
<ishitatsuyuki>
I'd suggest testing this again once RADV has fast linking support
evadot has quit [Quit: leaving]
<orbea>
just curious, has anyone looked at the assembly generated by C vs rust? godbolt suggests that rust generates a lot more, would that have any performance impact for gpu drivers? https://godbolt.org/
<orbea>
maybe im not understanding something crucial
mclasen has joined #dri-devel
<ishitatsuyuki>
orbea: if you find a bottleneck you can always use the dirtiest hacks to optimize it
<ishitatsuyuki>
meanwhile, Rust has a faster hashmap implementation, a faster quicksort implementation and other general perf goodies
<orbea>
ah, interesting
<orbea>
i suppose the only way to see what impact it makes if any would be to try it and find out
<Frogging101>
ishitatsuyuki: you mean in its standard library?
<orbea>
not sure what the README means by embedded environments not having std
evadot has joined #dri-devel
<linkmauve>
lina, I’m currently working on a prototype display driver in Rust for a platform much simpler than yours, I haven’t touched DRM yet, just getting to feel it without a kernel for now, but the plan is to eventually make it into a proper DRM driver.
<linkmauve>
There is a never-merged fbdev driver from the 2.6.32 era written in C, but I’m not super happy with it.
khfeng has quit [Ping timeout: 480 seconds]
<Frogging101>
ishitatsuyuki: What are graphics pipeline libraries?
<Rayyan>
linkmauve: what's the platform?
bmodem has joined #dri-devel
<linkmauve>
Rayyan, the Nintendo GameCube/Wii.
<linkmauve>
Those two platforms are quite similar, and are minimally supported in mainline already.
<linkmauve>
This doesn’t include display at all atm.
<Rayyan>
nice!
<linkmauve>
In https://github.com/rust-wii/luma I currently have modesetting and scanout of the YUYV final buffer working and tested on hardware, and a WIP of the RGB→YUYV hardware copy.
<linkmauve>
The latter will be useful for DRM as most of userland expect proper RGB buffers.
<linkmauve>
Although the extremely limited framebuffer memory will be an issue for unmodified userland.
<Rayyan>
linkmauve: are you part of the "wii-linux-ngx" project?
<Rayyan>
or is that something else?
<linkmauve>
I am not, I have done minimal work on the Wii side of things so far, mostly focused on mainlining Wii U drivers for now.
sobkas has quit [Remote host closed the connection]
<linkmauve>
I originally wanted to target the Wii U GPU, a r600-derived AMD GPU, but the radeon driver is written expecting a PCI bus and I haven’t figured out how to abstract that away to make it support MMIO registers instead.
<Rayyan>
I remember that the Wii U had a "virtual Wii" mode
<linkmauve>
Rayyan, the Wii U contains almost all of the Wii hardware, but also many newer parts, including a fully modern (for 2012) AMD GPU.
<Frogging101>
can you even unlock teh bootloader on that?
<linkmauve>
So for DRM purposes, GameCube and Wii will use one driver, and Wii U will use another (hopefully radeon in the future).
<linkmauve>
Frogging101, you can install a bootloader which will let you dual-boot the proprietary OS or Linux, yes.
<Frogging101>
that's neat
<Frogging101>
how?
<Rayyan>
linkmauve: does the Wii U have any dedicated hardware for running Wii things?
<linkmauve>
Rayyan, yes, it includes almost all of the hardware of the Wii, including its GPU.
<Rayyan>
like how the 3DS also includes an ARM7 core for DS and DSi backwards compatibility
<linkmauve>
Rayyan, actually ARM9, but yes. :)
<linkmauve>
ARM7 is the auxiliary DS CPU for GBA compatibility, used in GBA mode as well as for wifi and audio in DS games.
<linkmauve>
The CPU of the Wii U is an overclocked three-cores version of the Wii CPU with a few additional features, which all can be disabled to enter vWii mode.
<Rayyan>
unless I'm remembering this wrong, the 3ds has ARM7, ARM9 and ARM11
pochu has joined #dri-devel
<linkmauve>
Oh, perhaps it also kept the ARM7, and I’m remembering wrong.
nchery is now known as Guest353
nchery has joined #dri-devel
<Rayyan>
guess I need to consult gbatek again :P
<linkmauve>
:)
<MrCooper>
karolherbst: the tests shouldn't have changed, we use fixed commits of the test suites
<MrCooper>
the Fedora failure looks like Fedora updated to newer meson which no longer tolerates trying to set unknown options
<alyssa>
jekstrand: Maybe a silly question, but what *does* nir_intrsinic_discard do?
<MrCooper>
karolherbst: ah, unless some of those commits were changed, without bumping the corresponding image tags
<alyssa>
demote? terminate? whichever one is cheaper for the hardware? whichever one optimizations want?
<alyssa>
Is there a case for the last one?
<alyssa>
^use
<daniels>
karolherbst: if the tests are flaky and timing out in main, add them to skip, because that should never happen. if the tests are flaky only after updating the container deps, then it's either fix the thing that's making them flaky or also add them to skip
<alyssa>
Why not only have demote/terminate and have a compiler option for the backend preference?
<Frogging101>
ishitatsuyuki: And what is fast linking support?
<alyssa>
jekstrand: The real answer here is that none of the GL drivers implement terminate and given how well the lower_idiv yakshave is going, I don't want to take this one ;)
<jekstrand>
alyssa: radeonsi does
evadot has quit [Quit: leaving]
evadot has joined #dri-devel
<alyssa>
so it does
<karolherbst>
MrCooper: I updated all tags
<MrCooper>
yes, but a previous change which didn't might have affected the tests
<karolherbst>
correct
Votes78 has quit [Ping timeout: 480 seconds]
<karolherbst>
so I'll just add the lavapipe things to skip unless somebody else wants to figure out what's wrong there...
<karolherbst>
the virgl ones seems to be real issues though
<MrCooper>
suggested next steps: narrow down the specific tag(s) bumping which trigger the failure, then check if those jobs still pass with the commit which previously bumped those tags
bmodem has quit []
<karolherbst>
MrCooper: I am sure that one of those deqp update commits did it though
<karolherbst>
but I also don't have to wait 1-2 hours for each iteration to just find out "software update did it"
<karolherbst>
anyway,.. "dEQP-GLES3.functional.draw_buffers_indexed.random.max_required_draw_buffers" fails on virgl
<karolherbst>
so does anybody want to look into that or shall I just add it to known fails?
oneforall2 has quit [Ping timeout: 480 seconds]
FireBurn has quit [Ping timeout: 480 seconds]
<MrCooper>
I'd say you can add it to known fails (assuming it consistently fails)
gawin has joined #dri-devel
Votes78 has joined #dri-devel
rkanwal has quit [Ping timeout: 480 seconds]
gawin has quit [Remote host closed the connection]
<zmike>
karolherbst: for the lavapipe timeouts just add skips for them
bmodem has joined #dri-devel
Votes78 has quit [Ping timeout: 480 seconds]
mbrost has joined #dri-devel
gouchi has joined #dri-devel
Votes78 has joined #dri-devel
idr has joined #dri-devel
devilhorns has quit [Read error: Connection reset by peer]
nchery is now known as Guest363
nchery has joined #dri-devel
Guest363 has quit [Read error: Connection reset by peer]
frieder has quit [Remote host closed the connection]
kts has quit [Ping timeout: 480 seconds]
saurabhg has quit [Ping timeout: 480 seconds]
bmodem has quit [Ping timeout: 480 seconds]
JohnnyonF has joined #dri-devel
JohnnyonFlame has quit [Ping timeout: 480 seconds]
jeeeun841 has quit []
jeeeun841 has joined #dri-devel
sobkas has joined #dri-devel
aravind has quit [Ping timeout: 480 seconds]
jeeeun841 has quit []
jeeeun841 has joined #dri-devel
nchery is now known as Guest366
nchery has joined #dri-devel
Guest366 has quit [Ping timeout: 480 seconds]
gawin has joined #dri-devel
vyivel has quit [Remote host closed the connection]
bl4ckb0ne has quit [Remote host closed the connection]
emersion has quit [Remote host closed the connection]
vyivel has joined #dri-devel
emersion has joined #dri-devel
bl4ckb0ne has joined #dri-devel
lynxeye has quit [Quit: Leaving.]
Major_Biscuit has quit [Ping timeout: 480 seconds]
Votes78 has quit [Ping timeout: 480 seconds]
Votes78 has joined #dri-devel
stuart has joined #dri-devel
alyssa has left #dri-devel [#dri-devel]
_Votes78 has joined #dri-devel
mclasen has quit []
Votes78 has quit [Ping timeout: 480 seconds]
alyssa has joined #dri-devel
<alyssa>
danvet: Is there any prior art in the kernel for multiple compilation?
<Ristovski>
whats multiple compilation?
<alyssa>
Makefile magic to compile the same C file N times with differing -DVERSION= parameter, with the file's entrypoints suffixed by VERSION, and a runtime check / jump table used for code outside the file
gawin has quit [Ping timeout: 480 seconds]
<Ristovski>
ah, I see
<alyssa>
which is less heavy handed than copypasting for every version (the current state of the art in the kernel?)
<alyssa>
also less heavy handed than recompiling the entire driver for each version
<alyssa>
but lower overhead (and in some cases simpler) than run time checks within the file (the other state of the art in the kernel..0
<alyssa>
and doesn't require binding a new language just for proc macros to do the same thing without build system changes.
<alyssa>
(and I am referring just to the makefile pieces, no buildtime code generation.)
<alyssa>
(or using a new language just for c++ templates)
<alyssa>
in mesa we compile for each major hardware version, for the kernel I'm thinking more about compiling per major firmware version but it's conceptually the same problem
<alyssa>
Wondering if other drivers already do this (for prior art for how to do the makefiles), or if this has been explicitly NAK'd (link to relevant discussion?), or just never seriously raised because it wasn't a problem until Apple came along and decided to completely break the firmware ABI every 3 minutes
<jannau>
alyssa: I can only find tools/testing/selftests/arm64/bti/Makefile
<jannau>
not a driver
<alyssa>
jannau: it's a start (:
<jannau>
are there indications that multiple compilation with different defines would be rejected in kernel?
<danvet>
alyssa, maybe also a preprocessing thing perhaps?
<alyssa>
jannau: the indication is "nobody has done it before ;)"
<danvet>
like cmd_mkregtable in the radeon Makefile
<alyssa>
Kernel people seem to hate new things ;)
<danvet>
but yeah no idea
Redfoxmoon_ has joined #dri-devel
<alyssa>
the cmd_mkregtable rules are too complicated for me to understand what you're saying here..
Redfoxmoon has quit [Read error: Connection reset by peer]
<danvet>
alyssa, ah I missed that you don't care about buildtime code generation
<danvet>
that exists plenty I think
<alyssa>
OK
<alyssa>
not finding any in grep of drivers/ but maybe I'm grepping badly :)
<danvet>
alyssa, I guess you'd need something which generates the actual makefile for all the variants or something, but I don't think that's supported really eitehr
<alyssa>
or just copypaste the makefile rules?
<danvet>
alyssa, I guess the closest would be to keep the variant list in the makefile as a make variable
<alyssa>
like tools/testing/selftests/arm64/bti/Makefile
<danvet>
and then generate all the patterns in Make with substitution rules or so
<alyssa>
less obtuse than copypasting 1000 lines of driver code coupled to the firmware
<danvet>
and have some build-time generator thing which adjusts the source .c files for each variant or so
<danvet>
like this bti thing does
<alyssa>
"build-time generator thing which adjusts the source .c files for each variant or so"
<danvet>
yeah that should be doable with a agx_genlist = v1, v2, v3
<alyssa>
+1
<danvet>
and then so makefile substitution rules
<alyssa>
(the test case would be dcp which is differently pathological from agx, fwiw)
<airlied>
Ristovski: according to iris exposing those counters was a mistake
<danvet>
I've done some absolute horrors to generate test lists in igt before we switched over to meson this way too
<airlied>
Kayden: ^ you might have more info
<alyssa>
ack
<Ristovski>
airlied: As in, they don't exist on that HW or what?
<alyssa>
danvet: airlied: ^^ In principle is this something you're ok having in the subsystem? and is there any reason to expect torvalds/etc to object?
<alyssa>
(If they don't object to the size of the amdgpu headers .......)
<Ristovski>
amdgpu header sizes really are insane though :P
<danvet>
Ristovski, airlied what counters? (just missing context)
<alyssa>
Ristovski: now if someone would like to explain mesa/src/gallium/drivers/radeonsi/radeon_efc.h to me and why it's 1.5MB
<Ristovski>
danvet: perf counters (GL_AMD_performance_monitor (and INTEL_performance_query I guess)) on crocus
warpme___ has quit []
<danvet>
Ristovski, and it's a mistake to expose them, or to not expose them?
<danvet>
also dj-death for perf counter questions I think
<Ristovski>
Well airlie has a MR that syncs the crocus ones with the iris ones, which also gets rid of a lot of ACounters/NOAcounters ones (those were apparently a mistake I guess?)
<Ristovski>
I am not even sure what those are
<Ristovski>
danvet: This all stems from some of them not working or freezing the application when doing GALLIUM_HUD=some_counter_here with crocus btw, just for context
<idr>
There are a bunch of the counters in Iris that don't exist at all in the hardware supported by Crocus.
<Ristovski>
I guess a good question would be: Is there a doc somewhere that has a list of all the counters per GPU gen?
<idr>
Ristovski: dj-death will know better, but I thought the counters were controlled via the kernel, so it should know?
<idr>
If there's an authoritative list, I don't know of it. Either publicly available or otherwise.
<airlied>
9f19662550aad52f0609e228b9cb803366d98959 is the iris commit
<airlied>
that removed them
JohnnyonF has quit [Ping timeout: 480 seconds]
<Ristovski>
idr: afaik i915 also exposes _a lot_ more of them that are not present in the extensions (haven't tested INTEL_performance_query independently yet but apitrace/GALLIUM_HUD dont pick any up, only the AMD_performance_monitor ones)
<idr>
There are a huge number of counters, and there are a bunch of weird rules about which ones can be used together.
<idr>
It seems like the hardware was more designed to benefit the people making the hardware than to benefit people making software for the hardware. If that makes sense.
<Ristovski>
Yeah, they seem to be segmented into "groups", with a great amount of overlap for some reason
<alyssa>
idr: pikachu.jpg
jannau has quit [Remote host closed the connection]
<Ristovski>
But hmm, if those correspond to "ACounters/NOACounters" I don't see how they are "useless" as per the commit that removed them in iris, are they too debug-y?
<airlied>
they may need to be access in a different manner to be useful
<airlied>
lygstate: sroland is the expert, I know how to use them and what they are for
<airlied>
it's annoying they trash the cpu bits
<Ristovski>
I see. There are a number of groups that contain arbitrary performance counters. A counter can be present in several groups, and also absent from some group. If sampling more counters at once is expensive (that is, sampling a large group is more expensive than a small group), then what I assume you would want to do is to find the smallest group that contains the counters you want.
<Ristovski>
Maybe it is not expensive, and its the bandwidth that's the bottleneck, that would make more sense actually
<Ristovski>
airlied: Given you select what counters you want with the OACONTROL register (and assuming crocus didn't do this?), reading from the ones not in the current group might have been what was causing the freezes I reported.
iive has joined #dri-devel
ngcortes has joined #dri-devel
<bylaws>
jekstrand: ahh I see, has stuff changed a lot since Maxwell?
<dj-death>
Ristovski: it's a bit more complicated than that
<dj-death>
Ristovski: A counters are hardcoded counters, they always report the same data
<dj-death>
Ristovski: B/C are configurable
<dj-death>
Ristovski: you route the signals you want to analyze to them
<Ristovski>
Indeed, I _just_ found that in the docs
<dj-death>
you're limited to the number of counters in what you can capture at a given time
<dj-death>
not sure what you goal is
<dj-death>
I heard of putting that in the gallium HUD
<dj-death>
I don't know that it's a good idea
<dj-death>
we have to put the system in a particular mode to sustain the B/C counters
<dj-death>
(aka. no powergating, no deep sleep modes)
<Ristovski>
dj-death: ACounters/NOACounters used to be in GALLIUM_HUD (and still are if using crocus with stable mesa), they freeze whatever app, that is what caused me to look into this
<dj-death>
ah yeah you mentioned a problem when pulling the data out of i915
<Ristovski>
Yeah, I was interested in profiling solutions, one of them was GALLIUM_HUD/AMD_performance_monitor/INTEL_performance_query, and then the i915 perf interface that is used by gputop
<dj-death>
there is also perfetto
<Ristovski>
Yeah I was about to look into that next.
<Ristovski>
I ran that with =1 as a test and it hanged my GPU :)
Votes78 has joined #dri-devel
<Ristovski>
with =0 the app freezes, just like in the original report I gave. Note this could simply be crocus being broken since I am testing this with the old version (since the new one removed the ACounters)
_Votes78 has quit [Ping timeout: 480 seconds]
reductum has joined #dri-devel
oneforall2 has joined #dri-devel
<Ristovski>
dj-death: btw, I assume Perfetto has tagging only for Vulkan and not OpenGL?
gouchi has quit [Remote host closed the connection]
mdnavare_ has quit []
mdnavare has joined #dri-devel
fab has quit [Ping timeout: 480 seconds]
lemonzest has quit [Quit: WeeChat 3.5]
<Ristovski>
Oh actually u_trace is anv/iris only, so no support for crocus I assume
<airlied>
probably would need porting
rkanwal has joined #dri-devel
warpme___ has joined #dri-devel
cengiz_io has joined #dri-devel
<dj-death>
Ristovski: correct
<dj-death>
you can look at the iris commit
<dj-death>
it was not hard to implement if you feel like it
<Ristovski>
Yeah I might give it a try
<dj-death>
there is a listing of commands to execute in the docs/perfetto.rst
<dj-death>
should be pretty much a copy paste to get it to work
kts has joined #dri-devel
danvet has quit [Ping timeout: 480 seconds]
genpaku has quit [Remote host closed the connection]
<karolherbst>
maybe we should have a script/bot doing that every now and then... :D
<zmike>
:/
<karolherbst>
I could also remove the commits changing the tags.. but...
<zmike>
well at least we know now and it's not a total mystery
<karolherbst>
yeah...
<karolherbst>
but it would be nice if we got something which is regularly just bumping all tags and saying: check this out, stuff magically broke
<zmike>
yeah
<zmike>
maybe like a weekly thing on weekends
<karolherbst>
yeah.. or monthly
<karolherbst>
or something
<karolherbst>
or whenever it didn't run for weeks and CI is idling
<karolherbst>
anyway.. pushed an update to the MR
<karolherbst>
so let's see if my rusticl CI run is clean as well now
Major_Biscuit has joined #dri-devel
Jeremy_Rand_Talos_ has quit [Remote host closed the connection]
Jeremy_Rand_Talos_ has joined #dri-devel
<daniels>
karolherbst: your MR looks like the right thing to do. sergi has also been working on something for virglrenderer which regularly bumps the tag + deps and updates expectations as an automated part of that - when we can prove it out as part of virgl (since it's a smaller dev base with a more narrow impact) then it would make sense to push it into mesa too.
mvlad has quit [Remote host closed the connection]
apinheiro has joined #dri-devel
oneforall2 has quit [Remote host closed the connection]
oneforall2 has joined #dri-devel
rasterman has quit [Quit: Gettin' stinky!]
idr has quit [Remote host closed the connection]
idr has joined #dri-devel
idr has quit [Remote host closed the connection]
dolphin` has joined #dri-devel
dolphin is now known as Guest383
dolphin` is now known as dolphin
Ryback_[WORK] has quit [Read error: Connection reset by peer]
Ryback_ has joined #dri-devel
Guest383 has quit [Ping timeout: 480 seconds]
<zmike>
jekstrand: I did a spin of reworking discard
<zmike>
should handle all the cases from your dissertation
<karolherbst>
daniels: I meant rather bumping tags without changing anything, just to make sure we don't have hidden fails like that
iive has quit [Quit: They came for me...]
<karolherbst>
anyway.. my rusticl CI stuff is now clean and it also runs CI with llvmpipe :)
<airlied>
karolherbst: does clover still get covered? :-)
<karolherbst>
sure
<daniels>
karolherbst: yeah, that's more or less what the virgl thing does
<karolherbst>
after that we need to fix samplers+texture counts for iris and some other minor details and then it's ready for CL 3.0 conformance
<karolherbst>
and then we should deal with function calling at some point
<airlied>
I think we should deal with function calling before caring about conformance :-P
<airlied>
esp because opencl is a disaster zone in driver land because of inconsistent implementations that are conformant but useful
<airlied>
useless for real apps
<karolherbst>
well it's usefull for real apps
<karolherbst>
and those apps which didn't run, didn't run because of missing extensions
<airlied>
though I suppose darktable is a use case
<airlied>
even if CL is slower than cpus :-P
<karolherbst>
yeah, darktable runs perfectly well
<karolherbst>
it's not even slower with CL :D
<karolherbst>
but also not really faster
<bylaws>
jekstrand: Oh I see, presume I need >Pascal to contribute? If not I might have a look at impling some of the missing features
<karolherbst>
what needs pascal?
nchery is now known as Guest384
nchery has joined #dri-devel
<bylaws>
Very familiar with the majority of engine level Maxwell stuff thanks to working on Switch emulation :)
<karolherbst>
yeah well.. pascal is mostly identical to maxwell
<bylaws>
karolherbst: I meant GT than pascal
<bylaws>
For the oss GPU driver
<bylaws>
Since afaik that doesn't work on pascal
nchery is now known as Guest385
nchery has joined #dri-devel
JohnnyonFlame has joined #dri-devel
<karolherbst>
huh?
<karolherbst>
oh you meant nvidias one?
<karolherbst>
yeah.. that's I think turing+ or something
Guest384 has quit [Ping timeout: 480 seconds]
rkanwal has quit [Ping timeout: 480 seconds]
nchery is now known as Guest386
nchery has joined #dri-devel
Guest385 has quit [Ping timeout: 480 seconds]
<bylaws>
Yeah... I assumed NVK used it cause I saw a 'remove libdrm' commit but apparently that's not the case?
<karolherbst>
correct
<karolherbst>
I just got rid of using libdrm
<karolherbst>
well.. it's still using libdrm, but not libdrm_nouveau
<jekstrand>
bylaws: Pascal should work in theory
<karolherbst>
maxwell as well...
<karolherbst>
just needs some changes on how to launch shaders
<karolherbst>
and uhm...
<karolherbst>
we need a heap for shaders
<karolherbst>
mhh, maybe not
<karolherbst>
jekstrand: thing is.. before volta each shader is launched with an offset into the program heap... not sure how well that would all work.. but I guess we just bind a new heap and set a new offset
<jekstrand>
Right now we assert >= VOLTA for shaders
<karolherbst>
though I suspect this can suck for performance or something
<jekstrand>
karolherbst: It's implementable, we just need the data structure. That's how Intel works
<karolherbst>
yep, shouldn't be too bad
<karolherbst>
definetly somebody with a pascal/maxwell GPU could look into
<karolherbst>
kepler on the other hand will be more work
<bylaws>
Oh cool! Might try to get something setup then
Guest386 has quit [Ping timeout: 480 seconds]
<airlied>
karolherbst: what's the kepler hurdle?
<karolherbst>
airlied: sampler/texture state is different
<karolherbst>
maybe only samplers...
<karolherbst>
maxwell does support the kepler format though, so one might bring up maxwell and move over to keplers format and make that work
<karolherbst>
and then figure out all the other details
<airlied>
the only other machine I have is a kepler
<karolherbst>
NOUVEAU_MAXWELL_TIC
<airlied>
I think my nv98 blew up
<karolherbst>
actually.. "screen->tic.maxwell" is the field
<karolherbst>
in case you are interesed
<karolherbst>
and there are other subtle differences
<tarceri>
Frogging101: steam uses a new cache dir for each game, you can see them in for example ~/.steam/steam/steamapps/shadercache/ each located under the games id
<karolherbst>
constant buffers seem to also work differently on maxwell+
<tarceri>
this is also were the foz dbs etc live
<karolherbst>
airlied: anyway... we might want to move development away from our "secret" repo or something? :D
<karolherbst>
not sure when it's a good time for that honestly
fxkamd has quit []
<karolherbst>
or did we want to figure out the new uapi first?
<Frogging101>
tarceri: Thanks. And do you know if using a non standard mesa build (tip of main or whatever) diminishes the ability of the pre-caching system at all? Are there objects that steam downloads that won't apply because I'm using a custom mesa versions?
<Frogging101>
i.e. would I see more stuttering than a normal user would because of a non standard mesa build
<Frogging101>
I assumed the foz dbs were device and driver agnostic and the device/driver specific stuff for my system would be compiled during the replay step.
<airlied>
karolherbst: not sure there's value in moving it until we have some uapi in place
<karolherbst>
depends on how much we will be able to squeeze out of the current one...
<airlied>
or at least there'd be a lot more people trying to use something we know doesn't work if we move it
<airlied>
and subsequent time explaining it to every one individually
<karolherbst>
yeah...
<airlied>
not like it's niche hw
<Sachiel>
Frogging101: device/driver features can change how the pipelines are defined, so you can't make the the foz dbs fully generic
<karolherbst>
I think we should know if we can make it work or not
<airlied>
or on an arm board you have to spend 6 months making a kernel work
<airlied>
karolherbst: I think we know it can't work on the current uapi
* Frogging101
does not like ARM for that reason
<airlied>
that was never in doubt
<karolherbst>
if we can make it work and it's ugly, but reliable and not a perf problem... it might be "good enough", but if it absolutely sucks and is terrible from every perspective than maybe better not :P
<karolherbst>
airlied: hold my beer :P
<karolherbst>
but yeah.....
<Frogging101>
Sachiel: So do they work at all for completely novel builds that devs would use?
<airlied>
karolherbst: I see the current work as more of basis to write a better compiler with while we get uapi in parallel :-P
<karolherbst>
well.. none of the command buffer building will change
<Frogging101>
Since of course nobody else will have used a Mesa build with the same build timestamp as the one I build myself, so steam will not have any fossils for it
<airlied>
hopefully not, might be some bits we don't need to worry about like bo tracking
<bylaws>
Is this newer GPU generations needing a UAPI change?
<karolherbst>
okay sure... but that's trivial :P
<karolherbst>
bylaws: no, more like the current uapi isn't very useful for vulkan
<karolherbst>
e.g. you already have to provide tiling information when allocating a bo and stuff...
<Sachiel>
Frogging101: well, the fossils can just check if the device and feature set matches and run the replay to warm up the actual cache mesa cares about. Don't know if it actually does that, it's been a while since I looked at it
<karolherbst>
not really sure how relevant that actually is though
<karolherbst>
but there are other issues, like how fencing is done and stuff
<Frogging101>
Sachiel: I grepped the fossilize source code to try and see what information it uses to decide whether it's compatible but I didn't have luck
<karolherbst>
and we want to support sync objects
<Frogging101>
It might be inside Steam, too. Since I doubt steam wants to download variants of the db that don't apply to my configuration
<Frogging101>
The bucket names have the hardware name and a hash. Not sure what the hash is derived from