<cwabbott>
the problem is that different stages contribute different pipeline_flags, and we can't combine them in vk_graphics_pipeline_state_merge() because we'd have to create a new render pass state object instead of doing a shallow copy and it's totally not setup to do that
<cwabbott>
I think we weren't sanitizing the flags as much before this so we didn't run into it
<cwabbott>
but it's a fundamental problem with how we're handling it
<cwabbott>
I think the only non-terrible solution is to move the pipeline_flags up into vk_graphics_pipeline_state
<cwabbott>
that means I'm gonna have to rewrite everything again :/
<dj-death>
hmm I see
<dj-death>
gfx libs again...
* cwabbott
hates GPL
<cwabbott>
another example of the axiom that there are no correct GPL implementations
<dj-death>
I don't even want to think about shader objects
<karolherbst>
shader objects are the best
<karolherbst>
at least nvk and nvidia should have valid and correct implementations for shader objects
<karolherbst>
easily
<cwabbott>
at least shader objects doesn't have any of this nonsense
<karolherbst>
shader objects map 1:1 to nvidia hardware (more or less)
<cwabbott>
you just pass create flags into the shader and that's it
<cwabbott>
no futzing around with copying state everywhere that's wrong half of the time
<dj-death>
yeah
<dj-death>
should have been called NV_shader_objects
<karolherbst>
well.. it makes life easier for a lot of programmers
YuGiOhJCJ has quit [Remote host closed the connection]
<karolherbst>
mareko: yeah.. the UI is way smoother now with COMPUTE_ONLY set
<karolherbst>
though radv doesn't seem to use the compute queue for compute only vulkan contexts? Is this even a thing in vulkan?
<karolherbst>
or maybe zink doesn't do something properly?
<karolherbst>
though does nvidia even support it on linux?
<zmike>
yes
<karolherbst>
how?
<karolherbst>
it sounds like I have to flash the vbios?
<zmike>
pretty sure it just works
<karolherbst>
mhh.. let me check my uefi then, because I'm mildly sure I've enabled it there
<zmike>
be pretty hard to do any gaming at all in the current year without it
<karolherbst>
yeah.. it's enabled in my uefi
<karolherbst>
well
aravind has quit [Ping timeout: 480 seconds]
<karolherbst>
maybe I need a newer GPU? I have no idea, on my Turing I get 256MiB out of the box and that's it
<karolherbst>
and there is this "NVIDIA Resizable BAR Firmware Update Tool" thing
<karolherbst>
though there seem to be patches for nvidias driver to enable it as well...
<tnt>
lspci for my card shows Region 1: Memory at 7400000000 (64-bit, prefetchable) [size=16G
Duke`` has quit [Ping timeout: 480 seconds]
<karolherbst>
what nvidia gpu is that, and what distribution?
<zmike>
I use a 2070 when I'm testing that
<karolherbst>
I bet you have a nvidia kernel module patch for it then
<karolherbst>
or maybe they added support for it and my driver is outdated..
Duke`` has joined #dri-devel
<karolherbst>
but anyway "nvidia-smi -q | grep -i bar -A 3" reports 256 MiB on my Quadro 6000
<tnt>
karolherbst: That's a 4070 running ubuntu 22.04 with 535.(104?) drivers.
<karolherbst>
I'm on 535.104.05
<tnt>
But you might be rights that your card needs a vbios update.
<karolherbst>
it's one of the early turing ones, so yeah...
<karolherbst>
anyway... having to rely on rebar might not be something which would be feasible here.. dunno
<karolherbst>
but I suspect the vbios is just toggling some bit given that there is a module patch for it
<tnt>
There is also some reference to a "compute mode"/"graphics mode", the former having a 8G BAR and the latter 256M.
<karolherbst>
mhh.. interesting
<zmike>
things should work, albeit suboptimally, without rebar, but it's definitely not a preferred mode of operation
<karolherbst>
well.. 256 MiB isn't enough and zink fails to allocate memory at some point
<zmike>
yeah and then it'll do a fallback to another heap
<karolherbst>
but there isn't any compatible one
<zmike>
compatibility is relative
<karolherbst>
buffer allocates need HOST_VISIBLE and DEVICE_LOCAL and it doesn't look like it falls back to anything.. maybe I should debug a bit more, but it does look like it's just doing nonsense after that and crashes the GPU
<zmike>
ctrl+f /* demote BAR allocations to a different heap on failure to avoid oom */
<karolherbst>
ohh indeed.. still crashes the GPU though that might be something else going wrong
<karolherbst>
mhh.. the vvl doesn't complain... let me run the CTS
masteratwork has joined #dri-devel
Daanct12 has quit [Quit: WeeChat 4.0.5]
<masteratwork>
I remember few details about thing called binary buddy allocator , in theory it's somehow possible to put stack and heap both to pc-relative locations in the kernel, and have the kernels allocator defragment things , but i do not remember the details well, sortix claims to do that, but i know sbrk and mmap are not very good at this regard. Of course i can be mistaken, binary buddy allocator was for win95 that seemed to
<masteratwork>
work great alike though.
flto has joined #dri-devel
Daanct12 has joined #dri-devel
<masteratwork>
Maybe in case of heap it would not make overly too much sense, cause heap can be larger on loops of memory intensive apps then code, its programmers responsibility to free it
<masteratwork>
but if those ain't possible than only thing to do is to index the memory in compressed format, and that is quite rough
Danct12 has quit [Read error: Connection reset by peer]
<masteratwork>
i mean not heap overall but the issue is with global variables likely , so that would make more sense, this can be pc-relative
<masteratwork>
some section as to where they live
<masteratwork>
all the heap apps allocation would be ping ponged through global variables or TLS or something like that through stack
Daanct12 has quit [Ping timeout: 480 seconds]
<masteratwork>
so it's not like i made a very huge blooper on my compression theories, it's just that i may still lack some of the kernel skills to do it more easily
<cwabbott>
dj-death: just pushed a fixed version
<cwabbott>
now with this turnip actually passes all the tests again
<cwabbott>
as usual, past me was an idiot
<cwabbott>
dj-death: fyi, because of the reworks rebasing is not going to be trivial
<cwabbott>
also I didn't build-test anv so I'm testing it in CI now
<dj-death>
cwabbott: I'm doing it right now
<dj-death>
it's not too bad
<cwabbott>
one small thing is that anv had a few places passing around the render pass state which used it just to get the pipeline_flags
<cwabbott>
so they passed around render pass state, multisample state, etc.
<cwabbott>
and now they pass around all those other states... plus the vk_graphics_pipeline_state struct that contains all of them
<cwabbott>
you could collapse all of those arguments down to one now that we have to pass around the overall state anyway, but I just went for the minimal change that replaced state->rp with state
<dj-death>
yeah
Adrinael has joined #dri-devel
Adrinael is now known as Guest1666
alyssa has joined #dri-devel
<alyssa>
cwabbott: hard disagree
<alyssa>
past you wrote a state-of-the-art RA that present me is still digesting, definitely non-idiot
pcercuei has quit [Quit: leaving]
<cwabbott>
dj-death: ugh, one last bug that I needed to fix so I pushed again
<cwabbott>
forgot that other drivers merge libraries first so I have to OR in the flags in vk_pipeline_flags_init
<cwabbott>
anv pipeline libraries will probably blow up without that
Guest1666 is now known as Adrinael
heat has quit [Remote host closed the connection]
heat has joined #dri-devel
Danct12 has joined #dri-devel
yyds has joined #dri-devel
qyliss has quit [Remote host closed the connection]
qyliss has joined #dri-devel
vyivel has quit [Remote host closed the connection]
i509vcb has joined #dri-devel
Haaninjo has joined #dri-devel
<alyssa>
airlied: so what happens with fedora 39 given the current mesa + llvm 17 lulz?
<karolherbst>
I probably also should look into llvm 17.. pain
<alyssa>
but the fact 23827 isn't merged yet tells me fedora 39 should be fine with building against llvm16 lol
<karolherbst>
fedora does indeed provide different llvm runtimes, just `llvm-devel` is always the newest
Danct12 has quit [Quit: What if we rewrite the code?]
kts has joined #dri-devel
Danct12 has joined #dri-devel
tzimmermann has quit [Quit: Leaving]
masteratwork has joined #dri-devel
pekkari has quit [Quit: Konversation terminated!]
<karolherbst>
sooo.. let's take a look at this llvm-17 mess
<karolherbst>
alyssa: I'll probably try to make that CL stuff work on llvm-17 now, not sure how long it will take, but let me try to figure things out until next week
<karolherbst>
though I think we already had fixes for most in place
<karolherbst>
mhhh
Plagman has quit [Read error: Connection reset by peer]
Plagman has joined #dri-devel
egbert is now known as Guest1675
Guest1675 has quit [Remote host closed the connection]
egbert has joined #dri-devel
cmichael has quit [Quit: Leaving]
vliaskov_ has joined #dri-devel
vliaskov has quit [Ping timeout: 480 seconds]
yang3 has quit [Read error: Connection reset by peer]
yang3__ has joined #dri-devel
sarahwalker has quit [Remote host closed the connection]
opotin65 has quit [Remote host closed the connection]
opotin65 has joined #dri-devel
sukrutb has joined #dri-devel
rasterman has quit [Quit: Gettin' stinky!]
tursulin has quit [Ping timeout: 480 seconds]
ngcortes has joined #dri-devel
yyds has quit [Remote host closed the connection]
sukrutb has quit [Ping timeout: 480 seconds]
flom84 has joined #dri-devel
sukrutb has joined #dri-devel
Cyrinux9474 has quit []
Cyrinux9474 has joined #dri-devel
danylo has quit [Ping timeout: 480 seconds]
sukrutb has quit [Ping timeout: 480 seconds]
mripard has quit [Quit: mripard]
gouchi has joined #dri-devel
lynxeye has quit [Quit: Leaving.]
gouchi has quit []
kts has quit [Ping timeout: 480 seconds]
nchery has quit [Ping timeout: 480 seconds]
nchery has joined #dri-devel
Danct12 has quit [Remote host closed the connection]
Danct12 has joined #dri-devel
digetx has quit [Ping timeout: 480 seconds]
ungeskriptet8 has joined #dri-devel
Danct12 has quit []
Danct12 has joined #dri-devel
haasn` has joined #dri-devel
ungeskriptet has quit [Ping timeout: 480 seconds]
digetx has joined #dri-devel
Duke`` has quit [Ping timeout: 480 seconds]
Duke`` has joined #dri-devel
soreau has quit [Remote host closed the connection]
soreau has joined #dri-devel
masteratwork has quit [Remote host closed the connection]
soreau has quit [Read error: Connection reset by peer]
soreau has joined #dri-devel
kts has joined #dri-devel
kts has quit []
vaxry has joined #dri-devel
vliaskov_ has quit [Ping timeout: 480 seconds]
kts has joined #dri-devel
<alyssa>
dcbaker: adding script with multiple outputs
<alyssa>
i know you don't like that but not as much as everyone won't like what it's for (:
<dcbaker>
alyssa: I'm cool with a script with multiple outputs
<alyssa>
oh I thought that was treif?
<dcbaker>
I'm always the one shooting down "gen_thing_c.py" + "gen_thing_h.py" and arguing that we should have "gen_thing.py" that generates both
<alyssa>
ahhhh
<alyssa>
this is an extremely spicy intel_clc spin-off
<dcbaker>
oh boy, how could intel_lcl get any spicier
<alyssa>
I heard you like C, so I'm writing C to generate C from your C for your C
fab has quit [Quit: fab]
<alyssa>
magic that makes arbitraryish OpenCL functions available as nir_builder with no bindings
<dcbaker>
Xzibit approves this message
<dcbaker>
the existing CLC stuff is annoying, as is getting people to review the meson bits I'm trying to add to make it less annoying
<alyssa>
cc me on anything you want reviewed
<dcbaker>
I should rephrase that "new Meson features I'm trying to add to Meson itself to make clc less annoying"
<dcbaker>
which I need to make dependency(..., native : 'both') work correctly
ngcortes has quit [Ping timeout: 480 seconds]
<alyssa>
oh
<cmarcelo>
dcbaker: oh, I thought we had issues with multiple outputs? probably can fix this for glsl_type stuff (we have three outputs, so right now three separate scripts).
<dcbaker>
only if we use capture : true, which we shouldn't be using because it makes everything slow on Windows
<cmarcelo>
yeah. I've moved off capture, so maybe can just merge them up. will take a note of that.
<dcbaker>
Well, slow on Linux too, but really slow on Windows because forking is so expensive
<dcbaker>
I would review that change
<alyssa>
how does capture with 2 outputs work..?
<dcbaker>
it doesn't :)
<alyssa>
no perf issue then!
<alyssa>
:P
<dcbaker>
lol
<cmarcelo>
well.... one for stderr and other for stdout. :-D
<dcbaker>
capture is just slow in general because Meson has to wrap a custom_target that uses feed or capture inside a meson script
<dcbaker>
so you get an extra fork
<alyssa>
ah
<cmarcelo>
not the case of meson, but I think you could open fds and let the child inherit them, so not touching stdout or stderr.
<cmarcelo>
(but yeah, unrelated to the wrapper issue)