<javierm>
danvet: give me a few minutes to page all this in my head again. It been some time I wrote the email you answered and no longer remember the details :)
<javierm>
*since I wrote
<danvet>
yeah I'm doing the same right now
JohnnyonFlame has joined #dri-devel
pcercuei has joined #dri-devel
JohnnyonF has quit [Ping timeout: 480 seconds]
<javierm>
danvet: Ok, I think that remembered the details. I'll answer in the list
<danvet>
mripard[m], mlankhorst_ pls don't forget to roll branches forward so fixes don't get lost and drm-misc-next is in linux-next
Danct12 has quit [Remote host closed the connection]
guru_ has joined #dri-devel
rasterman has joined #dri-devel
oneforall2 has quit [Ping timeout: 480 seconds]
rasterman has quit [Quit: Gettin' stinky!]
<javierm>
danvet: done, let me know what you think
rasterman has joined #dri-devel
<danvet>
javierm, makes sense and I think a fix is fairly simple
<javierm>
danvet: yup, I'll get another coffee and then try to write a patch
<danvet>
javierm, ah if you want to get typing I'm fine with reviewing :-)
<javierm>
danvet: great :)
<danvet>
javierm, btw have you seen the patch from tzimmermann for offb?
<danvet>
looks like that one is missing out on the sysfb fun
<javierm>
danvet: yeah, had in my TODO but needed to refresh my memory in order to look at it. Now that I've all in cache I should do it
<danvet>
javierm, oh for encapsulation probably needs a sysfb_try_unregister which takes the struct device
<danvet>
and returns false if it's not the sysfb platform_dev
<javierm>
danvet: but should be specific to sysfb ?
<javierm>
danvet: for instance drivers/video/fbdev/vga16fb.c registers it's own pdev
<javierm>
in it's module_init() handler
<javierm>
but it's also info->flag |= FBINFO_MISC_FIRMWARE
<javierm>
danvet: I tried to minimize that problem with 0499f419b76f ("video: vga16fb: Only probe for EGA and VGA 16 color graphic cards")
<javierm>
but there may be other drivers that do the same
<javierm>
that's why I think we need some global state in fbdev core that says "a DRM driver already probed, don't allow generic fbdev drivers to be registered anymore"
<javierm>
danvet: actually, is more complicated than that... because you could for example probe a DRM driver for a small display but still want to allow FBINFO_MISC_FIRMWARE for your main display controller
<javierm>
anyways, I'll answer in the list to avoid having this convo in two places
rasterman has quit [Read error: No route to host]
rasterman has joined #dri-devel
HankB_ has quit [Remote host closed the connection]
HankB_ has joined #dri-devel
HankB_ has quit [Remote host closed the connection]
oneforall2 has joined #dri-devel
HankB_ has joined #dri-devel
rkanwal has joined #dri-devel
guru_ has quit [Ping timeout: 480 seconds]
guru_ has joined #dri-devel
elongbug has joined #dri-devel
devilhorns has joined #dri-devel
oneforall2 has quit [Ping timeout: 480 seconds]
camus has quit [Ping timeout: 480 seconds]
<danvet>
javierm, yeah we need the state to be in each platform sysfb
<danvet>
so if vga16 has another one, then that also needs handling in there
maxzor has quit [Ping timeout: 480 seconds]
<danvet>
javierm, anyway replied too, and added gregkh for more opinions
<danvet>
javierm, I think I'll resend my series without those last two patches to get them unblocked
mclasen has joined #dri-devel
rkanwal has quit [Quit: rkanwal]
heat has joined #dri-devel
maxzor has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
shashank_s has joined #dri-devel
shashank_sharma has quit [Ping timeout: 480 seconds]
slattann has quit []
adjtm has quit [Quit: Leaving]
thellstrom has joined #dri-devel
adjtm has joined #dri-devel
rkanwal has joined #dri-devel
maxzor has quit [Ping timeout: 480 seconds]
rkanwal has quit [Ping timeout: 480 seconds]
itoral has quit [Remote host closed the connection]
<javierm>
danvet: sorry, got dragged into meetings. Yes, re-sending without those two makes sense to me and we can continue the discussion
Danct12 has joined #dri-devel
nchery has quit [Read error: Connection reset by peer]
shankaru has quit [Quit: Leaving.]
rkanwal has joined #dri-devel
rasterman- has joined #dri-devel
thellstrom has quit [Read error: Connection reset by peer]
thellstrom has joined #dri-devel
rasterman has quit [Ping timeout: 480 seconds]
crabbedhaloablut has joined #dri-devel
thellstrom has quit [Ping timeout: 480 seconds]
loki_val has quit [Ping timeout: 480 seconds]
icecream95 has quit [Ping timeout: 480 seconds]
maxzor has joined #dri-devel
loki_val has joined #dri-devel
crabbedhaloablut has quit [Ping timeout: 480 seconds]
lemonzest has joined #dri-devel
adjtm is now known as Guest1257
adjtm has joined #dri-devel
sdutt has joined #dri-devel
sdutt has quit []
sdutt has joined #dri-devel
Guest1257 has quit [Ping timeout: 480 seconds]
<javierm>
danvet: added another option, this all feels as if the aperture and conflicting framebuffers removal is really a workaround and all this should be moved to the device model core
<danvet>
javierm, yes
<javierm>
that is expect that all DRM drivers need to call to request_mem_region() and have an option to force it and unregister any device that may already requested and overlapping aperture
<danvet>
well request_mem_region is another thing
<danvet>
in theory it should be used
<danvet>
in practice the times of jumper controlled IO decoding is long past us, so there's no real incentive for drivers to use it
<javierm>
danvet: yup, but if drivers used it could be a good indication of conflicting devices/drivers
<danvet>
but yeah maybe we could integrate it a bit better with request_mem_region
<danvet>
javierm, otherwise the removal flow is probably going to stay specific to graphics, due to the fw fb handover games
<danvet>
there's some of that also going on for input devices, but that's all very platform specific
<javierm>
danvet: yeah
alyssa has joined #dri-devel
<alyssa>
building radeonsi for the first time
<alyssa>
not a good feel
<javierm>
danvet: I also had a question about offb, I don't think we can do it in sysfb since doesn't even use a pdev. It's not even a proper driver since doesn't use the device model
<javierm>
danvet: I wonder if instead of Thomas' patch we should convert this to register a pdev driver, pdev and have a proper .probe
OftenTimeConsuming has quit [Remote host closed the connection]
<javierm>
because his assumption that all fbdev are backed by a device sounds reasonable
OftenTimeConsuming has joined #dri-devel
<danvet>
javierm, yeah longer-term, but as a quick regression fix the patch is more well contained
iive has joined #dri-devel
<javierm>
danvet: yeah, agree
<javierm>
danvet: every day I'm more happy that we are disabling all fbdev drivers since Fedora 36
<jekstrand>
hakzsam: Oh, I can keep going. I din't have it fully passing yet. I didn't know if you thought I *should* keep going given that you seemed to already have it working.
<hakzsam>
I still my old branch somewhere but rebasing will be annoying, I can do it if you want
rasterman- has quit []
rasterman has joined #dri-devel
kchibisov_ has quit []
kchibisov has joined #dri-devel
mbrost has joined #dri-devel
mlankhorst_ is now known as mlankhorst
<mlankhorst>
danvet: can we find a new maintainer for drm-misc temporarily? I can do this cycle but expecting to be pretty busy next 6 months
bcheng has quit [Remote host closed the connection]
bcheng has joined #dri-devel
aravind has quit [Ping timeout: 480 seconds]
<Venemo>
alyssa: I'd be interested in the "Explain why" part of your MR 15720
idr has joined #dri-devel
thellstrom has joined #dri-devel
<alyssa>
Venemo: right, will write that now
<alyssa>
It's not interesting to ACO fwiw
<Venemo>
maybe so, but it still looks interesting
<alyssa>
It doesn't work if you support geom/tess
<alyssa>
which we don't
mszyprow has quit [Ping timeout: 480 seconds]
<Venemo>
that's okay, it still seems like a neat idea
* alyssa
types
<alyssa>
TL;DR Mali doesn't have any hw support for transform feedback, I thought it did but it was totally wrong
<alyssa>
Generically lowering to a compute shader works well enough and then the rest of the driver can forget about it
<alyssa>
Possibly ditto for AGX? not sure what the geom/tess shader story is there
<Venemo>
don't you still have to output stuff to the rasterizer too, though?
<alyssa>
yes
<alyssa>
so there's a perf cost, but.. sort of inevitable
<alyssa>
well
<Venemo>
so, you run both a CS and a VS?
<alyssa>
under that MR, yes
<alyssa>
I am not sure that's the right forward
<alyssa>
but if we want a correct implementation for desktop GL, i don't think there's another choice
<Venemo>
does the VS re-compute the output, or does it just read from the XFB buffer?
<alyssa>
recompute on that MR, might be worth optimizing it
<Venemo>
why not just keep the VS intact and store to the XFB buffer from the VS?
HankB_ has quit [Remote host closed the connection]
HankB_ has joined #dri-devel
shashanks has joined #dri-devel
bcheng has quit [Remote host closed the connection]
bcheng has joined #dri-devel
shashank_s has quit [Ping timeout: 480 seconds]
<idr>
Venemo: If you don't have stream out hardware, getting things ordered correctly is probably somewhere between hard and impossible.
<idr>
We thought about doing that on some older hardware that didn't have stream out, and we gave up on the idea pretty quickly.
<idr>
That was years ago, and I don't remember all the details.
<Venemo>
idr: well, don't you have to still solve the same problem for the CS, though?
<idr>
Yes, but getting ordering between CS workers is much easier.
<Venemo>
why?
<idr>
In CS, each worker knows who it is. In VS, you don't know who you are.
<idr>
At least, VS invocations couldn't know that when we were trying to do it.
shankaru has quit [Quit: Leaving.]
<Venemo>
idr: I guess that depends on the HW capabilities
<idr>
From a CS, I suspect it would also be easier to structure it so that each worker processes an entire primitive. I think that could make some things a little easier.
<idr>
But I haven't looked at the MR.
<Venemo>
for sure, API VS doesn't do this, but if the HW is capable, you can lower your XFB VS to use whatever the HW can do
Haaninjo has joined #dri-devel
<alyssa>
idr: basically that
rkanwal has quit [Remote host closed the connection]
<alyssa>
right now i'm just trying to replicate what panfrost had before
rkanwal has joined #dri-devel
<alyssa>
oh ffs
pcercuei has quit [Quit: Lost terminal]
<alyssa>
look I don't know what to tell you guys
<alyssa>
I just want to land mali-g57 support
pcercuei has joined #dri-devel
<alyssa>
panfrost's xfb implementation is totally broken, g57 won't be able to use it, might as well rewrite to be somewhat less broken
<alyssa>
the CS approach is strictly less broken and opens the door to fixing harder
<alyssa>
but it's still broken... indexed draws, strips/fans/quads/polygons, some details about bounds checking, this stuff still doesn't work
<alyssa>
but at least with CS there's a way to make it work, with the nonsense "just alias the XFB buffer with an internal varying buffer" approach there's not
<alyssa>
it won't work for GS/TS, guess what the hw don't support those
<ajax>
even though x11_swapchain has a vk_object_base as its first member, so, what am i missing?
<alyssa>
..and I need Midgard compiler changes. joy.
<alyssa>
so much for no hw assumptions
<alyssa>
wait maybe I don't
Duke`` has joined #dri-devel
gawin has quit [Ping timeout: 480 seconds]
maxzor has quit [Ping timeout: 480 seconds]
<Venemo>
alyssa: thanks for your thoughts, I didn't mean to frustrate you, was merely curious
<Venemo>
alyssa: if you can solve the ordering problem in CS but it's not solveable in VS, then for sure this approach does make sense for those drivers
nchery has joined #dri-devel
bcheng has quit [Remote host closed the connection]
naveenk2 has quit []
bcheng has joined #dri-devel
thellstrom1 has joined #dri-devel
thellstrom has quit [Read error: Connection reset by peer]
shankaru has joined #dri-devel
libv has quit [Remote host closed the connection]
mclasen has quit []
mclasen has joined #dri-devel
<jekstrand>
Ugh... C's undefined behavior rules are making it nearly impossible to write ubsan-safe tests for timespec_add with overflow detection. :-(
<alyssa>
:|
heat has quit [Remote host closed the connection]
heat has joined #dri-devel
<cheako>
I'm testing flatpak today, does anyone know how to work with `org.freedesktop.Platform.GL.default`? Seems like an embeded copy of mesa, if true I'll need to know how to replace it.
<anholt_>
airlied: as mentioned in #crocus: g41 is back, hsw is in a box somewhere and I haven't found it. also, i915g is back too
<jekstrand>
karolherbst: One line. :)
<jekstrand>
karolherbst: But now it fails. I'll debug that next
<anholt_>
nouveau CI is waiting on geothermal workers to be out of the area, since it's a bit more work to set up. also, nouveau CI is still a disaster so it's not super useful.
<karolherbst>
jekstrand: wouldn't be surprised if iris assumes texture == sampler
<jekstrand>
karolherbst: It doesn't
<karolherbst>
ahh, cool
<jekstrand>
karolherbst: It is telling me it has no samplers and no binding tables. :-/
<karolherbst>
yep
<karolherbst>
it's txf alright
<karolherbst>
ehh wait, it's txs... mhh
<jekstrand>
Kayden: Is there some reason we're not setting BindingTableEntryCount in GPGPU_WALKER->INTERFACE_DESCRIPTOR_DATA?
<gawin>
anholt_: do I need something to trigger a run? (I don't have haswell, so it's gonna be helpful)
<anholt_>
you click the play button in the pipeline.
<karolherbst>
jekstrand: mhh weird.. I am luxmark works, so not sure if that test is doing something edgy
<karolherbst>
although that's just write images afaik
<karolherbst>
if at all
nchery has joined #dri-devel
<karolherbst>
jekstrand: btw, I am considering moving towards iris for regression testing
<karolherbst>
weird..
<karolherbst>
./build/test_conformance/images/kernel_read_write/test_image_streams passes here on iris as well
<karolherbst>
well
gouchi has quit [Remote host closed the connection]
<karolherbst>
partly
<karolherbst>
mirroring is broken as it is on llvmpipe
<karolherbst>
or something else
<jekstrand>
I'm sure it's something dumb
MajorBiscuit has quit [Ping timeout: 480 seconds]
<jekstrand>
zmike: Ok, CI is finally happy with the lavapipe MR. Changes since you reviewed: 1) Fixed ubsan issues with the timespec_add_nsec tests. 2) vk_sync_wait_many() for wait semaphores in lvp_queue_submit. 3) Delete some fails for lavapipe-asan test run because they no longer fail.
<zmike>
I like the sound of all these things
shankaru has quit []
<jekstrand>
zmike: Want to look again or should we assign marge?
<jekstrand>
We really should fix the asan issues one of these days. Most of them seem to just be memory leaks.
<zmike>
jekstrand: did you do a full cts run?
<jekstrand>
zmike: No, I can, though.
<zmike>
please do since otherwise I'll be doing it here
<gawin>
stupid me, there's something like "Job dependencies"
<jekstrand>
anholt_: dang!
<airlied>
anholt_: will look at it now
<anholt_>
thanks!
<jekstrand>
zmike: Running. Time to go turn my thermostat down to counter-act the space heater on my desk.
<zmike>
just accept the warmth into your soul
<jekstrand>
zmike: My soul has to be kept below -10C at all times.
<zmike>
your electric bill must be hefty
<jekstrand>
With all these lavapipe runs? Yeah, probably. :P
* airlied
tries to only do lvp runs when solar is going :-P
<jekstrand>
karolherbst: Surfaces and bindings look right. More digging!
<karolherbst>
jekstrand: there are quite some regression with iris, mhh, but I fear that some/most is due to USE_HOST_PTR :/
<jekstrand>
karolherbst: USE_HOST_PTR?
<karolherbst>
jekstrand: userptr
<karolherbst>
essentially
<karolherbst>
just crappy and stupid
<karolherbst>
soo CL allows you to back up cl_mem with a given pointer, but there are literally no requierements the client has to make sure of
<karolherbst>
so if the driver needs page aligned pointers? too bad
<karolherbst>
the runtime has to work it out
<jekstrand>
karolherbst: I'm not seeing iris_resource_from_user_memory being called
<karolherbst>
yes, and it fails
<karolherbst>
or should
<jekstrand>
karolherbst: It's not being called in this test
<karolherbst>
ahh, then it's something else there :)
<karolherbst>
soo.. CTS run completes in 7 minutes with iris here, nice
<karolherbst>
takes roughly 25 minutes with llvmpipe
<karolherbst>
let me see where iris regresses
<Kayden>
jekstrand: No reason. We should be setting that. anv does, iris forgot I guess
<jekstrand>
Kayden: Ok, I'll send a patch then.
<karolherbst>
jekstrand: ohh.. CL also allows userptr for images :) "rc/gallium/drivers/iris/iris_resource.c:1178: iris_resource_from_user_memory: Assertion `templ->target == PIPE_BUFFER' failed."
<jekstrand>
karolherbst: !
<karolherbst>
we should think about something to support this for drivers not allowing that
<Kayden>
I don't see why iris couldn't support that
<karolherbst>
Kayden: what about random pointers (no alignment)?
<karolherbst>
although CL allows handing in an alignment for images I think
<Kayden>
I think we'd need cacheline (64B) alignment
<Kayden>
for linear, or tile-size for tiled
<Kayden>
but presumably that's linear
<karolherbst>
what about buffers? I know that i915 (kernel) rejects anything which isn't page aligned
<karolherbst>
anyway... most regressions are indeed because of crappy USE_HOST_PTR support :)
<karolherbst>
Kayden: do you have an idea how we can properly support host pointers by lifting the page alignment requierement or is that something we have to deal with?
<jekstrand>
karolherbst, Kayden: Looks like iris depends on textures_used, a 32-bit bitfield. That's not going to fly for CL.
<karolherbst>
jekstrand: so, what clover is doing to support it, that is tries with resource_from_user_memory and if that fails, it crates a shadow buffer on the GPU and syncs it at certain points
<karolherbst>
jekstrand: airlied has patches for that
<karolherbst>
_but_ I think you don't need as many textures in CL, just many samplers
MajorBiscuit has joined #dri-devel
<jekstrand>
karolherbst: I'm still digging through how iris handles samplers. I don't know that it's applicable to this particular test but you may be right that it assumes they're linked.
<jekstrand>
The compiler doesn't but iris might in its binding table setup somewhere.
<karolherbst>
yeah.. wouldn't surprise me
<karolherbst>
does gallium nine runs on iris?
<Kayden>
yes
<karolherbst>
mhh
<karolherbst>
ehh max_read_image_args is indeed 128 :( *sigh*
<karolherbst>
jekstrand: anyway, it works with llvmpipe, and I didn't bumb the numbers there
<Kayden>
karolherbst: I think the page alignment thing is a i915 restriction
<karolherbst>
it's still on my todo list, but not that important atm
<karolherbst>
Kayden: okay, glad to hear
<Kayden>
I don't think the HW/iris should mind it being 64B
<karolherbst>
I _reall_ don't want to deal with shadow buffers if I don't have to
<karolherbst>
*_really_
<jekstrand>
Yeah, we can get around the restriction
<jekstrand>
Just pad out to a page on either side.
<Kayden>
I don't think iris requires textures and samplers to be linked. it shouldn't, anyway, but there may be some stupidity in need of purging
<karolherbst>
jekstrand: how?
<jekstrand>
We know that whole pages exist. You can't have a partial page.
<Kayden>
not aware of it being an issue
<karolherbst>
I mean.. you can offset internally, sure
<karolherbst>
if that's what you mean?
<jekstrand>
So we can userptr the set of pages containing the address range. Then offset as needed.
<jekstrand>
It'd be super annoying but we could do it.
<karolherbst>
okay
<Kayden>
right
<karolherbst>
better than shadow buffers
<Kayden>
re textures_used...yeah that'd need to get fixed
* airlied
has gone back and forward on how much those patches are needed
<karolherbst>
airlied: yeah.. but we need 128 read images :(
<airlied>
yes we do, but I'm not sure those patches are required
<karolherbst>
ahh
<airlied>
the use of those bitfields is quite a mess
<airlied>
in some cases they seem only to be useful for GLSL type things
<jekstrand>
Yeah...
<karolherbst>
I see
<airlied>
I dig in a few times get lost and run away
<jekstrand>
I generally don't like textures_used
<karolherbst>
shouldn't be all textures inside a shader be used? :P
<jekstrand>
It's not used in Vulkan at all
<airlied>
yeah it's very inconsistent
<jekstrand>
We do need some way to set up the binding table in iris.
<jekstrand>
Which means we at least need to know how many images and textures there are.
<karolherbst>
you still have the texture vars, no?
<karolherbst>
ehh samplers in gl
<jekstrand>
Maybe? They may be eliminated by then.
<karolherbst>
mhh
<jekstrand>
The way all this stuff works with gl/gallium is very different from VK
<jekstrand>
And I'm not convinced CL fits with the gallium model well.
<karolherbst>
indirects are also annoying :(
<jekstrand>
In some ways, I guess it fits fine.
<jekstrand>
karolherbst: Actually, indirects aren't hard if we do it right.
<karolherbst>
gallium is a bit too gl centric here, yes
<karolherbst>
but nine has the same problem, no?
<karolherbst>
jekstrand: yeah, CL doens't have those anyway
<karolherbst>
well I guess we could optimize to indirects if that means dropping some code, but I don't think there is a nice way of actually doing indirects
<karolherbst>
I think CL apps have to create arrays and shit
<jekstrand>
My image handling pass already does indirects
<karolherbst>
cool
<jekstrand>
Like it'll handle it fine if you store a bunch of image pointers in an array and indirect the array.
<jekstrand>
Not sure if that's legal in CL but we'd handle it.
<jekstrand>
Assuming you're using the thing I wrote for clover
<karolherbst>
I am sure it's legal in CL and the only way apps can do that
<Kayden>
The thing about textures_used, is that at least in GL, you can bind a whole bunch of images or samplers, and the current shader may not actually use all the bindings
<Kayden>
so you set-intersect and then only care about things that are both bound and referred to
<karolherbst>
Kayden: sure.. but why would you care about what's bound?
<karolherbst>
or can you not bind a sampler/texture and GL has to make sure it's not crashing?
<jekstrand>
For binding table setup, iris doesn't. It only uses that to figure out which things are referenced and compact the table down.
ahajda_ has joined #dri-devel
ahajda has quit [Read error: Connection reset by peer]
<karolherbst>
mhhh
<Kayden>
back in a while
<karolherbst>
GL has a strict indexing, right? you can't just reorder, or can you?
<karolherbst>
although...
<jekstrand>
You can do whatever you want in the driver as long as the client's bindings show up
<karolherbst>
okay.. so you essentially just need to know how many textures there are or does texture_used has any benefits on top?
<karolherbst>
like could you just DCE some, reindex and just move on?
MajorBiscuit has quit [Quit: WeeChat 3.4]
<jekstrand>
Yeah, stuff can get DCEd
<karolherbst>
okay
<jekstrand>
zmike: CTS run looks as good as any other I've done
<karolherbst>
jekstrand: anyway.. did you get the test to pass?
<zmike>
jekstrand: 👍
<jekstrand>
karolherbst: Not yet.
<jekstrand>
zmike: ship it?
<zmike>
🚢 🚢 🚢
<karolherbst>
jekstrand: does something in the stack do something funny with sampler vars?
<jekstrand>
karolherbst: Good question
<karolherbst>
"decl_var uniform INTERP_MODE_NONE sampler @2 (2, 8, 0)" I see that and I could imagine that the driver_location can mess things up
<jekstrand>
karolherbst: It's using txl, not txf so it does use a sampler
<karolherbst>
or well.. the location
fxkamd has quit []
<karolherbst>
I still bind the sampler at index 0 though
<karolherbst>
so I might have to reindex the sampler vars before passing it into drivers
<jekstrand>
It's using the sampler at index 0 but maybe it's not getting bound?
* jekstrand
looks
* jekstrand
feels like he's debugged this before
<karolherbst>
anyway, that reminds me that I have to do something with the sampler vars because atm they just take away space
<karolherbst>
so if it's broken for iris, I can just do this tomorrow then
<jekstrand>
karolherbst: I also need to come up with a less terrible buffer_clear implementation
<karolherbst>
:)
<jekstrand>
Really, I kind-of want rusticl to fall back to kicking off a kernel if buffer_clear doesn't exist.
danvet has quit [Ping timeout: 480 seconds]
<karolherbst>
yeah... I didn't focus on any kind of fallbacks atm
<jekstrand>
That's ok
<karolherbst>
but I think doing fallbacks in kernels is better than doing them in sw :)
<jekstrand>
Very much
<karolherbst>
also for all those copy ops
<jekstrand>
And I could implement something sensible in BLORP but I don't think it'd be any better than what we can do in rusticl for everyone.
<jekstrand>
Yup
<karolherbst>
well at least GPU to GPU copies we can't do on the GPU
<karolherbst>
okay
* karolherbst
adds stuff to an imaginary todo list
<karolherbst>
would be fun to supply our own kernels
<jekstrand>
karolherbst: Yeah, so iris uses textures_used for samplers. :(
<karolherbst>
I guess we would just use nir_builder?
<karolherbst>
jekstrand: figurew
<karolherbst>
*figures
<jekstrand>
karolherbst: Or OpenCL C and compile it with clc
<karolherbst>
I'd rather not have to compile to much CLC code at runtime though
<karolherbst>
also.. it's only for copies
<karolherbst>
that's trivial stuff
<karolherbst>
one memcpy intrinsic a loop and some casts :D
<karolherbst>
I am also think about improving what I bind at kernel launching time, so we don't have to rebind everything
<karolherbst>
and support partial updates
pcercuei has quit [Quit: dodo]
<jekstrand>
Yeah, it can be done with nir_builder too
<jekstrand>
We can also run clc at compile time and embed the SPIR-V
<jekstrand>
Lots of options
<karolherbst>
yeah..
<karolherbst>
I will play around with it
<karolherbst>
buffer <-> image copies I'd like to support with something like that :)
<karolherbst>
but at this point I might even have to write a state tracking mechanism for kernels... will be fun
<karolherbst>
or well.. went to
<karolherbst>
*want
jfalempe has quit [Remote host closed the connection]
jfalempe has joined #dri-devel
<jekstrand>
Ok, looks like samplers are uploading properly
icecream95 has joined #dri-devel
<jekstrand>
karolherbst: Are we still using grid->input?
<karolherbst>
for the time being, yes
<karolherbst>
why?
<jekstrand>
Because I seem to have dropped iris support for it
<jekstrand>
But then how is anything working?!?!?
<karolherbst>
well...
<karolherbst>
I have no idea?
<karolherbst>
I support it's still there?
<jekstrand>
Hrm... iris is doing it via sysvals
<jekstrand>
so cbuf0
<karolherbst>
yeah, which is fine :)
* jekstrand
wishes he could debug on hardware with int64 support
<zmike>
are there any generic nir passes which can do array compaction for cross-stage i/o?
<jekstrand>
zmike: look at brw_nir_link_shaders
<zmike>
like if I have `out foo float[32]; out bar float[32];` it'll combine them using location_frac
<zmike>
?
<jekstrand>
Oh, that? I don't think so.
<zmike>
gah
<jekstrand>
There might be something in the linking helpers but I don't remember.
<zmike>
I looked, but they're all for compacting the location
<zmike>
or so it seemed to me
<zmike>
a project for tomorrow I guess
<bnieuwenhuizen>
IIRC tarceris linking pass can combine varyings tightly in components
<jekstrand>
karolherbst: Well, I may have found the bug. Looks like load_input is getting turned into load_ubo with an offset of 4B for no reason.
<karolherbst>
jekstrand: uhm...
<bnieuwenhuizen>
but you'll need to check that the array is getting lowered to separate varyings first
<bnieuwenhuizen>
(IIRC there was something if all indexing was constant based but ...)
<zmike>
bnieuwenhuizen: hm I'll have to look closer at that tomorrow I guess
<karolherbst>
jekstrand: how can I turn on shader debugging for iris again?
<zmike>
it seems like I should be able to do this trivially by analysis since I can just set location_frac on teh whole variable and then export it with location+component
<karolherbst>
yeah, I have that one already though :)
<karolherbst>
jekstrand: mhh.. where is the offset added?
OftenTimeConsuming has quit [Remote host closed the connection]
<jekstrand>
karolherbst: Working on that
<karolherbst>
I see 1, 0 and 1, 0x20 as args
<karolherbst>
but not 1, 4
<bnieuwenhuizen>
zmike: I think the question is "what does that even look like" if you keep the array as an array
OftenTimeConsuming has joined #dri-devel
<jekstrand>
hrm... No. No offset. I was looking at the binding table index. :-/
<zmike>
bnieuwenhuizen: that's the adventure!
<karolherbst>
ohhh
<bnieuwenhuizen>
like AFAIU what you'd want is compact the 32-entry array in e.g. 8 locations of 4 components each right?
anholt__ has joined #dri-devel
<karolherbst>
jekstrand: so the offset is inside the binding table for the cb? ehh...
<zmike>
not entirely sure what I want at this point? in theory a 32-entry array is based on the driver's io capabilities
anholt__ is now known as anholt
<bnieuwenhuizen>
I think how some of this might be happening on radv is that we lower io to a separate tmp array + copies, which means that for final output & initial input all accesses are constant indexed, which helps lowering arrays
<zmike>
so either flattening into 8x4 (x2) or merging with the other array like 32x2
anholt_ has quit [Ping timeout: 480 seconds]
<zmike>
I'm not looking at changing any of the shader, just the variable decls (ideally)
<zmike>
I only stubbed my toe on this a minute ago, haven't really figured out what I want to do about it
rasterman has quit [Quit: Gettin' stinky!]
* zmike
grumbles about tessellation shaders
<marex>
time to exercise drm-misc commit access and apply a bugfix, stress level -> 11
ahajda_ has quit []
<marex>
whew ...
tursulin has quit [Read error: Connection reset by peer]
bcheng has quit [Remote host closed the connection]
bcheng has joined #dri-devel
heat has quit [Remote host closed the connection]
soreau has quit [Read error: Connection reset by peer]
soreau has joined #dri-devel
iive has quit []
Haaninjo has quit [Quit: Ex-Chat]
alanc has quit [Remote host closed the connection]
mbrost has quit [Ping timeout: 480 seconds]
alanc has joined #dri-devel
mbrost has joined #dri-devel
HankB_ has quit [Remote host closed the connection]