ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar
<daniels> karolherbst: pre-armv8, unaligned access would trap for _all_ memory; it's a kernel tweak to decide if it passes on a SIGBUS or if it transparently fixes it up and continues
<karolherbst> :D
<karolherbst> daniels: I guess broken aarch64 plattforms could have the same thing?
<karolherbst> you have dts for all of that anyway, so just declare it inside one or something :P
<HdkR> unaligned device memory fixup on AArch64 sounds horrifying
<karolherbst> well..
<karolherbst> don't build broken hardware then?
<jcdutton> daniels, how would we test the SIGBUS transparent fixes feature?
<jekstrand> karolherbst: That sounds good enough for now
jcdutton has quit [Quit: Leaving]
pcercuei has quit [Quit: dodo]
iive has quit []
tursulin has quit [Read error: Connection reset by peer]
tarceri_ has quit []
tarceri has joined #dri-devel
ngcortes has quit [Remote host closed the connection]
dllud_ has joined #dri-devel
dllud has quit [Read error: Connection reset by peer]
columbarius has joined #dri-devel
co1umbarius has quit [Ping timeout: 480 seconds]
JohnnyonFlame has quit [Ping timeout: 480 seconds]
mbrost has quit [Ping timeout: 480 seconds]
shankaru has joined #dri-devel
mbrost has joined #dri-devel
mclasen has quit []
mclasen has joined #dri-devel
mripard_ has joined #dri-devel
saurabhg has joined #dri-devel
saurabhg has quit []
jewins1 has quit [Remote host closed the connection]
mripard has quit [Ping timeout: 480 seconds]
JohnnyonFlame has joined #dri-devel
mbrost has quit [Ping timeout: 480 seconds]
<mareko> what is virglrenderer?
mclasen has quit []
mclasen has joined #dri-devel
shankaru has quit [Quit: Leaving.]
heat has joined #dri-devel
shankaru has joined #dri-devel
wens has joined #dri-devel
JohnnyonFlame has quit [Read error: Connection reset by peer]
mclasen has quit []
mclasen has joined #dri-devel
mbrost has joined #dri-devel
vnayana has joined #dri-devel
mclasen has quit [Ping timeout: 480 seconds]
kts has joined #dri-devel
kem has quit [Ping timeout: 480 seconds]
vnayana_ has joined #dri-devel
vnayana has quit [Ping timeout: 480 seconds]
kem has joined #dri-devel
Daanct12 has joined #dri-devel
ybogdano has quit [Ping timeout: 480 seconds]
mbrost has quit [Ping timeout: 480 seconds]
alatiera has quit [Quit: The Lounge - https://thelounge.chat]
ybogdano has joined #dri-devel
Duke`` has joined #dri-devel
alatiera has joined #dri-devel
<airlied> mareko: the host side of virgl
<airlied> plugs into qemu or crosvm and translate gallium into GL (or in later bits vulkan into vulkan)
ybogdano has quit [Ping timeout: 480 seconds]
mattrope has quit [Read error: Connection reset by peer]
heat_ has joined #dri-devel
heat has quit [Read error: Connection reset by peer]
YuGiOhJCJ has quit [Quit: YuGiOhJCJ]
aravind has joined #dri-devel
kts has quit [Quit: Konversation terminated!]
Daanct12 has quit [Remote host closed the connection]
prahal has quit [Ping timeout: 480 seconds]
heat has joined #dri-devel
heat_ has quit [Read error: No route to host]
danvet has joined #dri-devel
itoral has joined #dri-devel
nchery has quit [Remote host closed the connection]
nchery has joined #dri-devel
eukara_ has quit []
Duke`` has quit [Ping timeout: 480 seconds]
pnowack has joined #dri-devel
heat has quit [Ping timeout: 480 seconds]
soreau has quit [Read error: No route to host]
soreau has joined #dri-devel
Daanct12 has joined #dri-devel
alanc has quit [Remote host closed the connection]
alanc has joined #dri-devel
shankaru has quit [Quit: Leaving.]
ahajda has joined #dri-devel
frieder has joined #dri-devel
itoral has quit [Remote host closed the connection]
kts has joined #dri-devel
shankaru has joined #dri-devel
jkrzyszt has joined #dri-devel
kts_ has joined #dri-devel
kts has quit [Ping timeout: 480 seconds]
itoral has joined #dri-devel
MajorBiscuit has joined #dri-devel
MajorBiscuit has quit []
kts_ has quit []
nchery has quit [Read error: Connection reset by peer]
MajorBiscuit has joined #dri-devel
illwieckz has quit [Ping timeout: 480 seconds]
tursulin has joined #dri-devel
ella-0_ has quit [Read error: Connection reset by peer]
<pq> agd5f, the only use case I could understand from the email discussions was that you want to kill processes by the GPU reset event.
<pq> agd5f, in my opinion, killing processes is a harmful thing to do, and step in the wrong direction if we want to make systems recover from a reset.
<pq> This was just the final straw for me.
<danvet> hm which discussion?
<danvet> sounds like something I should look at :-)
<pq> Dri-devel is consuming my time and sanity, but I get nothing from it. I think it's time to stop this addiction. I have disabled mail delivery, but CC should still work. It's time for me to leave, but I will come back if I have to.
<pq> This way I'm less likely to butt in to topics I have nothing to do with.
sdutt has quit [Ping timeout: 480 seconds]
pq has left #dri-devel [Goodbye.]
lynxeye has joined #dri-devel
vnayana_ has quit [Ping timeout: 480 seconds]
anholt has quit [Ping timeout: 480 seconds]
anholt has joined #dri-devel
jbarnes has quit [Remote host closed the connection]
jbarnes has joined #dri-devel
<MrCooper> danvet: "[PATCH v2 1/2] drm: Add GPU reset sysfs event"
famfo has quit []
famfo has joined #dri-devel
<danvet> MrCooper, thx, replied somewhere hopefully useful
<MrCooper> cool
Daanct12 has quit [Remote host closed the connection]
<MrCooper> danvet: FYI, amdgpu doesn't always nuke everything for GPU reset; I've hit a few cases where mutter continued as if nothing happened after amdgpu reset just one engine (though Firefox responded to the corresponding robustness notification by dropping to SW rendering :)
pcercuei has joined #dri-devel
<danvet> javierm, do you plan to apply the patches from geert?
<danvet> maybe also volunteer him for commit rights in a bit
<danvet> MrCooper, yeah there's a "unblock stuck threads in all CU" first step iirc
<danvet> I thought I mentioned that in my mail
<danvet> but after that it's the "nuke the chip" step
<danvet> which is awkward because the locking against kms is pain
<danvet> and the locking against fbcon is impossible :-/
<danvet> intel hw is pretty nice, since you have per block reset
<danvet> where block = engine, except for compute engines because they all share the same EU so hard to reset in isolation
shankaru has quit [Quit: Leaving.]
pochu has joined #dri-devel
tzimmermann has joined #dri-devel
angerctl has quit [Ping timeout: 480 seconds]
angerctl has joined #dri-devel
<tzimmermann> javierm, hi. if you're around, i'd need a review of https://patchwork.freedesktop.org/series/101321/
mclasen has joined #dri-devel
mclasen has quit []
mclasen has joined #dri-devel
frankbinns has joined #dri-devel
rkanwal has joined #dri-devel
mclasen has quit []
mclasen has joined #dri-devel
flacks has quit [Quit: Quitter]
flacks has joined #dri-devel
<javierm> danvet: yes, I was planning to wait a little bit if someone else wanted to look and then apply
<javierm> danvet: and yeah, we should ask him to request commit access too
<javierm> tzimmermann: sure. I didn't because thought that Geert already did but will look at it now
shankaru has joined #dri-devel
mvlad has joined #dri-devel
mclasen has quit []
mclasen has joined #dri-devel
shankaru has quit []
elongbug has joined #dri-devel
elongbug has quit []
MajorBiscuit has quit [Ping timeout: 480 seconds]
lemonzest has quit [Quit: WeeChat 3.4]
shankaru has joined #dri-devel
aravind has quit []
itoral has quit [Remote host closed the connection]
sdutt has joined #dri-devel
shankaru has quit [Quit: Leaving.]
lemonzest has joined #dri-devel
pochu has quit [Quit: leaving]
fxkamd has joined #dri-devel
mbrost has joined #dri-devel
mattrope has joined #dri-devel
mattrope has quit [Remote host closed the connection]
rkanwal has quit [Ping timeout: 480 seconds]
<pjakobsson> danvet, did my explanation in https://patchwork.kernel.org/project/dri-devel/patch/20220317092555.17882-4-patrik.r.jakobsson@gmail.com/ make sense or do you want me to resend?
<danvet> pjakobsson, nah I simply missed those when scrolling through
<danvet> I tried to look for anything like that, I guess I wasn't awake enough yet
<danvet> and yeah maybe for next time around split these up more, the commit message at least sounds a bit like multiple things smashed into one
<pjakobsson> danvet, ok thanks
sdutt has quit []
sdutt has joined #dri-devel
Company has joined #dri-devel
kts has joined #dri-devel
mclasen has quit []
mclasen has joined #dri-devel
shankaru has joined #dri-devel
Haaninjo has joined #dri-devel
nchery has joined #dri-devel
<graphitemaster> So on the topic of DirectStorage from yesterday, what is the equivalent in Linux
<imirkin> p2p dma
<imirkin> no userspace-focused APIs afaik
<imirkin> more like allowing dmabufs to be passed around
<imirkin> and making things Just Work (tm)?
<graphitemaster> Seems like something that should be added to io_uring
<imirkin> i guess i don't know too much about io_uring, but seems entirely unrelated to me
<imirkin> that one's about cpu <-> device interactions, basically
<graphitemaster> The thing is the NVMe device is not a regular mapped drive on the CPU side, it's just a flat store of bytes, GPU side just direct reads from it as if the thing is a big chunk of RAM
<graphitemaster> But you still need to write to the drive as if it was a RAM device (that's how the API works anyways)
<graphitemaster> So I assume that would map best to io_uring, rather than read/write which requires a working VFS implementation for the kernel.
<graphitemaster> It's also just unusually weird if you had to use pwrite/pread giving it pointer _addresses_ reinterpreted as offsets. I suppose you could mmap, but mmap is over a _file_
<graphitemaster> Maybe the device driver just exposes a DirectStorage drive as a singular file, humm..
<imirkin> there's also something called HMM
<imirkin> and afaik that's meant to work with dax pages?
<graphitemaster> I forgot about HMM - in either case we need some working user-space API :P
<graphitemaster> They're also planning on bringing DirectStorage to WebGPU
<graphitemaster> So now my webbrowser can fill an SSD from Javascript :P
<imirkin> woohoo!
MajorBiscuit has joined #dri-devel
kts has quit [Quit: Konversation terminated!]
cheako is now known as cheakoirccloud
cheakoirccloud has quit []
cheakoirccloud has joined #dri-devel
cheakoirccloud has left #dri-devel [#dri-devel]
cheako has joined #dri-devel
<anholt> tomeu: does the current lava stuff really not have a "regex for lines looking like this and restart the job?
<anholt> should I be qable to regex on "for line in logs:"?
<tomeu> we have that in lava, but is more involved than that
<tomeu> it basically detects what phase of a job it failed, and depending on what it was will fail the job with an infrastructure error
<tomeu> which is handled differently
<anholt> https://gitlab.freedesktop.org/mesa/mesa/-/issues/6158 is pretty critical at this point
<anholt> like, we need to turn off a618 if we can't get progress on it
<tomeu> I think we should remove coverage more aggressively if the impact is high
<tomeu> I can do that
<anholt> here's another a618 with an obvious regex to add to trigger restart https://gitlab.freedesktop.org/mesa/mesa/-/jobs/19878350
<anholt> if this was bare-metal, I would have just added the regexes and been done, but I don't see how to do that for maintaining lava boards.
<tomeu> well, it's hard to keep the total pipeline runtime down if we have so many restarts, we have seen that being a problem already
<anholt> ok, but is someone working on fixing the underlying issue on your end in that case?
<anholt> it's failing multiple marges per day currently.
<anholt> if you don't have someone fixing the underlying thing, then we at least need restarts.
<anholt> or we have to give up on using the lava lab for our coverage.
eukara has joined #dri-devel
<tomeu> yes, we have one person in the lab that is going to be actively looking for these issues (starting from tomorrow)
<daniels> it's failing a bit less often than a630, but close
<daniels> (that's raw numbers, not filtered for legit vs. spurious)
<daniels> anholt: turns out I don't understand 2022 Python, so gallo has the stuff I was doing to change log parsing (so we can regex, and get positive confirmation of starting deqp rather than negative confirmation of specific fail regex, and so it's less hokey, and also way less noisy), and is merging it into the version he has in tree now which has tests etc
jkrzyszt has quit [Ping timeout: 480 seconds]
tobiasjakobi has joined #dri-devel
shankaru has quit [Quit: Leaving.]
tobiasjakobi has quit []
Duke`` has joined #dri-devel
MajorBiscuit has quit [Ping timeout: 480 seconds]
airlied has quit [Remote host closed the connection]
airlied has joined #dri-devel
frieder has quit [Remote host closed the connection]
<agd5f> graphitemaster, both devices need to support the same p2p API in the kernel (GPUs support dma-buf, nvme supports pci p2p), so you'd need to convert one or the other to a common p2p API. Then you need to some userspace API to provide actually utilize the p2p transfer.
<karolherbst> airlied: did you write some patches for nir_op_ball_iequal8 and the likes?
<karolherbst> jekstrand: I think I am almost done :D
<airlied> karolherbst: thought there was a lowering pass
<karolherbst> airlied: there is, just not for 8 and 16
<graphitemaster> agd5f, Well I hope someone is looking into it. I sense a shift in gaming where DirectStorage goes from being an optional thing to a requirement in the next 2 years or so.
<karolherbst> I was just hoping you already have a patch for it
<jekstrand> karolherbst: done?
<karolherbst> jekstrand: yeah.. so maybe 2 or 3 things and all the test pass at least using a CL 1.0 device
<jekstrand> Nice!
<jekstrand> karolherbst: Images too?
<karolherbst> ahh nope
<karolherbst> I am not sure if I want to do images or 1.2 next
<karolherbst> probably 1.2
<karolherbst> images is probably just wiring up more gallium interfaces really
<karolherbst> I think
<karolherbst> and using whatever lowering pass we have in clc
mhenning has joined #dri-devel
<karolherbst> 1. work offsets 2. linking with external kernels 3. some alu precision stuff
<karolherbst> that's what needs to be done for 1.0 without images I think
<karolherbst> there are some weirdo sub buffer fails as well... not sure what the problem is though
<airlied> karolherbst: does it reproduce under clover?
<karolherbst> what?
<airlied> whatever causes that intrinsic
<karolherbst> ehh.. let me try
<karolherbst> airlied: yeah, it does
<karolherbst> test_relationals: ../src/compiler/nir/nir_lower_bool_to_int32.c:109: lower_alu_instr: Assertion `alu->dest.dest.ssa.bit_size > 1' failed.
illwieckz has joined #dri-devel
<karolherbst> I can fix it, I just hoped you already did so :D
<airlied> ill look in my wip branch on a bit
heat has joined #dri-devel
<karolherbst> airlied: I also need something for contractions
<karolherbst> 31) Error for float kernel1: -0x1.c59d36p-122 * -0x1.8029ap-17 + -0x0p+0 = *0x0p+0 vs. 0x1.546p-138
<karolherbst> clover also fails here
<karolherbst> jekstrand: I am slowly start to wonder when I should create an MR for it :D
<jekstrand> karolherbst: Why not?
<jekstrand> karolherbst: We can keep working on the MR branch while dcbaker and others argue over the rust build system stuff.
<karolherbst> yeah... my goal was just to get it into good enough shape it passes a basic CL 1.0 CTS run :D
<karolherbst> ahhh
<karolherbst> right, we need that include stuff fixed
<karolherbst> but yeah. I think at this point I am mostly interested in general feedback from everybody knowing more about rust than I do
ngcortes has joined #dri-devel
<dcbaker> If you’re willing to go the structured_sources route, that will work in the upcoming 0.62 release
flto has quit [Ping timeout: 480 seconds]
<karolherbst> dcbaker: ahh, no, we found another solution, which involves dangerous pointer casting instead
<karolherbst> but I meant that other issue
<karolherbst> where I need generated nir header files for bindgen
<dcbaker> Ah, that one
<dcbaker> That will have to wait till 0.63, I’m having to change implantation details of dependencies in non-trivial ways :/
<karolherbst> yeah no worries
<karolherbst> not interested in getting it merged at this point, just for more people to look at it, now that stuff basically works
<dcbaker> I have a start on that, but we’re in release freeze atm
Danct12 has quit [Remote host closed the connection]
<dcbaker> I’m just interested in getting stuff solved asap so we don’t need a Meson from yesterday when we’re ready to turn it on in general
<jekstrand> karolherbst: I am planning to come back to it. Really, I am!
* jekstrand is having too much fun with misc. Vulkan drivers this week
<karolherbst> :D
<jekstrand> I'm slightly tempted to dive into v3dv as well but I think I might put that off a bit.
<jekstrand> zmike: Kicking my lavapipe renderpass MR through CI now
<jekstrand> zmike: crucible's func.first passes so it's not totally hosed. :D
<karolherbst> airlied: I think I really need this compiler stuff though... but compiling llvm-14 locally is going to be pain
<zmike> jekstrand: ci doesn't do a full cts run
<zmike> so that's gonna have to happen
<jekstrand> zmike: It doesn't? Ok, I can do that. Let me shove everything over to my desktop
<zmike> no, it does a very short run
<karolherbst> jekstrand: the most painful part is now how long those test run take.. 1:10 hours on my i7-12700
<karolherbst> and the CPU is pretty much at 100% all the time
<karolherbst> I hope that's much better on iris
<jekstrand> karolherbst: It's somewhat better on iris because there's an actual GPU but it's probably still pretty compile-heavy
<karolherbst> mhhh I wonder
<karolherbst> there is this thread_dimension full_3d_explicit_local test which alone is responsible for 30 min of runtime
ngcortes has quit [Remote host closed the connection]
<karolherbst> I am sure you'd be able to cut it in half with iris
<karolherbst> maybe that's the next step after getting llvmpipe to work :D
<airlied> I think I had to move that into the state tracker
iive has joined #dri-devel
flto has joined #dri-devel
<airlied> probably need that in clc
<karolherbst> mhhh
<karolherbst> ahh yeah, makes sense
<karolherbst> that clover patch looks like something I can port
rasterman has joined #dri-devel
alyssa has joined #dri-devel
<alyssa> Does anyone know what the spec says about side effects in vertex shaders?
<jekstrand> zmike: Ok, Full CTS run going
<jekstrand> Should take about an hour according to deqp-runner's current estimates.
<alyssa> Is it legal for the shader to be run multiple times? or none, for culled vertices?
<zmike> sounds right
<alyssa> (Relevant for tilers)
<jekstrand> alyssa: vertex shader or fragment?
<alyssa> jekstrand: vertex
<jekstrand> alyssa: multiple: always, I think. None? Depends on if it has side-effects like atomics.
<alyssa> "atomicAdd(count, 1); gl_Position = a_position;"
<alyssa> errm
<alyssa> "atomicAdd(counts[gl_VertexID], 1); gl_Position = a_position;"
<jekstrand> I *think* you're supposed to always run it if there are side-effects
<karolherbst> okay... get_global_offset
<jekstrand> But I could be wrong there. I'm not especially familiar with that particular corner.
<alyssa> It would be convenient if the spec didn't say anything about the value of counts[] for that shader
<alyssa> but hey
<alyssa> not sure what I would even grep for
<imirkin> alyssa: side-effects
<imirkin> and/or "side effects"
<imirkin> you can also look at e.g. the atomic counters spec (i know this isn't about counters, but it's the same idea)
<alyssa> imirkin: There's text for helper invoctions but nothing else in ESSL 3.2
* alyssa tries ES3.2 itself
<alyssa> "The repeatability requirement doesn't apply when using shaders containing side effects... because these memory operations are not guaranteed to be processed in a defined order"
* jekstrand can actually hear his desktop right now
<anholt> alyssa: yes, legal to run VSes multiple times.
<alyssa> anholt: OK
<imirkin> alyssa: that's a different thing, fwiw
<imirkin> alyssa: that's for the r600-style apply-everything-at-the-end types of setups
<alyssa> imirkin: ah
<imirkin> and then the exact ordering of blocks can end up being different
<alyssa> spec seems silent here
<anholt> GLES was explicit about this related to atomics, GL wasn't, but it's been treated as a bug when GL tests relied on executing once.
<alyssa> anholt: that helps, thank you :)
<alyssa> (atomics and also SSBOs/images/etc I would hope?)
<anholt> right
<alyssa> does the side effect have to happen for vertices corresponding to primitives that will be culled?
<alyssa> (Practically: do the side effects haev to happen in the binning shader?)
ella-0 has joined #dri-devel
<graphitemaster> Why might an implementation want to execute the VS multiple times?
<imirkin> graphitemaster: binning
<alyssa> graphitemaster: and if re-running the VS is faster than looking up the result in memory
<alyssa> (Might hold for funny caching setups, etc)
<imirkin> graphitemaster: or more importantly, tiling. binning is just an optimization for tiling.
<imirkin> basically all the mobile gpu's only have a teensy little bit of fast ram that's good for blending and such ops
<imirkin> so a FB is cut up into tiles, and you run the full pipeline once for each tile
<imirkin> but then you run multiple draws in a single tile
froz has joined #dri-devel
<imirkin> so you save on moving things in/out of fast memory
rkanwal has joined #dri-devel
<imirkin> in exchange for running the vertex stages a bunch of times
froz has quit []
<alyssa> to add, new Mali supports two (well three) geometry flows
<imirkin> (and binning is an opt which helps avoid running the _whole_ vertex pipeline, and instead only do the vertices which will rasterize to polygons that matter)
<alyssa> a binning flow, and a non-binning one
<imirkin> ah yeah, binning is optional on adreno too
<alyssa> (The non-binning flow is legacy)
<alyssa> Panfrost is conservative and only uses binning when there are no side effects / XFB
<alyssa> but maybe that's overkill
<alyssa> I am currently wiring up the legacy path for valhall and wondering if I should just, not
agd5f_ has joined #dri-devel
<alyssa> I suppose it helps for that random big GL game that was written without tilers in mind and depends on the exactly-one-invocation behaviour
<alyssa> if we end up with drirc entries for that
<HdkR> every big GL game that is written without tilers in mind*
<alyssa> HdkR: ...and depends on the subtle SSBO behaviour
<HdkR> Of course :D
<imirkin> alyssa: freedreno does XFB in the binning pass ;)
<imirkin> [when there's no hw xfb supported, that is]
<alyssa> imirkin: So I here
<alyssa> hear
agd5f has quit [Ping timeout: 480 seconds]
* alyssa would be somewhat interested in a nir_lower_xfb
<alyssa> using mareko 's new store_output stuff
* alyssa might even write one depending on her motivation levels
<airlied> karolherbst: 15433 has the clover fix if you have am inute
<imirkin> alyssa: i thought the freedreno thing was a common pass .... maybe not
<alyssa> imirkin: will see
<imirkin> it does rely on some special inputs
<alyssa> I don't think it became possible to lower XFB purely generically until a few weeks ago
<imirkin> might not be extremely generic. but it's def an nir pass.
<alyssa> Anyway, doing "hardware" XFB made sense for Malis that didn't do binning, it saved on memory b/w
<Lynne> airlied: "can't use VK_IMAGE_CREATE_DISJOINT_BIT because VK_FORMAT_G8_B8R8_2PLANE_420_UNORM doesn't support VK_FORMAT_FEATURE_DISJOINT_BIT"
<imirkin> should always do hardware xfb. a3xx just doesn't have it. and i'm still RE'ing the finer details on a4xx.
<alyssa> we don't have real hw xfb
<alyssa> it's just, on older malis the varyings all got written to driver-allocated memory anyway
<Lynne> we need to be able to convert between planar images (for codecs) and standalone images (what we normally use everywhere), so not being able to merge images shuts us down on writing a hwaccel
<alyssa> so we can play games to use the same buffer for xfb and for internal varying use
<airlied> Lynne: I don't think the hw can support disjoint for hw operations thoigh
<airlied> Lynne: the decode hw has to have the planes in the same BO allocation
<Lynne> for intel yes, but pretty sure AMD wants separate planes
<Lynne> at least in vaapi land
<airlied> nope
<airlied> you can have separate planes but they must be in the same object allocation
<Lynne> oh, is that why radv hard-requires dedicated allocation?
<airlied> oh maybe I'm confusing intel and amd here, let me dig a bit more
<alyssa> imirkin: anyway, would need to benchmark but I suspect on newer Mali lowering XFB would be a win for perf
<airlied> Lynne: oh looks like you might be right, I should probably allow disjoint on amd then
<alyssa> since Arm seemed to stop caring about perf of the legacy (non-binning) path
lynxeye has quit []
<imirkin> alyssa: ok. i prefer the hw approach since it avoids ... annoyances. but if it's missing, then can't do much about that
<alyssa> imirkin: nod. I guess what I'm saying is, they're both software paths, just with different costs.
<imirkin> ah yeah. then it's annoying.
<airlied> Lynne: looks like radv has no disjoint support at all yet, so will have to think about it a bit
<airlied> bnieuwenhuizen: ^ any ideas?
<alyssa> imirkin: and given we're failing a big pile of piglits (despite passing deqp-gles), I am not convinced the code is correct either :v
<imirkin> alyssa: well, watch out with piglits... a lot of them assume correct GL_QUADS handling
<imirkin> which you might not have, nor care to have
<alyssa> We have real QUADS
<bnieuwenhuizen> airlied: it is annoying to reference multiple BOs in descriptor sets for a single binding without making it huge for everything
agd5f_ has quit []
gouchi has joined #dri-devel
agd5f has joined #dri-devel
<agd5f> graphitemaster, I wouldn't hold your breath considering how much of a challenge it was to get basic p2p support upstream.
<agd5f> plus I think only AMD and Intel CPUs support p2p DMA. There is no PCIe spec to determine whether it's supported or not
<agd5f> so the kernel uses a whitelist
<bnieuwenhuizen> Lynne: not sure about imported images, but in radeonsi VA, all allocated images currently consist of a single BO
<Lynne> airlied: thanks, also look into updating your radv video branch to the newest headers
<Lynne> I've left my more-or-less complete but untested code here - https://github.com/cyanreg/FFmpeg/tree/vulkan_decode
<Lynne> I have a spare machine, I'll sacrifice it and test with nvidia tomorrow
<Lynne> bnieuwenhuizen: even with VA_EXPORT_SURFACE_SEPARATE_LAYERS?
<bnieuwenhuizen> lemme look up what layers were in this case again, but IIRC all layers still get the same BO
<airlied> Lynne: a rebase should bring them up 1.3.207 is that new enough?
<bnieuwenhuizen> just with a different offset
* airlied will fixup the rebase today
<bnieuwenhuizen> Lynne: thought AFAICT VA_EXPORT_SURFACE_SEPARATE_LAYERS doesn't really get any handling in the gallium libva frontend
<Lynne> airlied: yah, that's new enough
<Lynne> airlied: " vkCreateImage(): extent.depth 1 exceeds allowable maximum image extent depth 0"
<Lynne> for a 2D image, that doesn't make any sense
<airlied> grrr someone updated the vulkan headres without using the script
<bnieuwenhuizen> I updated but I haven't heard of any script?
<Lynne> someone also broke the video headers by not including vulkan_video_codecs_common.h
<airlied> bin/khronos-update.sh
ybogdano has joined #dri-devel
<airlied> though it might be a bit heavyweight and we should let it just update vulkan headers if needed
<airlied> bnieuwenhuizen: please ack 15434
<bnieuwenhuizen> done
<airlied> Lynne: branch is on updated headers now
<Lynne> thanks
nchery is now known as Guest2430
nchery has joined #dri-devel
<jekstrand> zmike: Pass: 375726, Fail: 5, Crash: 3, Warn: 16, Skip: 713664, Timeout: 41, Flake: 6, Duration: 43:48, Remaining: 0
<jekstrand> zmike: Is that good enough or do you want me to run against main and compare?
<Sachiel> unacceptable. 100% pass ratio or bust
<zmike> jekstrand: lavapipe is currently conformant minus like 2 tests from https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15335
<zmike> so...yeah crashing is no bueno
<jekstrand> Ok
* jekstrand runs mainline
Guest2430 has quit [Ping timeout: 480 seconds]
mclasen has quit []
mclasen has joined #dri-devel
deathmist has joined #dri-devel
<karolherbst> th hell.. decl_var uniform INTERP_MODE_NONE uvec4x0a16B base_global_invocation_id (1, 16, 0)
<karolherbst> I am not sure if I should blame rust or soemthing else on that uvec4x0a16B
<jekstrand> karolherbst: There are passes you may be running which turn vec3 into vec4
<karolherbst> ohh that's not what I mean. I am just mostly confused on how that type is written out
<alyssa> karolherbst: memory corruption? :-p
<karolherbst> I have no idea
<jekstrand> It's a uvec4 with an explicit alignment of 16B
<karolherbst> ahh
<alyssa> jekstrand: where does this notation come from
<jekstrand> And a stride of 0 because stride doesn't matter for vectors
<jekstrand> alyssa: Uh... I came up with it?
<karolherbst> jekstrand: I am do wondering though why microsoft lowest vec3 to vec4
<jekstrand> nir_print.c
<karolherbst> *lowers
<jekstrand> karolherbst: It gets rid of most struct holes so memcpy optimizations work.
<jekstrand> That's one reason, anyway
<karolherbst> ohh
<karolherbst> okay.. then I guess I can keep it anyway and still load internal vec3 uniforms
<deathmist> is there docs on running dEQP? I know it's setup for CI but I'd like to help out with FD540 and run it locally
<mattst88> deathmist: not really any docs, but this is how I run it: https://dpaste.org/jk1L
rkanwal has quit []
rkanwal has joined #dri-devel
<deathmist> mattst88: thanks, I'll test it after building a debug-optimized mesa
<mattst88> yw!
pnowack has quit [Quit: pnowack]
<HdkR> So there are currently two RISCV GPU projects in the world. LibreSoC and RV64X. I believe both projects have been in here before? Anyone have any information as to which one is active?
heat has quit [Read error: No route to host]
<airlied> HdkR: LibreSOC is not longer risc-v based
heat has joined #dri-devel
<bnieuwenhuizen> also don't forget llvmpipe
<HdkR> ah. That's good to know
<HdkR> Did LibreSoC move over to POWER or something?
<alyssa> hnnnn
<bnieuwenhuizen> IIRC yes
<alyssa> HdkR: are you trying to nerdsnipe me
<HdkR> Nah, I'm just trying to find some RISCV CPU designers to poke at :P
<HdkR> Prodding someone at SiFive may be useful as well
<bnieuwenhuizen> someone was also on mesa-dev@ asking about stuff while apparently working on RISC-V extensions for GPUs
<alyssa> if you all aren't trying to nerdsnipe me you're doing a terrible job
<bnieuwenhuizen> alyssa: whatis there to be nerdsniped by?
<HdkR> alyssa: So you're saying you DO want to work on x86 emulation
<airlied> HdkR: pixilica is the name of one company involved
<bnieuwenhuizen> AFAICT these projects are all terrible at actually delivering a GPU, so no driver to write
<airlied> tbh the idea of a risc-v gpu is woefully bad
<HdkR> airlied: I love the name
<airlied> just write a GPU ISA, it's not like you need to standardise it
<airlied> you are going to get patent sued to oblivion anyways
<HdkR> haha
<karolherbst> jekstrand: oh wow... I am surprised how far I cam without caring about alignment inside the input buffer at all...
<alyssa> bnieuwenhuizen: open hw gpu
<jekstrand> karolherbst: hehe
* airlied would love to have a super fast/massive FPGA, open source tooling, and 5 years of funding, and then 10 years in court
<jekstrand> zmike: Results for my dynamic rendering branch are the same as your sync2 branch that it's based on top of. I'd say that means it's good.
<zmike> jekstrand: weird
<zmike> but not the first time I've seen different systems/llvms give wildly different results
<airlied> jekstrand: what are the crashes?
<jekstrand> zmike: This is also on whatever CTS branch I'm on
<jekstrand> dEQP-VK.glsl.crash_test.divbyzero_comp,Crash
<zmike> yea that's known + fixed-ish
<jekstrand> dEQP-VK.spirv_assembly.instruction.spirv1p4.entrypoint.tess_con_pc_entry_point,Crash
<jekstrand> dEQP-VK.spirv_assembly.instruction.spirv1p4.entrypoint.tess_eval_pc_entry_point,Crash
<zmike> those are new
* jekstrand runs in gdb
<jekstrand> crashes inside LLVM jit code, of course
<jekstrand> :facepalm:
<zmike> the nightmare continues
<zmike> I guess one of us will have to check them out
<zmike> can you file tickets?
<jekstrand> I'm gonna go with not my fault. :D
lemonzest has quit [Quit: WeeChat 3.4]
<zmike> yea prob not
<HdkR> If you name the function in LLVM then it is nice enough to give you gdb symbols :P
ngcortes has joined #dri-devel
<zmike> thx
Haaninjo has quit [Quit: Ex-Chat]
<airlied> alyssa: you can file you lower to scalar dubiousness in an issue if you want :P
illwieckz has quit [Ping timeout: 480 seconds]
<Lynne> airlied: fixed all validation issues, only the dedicated allocation/single->multiplane image issue remains
<Lynne> IIRC nvidia didn't require dedicated allocation, so I'll see if it works tomorrow
<airlied> Lynne: oh we might not required dedicate alloc either actually
<airlied> though other vendors might so no harm in supporting it
illwieckz has joined #dri-devel
<Lynne> oh, right, requiresDedicatedAllocation is 0
Duke`` has quit [Ping timeout: 480 seconds]
mvlad has quit [Remote host closed the connection]
* karolherbst hates regressions
camus1 has joined #dri-devel
rkanwal has quit [Ping timeout: 480 seconds]
camus has quit [Ping timeout: 480 seconds]
<karolherbst> jekstrand: but if I compare how it was to support CL on top of nir in the begining with the situation know I have to say that's a lot more pleasent today. It's nice when stuff just works (tm)
<karolherbst> "FAILED 1 of 73 tests." nooooo
<karolherbst> ehh kernel_preprocessor_macros .. *sigh*
<karolherbst> __OPENCL_VERSION__ and __OPENCL_C_VERSION__ are quite annoying
<karolherbst> sometimes clang doesn't define them and sometimes it does
<jekstrand> :(
<karolherbst> I think I can fix it by passing in -cl-std but I think that was breaking other stuff
rasterman has quit [Quit: Gettin' stinky!]
alyssa has quit [Quit: leaving]
<karolherbst> test_basic: PASSED 73 of 73 tests. \o/
<karolherbst> airlied: so now I really have to figure out that extern kernel stuff :D
<karolherbst> either by backporting that patch *ugh* or compiling and installing llvm-14 *uhhuhhhgggh*
<jekstrand> Those sound equally terrible
<karolherbst> depends on how terrible it would be to backport that one fix
<jekstrand> backporting also involves building LLVM
<karolherbst> not if it's inside the translator
<jekstrand> Oh, if it's just in the translator, that's not bad
<karolherbst> yeah.. should be fine
<karolherbst> jekstrand: wait.... clang just told me something
<karolherbst> "input.cl:1:1: error: OpenCL C version 1.0 does not support the 'extern' storage class specifier"
<karolherbst> :D
<karolherbst> I think I might going to skip it then
<karolherbst> a bit annoying that most of those tests kind of assume CL 1.2
<jekstrand> yeah
<karolherbst> let's create an MR and see what happens
danvet has quit [Ping timeout: 480 seconds]
rasterman has joined #dri-devel
<karolherbst> what color does Rust have?
mbrost has quit [Ping timeout: 480 seconds]
<karolherbst> so... it's done
<karolherbst> soo... let's take a look at 1.1 shall we
<karolherbst> uhm.. how can I change the source branch for an MR?
<ajax> that can't be the intended sort order, right?
gouchi has quit [Remote host closed the connection]
<karolherbst> ahh cl_khr_fp64 is required for CL 1.2 nice
<karolherbst> but I think I start reporting 3.0 once that's enabled :)
<jekstrand> :)
<imirkin> ajax: going in order of performance...
fxkamd has quit []
<karolherbst> "Device Version OpenCL 3.0" :3
tobiasjakobi has joined #dri-devel
tobiasjakobi has quit []
<airlied> karolherbst: yeah backporting shouldn't be horrible
* karolherbst switching to 3.0 now
<airlied> karolherbst: though for CL3.0 there were also a lot of header file changes that I'm not sure are in llvm-13
<karolherbst> ahh..
<airlied> karolherbst: it might be an idea to avoid using the opencl-c.h in favour of the new thing
<airlied> but I think the new thing is only in pretty new llvm
<Lynne> airlied: just pulled from your radv-vulkan-video-prelim-decode branch, now everything segfaults, starting from radv_GetPhysicalDeviceVideoCapabilitiesKHR
<karolherbst> airlied: yeah.. but we can figure out those details later
<airlied> karolherbst: tstellar had some llvm nightly copr I think now for fedora
<karolherbst> airlied: but that's all handled inside clc anyway, no?
<airlied> Lynne: that seems suboptimal, let me give check it here
<airlied> karolherbst: no opencl-c.h comes from clang
<karolherbst> sure, but clc handles including that
<karolherbst> I don't mean libclc
<karolherbst> I mean src/compiler/clc
<airlied> yes clc likely needs those fixes for 3.0 support
<airlied> Lynne: still passes cts tests here, I should clone your work and look
ahajda has quit [Quit: Going offline, see ya! (www.adiirc.com)]
<airlied> Lynne: I think the gstreamer folks hit the same issue with disjoint
<airlied> "This is because the Vulkan format output by the driver’s decoder device is VK_FORMAT_G8_B8R8_2PLANE_420_UNORM, which is NV12 crammed in a single image, while for GstVulkan a NV12 frame is a buffer with two images, one per component. "
<Lynne> hmm, I just remembered we have single-buffer specialcase for intel hardware
<airlied> yeah intel hw definitely can't support it at all
<graphitemaster> I'm having the strangest behavior here, glClearNamedFramebufferfv is being affected by a stale bound shader
<airlied> Lynne: one thing radv does very wrong at the moment is DPB allocation, hoping to try and make it spec compliant
<airlied> unfortunately it breaks the nvidia player if I do
<Lynne> break it all the way.
<Lynne> that mass of C++ object-orientated code deserves all it gets
<airlied> yeah it's a horror show to even hack the fixes into it
<Lynne> a quick ffmpeg command to test my branch: "./ffmpeg_g -init_hw_device "vulkan=vk:0,debug=1" -hwaccel vulkan -hwaccel_output_format vulkan -i test.mkv -frames:v 1 -loglevel verbose -f null -"
<Lynne> it should autodetect vulkan on ./configure
<Lynne> for the test video, anything remotely standard h264 should work
ybogdano has quit [Ping timeout: 480 seconds]
frankbinns has quit [Remote host closed the connection]
rasterman has quit [Quit: Gettin' stinky!]
<Lynne> pushed a change to make it always use contiguous memory for decoding
<Lynne> didn't keep the old repo around, so can't test -_-
rgallaispou1 has quit [Read error: Connection reset by peer]
<airlied> Lynne: you have to pass a VIDEO_DECODE_H264_CAPABILITIES_EXT in pNext to that function I think
<karolherbst> CL 3.0 conformance is looking good as well :)
<karolherbst> ~250 fails
rgallaispou has joined #dri-devel
<airlied> Lynne: though I better go read the spec to confirm
<airlied> but I think what I have passed CTS tests
<Lynne> aaah, that
<Lynne> the validation layer complained about it, so I removed it
<airlied> https://paste.centos.org/view/f412c718 seems correct from reading spec
<Lynne> "vkCreateImageView(): Format VK_FORMAT_G8_B8R8_2PLANE_420_UNORM requires a VkSamplerYcbcrConversion but one was not passed in the pNext chain"
<Lynne> the hell?
<Lynne> airlied: yeah, I pushed that a minute ago
heat has quit [Remote host closed the connection]
<karolherbst> airlied: you wanted me to implement clCloneKernel? :D
heat has joined #dri-devel
<airlied> Lynne: I don;t think you need dec_caps anymore
<airlied> Lynne: I no longer see VideoDecodeCapabilities in the spec
<airlied> or maybe I need a newer pdf :)
<airlied> ah indeed I need to update the code again
ybogdano has joined #dri-devel
pcercuei has quit [Quit: dodo]