ChanServ changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - Logs https://oftc.irclog.whitequark.org/panfrost - <macc24> i have been here before it was popular
soreau has quit [Read error: Connection reset by peer]
soreau has joined #panfrost
JulianGro has joined #panfrost
<jekstrand> alyssa: Fixed it!
* jekstrand watches the operator tests scroll by
<alyssa> Aaaaa!!!!
<alyssa> jekstrand: It's my birthday and Christmas all at once! :p
<jekstrand> alyssa: I'll post the MR as soon as I figure out which CTS tests to enable in CI
<alyssa> *nod*
<alyssa> I imagine your fix + alyssa/mesa:vk2 will pass most of deqp-vk.glsl.*
<jekstrand> :D
rkanwal has quit [Ping timeout: 480 seconds]
<alyssa> and then CTS results should be a lot more meaningful
<alyssa> how'd you solve it, btw?
<jekstrand> alyssa: There's a lot of operator tests. :-/
<alyssa> deqp-runner seems non-optional for VK :p
<jekstrand> Yeah
<alyssa> (the rust one)
<alyssa> on the flip side, if there are lots of operator tests then our pass/fail should be way up :)
<alyssa> woot!
<alyssa> GoGoGo'd-By: Alyssa Rosenzweig <alyssa@collabora.com>
<alyssa> :-p
<alyssa> (I have no idea if that fix is good but I can't argue with zomg passing tests)
<jekstrand> alyssa: There's some tight interaction between front-end and back-end going on there. Not sure if it's correct or not. Then again, if I'm not and you're not, who is?!?
Daanct12 has joined #panfrost
<alyssa> jekstrand: ...bbrezillon? ;)
alpernebbi has quit [Remote host closed the connection]
alpernebbi has joined #panfrost
* icecream95 may or may not be having to install packages for compiling Mesa again for some unspecified reason
<jekstrand> alyssa: We'll give him a chance to wake up and see the MR before I \assign marge.
<jekstrand> alyssa: I would like your ack on putting the attributes_read in the shader_info. I didn't know if that should be a panvk-specific thing or not.
<jekstrand> alyssa: Also, I'm a bit confused about panfrost GLES. Does gallium compact for you so you never have to deal with it?
icecream95 has quit [Remote host closed the connection]
icecream95 has joined #panfrost
Danct12 has quit [Ping timeout: 480 seconds]
jolan has quit [Quit: leaving]
alpernebbi has quit [Quit: No Ping reply in 180 seconds.]
jolan has joined #panfrost
alpernebbi has joined #panfrost
guillaume_g has joined #panfrost
bbrezill1 has quit []
bbrezillon has joined #panfrost
rasterman has joined #panfrost
pjakobsson_ has joined #panfrost
pjakobsson has quit [Ping timeout: 480 seconds]
Daanct12 has quit [Quit: Leaving]
tanty has quit []
tanty has joined #panfrost
rkanwal has joined #panfrost
icecream95 has quit [Ping timeout: 480 seconds]
<CounterPillow> how close is panvk to being functional?
<macc24> CounterPillow: i can try on my bifrost machine if i get wifi running... pcie passthrough is pain
<CounterPillow> I mean, I could try too on my bifrost, I'm just lazy :P
<CounterPillow> afaik panvk also needs some env variable to be set that says something along the lines of "yes I know this vulkan implementation is broken" for it to even be picked
<macc24> well then there's your answer
<CounterPillow> PAN_I_WANT_A_BROKEN_VULKAN_DRIVER
<CounterPillow> I'm just wondering how close it is to like, spinny vkcube
<alyssa> jekstrand: for gles, i have literally never thought about compaction
<alyssa> whatever gallium does seems to just work?
nlhowell has joined #panfrost
<bbrezillon> CounterPillow: vkcube should run (I say should because I didn't test it in a while)
<CounterPillow> ! so it's further along than I thought :D
<jekstrand> alyssa: That's what I figured
<bbrezillon> it's actually the only thing that works :P
<macc24> bbrezillon: am i now allowed to scream at y'all if other stuff doesn't run?
<bbrezillon> macc24: nope, but you're allowed to make other things work :P
<macc24> bbrezillon: so far i made a rk3399 handheld boot mainline linux... and it isn't pinephone pro
<macc24> i am nowhere near good enough with gpus to be poking panvk
nlhowell has quit [Ping timeout: 480 seconds]
<robmur01> "hey, the broken Vulkan driver I asked for is broken!" :P
<macc24> who would have expected that
<alyssa> surprised pikachu
<alyssa> jekstrand: Pass: 12526, Fail: 3005, Crash: 569, Skip: 78280, Missing: 10619, Flake: 1, Duration: 34:13, Remaining: 3:42:41
<alyssa> my vk2 (well vk3 now) branch with your fixes cherrypicked
<alyssa> so about 78% passing ... :)
<alyssa> not as high as I might've hoped but still pretty good everything considered
<alyssa> I don't need the machine today, so I think I'll let it finish the CTS run. Curious for the actual number, heh
<alyssa> looks like we're failing a ton of stencil tests, will see what's up with that
<alyssa> though maybe getting through the crashes matters more because of runtime..?
<jekstrand> alyssa: \o/
nlhowell has joined #panfrost
nlhowell has quit [Ping timeout: 480 seconds]
WoC has quit [Remote host closed the connection]
WoC has joined #panfrost
WoC has quit [Remote host closed the connection]
<alyssa> bbrezillon: So I've thought more about the explicit layout code we have
<alyssa> I know it was originally added as "for Vulkan". After looking into the relevant VK requirements, I don't think this is appropriate.
<alyssa> Previously pan_image_layout_init did two things:
<alyssa> 1. Fill out a layout struct with its parameters
<alyssa> 2. Select strides and computes sizes for all the slices.
WoC has joined #panfrost
<alyssa> I recently split these up to simplify the API, now the caller fills out pan_image_layout and pan_image_layout_init just does step 2 -- computing the derived fields from the selected fields.
<alyssa> I.e. its role is to choose strides and sizes and such
<bbrezillon> sounds good
<alyssa> That makes sense for implicit layouts
<alyssa> How about explicit layouts?
<alyssa> Explicit layouts need to cnotrol /all/ of the otherwise derived fields.
<alyssa> control
<bbrezillon> well, we need a function that checks the those fields are correct
<bbrezillon> when they're explicitly filled
<bbrezillon> *s/the those/those/
<alyssa> Right. but that function is logically distinct from a function that decides those fields itself.
<alyssa> Further, there's no in between,
WoC has quit [Read error: Connection reset by peer]
<alyssa> either the caller cares about the layout (and needs to specify strides+offset for /every/ slice, Vulkan has provisions for this if the driver wants to support explicit + array)
<alyssa> or the caller doesn't care about the layout (and then the implicit layout selection is unconstrained and should do whatever is fastest)
<alyssa> So, I propose to split pan_image_layout_init into two functions: "select a layout, void return" and "use this explicit layout, return if valid".
<bbrezillon> makes sense
<alyssa> Internally those two might share some helpers for AFBC, but the external API is totally select.
<alyssa> *separate
<alyssa> Once we go from there, the logical next step is to get rid of pan_explicit_layout altogether, and just have the caller put the layout it wants into pan_image_layout directly.
WoC has joined #panfrost
<alyssa> and make the latter function just "validate the chosen layout (and set internal derived fields, I guess, but those probably shouldn't exist)"
<alyssa> explicit_layout right now is a simple struct, but if we want to support more complex explicit layouts (which Vulkan specs out, optional for driver, no idea if we actually want this), it would grow to contain... pan_image_layout.
<alyssa> Tangent: there is a *lot* of confusion in the layout code between (line strides/row strides) and (bytes per pixel/bytes per block)
<bbrezillon> yeah, I know, would be much simpler if everything was linear :P
<alyssa> Hehe
<bbrezillon> (and much slower at the same time)
<alyssa> case in point, explicit_layout is labeled line_stride
<alyssa> s/is/has/
<alyssa> ...but if you look carefully, the Vulkan spec actually wants row stride for that!
<bbrezillon> because I think that's what the API passes
<bbrezillon> really?
<alyssa> yeah, let me find the citation again
<bbrezillon> I thought everything was passed at line strides and we had to extrapolate the row stride somehow
<bbrezillon> (that's what I remembered from the DRM extension)
<alyssa> rowPitch describes the number of bytes between each row of texels in an image.
<alyssa> offset
<alyssa> For compressed formats, the rowPitch is the number of bytes between compressed texel blocks in adjacent rows
<alyssa> the fact is, "line stride" doesn't make sense for nonlinear images. "row stride" always makes sense.
<bbrezillon> sure
<alyssa> there are lots of places we compute/use/store a "line stride" that has /no actual meaning/
<alyssa> and the calculations happen to work out due to sufficient cancellation
<alyssa> It's a bit scary.
<alyssa> also:
<alyssa> "If the image is non-linear, then rowPitch, arrayPitch, and depthPitch have an implementation-dependent meaning."
<alyssa> this covers fun cases like "uncompressed format with u-interleaving"
<alyssa> RGBA8 UNORM with 16x16 u-interleaving, for example
<alyssa> There, the row stride should be "the number of bytes from a texel to the texel in the adjacent row"
<alyssa> (16 * 4 * width) in this case
<alyssa> That's a number with meaning for the image. It's also exactly what the hardware wants.
<alyssa> ...Though confusingly, the XML calls this "line stride" anyway...
WoC has quit [Remote host closed the connection]
<alyssa> at least in some places.
WoC has joined #panfrost
<alyssa> oh, no it doesn't. always row stride. good.
<alyssa> Anyway. In Panfrost right now we have (16 * 4 * width) as the row_stride for that image, but (4 * width) for the line_stride
<alyssa> (4 * width) is a number that makes no sense whatsoever for the image.
<alyssa> It's just that, (16 * 4 * width) * (height / 16) = (4 * width) * height
<alyssa> so we get the same overall size
<alyssa> but it's confusing and wrong and a recipe for bugs, if not in panfrost then in the panvk
WoC has quit [Remote host closed the connection]
WoC has joined #panfrost
<bbrezillon> alyssa: not sure who you're trying to convince, but I'm already convinced :P
<alyssa> Myself, mostly ;P
<alyssa> If I'm going to go changing APIs, I want to make sure I get them right this time around.
<bbrezillon> and yes, I made poor design decisions when working on panvk because I wanted to get the basic stuff working
<alyssa> and getting rid of line_stride would be an invasive change (ie. it would also touch lima, which seems to have inherited the quirks from my code)
<alyssa> sometimes I want to go to younger Alyssa and yell at her.
<bbrezillon> and getting this layout logic shared with panfrost allowed me to test on panfrost first, which was one less source of bug to worry about
<alyssa> Yes, sharing the layout logic was the right decision
<alyssa> The layout logic was just already busted on Panfrost.
<bbrezillon> and I guess I bent the logic to fit vk requirements, without really spending on a better design
<bbrezillon> which you're doing right now, so that's great!
<alyssa> hey, there's no blame to go around :-)
<alyssa> bbrezillon: panvk_slice_layout is unused, should I delete?
<alyssa> plane_layout too
<alyssa> and plane_memory
<alyssa> and then PANVK_MAX_MIP_LEVELS
<alyssa> yeah going to guess that was all before sharing the layout structs, deleting
guillaume_g has quit []
<alyssa> [always-row-stride 4f31f3ea6d4] panvk: Remove unused layout structs 1 file changed, 43 deletions(-)
Danct12 has joined #panfrost
<bbrezillon> yep
<cphealy> Should https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16181 be tagged for stable releases line 22.0.x or is it not relevant there?
<alyssa> I assume icecream95 hit that with OpenCL or something.
<alyssa> so no
<cphealy> Is that because OpenCL is not supported with Mesa 22.0.x or some other reason?
<alyssa> yes
<cphealy> Ahh, yes, I got OpenCL and GLES 3.1 compute shaders confused. This change doesn't matter for GLES 3.1 compute shaders?
* alyssa tries to get rid of the third slightly different broken stride
<alyssa> inside slice layout
<cphealy> Also, is the state of OpenCL support with Panfrost written down anywhere? I'm curious what the current state is (especially with G52.)
<alyssa> jekstrand: Oh look what decided to finish
<alyssa> Pass: 93382, Fail: 21314, Crash: 4382, Warn: 6, Skip: 584971, Timeout: 3, Missing: 84042, Flake: 23, Duration: 4:25:21, Remaining: 0
<alyssa> So yeah. 78% pass.
<macc24> alyssa: what's tha?
<alyssa> produced 2GB of output/, oof
<macc24> that*
<alyssa> macc24: panvk cts
<macc24> alyssa: :O
<alyssa> macc24: direct all :O to bbrezillon :p
<macc24> bbrezillon: :O
<macc24> alyssa: on midgard, bifrost or valhall?
<alyssa> bifrost
<alyssa> Valhall + PanVK will come when things are a bit further along
<macc24> will panvk ever support midgard?
<alyssa> TBD
<alyssa> It's looking less likely, tbh.
<macc24> :|
<macc24> ok
<alyssa> It'd be a ton of extra work for.. what
<alyssa> what VK content is anyone realistically going to run on rk3399
<alyssa> and no, Zink/ANGLE don't count
<macc24> well, ppsspp
<alyssa> has a gl renderer
<alyssa> no?
<macc24> yeah
<macc24> tho i hear vulkan's faster on it
<macc24> but ok
<alyssa> midgard is like haswell.
<macc24> i don't get it
<alyssa> ^^ might be interesting
<alyssa> actually -EBIG, taking down
<macc24> 2.7mb too big?
<alyssa> macc24: don't want crawlers finding it
<macc24> oh ok
<alyssa> file available on request.
<alyssa> bit less than a third are image_clearing tests
<alyssa> which.. really should be easy...
<alyssa> image atomics look broken, that's probably my fault
<alyssa> and some textureing
<alyssa> missing texturesize/texturequerylevels, that's just some plumbing that I won't know how to do :p
<alyssa> depth/stencil sampling is broken
<alyssa> looks like a LOT of tests would be fixed from Z/S
<alyssa> bunch of texture filtering tests in there
<alyssa> texture swizzling, looks like
<alyssa> alright. 25,700 failures but not a lot of distinct bugs.
<HdkR> macc24: You need to look at x86 Vulkan games
<HdkR> Need to support those on Midgard :P
<macc24> HdkR: with only 4 gigs of ram?
* macc24 is talking about games on anbernic rg552
<HdkR> I'm sure there are some games that can run in that low amount of memory
<macc24> tho i /think/ most other rk3399 boards are 4gb too
<macc24> oh
<macc24> TERRARIA
<macc24> usecase #2 for panvk on midgard
<macc24> and other xna games
<HdkR> But I'm also a realist and I don't really see Vulkan on Midgard to be a sane choice
<alyssa> HdkR: I appreciate you saying as much.
<alyssa> How about Vulkan on Bifrost?
<alyssa> Valhall?
<HdkR> Valhall is the most sane choice
<HdkR> Bifrost...eeeh, take it or leave it
<macc24> or bifrost since valhall is currently unobtainium aside chromebooks and android phones
<jekstrand> alyssa: \o/
<jekstrand> alyssa: Want to throw me a fail to look at?
<alyssa> haven't started looking at the vk fails yet
<alyssa> just got v1 of the yak shave over the wall
<alyssa> down from 4 stride fields to 2 :p
<alyssa> jekstrand: If I could pick your VK brain about 1 thing right now -- how should textureSize work in panvk?
<alyssa> and SSBOs more generally?
<alyssa> They have the same problem -- the way it's handled right now (with the special sysval UBO with mostly direct indexing) is super GL centric
<alyssa> The SSBO code we have now has a big "this is broken on it"
<alyssa> and indeed some of the fails are due to indirect indexing of SSBOs not working
<alyssa> I could commit harder to the GLisms and make indirect SSBO indexing work with the sysval model
<alyssa> and route the texture size sysval etc
<alyssa> but I suspect we.. don't want to that?
<alyssa> and the whole sysval model in Panfrost needs an overhaul for proper VK?
<jekstrand> Ye
<jekstrand> Yes
<jekstrand> Descriptors need an overhaul
<jekstrand> You want a UBO per descriptor set that contains ptr/size for SSBOs, whatever extra image stuff for textures/images, etc.
<jekstrand> Then lower descriptor access to an offset into the UBO, load_ubo, and then do whatever with the result.
<alyssa> Sounds complicated :X
<jekstrand> For things like textures, those will have to be offsets into a table as per GL
<jekstrand> No more complicated than the sysval stuff
<alyssa> in theory for Valhall, we can load from the resource table directly
<alyssa> i.e. implement textureSize by loading the texture descriptor in the shader and decoding it
<jekstrand> Sure
<alyssa> OTOH, that's not portable (need to update the lowering each time the descriptor is shuffled), and will be slower in general than the UBO approach
<jekstrand> And for valhall we probably want to use the magic texture descriptor thing instead of having a UBO descriptor
<jekstrand> Or we can have a UBO descriptor
<jekstrand> Both work
<alyssa> (decoding isn't free. eg for texture size, the width/height are stored minus 1)
<jekstrand> sure but meh
<alyssa> what I mean is we may as well reuse the bifrost path on valhall
<jekstrand> Sure
* jekstrand is tempted to start writing new panvk descriptor code
<alyssa> I guess that's more VK than Mali, eh? :)
<jekstrand> Yeah
<jekstrand> Do I really want to?
<jekstrand> This is gonna be a big rewrite...
<jekstrand> Maybe I can make it somewhat incremental
* jekstrand creates a branch. Here goes!
<alyssa> Good luck :o
<alyssa> these 1 instruction blit shaders are so cute
<alyssa> Oh I see what's wrong with this one
<alyssa> This looks like another VK thing I won't grok
<alyssa> instr->sampler_index == 1
<alyssa> but I don't see sampler1 filled out
<alyssa> wonder if this is like the operators issue
<alyssa> going to guess this is another case of panvk's descriptor set code is wrong
* alyssa adds it to the stack
<alyssa> Simplest reproducer: dEQP-VK.texture.swizzle.component_mapping.r8_unorm_2d_pot_rgba
<alyssa> Notice the shader:
<alyssa> vec4 32 ssa_3 = (float32)tex ssa_2 (coord), 0 (texture), 1 (sampler)
<alyssa> Yet it only binds sampler #0, not sampler 1 (which is null)
<alyssa> So then the hardware (correctly) detects the out-of-bounds condition on the sampler and reads back 0's.
<alyssa> Mmmmmm
<alyssa> Copy and paste anv code gets 264 tests from Crash->Pass
<alyssa> I like that! :p
<jekstrand> What'd you copy+paste?
<jekstrand> that's gonna conflict with descriptor set reworking...
<alyssa> won't push then :)
rasterman has quit [Quit: Gettin' stinky!]
rasterman has joined #panfrost
<jekstrand> Soo many bugs to fix all at one go
<jekstrand> alyssa: So... panfrost UBOs. Are they a bound thing or are they base64+offset/size?
<alyssa> jekstrand: Bound, if I understand the question right
<alyssa> "Uniform Buffer" structure
<jekstrand> Ok
<alyssa> Draw Call Descriptor points to an array of Uniform Buffers
<jekstrand> But SSBOs are always 64bit address, right?
<alyssa> Each Uniform buffer is a base address + size
<alyssa> There's no SSBO hardware in Bifrost. We lower to a 64bit address for GLES.
<jekstrand> Ok
<alyssa> Ideally PanVK can use lower_explicit_io for that and not lower_ssbo...?
<alyssa> oof, just caught a serious bug
<alyssa> this should help the numbers.
<alyssa> oki. right. grumble.
<jekstrand> alyssa: What's the maximum number of UBOs?
<jekstrand> Or is it UINT32_MAX or something silly like that?
<alyssa> 255
<jekstrand> Ok, 255 means I can use 8 bits \o/
<alyssa> Yep
<alyssa> ...Does Vulkan permit draws with no shader?
<alyssa> no fragment shader, I mean
<alyssa> running purely for depth/stencil side effects
<jekstrand> Yes
<alyssa> Delight.
* alyssa fixes pipeline_builder_init_shaders then, because it's totally broken for that
<jekstrand> alyssa: Is there a NIR intrinsic for loading a pan system value?
icecream95 has joined #panfrost
<alyssa> jekstrand: ...which one?
<alyssa> system_value("sample_positions_pan", 1, bit_sizes=[64])
<alyssa> load("sampler_lod_parameters_pan", [1], flags=[CAN_ELIMINATE, CAN_REORDER])
<alyssa> ...
<alyssa> this is what I meant by "the GLES sysval handling is very GLES"
<jekstrand> alyssa: I need a new sysval; I guess I get to add an intrinsic
<jekstrand> alyssa: Can sysvals be dynamic? i.e. a ssbo_size with a dynamic index?
<icecream95> jekstrand: sysvals are uploaded in a UBO, which should support dynamic indices, though access will be slower than directly accessed sysvals which can be pushed
alpernebbi has quit [Read error: Connection reset by peer]
<alyssa> ^^
rasterman has quit [Quit: Gettin' stinky!]
<alyssa> That's not wired up right now, though
<alyssa> ...Yes, that means `textureSize(textures[index])` is broken :|
alpernebbi has joined #panfrost
<alyssa> XXX: Invalid field of Blend unpacked at word 1
<alyssa> *blink*
<icecream95> Rather than create a bi_function to hold the blocks, I think using a new compiler context for each function would be easier...
<alyssa> quite possibly...
<alyssa> less churn, at least.
<icecream95> It does require pausing compilation halfway through, but I already wrote patches to do that, to push more efficiently for IDVS shaders
<alyssa> ...Sorry?
<icecream95> Uniforms used in both position and varying shaders currently get pushed twice, unless something has changed there
<icecream95> (So the solution is to stop compilation of the position shader after selecting UBO offsets to push, and continue once the varying shader has compiled to that point)
<jekstrand> alyssa: :(
<alyssa> icecream95: I don't think that's the right solution
<alyssa> My intent was:
<alyssa> 1. Position shaders are smaller and run more often than varying shaders.
<alyssa> 2. Therefore the strategy of "first push everything for position, then everything for varying" should be 'good' enough most of the time.
<alyssa> 3. Therefore we can compile the position shader in full, and then compile the varying shader in full (with extra constraints on the pushing for the varying shader).
<alyssa> 4. The varying shader knows what the position shader already pushed, so it doesn't have to push those itself.
<alyssa> The last point is complicated by the FAU reordering. This is probably a "meh?"
<alyssa> Valhall configures FAU separately for position and varying shaders, so the fighting is exclusive to Bifrost.
<alyssa> And FAU is large enough that it's usually ok.
<alyssa> I think the reordering stuff broke #4, or maybe that was never done.
<icecream95> "Valhall configures FUA seperately". Maybe I was reading the dump wrong, but I'm unsure if v10 still does that
<alyssa> At any rate, that's a simple logical change internal to the UBO push logic. No global changes.
<alyssa> I haven't looked at v10 yet.
<alyssa> I also don't remember if the (v9) DDK takes advantage of the separate FAU ... Mesa doesn't yet, for Bifrost b/w compat, I'd be unsurprised if the same went for the DDK
<icecream95> alyssa: Just wondering.. have you hooked up preamble shader decoding for v9 yet?
<alyssa> No, v9 doesn't support preamble shaders.
<icecream95> it does!
<alyssa> [citation needed]
* icecream95 builds Piglit for use with the blob again
<alyssa> Pass (Result image matches reference)
<alyssa> ha-Ha!
<jekstrand> alyssa: I've copied+pasted the SSBO code from ANV. :)
<jekstrand> And got it mostly building in panvk
<jekstrand> Not tested yet, of course. :)
<alyssa> :)
<jekstrand> Still need to figure out how I want to do dynamic offsets
<jekstrand> I think I want to use sysvals but I need to make dynamic sysvals a thing if I'm going to do that
<jekstrand> That, or load them all and bcsel. That seems suboptimal, though.
<alyssa> Should definitely lower to a dynamic UBO load...
<jekstrand> Yeah
<alyssa> I assume we want to enable the UBO->push pass at some point
<alyssa> (Given your comments on the khr gitlab about regretting API push constants ...)
<jekstrand> The bigger problem is that the way sysvals currently work, they're populated on-demand and ensuring that all the dynamic offset ones (or TXS for that matter) are consecutive so we can do a dynamic UBO load.
<alyssa> "populated on-demand"?
<alyssa> as in, the driver walks the sysval list?
<alyssa> yeah. suss.
<icecream95> Support specifying an array of SSBO offsets etc. as a sysval?
<jekstrand> Yeah, that's kinda what we want
<jekstrand> The problem is that you pretty much have to pre-walk the IR and find all the sysvals first, then sort or something.
<alyssa> dEQP-VK.pipeline.stencil.* passes with my branch
<jekstrand> Or you can make the UBO big enough to conatin them all and only flag the ones you use as needing to be uploaded.
<icecream95> Currently we just do PAN_SYSVAL(type, no) (((no) << 16) | PAN_SYSVAL_##type), but it could be PAN_SYSVAL(type, no, count)
<jekstrand> Yes, if you're ok with always uploading all of them all the time
<jekstrand> which maybe is ok
<jekstrand> Anyway, I'm not solving that problem tonight.
<jekstrand> alyssa: Yes, we want the push path. Badly.
<alyssa> jekstrand: the "compiler decides push" path as opposed to API push path, I mean
<alyssa> OTOH I'm not sure how that path would work in Vulkan
<alyssa> Compute kernel writes to SSBO, SSBO is bound as a UBO, shader reads from the UBO
<alyssa> I think it's legal in VK
<alyssa> in GL, we just stall if you do that
<alyssa> VK.. I guess needs a compute kernel to do the memcpy?
<alyssa> what part of VK was supposed to be faster for apps again
<icecream95> sigh why isn't my ASLR-resistant diff working any more?
<icecream95> ("The address space is not random enough?")
<alyssa> OK, with the no-shader fixes, we should be down another 6000 fails :-)
<icecream95> Hmm.. maybe I was mistaken about preambles then?
<alyssa> Won't be around the rest of the week, notes to self:
<alyssa> * bunch of MRs to land
<alyssa> * icecream95 MR review
<alyssa> * bifrost worklist land
<alyssa> * internal admin tasks
<alyssa> * finish up XFB
<alyssa> * separate shader lowering
<alyssa> * es3.1 cts on g57
<alyssa> looks like I have my May cut out for me, eh.
<alyssa> jekstrand: Pass: 9805, Fail: 1489, Crash: 402, Skip: 57158, Missing: 7645, Flake: 1, Duration: 24:24, Remaining: 3:47:02
<alyssa> Up to 84% with the no shader fix, that's looking a bit better ;)
<icecream95> Hmm.. so doing sin() on a uniform in a "varying" shader seems to produce a varying shader which isn't decoded.. maybe that is a preamble shader then?
<alyssa> and descriptor set rewrite should fix a big chunk of the remaining
alyssa has left #panfrost [#panfrost]
<jekstrand> alarumbe: \o/
jernej has quit [Ping timeout: 480 seconds]
rkanwal has quit [Ping timeout: 480 seconds]