ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar
khfeng has joined #dri-devel
columbarius has joined #dri-devel
co1umbarius has quit [Ping timeout: 480 seconds]
deathmist has joined #dri-devel
mhenning has quit [Quit: mhenning]
alyssa has quit [Quit: leaving]
deathmist has quit [Remote host closed the connection]
deathmist has joined #dri-devel
camus1 has quit [Ping timeout: 480 seconds]
camus has joined #dri-devel
adjtm has quit [Ping timeout: 480 seconds]
khfeng has quit [Remote host closed the connection]
Danct12 has quit [Remote host closed the connection]
Danct12 has joined #dri-devel
heat has quit [Ping timeout: 480 seconds]
pallavim has joined #dri-devel
h0tc0d3 has quit [Remote host closed the connection]
h0tc0d3 has joined #dri-devel
shankaru has joined #dri-devel
aravind has joined #dri-devel
K`den has joined #dri-devel
Kayden has quit [Read error: Connection reset by peer]
Kayden has joined #dri-devel
K`den has quit [Read error: Connection reset by peer]
lemonzest has joined #dri-devel
itoral has joined #dri-devel
OftenTimeConsuming has quit [Remote host closed the connection]
OftenTimeConsuming has joined #dri-devel
itoral_ has joined #dri-devel
itoral has quit [Ping timeout: 480 seconds]
OftenTimeConsuming has quit [Remote host closed the connection]
OftenTimeConsuming has joined #dri-devel
mbrost has joined #dri-devel
mbrost has quit [Remote host closed the connection]
OftenTimeConsuming has quit [Remote host closed the connection]
OftenTimeConsuming has joined #dri-devel
itoral_ has quit [Remote host closed the connection]
HankB has quit [Remote host closed the connection]
HankB has joined #dri-devel
itoral has joined #dri-devel
Duke`` has joined #dri-devel
alanc has quit [Remote host closed the connection]
alanc has joined #dri-devel
mvlad has joined #dri-devel
adjtm has joined #dri-devel
maxzor has quit [Remote host closed the connection]
maxzor has joined #dri-devel
neonking has quit []
neonking has joined #dri-devel
mbrost has joined #dri-devel
mbrost has quit [Remote host closed the connection]
sadlerap1 has joined #dri-devel
sadlerap has quit [Ping timeout: 480 seconds]
slattann has joined #dri-devel
<slattann> Test Mesg
dj-death has joined #dri-devel
dj-death has quit [Remote host closed the connection]
dj-death has joined #dri-devel
tarceri_ has quit [Ping timeout: 480 seconds]
shankaru has quit [Quit: Leaving.]
tarceri_ has joined #dri-devel
samuelig has quit [Quit: Bye!]
gouchi has joined #dri-devel
shankaru has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
itoral has quit [Remote host closed the connection]
Guest2190 has joined #dri-devel
itoral has joined #dri-devel
itoral has quit [Remote host closed the connection]
Daanct12 has joined #dri-devel
Haaninjo has joined #dri-devel
icecream95 has quit [Ping timeout: 480 seconds]
maxzor has quit [Ping timeout: 480 seconds]
adithya has joined #dri-devel
MajorBiscuit has joined #dri-devel
tanty has quit []
tanty has joined #dri-devel
flacks has quit [Quit: Quitter]
MajorBiscuit has quit [Quit: WeeChat 3.4]
flacks has joined #dri-devel
MajorBiscuit has joined #dri-devel
Guest2190 has quit []
pallavim has quit [Ping timeout: 480 seconds]
deathmist1 has joined #dri-devel
deathmist has quit [Ping timeout: 480 seconds]
sadlerap1 has quit []
sadlerap has joined #dri-devel
h0tc0d3 has quit [Remote host closed the connection]
h0tc0d3 has joined #dri-devel
pcercuei has joined #dri-devel
MajorBiscuit has quit [Ping timeout: 480 seconds]
shankaru has quit [Quit: Leaving.]
MajorBiscuit has joined #dri-devel
adithya has quit []
<FLHerne> slattann: belated ack
<karolherbst> jekstrand: with kernel caching things are _sooo_ fast :O test_basic: 2m15s -> 35s
<karolherbst> some tests even finish in like 0.1s
Duke`` has quit [Ping timeout: 480 seconds]
MajorBiscuit has quit [Ping timeout: 480 seconds]
Emmy_ has quit [Ping timeout: 480 seconds]
itoral has joined #dri-devel
mal has quit [Quit: leaving]
alyssa has joined #dri-devel
Duke`` has joined #dri-devel
Emmy_ has joined #dri-devel
heat has joined #dri-devel
HankB has quit [Remote host closed the connection]
HankB has joined #dri-devel
Daanct12 has quit [Quit: Leaving]
pcercuei has quit [Ping timeout: 480 seconds]
deathmist1 has quit [Remote host closed the connection]
deathmist1 has joined #dri-devel
itoral has quit [Remote host closed the connection]
jewins has joined #dri-devel
pcercuei has joined #dri-devel
<jekstrand> karolherbst: :D
<jekstrand> karolherbst: Yeah, there's a lot of duplicate kernels
<karolherbst> jekstrand: you know what I think happens? I submit stuff too quickly
<jekstrand> karolherbst: That's possible, I guess.
<jekstrand> karolherbst: Have you pulled in either my or pzanoni locking patches?
<karolherbst> I did
<karolherbst> but it's the kernel which is crashing
<jekstrand> Hrm...
<jekstrand> !?!
<karolherbst> yeah
<karolherbst> mem corruption
<jekstrand> GPU mem corruption?
<karolherbst> no
<karolherbst> CPU
mal has joined #dri-devel
<karolherbst> "list_del corruption."
<karolherbst> I do run 24 jobs in parallel though
<jekstrand> karolherbst: Oof...
<karolherbst> yeah...
<jekstrand> karolherbst: That looks like the VMA cache breaking. Fun!
<karolherbst> doesn't happen with kasan :(
<jekstrand> danvet: ^^
<jekstrand> karolherbst: You're opening and closing contexts too fast
<karolherbst> guess so
<jekstrand> Of course, i915 isn't actually locking there because RCU will save us!
<karolherbst> :(
<karolherbst> I just wanted to benchmark the speed of the CTS runs :(
<karolherbst> (though it takes like ~2.5 minutes here in case it doesn't crash)
<karolherbst> btw.. is there a fix I can try?
<jekstrand> No
<karolherbst> damn
<jekstrand> Not an easy one
<jekstrand> I could type one but ugh, this is thorny
<jekstrand> I mean, I may have a patch somewhere that nukes the vma cache...
<jekstrand> But I likely don't have it anymore. It was on my Intel machine and I didn't save off all my kernel branches.
<karolherbst> mhhh
<karolherbst> wondering if I could hit this with the official stack...
<jekstrand> maybe? Depends on how they use contexts.
<jekstrand> It'd make for an interesting IGT test.
<karolherbst> not sure if it really has an impact on how they use context if I still throw 24 processes at the same time at it
<karolherbst> let' see if 20 runs reliably enough
<karolherbst> nope..
<jekstrand> It's a race between context close and execbuffer
<karolherbst> so if there is still work in flight/submitted, but I close to context before that?
<karolherbst> or well.. at the same time
<jekstrand> work in flight is fine. It's the ioctls racing
<karolherbst> ehh
<karolherbst> so yeah.. it happens with intels stack as well :)
<jekstrand> Well, context close is deferred so it may not just be ioctlls
<karolherbst> different, but my machine is frozen
<jekstrand> Oh, sure it's a kernel bug and, theoretically, anything can trigger it.
<jekstrand> You can probably trigger it with GL or Vulkan if you try just right
<karolherbst> I'd just wondering if I should file a bug against the intel CL stack :D
<karolherbst> although I guess intel already investigates this?
<karolherbst> or well.. working on a fix or something
<jekstrand> File a bug against i915.
<karolherbst> my CPU makes weird noises :)
<karolherbst> mhh.. with intels stack it doesn't actually crash, it's just stopped doing anything. I don't even get stacktraces or something
<karolherbst> oh well
Duke`` has quit [Ping timeout: 480 seconds]
<jekstrand> karolherbst: Optimizing clc takes 14s on Arm. :-/ We need shader cache if we're going to do that...
<karolherbst> yeah...
<karolherbst> but it's optional
<karolherbst> atm
<jekstrand> Yup
<karolherbst> my wish is though to create the nir binary when compiling mesa, but...
<karolherbst> I am wondering if we could make things in a way, that we don't use anything from nir_options
<jekstrand> That's tricky
<karolherbst> jekstrand: maybe we could also lazy init it. Like checking if it's needed at all and skip loading it until we hit a kernel that actually needs it
<karolherbst> even on x86 I have to wait like 2 seconds until clinfo finishes the first time :(
<jekstrand> karolherbst: Unfortunately, with the way clc works now, that's a chicken-and-egg problem because we need the libclc shader in spirv_to_nir right now.
<karolherbst> we do?
<alyssa> "my wish is though to create the nir binary when compiling mesa" ... would be nice, wouldn't it.
<jekstrand> karolherbst: Yeah... But I think it's only so we can look things up.
<karolherbst> since when?
<jekstrand> Maybe we can have a spirv_to_nir mode which generates a protype NIR with no actual function impls.
<jekstrand> karolherbst: spirv_to_nir_options::clc_shader
<karolherbst> ohhh, right...
<jekstrand> karolherbst: Yeah, for that it looks like all we need is prototypes.
<jekstrand> We can probably have a super-fast light-weight spirv_to_nir pass which just gathers those and doesn't actually parse the whole thing.
<karolherbst> potentially
<karolherbst> I could check what's actually expensive here
<jekstrand> I do think it's good to have a "header" shader at least, if for nothing other than verification.
<karolherbst> maybe it doesn't matter
<jekstrand> Yeah, it's possible a lot of the 3s spent on my Arm board is just I/O loading all that SPIR-V. :-/
<jekstrand> I kind-of doubt it but I'm sure that's non-trivial
<karolherbst> let me profile it here
<jekstrand> But, also, disk cache just solves this problem so I'm kind-of inclined to not care
<jekstrand> I can wire up caching in panfrost if needed.
<alyssa> r-b
<karolherbst> yeah... it's just first start up which is annoying
<karolherbst> jekstrand: yeah.. you don't even have to cache panfrosts internal shaders :D
<karolherbst> jekstrand: also keep in mind that debug builds are super slow here :P
<karolherbst> debug: 2s -> 0.1s
<karolherbst> release: 0.45s -> 0.05s
<karolherbst> anyway.. 82% is nir_load_libclc_shader in the cold cache case
<karolherbst> and spirv_to_nir is just 18% in total
<karolherbst> jekstrand: I think it's better to spend time on improving runtime of opt passes instead :p
<karolherbst> lower_vars_to_ssa uses like 20%
<karolherbst> in the hot cache case load_libclc_shader drops doen to 27% with my patch
<karolherbst> mhh
<karolherbst> I might want to skip calling nir_sweep in case it's a cached thing I loaded
<karolherbst> nir_sweep is expensive...
<karolherbst> 5% of the time is spent on just initing libLLVM :(
shankaru has joined #dri-devel
<karolherbst> jekstrand: when I started: Kernels compilation time: 1800ms, now: "Kernels compilation time: 48ms" :) that's the luxmark thing after caching
<karolherbst> but there is still a huge problem.. delayed compilation of things (like backend compiling stuff when creating the cso) does get included in benchmark scores
<karolherbst> so I am wondering if I could create the cso before enqueueing the kernel, but that might get annoying
mdroper has joined #dri-devel
rkanwal has joined #dri-devel
ella-0 has joined #dri-devel
fxkamd has joined #dri-devel
ella-0_ has quit [Read error: Connection reset by peer]
<jekstrand> karolherbst: We should sweep at the end of optimization but no need to sweep before, probably.
<jekstrand> For CLC that is
bbrezillon has quit [Ping timeout: 480 seconds]
<karolherbst> yeah.. atm I do it after loading
<karolherbst> jekstrand: what about nir_split_var_copies btw? Should I add it as well?
<jekstrand> karolherbst: If you're going to lower copies, yes, it's required.
<karolherbst> okay
<jekstrand> I don't think copy lowering is guaranteed to work without copy splitting
<jekstrand> But copy splitting only has to be run once, not in the loop.
<karolherbst> ahh
<jekstrand> Actually.....
<jekstrand> Just run splitting, not lowering.
<jekstrand> splitting is required for correctness for a few other things. vars_to_ssa will lower copies on-demand if it needs them.
<jekstrand> Sorry, I should avoid code review for the first few hours of Mondays
<jekstrand> :P
<karolherbst> :D
konstantin has joined #dri-devel
nchery is now known as Guest2225
nchery has joined #dri-devel
Duke`` has joined #dri-devel
FireBurnUK has joined #dri-devel
FireBurn has quit [Read error: Connection reset by peer]
<karolherbst> will check what's up with nir_lower_undef_to_zero, but I assume that's libclc using undefined values
<karolherbst> mhh, used in phis
<karolherbst> _Z12__clc_remquoDv3_fS_PDv3_i has two even and it's quite small
<karolherbst> jekstrand: ahh yeah...
Guest2225 has quit [Ping timeout: 480 seconds]
<karolherbst> uhhh
<alyssa> karolherbst: could be a libclc bug?
<karolherbst> nope
<karolherbst> it's just normal C
<karolherbst> if this is a bug, all C is buggy :p
<alyssa> correct
<karolherbst> _although_ you could write code differently to not make that mistake
<karolherbst> also, jekstrand: potential optimization possible
FireBurnUK has quit [Read error: No route to host]
FireBurn has joined #dri-devel
Duke`` has quit [Ping timeout: 480 seconds]
FireBurnUK has joined #dri-devel
FireBurn has quit [Read error: Connection reset by peer]
shankaru has quit [Quit: Leaving.]
shankaru has joined #dri-devel
Duke`` has joined #dri-devel
FireBurnUK has quit [Read error: Connection reset by peer]
FireBurnUK has joined #dri-devel
jewins1 has joined #dri-devel
jewins has quit [Ping timeout: 480 seconds]
maxzor has joined #dri-devel
Arsen has quit [Quit: Quit.]
Arsen has joined #dri-devel
lemonzest has quit [Ping timeout: 480 seconds]
<jekstrand> karolherbst: What?
<karolherbst> jekstrand: see my comment on the MR
<karolherbst> but I'm inclined to ignore it as after inlining that all goes away anyway
<jekstrand> karolherbst: Sounds like the undef is harmless then
<karolherbst> yep
<jekstrand> Ok, cool.
<karolherbst> clc is doing a bunch of those sadly
<jekstrand> That's fine. We don't need to run undef_to_zero
<jekstrand> Sure
<jekstrand> But I kinda don't care
<jekstrand> Like, that's a standard C pattern and it's safe.
<jekstrand> I was more concerned that maybe there was actual UB in libclc.
<jekstrand> But if there isn't then we can keep using undef
<karolherbst> that might be, but clc is huge
<jekstrand> karolherbst: Sure, but we'll probably find out pretty quick if we've broken libclc
<karolherbst> mhh, well.. I think we do have a bug somewhere, but I can't really find it
<jekstrand> ?
<karolherbst> so llvmpipe is a bit broken the more passes I throw at the kernel, but I couldn't find a reason for this to happen
<jekstrand> ugh
<karolherbst> it segfaults, though
<jekstrand> That should be findable unless it's segfaulting in JIT code
<karolherbst> it's in JIT code
<jekstrand> :(
<karolherbst> yeah...
<karolherbst> I am inclined to not care as long as iris is fine
<karolherbst> we'll figure it out someday, but...
<karolherbst> it's unrelated to my change though
<karolherbst> and more related to the passes I run inside rusticl
sdutt has joined #dri-devel
sdutt has quit []
sdutt has joined #dri-devel
<karolherbst> jekstrand: huh.. there is something which looks a bit wrong though
<karolherbst> intrinsic store_deref (ssa_27, ssa_24)
<karolherbst> ehh wait.. it has a wrmask
<karolherbst> nvm then
<karolherbst> we might be able to optimize then though :D
<karolherbst> *that
mdroper has quit []
ybogdano has joined #dri-devel
<alyssa> The more I think about undef_to_zero the more I conclude what we actually want is a much more aggressive opt_undef
<jekstrand> undef_to_zero is kind-of horrible
<jekstrand> It's not actually the language behavior you want
<alyssa> one that transforms anything(undef) -> undef
<karolherbst> alyssa: can't really do that
<alyssa> that would take care of the use case of undef_to_zero combining well with opt_algebraic
<alyssa> karolherbst: no? not even for alu?
<karolherbst> only if it's scalar I guess
<jekstrand> Yeah, (undef | 0xffff) | 0xffff0000
<karolherbst> ahh, and that ^^
<alyssa> oh, ugh, that's really cursed.
<jekstrand> Yeah, I've thought about how to do this. It's hard.
<jekstrand> I think valgrind, ubsan, and others have a concept of bitwise undef
<jekstrand> And maybe LLVM too?
<jekstrand> It's really annoying once you actually think about it for a while
<karolherbst> maybe we need to declare what affects inputs have on ouputs, or mark some instructions as "all inputs, affect all output bits"
<alyssa> undef_to_zero doesn't sound all that bad anymore :-p
<jekstrand> undef << x | y. No idea. Could be totally valid depending on context.
<jekstrand> Yeah, the advantage of undef_to_zero is that you can usually fold zero
<karolherbst> I think the better approach is to figure out if the undef bits are unused and optimize them away
<jekstrand> karolherbst: Have fun with that!
<karolherbst> we don't do value range tracking, do we? :P
<jekstrand> One day, I'd like NIR to grow full competent range analysis and undef analysis could be part of that.
<karolherbst> althought hat only helps a little
<alyssa> is nir_range_analysis incompetent? :v
<karolherbst> yeah...
<jekstrand> Like, exactly the same logic as you use for tracking definitely-set and definitely-unset bits can be used to track undef bits.
<jekstrand> alyssa: It's incomplete
<bnieuwenhuizen> also stuff like opencoded bounds (min(max(undef, x), y) using comparisons show why even going as far as comparisons it gets messy
<alyssa> Ekstrand's Second Incompleteness Theorem states that all value range tracking systems for a sufficiently powerful IR are incompetent or incomplete.
lemonzest has joined #dri-devel
konstantin has quit []
aravind has quit [Ping timeout: 480 seconds]
<jekstrand> alyssa: Pretty sure that's not even a lemma. It's more one of those "it's obvious that..."
<alyssa> very humble of you
<karolherbst> jekstrand: yeah soo.. opt_if somehow only ends up increasing size
<jekstrand> ok...
<karolherbst> ahh yeah.. added inots
<karolherbst> so it moves stuff from the else into the then block and adds inots
<karolherbst> ehh
<jekstrand> Woo! Found some gallium bugs...
<karolherbst> wait..
<karolherbst> I have an idea
<karolherbst> jekstrand: sooo.. opt_if only makes sense if we run opt_algebraic as well
<karolherbst> question is.. should I drop the optimize flag and always call opt_algebraic?
heat_ has joined #dri-devel
heat has quit [Read error: No route to host]
<karolherbst> not sure if the optimize flag makes sense at all anyway... I tried to only call passes which all drivers are fine with (like hence running such a pessimistic nir_opt_peephole_select with 0, false, false instead of 8, true, true
<karolherbst> and opt_algebric already depends on the nir_options, so I guess that's fine as well
<jekstrand> karolherbst: Microsoft wants the optimize flag
<jekstrand> So we can't just drop it
<karolherbst> ohh, so they have to get the libclc shaders without opts ran on it?
<jekstrand> Should we always call opt_algebraic? I don't see why not. We're going to disk cache it anywya.
<jekstrand> karolherbst: No, they want to optimize without a disk cache
<zmike> jekstrand: if you're good with zmike/32 I'm gonna add your ack and marge since I've got some other stuff that will probably conflict
<karolherbst> jekstrand: yeah, but I meant drop the optimize flag and always optimize
<jekstrand> zmike: Fine with me.
<karolherbst> so the question is, does somebody wants to get libclc without opts at all
<jekstrand> karolherbst: I think we don't want opts for rusticl or clover if there's no disk cache
<jekstrand> It's 17s vs. 3s for clinfo on panfrost.
<karolherbst> ahhh, I see
<karolherbst> okay, makes sense then
<jekstrand> If we could figure out on-demand loading, that'd be swell.
<karolherbst> yeah
shankaru has quit [Quit: Leaving.]
slattann has quit [Ping timeout: 480 seconds]
maxzor has quit [Ping timeout: 480 seconds]
<karolherbst> jekstrand: does brw_kernel.c want to always optimize or just with a cache?
<jekstrand> probably just with a cache
<jekstrand> I think it's only microsoft that wants to optimize without one and only because they have a different caching mechanism.
<karolherbst> yeah.. probably
<karolherbst> clover is slow :(
<karolherbst> ehh maybe I should try a release build
<karolherbst> clinfo runtime clover vs rusticl: 0.180s vs 0.080s
<karolherbst> it's like 0.110s if I disable spirv and nir caching
h0tc0d3 has quit [Quit: Leaving]
<karolherbst> ohh.. I forgot to enalbe the opts
<karolherbst> 0.06s :)
<karolherbst> nvm
<karolherbst> jekstrand: there is one thing I am wondering about though.. do you think it makes sense to key the optimize flag as well?
<karolherbst> otherwise you'll get the unoptimized libclc if you disabled caching via env the next time you run with caching enabled
<karolherbst> ehhh
<karolherbst> wait...
<karolherbst> that doesn't make sense
MajorBiscuit has joined #dri-devel
mbrost has joined #dri-devel
FireBurnUK is now known as FireBurn
ngcortes has joined #dri-devel
bbrezillon has joined #dri-devel
MajorBiscuit has quit [Ping timeout: 480 seconds]
maxzor has joined #dri-devel
mbrost has quit [Remote host closed the connection]
mbrost has joined #dri-devel
Duke`` has quit [Ping timeout: 480 seconds]
heat has joined #dri-devel
heat_ has quit [Read error: No route to host]
alatiera has quit [Quit: The Lounge - https://thelounge.chat]
alatiera has joined #dri-devel
mvlad has quit [Remote host closed the connection]
<pcercuei> airlied: just asking for confirmation that I can merge drm/drm-next into drm-misc-next
<pcercuei> That's 643 new commits so I don't want to make a mess
* airlied isn't always sure of the procedure there, but I think it's generally safe to backmerge with a reason
<airlied> and the reason should be the fixup for the build fail
<pcercuei> Should I explicit the reason in the merge commit?
<airlied> yes
jewins1 has quit [Ping timeout: 480 seconds]
ybogdano has quit [Ping timeout: 480 seconds]
alatiera has quit [Quit: The Lounge - https://thelounge.chat]
alatiera has joined #dri-devel
heat_ has joined #dri-devel
heat has quit [Read error: Connection reset by peer]
<pcercuei> done
alyssa has left #dri-devel [#dri-devel]
mbrost_ has joined #dri-devel
mbrost has quit [Read error: Connection reset by peer]
MajorBiscuit has joined #dri-devel
MajorBiscuit has quit []
MajorBiscuit has joined #dri-devel
ybogdano has joined #dri-devel
jewins has joined #dri-devel
ngcortes has quit [Remote host closed the connection]
Major_Biscuit has joined #dri-devel
MajorBiscuit has quit [Ping timeout: 480 seconds]
ngcortes has joined #dri-devel
JohnnyonFlame has joined #dri-devel
gouchi has quit [Remote host closed the connection]
dj-death has quit [Ping timeout: 480 seconds]
neonking_ has joined #dri-devel
neonking has quit [Ping timeout: 480 seconds]
Namarrgon has quit [Quit: WeeChat 3.4]
Namarrgon has joined #dri-devel
heat_ has quit [Remote host closed the connection]
heat_ has joined #dri-devel
heat_ has quit [Read error: No route to host]
Haaninjo has quit [Quit: Ex-Chat]
heat has joined #dri-devel
ybogdano has quit [Ping timeout: 480 seconds]
ybogdano has joined #dri-devel
neonking_ has quit []
neonking has joined #dri-devel
<jekstrand> karolherbst: My image count and sampler patches are looking good on iris in CI. I'm going to cherry-pick them into my branch and see where things are at.
<karolherbst> doesn't work for CL :P
<jekstrand> karolherbst: You may have an old version. I've fixed a bunch of bugs.
<karolherbst> wait.. I have some patches to fix _some_ issues
<jekstrand> If there are CL bugs, I'll fix them.
<karolherbst> mhh? okay
<karolherbst> did you fix the binding table issue?
<jekstrand> maybe?
<karolherbst> like the one where the hw only has 32 slots?
<jekstrand> test name?
<karolherbst> api min_max_read_image_args
<karolherbst> and min_max_write_image_args
<karolherbst> essentially the binding table runs full
<karolherbst> I had patches to make it bigger than 32, but the hw only has 4 bits or something
<karolherbst> or well.. genxml
<karolherbst> jekstrand: btw, I pushed my branch with all the caching and libclc stuff
<karolherbst> no regressions on llvmpipe
<jekstrand> ok
<karolherbst> can't say the same for iris, because... my machine dies
ybogdano has quit [Ping timeout: 480 seconds]
<jekstrand> heh
<karolherbst> let' see, your image patches are on rusticl/wip still, correct?
Major_Biscuit has quit [Ping timeout: 480 seconds]
<jekstrand> no
<jekstrand> not yet
<karolherbst> ahh..
<karolherbst> maybe I should remove them from my branch then as well
Major_Biscuit has joined #dri-devel
<karolherbst> jekstrand: might have to do a "meson configure build -Drust_std=2021"
Major_Biscuit has quit []
<karolherbst> the latter should be on my branch though
mbrost_ has quit [Ping timeout: 480 seconds]
Duke`` has joined #dri-devel
jewins has quit [synthon.oftc.net reflection.oftc.net]
nchery has quit [synthon.oftc.net reflection.oftc.net]
sadlerap has quit [synthon.oftc.net reflection.oftc.net]
fxkamd has quit [synthon.oftc.net reflection.oftc.net]
alanc has quit [synthon.oftc.net reflection.oftc.net]
oneforall2 has quit [synthon.oftc.net reflection.oftc.net]
tlwoerner has quit [synthon.oftc.net reflection.oftc.net]
jessica_24 has quit [synthon.oftc.net reflection.oftc.net]
OftenTimeConsuming has quit [synthon.oftc.net charm.oftc.net]
anholt has quit [synthon.oftc.net reflection.oftc.net]
quantum5 has quit [synthon.oftc.net reflection.oftc.net]
orbea has quit [synthon.oftc.net reflection.oftc.net]
seanpaul has quit [synthon.oftc.net reflection.oftc.net]
lstrano has quit [synthon.oftc.net reflection.oftc.net]
aknautiy has quit [synthon.oftc.net reflection.oftc.net]
jhli has quit [synthon.oftc.net reflection.oftc.net]
stuartsummers has quit [synthon.oftc.net reflection.oftc.net]
mattrope has quit [synthon.oftc.net reflection.oftc.net]
demarchi has quit [synthon.oftc.net reflection.oftc.net]
TD-Linux has quit [synthon.oftc.net reflection.oftc.net]
radii has quit [synthon.oftc.net reflection.oftc.net]
Karyon has quit [synthon.oftc.net reflection.oftc.net]
xyene has quit [synthon.oftc.net reflection.oftc.net]
bcheng has quit [synthon.oftc.net reflection.oftc.net]
remexre has quit [synthon.oftc.net reflection.oftc.net]
SolarAquarion has quit [synthon.oftc.net reflection.oftc.net]
exit70 has quit [synthon.oftc.net reflection.oftc.net]
samueldr has quit [synthon.oftc.net reflection.oftc.net]
kurufu has quit [synthon.oftc.net reflection.oftc.net]
imirkin has quit [synthon.oftc.net reflection.oftc.net]
rossy has quit [synthon.oftc.net reflection.oftc.net]
robink has quit [synthon.oftc.net reflection.oftc.net]
zzag has quit [synthon.oftc.net reflection.oftc.net]
ajax has quit [synthon.oftc.net reflection.oftc.net]
mattst88 has quit [synthon.oftc.net reflection.oftc.net]
jolan has quit [synthon.oftc.net reflection.oftc.net]
JTL has quit [synthon.oftc.net reflection.oftc.net]
Lightning has quit [synthon.oftc.net reflection.oftc.net]
steev has quit [synthon.oftc.net reflection.oftc.net]
lemes has quit [synthon.oftc.net reflection.oftc.net]
enilflah has quit [synthon.oftc.net reflection.oftc.net]
jrayhawk has quit [synthon.oftc.net reflection.oftc.net]
siqueira has quit [synthon.oftc.net reflection.oftc.net]
smaeul has quit [synthon.oftc.net reflection.oftc.net]
austriancoder has quit [synthon.oftc.net reflection.oftc.net]
arnd has quit [synthon.oftc.net reflection.oftc.net]
neonking has quit [synthon.oftc.net larich.oftc.net]
sdutt has quit [synthon.oftc.net larich.oftc.net]
HankB has quit [synthon.oftc.net larich.oftc.net]
Emmy_ has quit [synthon.oftc.net larich.oftc.net]
flacks has quit [synthon.oftc.net larich.oftc.net]
Kayden has quit [synthon.oftc.net larich.oftc.net]
anujp has quit [synthon.oftc.net larich.oftc.net]
LexSfX has quit [synthon.oftc.net larich.oftc.net]
andrey-konovalov has quit [synthon.oftc.net larich.oftc.net]
soreau has quit [synthon.oftc.net larich.oftc.net]
jljusten has quit [synthon.oftc.net larich.oftc.net]
vyivel has quit [synthon.oftc.net larich.oftc.net]
rcf has quit [synthon.oftc.net larich.oftc.net]
macromorgan has quit [synthon.oftc.net larich.oftc.net]
rsripada has quit [synthon.oftc.net larich.oftc.net]
Znullptr has quit [synthon.oftc.net larich.oftc.net]
rpigott has quit [synthon.oftc.net larich.oftc.net]
xperia64 has quit [synthon.oftc.net larich.oftc.net]
dolphin has quit [synthon.oftc.net larich.oftc.net]
Lightsword has quit [synthon.oftc.net larich.oftc.net]
dri-logger has quit [synthon.oftc.net larich.oftc.net]
mareko has quit [synthon.oftc.net larich.oftc.net]
glisse has quit [synthon.oftc.net larich.oftc.net]
mslusarz has quit [synthon.oftc.net larich.oftc.net]
Sachiel has quit [synthon.oftc.net larich.oftc.net]
cengiz_io has quit [synthon.oftc.net larich.oftc.net]
neoXite__ has quit [synthon.oftc.net larich.oftc.net]
sagar__ has quit [synthon.oftc.net larich.oftc.net]
MTCoster has quit [synthon.oftc.net larich.oftc.net]
kem has quit [synthon.oftc.net larich.oftc.net]
kisak has quit [synthon.oftc.net larich.oftc.net]
craftyguy has quit [synthon.oftc.net reflection.oftc.net]
krushia has quit [synthon.oftc.net larich.oftc.net]
dschuermann has quit [synthon.oftc.net larich.oftc.net]
CosmicPenguin has quit [synthon.oftc.net larich.oftc.net]
cphealy has quit [synthon.oftc.net larich.oftc.net]
jhugo has quit [synthon.oftc.net larich.oftc.net]
hfink has quit [synthon.oftc.net larich.oftc.net]
markyacoub has quit [synthon.oftc.net larich.oftc.net]
tchar has quit [synthon.oftc.net larich.oftc.net]
jstultz has quit [synthon.oftc.net larich.oftc.net]
graphitemaster has quit [synthon.oftc.net larich.oftc.net]
zmike has quit [synthon.oftc.net larich.oftc.net]
narmstrong has quit [synthon.oftc.net larich.oftc.net]
`join_subline has quit [synthon.oftc.net larich.oftc.net]
ogabbay has quit [synthon.oftc.net larich.oftc.net]
JohnnyonFlame has quit [synthon.oftc.net weber.oftc.net]
ngcortes has quit [synthon.oftc.net weber.oftc.net]
eukara has quit [synthon.oftc.net weber.oftc.net]
mclasen has quit [synthon.oftc.net weber.oftc.net]
sneil has quit [synthon.oftc.net weber.oftc.net]
sumits has quit [synthon.oftc.net weber.oftc.net]
Lyude has quit [synthon.oftc.net weber.oftc.net]
abhinav__ has quit [synthon.oftc.net weber.oftc.net]
anarsoul has quit [synthon.oftc.net weber.oftc.net]
flto has quit [synthon.oftc.net weber.oftc.net]
aswar002 has quit [synthon.oftc.net weber.oftc.net]
agd5f has quit [synthon.oftc.net weber.oftc.net]
zf has quit [synthon.oftc.net weber.oftc.net]
tjmercier has quit [synthon.oftc.net weber.oftc.net]
kj has quit [synthon.oftc.net weber.oftc.net]
Ryback_ has quit [synthon.oftc.net weber.oftc.net]
mdnavare has quit [synthon.oftc.net weber.oftc.net]
unerlige has quit [synthon.oftc.net weber.oftc.net]
sarnex has quit [synthon.oftc.net weber.oftc.net]
rcn-ee_ has quit [synthon.oftc.net weber.oftc.net]
rib___ has quit [synthon.oftc.net weber.oftc.net]
nikitalita48 has quit [synthon.oftc.net weber.oftc.net]
mmenzyns has quit [synthon.oftc.net weber.oftc.net]
lileo___ has quit [synthon.oftc.net weber.oftc.net]
benettig has quit [synthon.oftc.net weber.oftc.net]
sh_zam has quit [synthon.oftc.net weber.oftc.net]
eletrotupi has quit [synthon.oftc.net weber.oftc.net]
chadv has quit [synthon.oftc.net weber.oftc.net]
jbarnes has quit [synthon.oftc.net weber.oftc.net]
sumoon has quit [synthon.oftc.net weber.oftc.net]
pzanoni has quit [synthon.oftc.net weber.oftc.net]
ceyusa has quit [synthon.oftc.net weber.oftc.net]
SanchayanMaity has quit [synthon.oftc.net weber.oftc.net]
Frogging101 has quit [synthon.oftc.net weber.oftc.net]
sauce has quit [synthon.oftc.net weber.oftc.net]
ezequielg has quit [synthon.oftc.net weber.oftc.net]
tfiga has quit [synthon.oftc.net weber.oftc.net]
ifreund has quit [synthon.oftc.net weber.oftc.net]
robher has quit [synthon.oftc.net weber.oftc.net]
angular_mike_____ has quit [synthon.oftc.net weber.oftc.net]
mmx_in_orbit has quit [synthon.oftc.net weber.oftc.net]
rodrigovivi has quit [synthon.oftc.net weber.oftc.net]
eric_engestrom has quit [synthon.oftc.net weber.oftc.net]
krh has quit [synthon.oftc.net weber.oftc.net]
hwentlan____ has quit [synthon.oftc.net weber.oftc.net]
robclark has quit [synthon.oftc.net weber.oftc.net]
cmarcelo has quit [synthon.oftc.net weber.oftc.net]
airlied has quit [synthon.oftc.net weber.oftc.net]
rg3igalia has quit [synthon.oftc.net weber.oftc.net]
cwabbott has quit [synthon.oftc.net weber.oftc.net]
melissawen has quit [synthon.oftc.net weber.oftc.net]
clever has quit [synthon.oftc.net weber.oftc.net]
swivel has quit [synthon.oftc.net weber.oftc.net]
Simonx22 has quit [synthon.oftc.net weber.oftc.net]
gpiccoli has quit [synthon.oftc.net weber.oftc.net]
jekstrand has quit [synthon.oftc.net weber.oftc.net]
dianders has quit [synthon.oftc.net weber.oftc.net]
daniels has quit [synthon.oftc.net weber.oftc.net]
ngcortes has joined #dri-devel
eukara has joined #dri-devel
Lyude has joined #dri-devel
tjmercier has joined #dri-devel
zf has joined #dri-devel
Ryback_ has joined #dri-devel
mdnavare has joined #dri-devel
kj has joined #dri-devel
unerlige has joined #dri-devel
sarnex has joined #dri-devel
nikitalita48 has joined #dri-devel
daniels has joined #dri-devel
rib___ has joined #dri-devel
mmx_in_orbit has joined #dri-devel
dianders has joined #dri-devel
ezequielg has joined #dri-devel
robclark has joined #dri-devel
lileo___ has joined #dri-devel
angular_mike_____ has joined #dri-devel
hwentlan____ has joined #dri-devel
sh_zam has joined #dri-devel
jbarnes has joined #dri-devel
gpiccoli has joined #dri-devel
ifreund has joined #dri-devel
Simonx22 has joined #dri-devel
krh has joined #dri-devel
eric_engestrom has joined #dri-devel
SanchayanMaity has joined #dri-devel
rcn-ee_ has joined #dri-devel
jekstrand has joined #dri-devel
melissawen has joined #dri-devel
benettig has joined #dri-devel
JohnnyonFlame has joined #dri-devel
airlied has joined #dri-devel
ceyusa has joined #dri-devel
eletrotupi has joined #dri-devel
mmenzyns has joined #dri-devel
sauce has joined #dri-devel
chadv has joined #dri-devel
Frogging101 has joined #dri-devel
sneil has joined #dri-devel
swivel has joined #dri-devel
mclasen has joined #dri-devel
sumits has joined #dri-devel
sumoon has joined #dri-devel
flto has joined #dri-devel
anarsoul has joined #dri-devel
abhinav__ has joined #dri-devel
agd5f has joined #dri-devel
aswar002 has joined #dri-devel
tfiga has joined #dri-devel
cwabbott has joined #dri-devel
rg3igalia has joined #dri-devel
rodrigovivi has joined #dri-devel
robher has joined #dri-devel
pzanoni has joined #dri-devel
cmarcelo has joined #dri-devel
clever has joined #dri-devel
sadlerap has joined #dri-devel
nchery has joined #dri-devel
jewins has joined #dri-devel
fxkamd has joined #dri-devel
anholt has joined #dri-devel
tlwoerner has joined #dri-devel
orbea has joined #dri-devel
oneforall2 has joined #dri-devel
seanpaul has joined #dri-devel
jessica_24 has joined #dri-devel
lstrano has joined #dri-devel
alanc has joined #dri-devel
mattrope has joined #dri-devel
jhli has joined #dri-devel
smaeul has joined #dri-devel
zzag has joined #dri-devel
arnd has joined #dri-devel
exit70 has joined #dri-devel
austriancoder has joined #dri-devel
ajax has joined #dri-devel
siqueira has joined #dri-devel
xyene has joined #dri-devel
imirkin has joined #dri-devel
SolarAquarion has joined #dri-devel
jolan has joined #dri-devel
enilflah has joined #dri-devel
lemes has joined #dri-devel
kurufu has joined #dri-devel
Karyon has joined #dri-devel
robink has joined #dri-devel
Lightning has joined #dri-devel
jrayhawk has joined #dri-devel
TD-Linux has joined #dri-devel
JTL has joined #dri-devel
rossy has joined #dri-devel
radii has joined #dri-devel
bcheng has joined #dri-devel
craftyguy has joined #dri-devel
mattst88 has joined #dri-devel
samueldr has joined #dri-devel
steev has joined #dri-devel
aknautiy has joined #dri-devel
stuartsummers has joined #dri-devel
remexre has joined #dri-devel
demarchi has joined #dri-devel
quantum5 has joined #dri-devel
Znullptr has joined #dri-devel
rpigott has joined #dri-devel
cphealy has joined #dri-devel
sdutt has joined #dri-devel
Emmy_ has joined #dri-devel
flacks has joined #dri-devel
soreau has joined #dri-devel
jljusten has joined #dri-devel
LexSfX has joined #dri-devel
andrey-konovalov has joined #dri-devel
vyivel has joined #dri-devel
rcf has joined #dri-devel
macromorgan has joined #dri-devel
rsripada has joined #dri-devel
Lightsword has joined #dri-devel
dolphin has joined #dri-devel
xperia64 has joined #dri-devel
dri-logger has joined #dri-devel
mareko has joined #dri-devel
glisse has joined #dri-devel
mslusarz has joined #dri-devel
Sachiel has joined #dri-devel
kem has joined #dri-devel
ogabbay has joined #dri-devel
dschuermann has joined #dri-devel
tchar has joined #dri-devel
MTCoster has joined #dri-devel
markyacoub has joined #dri-devel
zmike has joined #dri-devel
CosmicPenguin has joined #dri-devel
jstultz has joined #dri-devel
hfink has joined #dri-devel
narmstrong has joined #dri-devel
cengiz_io has joined #dri-devel
neoXite__ has joined #dri-devel
HankB has joined #dri-devel
`join_subline has joined #dri-devel
kisak has joined #dri-devel
krushia has joined #dri-devel
jhugo has joined #dri-devel
sagar__ has joined #dri-devel
graphitemaster has joined #dri-devel
neonking has joined #dri-devel
anujp has joined #dri-devel
Kayden has joined #dri-devel
OftenTimeConsuming has joined #dri-devel
<jekstrand> karolherbst: No longer crashing but it fails. Working on why
<jekstrand> karolherbst: I just came up with a terrible idea for how to do CL on crocus
<karolherbst> :O
<karolherbst> let's hear it
<karolherbst> I mean.. nv50 supports CL 1.1, and that one is pre GL 4
<jekstrand> So... HSW doesn't have 48-bit addresses, right? But this also means that the entire address space will fit inside a 2 or 4 SSBOs.
<jekstrand> (An SSBO size is 2^30 and I don't remember if HSW's address space is 32 or 31 bits)
<karolherbst> can't even do compute shaders on GL
<karolherbst> well... it can, but crapily
<karolherbst> although I think the issue was more like it doesn't support image_load_store and hence we can't claim compute shaders support
<karolherbst> something wich surface ops being compute only
rkanwal has quit [Quit: rkanwal]
<karolherbst> jekstrand: you can have device being 32 bits only
<karolherbst> *Devices
<jekstrand> So we bind 4 ssbos over the entire Address space and then load/store_global(addr) becomes load/store_ssbo(addr >> 30, addr & 0x3fffffff)
<jekstrand> The tricky part becomes uploading kernel arguments
ybogdano has joined #dri-devel
<jekstrand> But there's no reason why we can't have a UBO with relocations in it.
<karolherbst> why?
<karolherbst> don't support generic and do 32 bit only?
<karolherbst> would be enough for CL 1.2
<jekstrand> We don't have actual addresses. Only the kernel knows the actual address and only at exec time.
<jekstrand> So we need to patch the input UBO
<bnieuwenhuizen> don't forget your unaligned loads/stores though :P
<karolherbst> jekstrand: I still don't see the problem
* jekstrand really needs to stop caring about HSW. It's like a drug.
<karolherbst> so instead of pointers, we do ssbo offsets
<karolherbst> ms is doing it as well, so it can't be that bad
<jekstrand> karolherbst: Yeah, that's an option if we're ok with one SSBO per memory objet
<karolherbst> just needs a special case in kernel::launch I guess
<karolherbst> jekstrand: should be fine for 32 bit devices
<karolherbst> you just say your max buffer size is.. whatever fits in ssbo size
<jekstrand> yeah
<karolherbst> so if htat's 2^30, that's fine
<jekstrand> Yeah, that would work too
<karolherbst> it's just stupid to do that on GPUs with like massive amounts of VRAM :)
<jekstrand> And would involve less driver gymnastics
<karolherbst> yep
<jekstrand> It's pretty sad that HSW only has a 32-bit address space given how much ram you can pair with that processor.
<karolherbst> I hardcode 64 pretty much everywhere atm, but I might wire up rusticl on nv50 and see where that goes
<karolherbst> we only have 16 32 bit buffers
<karolherbst> the GPUs VA is bigger, but...
<karolherbst> none of those GPUs have more than 4GB VRAM.. I doubt they even have more than 1GB
<karolherbst> ahh, seems like there were 2GB variants
<clever> talking about a specific gpu line or gpu's in general?
<karolherbst> specific gpus line
<karolherbst> jekstrand: yeah.. that's a bit odd
<karolherbst> also sounds like a bad idea to have a smaller VA on an iGPU than the CPU/MMU supports..
mbrost has joined #dri-devel
<jekstrand> Yeah....
<clever> due to design limits, the pi4's 3d core only has a 32bit addr space, but the host can handle up to 16gig of ram
<clever> there is a single level paging table and mmu patching that mismatch
<karolherbst> oh wow
<clever> and you need 4mb of physically contiguous memory to hold the paging table
<karolherbst> not sure what intel needed that for, but intels GPUs also had some preallocated block of memory where you can even configure the size in the firmware
<clever> drivers/gpu/drm/v3d/v3d_mmu.c deals with that
<clever> for the pre-pi4 models (pi0-pi3), the 3d core is also limited to 32bit, but the host only supports 1gig of ram
<clever> the extra 2 bits of the address are used for cache-control flags
<karolherbst> ahh fun.. i915 survived this run
<clever> `addr | 0xc000_0000` means to ignore all caches
<clever> `addr | 0x8000_0000` is both coherent with the L2, and allocates into the L2
<clever> but the arm core isnt coherent with the L2 on some models
<clever> so you need to use the right addr for the model
<karolherbst> jekstrand: mhh weird.. drm-tip looks more stable
alyssa has joined #dri-devel
<clever> i think 0x4000_0000 was coherent but non-allocating?
* alyssa wonders why she's getting f2f16(phi(f2f32(x), f2f32(y))) instead of a 16-bit phi..
<karolherbst> wow.. the CPU is basically idling most of the time with the caches
<jekstrand> alatiera: Our down-of-upcatst optimization can't see through phis. :(
<jekstrand> alyssa, rather ^^
Duke`` has quit [Ping timeout: 480 seconds]
<alyssa> jekstrand: right..
* karolherbst tries -j48
<jekstrand> I've been tempted to write a phi-of-unop CSE like thing
<alyssa> heh
<karolherbst> werid.. things won't go faster if the GPU is already at 100% all the time
<jekstrand> But then someone's going to want me to handle fneg(phi(fneg(x), y))
<alyssa> I would like that handled yes! :-p
<karolherbst> "2176/2176 02:47" :3
<alyssa> jekstrand: Oh, it's even more sinister than that
<alyssa> My test does "if (x) { gl_FragColor = A; } else { gl_FragColor = B }"
<jekstrand> karolherbst: \o/
<alyssa> gl_FragColor is forced to be fp32 for some reason, so you get the conversion mess
<karolherbst> jekstrand: seeks like something one can code generically
<jekstrand> karolherbst: Is that on iris or lavapipe?
<jekstrand> llvmpipe, rather
<karolherbst> jekstrand: iris :D
<karolherbst> llvmpipe is a bit slower
<alyssa> If I rewrite it as "vec4 temp; if (x) { temp = A; } else { temp = B } gl_FragColor = temp"
<jekstrand> karolherbst: Did you fix the max image tests?
<alyssa> ..there are no conversions
<jekstrand> karolherbst: Or is that a subset
<karolherbst> lp needs like 17 minutes with a cold cache
<karolherbst> 6 with a hot one
<karolherbst> jekstrand: ahh no.. it still fails
<karolherbst> mostly benchmarking the caching stuff
<jekstrand> karolherbst: Ah
<karolherbst> on lp the difference is like... small
<karolherbst> guess if it's busy doing compute on the CPU caching won't help much
<jekstrand> alyssa: Yeah, makes sense.
<karolherbst> but for iris it kind of halfes the time it needs to run through the tests
<jekstrand> alyssa: Are you using opt_peephole_select? I'm not sure if we can see through bcsel either, but that's easier than a phi
<karolherbst> k.. filing a bug now
mbrost has quit [Ping timeout: 480 seconds]
<alyssa> jekstrand: I am, `A` is a texture op (to test whether my RA can handle phi nodes good :p)
<jekstrand> alyssa: Ah. Yeah....
<alyssa> (answer: it can't yet, too many moves)
<alyssa> ..wait, copyprop should be helping here
<alyssa> can I not copyprop into phis? shame, I should fix that!
<jekstrand> copy-prop can propagate into phies
<jekstrand> *phis
<alyssa> AGX copyprop
<jekstrand> ay
<jekstrand> ah
<alyssa> optimizer bug, doing iterating `opcode info.num sources` sources instead of using the agx_foreach_src iterator which does the right thing for phis
<alyssa> ok, now it's copypropped but then RA can't coalesce the phi so no change in total instruction count
<jekstrand> Yeah, but that's still a way better RA test
<alyssa> ah, well, yes
<alyssa> cleaner IR, too
<jekstrand> :D
<alyssa> Admittedly I am very overwhelmed by actually doing coalescing
<alyssa> merge sets seem really complicated.
<jekstrand> alyssa: Didn't you already break that all out into a helper so it's easy now? :P
<alyssa> Hm?
<jekstrand> merge sets
<alyssa> no(t yet)
<alyssa> parallel copy lowering is what I stole from ir3 for the benefit of the commons
* jekstrand hasn't actually read the MR. It's easier to be snarky if you don't know what's going on.
<alyssa> basically FOSS robin hood
* karolherbst filed a bug
<alyssa> I also don't know if the ir3 people will hate that MR
* karolherbst thinks about wiring up CL spir-v support
<karolherbst> ehhh the only thing I really don't like about markdown is new line handling
<jekstrand> karolherbst: I think clover is messing up my inputs. :-/
<jekstrand> rusticl, rather
<karolherbst> noooo
<karolherbst> how so?
<jekstrand> hrm... maybe not?
* jekstrand is confused
<karolherbst> yeah, I am sure rusticl is fine :p
<jekstrand> Well, someone's messing them up. :P
<karolherbst> yeah well..
<karolherbst> depends on what gets messed up
<karolherbst> intels stack does some weird indexing of images/textures if you get both of them, but you might hit something else
<jekstrand> karolherbst: Yeah, pretty sure it's rusticl. :P
<karolherbst> I am sure it's not :P
<jekstrand> vec1 64 ssa_1 = intrinsic load_kernel_input (ssa_0) (base=0, range=0, align_mul=256, align_offset=0)
<jekstrand> (gdb) p/x ((uint32_t *)grid->input)[0]@14
<jekstrand> $11 = {0x0, 0x0, 0xffc00000, 0xfffffffe, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}
<jekstrand> Pretty sure that there address isn't at offset 0. :P
<karolherbst> what's ssa_0?
<karolherbst> and don't tell me it's 0x8 :p
<jekstrand> vec1 32 ssa_0 = load_const (0x00000000 = 0.000000)
<karolherbst> ahh
<karolherbst> weird
<karolherbst> which test?
<jekstrand> min/max image
<jekstrand> But I've not rebased in a few days so maybe you fixed something?
<karolherbst> doubt it
<karolherbst> ohh wait
<karolherbst> jekstrand: when are you dumping it?
<jekstrand> inside iris
<karolherbst> you need to do it after set_global_bindings
<jekstrand> I'm dumping at the moment iris emits the GPGPU_WALKER command, so very late
<karolherbst> mhh right.. that test doesn't even crash, let me check
<karolherbst> jekstrand: it's min_max_read_image_args, right?
<jekstrand> yup
<karolherbst> yeah.. soo weird...
<karolherbst> no, it gets filled
<jekstrand> Oh, it very much gets filled... 8 bytes too late.
<karolherbst> not here
<jekstrand> It's getting filled at the start?
<karolherbst> yeah
<karolherbst> p/x (const uint64_t[4])grid->input
<karolherbst> $22 = {0x7fffb40013d0, 0x100000002, 0x100000001, 0x0}
<karolherbst> but I also didn't fix any kind of bug in this code.. at least not that I am aware of
<karolherbst> ehh wait
<karolherbst> my cast is wrong
<jekstrand> Ok, it's definitely getting an offset of 8 in resource_info
<karolherbst> $26 = {0x0, 0xfffffffeffc00000, 0x0, 0x0}
pcercuei has quit [Quit: dodo]
<karolherbst> that looks more like yours
<jekstrand> There you go. :)
Gorg has joined #dri-devel
<karolherbst> odd
<karolherbst> but yeah.. there is an offset of 0x8 for whatever reason
<karolherbst> let me try to understand my code there :D
<karolherbst> okay.. I have an idea
<karolherbst> why is the args offset 8...
<karolherbst> it's 0 in the variable
<karolherbst> ohhhhhhh nooooo
<karolherbst> how did stuff even work until now
<karolherbst> jekstrand: the var.data.location field of the sampler messes things up :(
<jekstrand> karolherbst: oh?
<karolherbst> yeah.... I have a uniform/image location -> arg mapping I use
<karolherbst> but samplers have their own index
<jekstrand> right
<karolherbst> I probably just skip samplers as they are opaque
<karolherbst> mhh
<karolherbst> actually..
<karolherbst> weird
<karolherbst> ehh no
jewins has quit [Remote host closed the connection]
<karolherbst> it's just inline samplers
jewins has joined #dri-devel
<karolherbst> yeah.. that makes way more sense
<karolherbst> normal samplers still get their slot, because...
<karolherbst> mhhh
<karolherbst> annoying
<karolherbst> mhhh
<karolherbst> yeah... that stuff is slightly broken
<jekstrand> clearly :P
morphis has quit [Ping timeout: 480 seconds]
<karolherbst> inline samplers are at the end anyway, that makes things easy
morphis has joined #dri-devel
* alyssa also wants to wire up nir_opt_preamble for AGX
<karolherbst> "decl_var uniform INTERP_MODE_NONE sampler @33 (33, 8, 0) = { clamp, false, nearest }" that looks more correct
<alyssa> Ideally I don't even need a UBO pushing pass for AGX, I can just use that
<karolherbst> "PASSED sub-test."
<karolherbst> jekstrand: not sure if relying on the order of uniform/image variables is a good idea...
<karolherbst> but...
<jekstrand> karolherbst: Not especially, no, but it should be safe in practice.
<karolherbst> yeah
<karolherbst> I just ignored images :)
<jekstrand> And there's a NIR function to sort them if you want.
<karolherbst> and non sampler vars
<jekstrand> karolherbst: patch?
<karolherbst> in a sec
<karolherbst> need to clean it a little
<jekstrand> kk
<karolherbst> ehh unions...
<karolherbst> ahh.. screw it, less nice code must be enough for now
<karolherbst> seriously.. that could even fix luxmark
<karolherbst> it doesn't
<karolherbst> jekstrand: ohh.. and "api min_max_constant_args" is my fault as well :)
<karolherbst> and profiling execute passes now as well
<airlied> dschuermann: aco doesn't support variable workgroup size? big problem or little problem? :-P
<karolherbst> jekstrand: big question is now, how to deal with CL_DEVICE_MAX_CONSTANT_ARGS + global things
<karolherbst> but I probably just put them into ubos? dunno
<karolherbst> ahh.. I limited to 32 read images :)
<alyssa> remark: it's easy to make a bad SSA RA and easy to make a mediocre traditional RA,
<alyssa> hard to make a good SSA RA and extremely hard to make a good traditional RA
<alyssa> thus on average we settle for mediocre + traditional :-p
<jekstrand> karolherbst: Result failed to verify. Got 2.14364e+27, expected 8001.
<jekstrand> Better than 0 :D
<karolherbst> heh
<karolherbst> but you use more than 32 sampler views, right?
<jekstrand> That's 128
<karolherbst> mhh yeah..
* jekstrand tries with 16
<karolherbst> it should work with 32
<karolherbst> did you fix the cl image lowering?
<jekstrand> still failing...
<jekstrand> I didn't rebase on yours, I just pulled the one patch. Let me rebase
<karolherbst> I still have some stuff of yours though
<karolherbst> I reverted "iris: Support up to 128 textures" though and with that 32 read images and 64 write images work here
<jekstrand> Yeah, 32's not working either. I'll debug
<karolherbst> 128 read images work with llvmpipe here :)