<jekstrand>
karolherbst: That looks like the VMA cache breaking. Fun!
<karolherbst>
doesn't happen with kasan :(
<jekstrand>
danvet: ^^
<jekstrand>
karolherbst: You're opening and closing contexts too fast
<karolherbst>
guess so
<jekstrand>
Of course, i915 isn't actually locking there because RCU will save us!
<karolherbst>
:(
<karolherbst>
I just wanted to benchmark the speed of the CTS runs :(
<karolherbst>
(though it takes like ~2.5 minutes here in case it doesn't crash)
<karolherbst>
btw.. is there a fix I can try?
<jekstrand>
No
<karolherbst>
damn
<jekstrand>
Not an easy one
<jekstrand>
I could type one but ugh, this is thorny
<jekstrand>
I mean, I may have a patch somewhere that nukes the vma cache...
<jekstrand>
But I likely don't have it anymore. It was on my Intel machine and I didn't save off all my kernel branches.
<karolherbst>
mhhh
<karolherbst>
wondering if I could hit this with the official stack...
<jekstrand>
maybe? Depends on how they use contexts.
<jekstrand>
It'd make for an interesting IGT test.
<karolherbst>
not sure if it really has an impact on how they use context if I still throw 24 processes at the same time at it
<karolherbst>
let' see if 20 runs reliably enough
<karolherbst>
nope..
<jekstrand>
It's a race between context close and execbuffer
<karolherbst>
so if there is still work in flight/submitted, but I close to context before that?
<karolherbst>
or well.. at the same time
<jekstrand>
work in flight is fine. It's the ioctls racing
<karolherbst>
ehh
<karolherbst>
so yeah.. it happens with intels stack as well :)
<jekstrand>
Well, context close is deferred so it may not just be ioctlls
<karolherbst>
different, but my machine is frozen
<jekstrand>
Oh, sure it's a kernel bug and, theoretically, anything can trigger it.
<jekstrand>
You can probably trigger it with GL or Vulkan if you try just right
<karolherbst>
I'd just wondering if I should file a bug against the intel CL stack :D
<karolherbst>
although I guess intel already investigates this?
<karolherbst>
or well.. working on a fix or something
<jekstrand>
File a bug against i915.
<karolherbst>
my CPU makes weird noises :)
<karolherbst>
mhh.. with intels stack it doesn't actually crash, it's just stopped doing anything. I don't even get stacktraces or something
<karolherbst>
oh well
Duke`` has quit [Ping timeout: 480 seconds]
<jekstrand>
karolherbst: Optimizing clc takes 14s on Arm. :-/ We need shader cache if we're going to do that...
<karolherbst>
yeah...
<karolherbst>
but it's optional
<karolherbst>
atm
<jekstrand>
Yup
<karolherbst>
my wish is though to create the nir binary when compiling mesa, but...
<karolherbst>
I am wondering if we could make things in a way, that we don't use anything from nir_options
<jekstrand>
That's tricky
<karolherbst>
jekstrand: maybe we could also lazy init it. Like checking if it's needed at all and skip loading it until we hit a kernel that actually needs it
<karolherbst>
even on x86 I have to wait like 2 seconds until clinfo finishes the first time :(
<jekstrand>
karolherbst: Unfortunately, with the way clc works now, that's a chicken-and-egg problem because we need the libclc shader in spirv_to_nir right now.
<karolherbst>
we do?
<alyssa>
"my wish is though to create the nir binary when compiling mesa" ... would be nice, wouldn't it.
<jekstrand>
karolherbst: Yeah... But I think it's only so we can look things up.
<karolherbst>
since when?
<jekstrand>
Maybe we can have a spirv_to_nir mode which generates a protype NIR with no actual function impls.
<jekstrand>
karolherbst: Yeah, for that it looks like all we need is prototypes.
<jekstrand>
We can probably have a super-fast light-weight spirv_to_nir pass which just gathers those and doesn't actually parse the whole thing.
<karolherbst>
potentially
<karolherbst>
I could check what's actually expensive here
<jekstrand>
I do think it's good to have a "header" shader at least, if for nothing other than verification.
<karolherbst>
maybe it doesn't matter
<jekstrand>
Yeah, it's possible a lot of the 3s spent on my Arm board is just I/O loading all that SPIR-V. :-/
<jekstrand>
I kind-of doubt it but I'm sure that's non-trivial
<karolherbst>
let me profile it here
<jekstrand>
But, also, disk cache just solves this problem so I'm kind-of inclined to not care
<jekstrand>
I can wire up caching in panfrost if needed.
<alyssa>
r-b
<karolherbst>
yeah... it's just first start up which is annoying
<karolherbst>
jekstrand: yeah.. you don't even have to cache panfrosts internal shaders :D
<karolherbst>
jekstrand: also keep in mind that debug builds are super slow here :P
<karolherbst>
debug: 2s -> 0.1s
<karolherbst>
release: 0.45s -> 0.05s
<karolherbst>
anyway.. 82% is nir_load_libclc_shader in the cold cache case
<karolherbst>
and spirv_to_nir is just 18% in total
<karolherbst>
jekstrand: I think it's better to spend time on improving runtime of opt passes instead :p
<karolherbst>
lower_vars_to_ssa uses like 20%
<karolherbst>
in the hot cache case load_libclc_shader drops doen to 27% with my patch
<karolherbst>
mhh
<karolherbst>
I might want to skip calling nir_sweep in case it's a cached thing I loaded
<karolherbst>
nir_sweep is expensive...
<karolherbst>
5% of the time is spent on just initing libLLVM :(
shankaru has joined #dri-devel
<karolherbst>
jekstrand: when I started: Kernels compilation time: 1800ms, now: "Kernels compilation time: 48ms" :) that's the luxmark thing after caching
<karolherbst>
but there is still a huge problem.. delayed compilation of things (like backend compiling stuff when creating the cso) does get included in benchmark scores
<karolherbst>
so I am wondering if I could create the cso before enqueueing the kernel, but that might get annoying
mdroper has joined #dri-devel
rkanwal has joined #dri-devel
ella-0 has joined #dri-devel
fxkamd has joined #dri-devel
ella-0_ has quit [Read error: Connection reset by peer]
<jekstrand>
karolherbst: We should sweep at the end of optimization but no need to sweep before, probably.
<jekstrand>
For CLC that is
bbrezillon has quit [Ping timeout: 480 seconds]
<karolherbst>
yeah.. atm I do it after loading
<karolherbst>
jekstrand: what about nir_split_var_copies btw? Should I add it as well?
<jekstrand>
karolherbst: If you're going to lower copies, yes, it's required.
<karolherbst>
okay
<jekstrand>
I don't think copy lowering is guaranteed to work without copy splitting
<jekstrand>
But copy splitting only has to be run once, not in the loop.
<karolherbst>
ahh
<jekstrand>
Actually.....
<jekstrand>
Just run splitting, not lowering.
<jekstrand>
splitting is required for correctness for a few other things. vars_to_ssa will lower copies on-demand if it needs them.
<jekstrand>
Sorry, I should avoid code review for the first few hours of Mondays
<jekstrand>
:P
<karolherbst>
:D
konstantin has joined #dri-devel
nchery is now known as Guest2225
nchery has joined #dri-devel
Duke`` has joined #dri-devel
FireBurnUK has joined #dri-devel
FireBurn has quit [Read error: Connection reset by peer]
<karolherbst>
will check what's up with nir_lower_undef_to_zero, but I assume that's libclc using undefined values
<karolherbst>
mhh, used in phis
<karolherbst>
_Z12__clc_remquoDv3_fS_PDv3_i has two even and it's quite small
<karolherbst>
jekstrand: ahh yeah...
Guest2225 has quit [Ping timeout: 480 seconds]
<karolherbst>
uhhh
<alyssa>
karolherbst: could be a libclc bug?
<karolherbst>
nope
<karolherbst>
it's just normal C
<karolherbst>
if this is a bug, all C is buggy :p
<alyssa>
correct
<karolherbst>
_although_ you could write code differently to not make that mistake
<karolherbst>
also, jekstrand: potential optimization possible
FireBurnUK has quit [Read error: No route to host]
FireBurn has joined #dri-devel
Duke`` has quit [Ping timeout: 480 seconds]
FireBurnUK has joined #dri-devel
FireBurn has quit [Read error: Connection reset by peer]
shankaru has quit [Quit: Leaving.]
shankaru has joined #dri-devel
Duke`` has joined #dri-devel
FireBurnUK has quit [Read error: Connection reset by peer]
FireBurnUK has joined #dri-devel
jewins1 has joined #dri-devel
jewins has quit [Ping timeout: 480 seconds]
maxzor has joined #dri-devel
Arsen has quit [Quit: Quit.]
Arsen has joined #dri-devel
lemonzest has quit [Ping timeout: 480 seconds]
<jekstrand>
karolherbst: What?
<karolherbst>
jekstrand: see my comment on the MR
<karolherbst>
but I'm inclined to ignore it as after inlining that all goes away anyway
<jekstrand>
karolherbst: Sounds like the undef is harmless then
<karolherbst>
yep
<jekstrand>
Ok, cool.
<karolherbst>
clc is doing a bunch of those sadly
<jekstrand>
That's fine. We don't need to run undef_to_zero
<jekstrand>
Sure
<jekstrand>
But I kinda don't care
<jekstrand>
Like, that's a standard C pattern and it's safe.
<jekstrand>
I was more concerned that maybe there was actual UB in libclc.
<jekstrand>
But if there isn't then we can keep using undef
<karolherbst>
that might be, but clc is huge
<jekstrand>
karolherbst: Sure, but we'll probably find out pretty quick if we've broken libclc
<karolherbst>
mhh, well.. I think we do have a bug somewhere, but I can't really find it
<jekstrand>
?
<karolherbst>
so llvmpipe is a bit broken the more passes I throw at the kernel, but I couldn't find a reason for this to happen
<jekstrand>
ugh
<karolherbst>
it segfaults, though
<jekstrand>
That should be findable unless it's segfaulting in JIT code
<karolherbst>
it's in JIT code
<jekstrand>
:(
<karolherbst>
yeah...
<karolherbst>
I am inclined to not care as long as iris is fine
<karolherbst>
we'll figure it out someday, but...
<karolherbst>
it's unrelated to my change though
<karolherbst>
and more related to the passes I run inside rusticl
sdutt has joined #dri-devel
sdutt has quit []
sdutt has joined #dri-devel
<karolherbst>
jekstrand: huh.. there is something which looks a bit wrong though
<karolherbst>
we might be able to optimize then though :D
<karolherbst>
*that
mdroper has quit []
ybogdano has joined #dri-devel
<alyssa>
The more I think about undef_to_zero the more I conclude what we actually want is a much more aggressive opt_undef
<jekstrand>
undef_to_zero is kind-of horrible
<jekstrand>
It's not actually the language behavior you want
<alyssa>
one that transforms anything(undef) -> undef
<karolherbst>
alyssa: can't really do that
<alyssa>
that would take care of the use case of undef_to_zero combining well with opt_algebraic
<alyssa>
karolherbst: no? not even for alu?
<karolherbst>
only if it's scalar I guess
<jekstrand>
Yeah, (undef | 0xffff) | 0xffff0000
<karolherbst>
ahh, and that ^^
<alyssa>
oh, ugh, that's really cursed.
<jekstrand>
Yeah, I've thought about how to do this. It's hard.
<jekstrand>
I think valgrind, ubsan, and others have a concept of bitwise undef
<jekstrand>
And maybe LLVM too?
<jekstrand>
It's really annoying once you actually think about it for a while
<karolherbst>
maybe we need to declare what affects inputs have on ouputs, or mark some instructions as "all inputs, affect all output bits"
<alyssa>
undef_to_zero doesn't sound all that bad anymore :-p
<jekstrand>
undef << x | y. No idea. Could be totally valid depending on context.
<jekstrand>
Yeah, the advantage of undef_to_zero is that you can usually fold zero
<karolherbst>
I think the better approach is to figure out if the undef bits are unused and optimize them away
<jekstrand>
karolherbst: Have fun with that!
<karolherbst>
we don't do value range tracking, do we? :P
<jekstrand>
One day, I'd like NIR to grow full competent range analysis and undef analysis could be part of that.
<karolherbst>
althought hat only helps a little
<alyssa>
is nir_range_analysis incompetent? :v
<karolherbst>
yeah...
<jekstrand>
Like, exactly the same logic as you use for tracking definitely-set and definitely-unset bits can be used to track undef bits.
<jekstrand>
alyssa: It's incomplete
<bnieuwenhuizen>
also stuff like opencoded bounds (min(max(undef, x), y) using comparisons show why even going as far as comparisons it gets messy
<alyssa>
Ekstrand's Second Incompleteness Theorem states that all value range tracking systems for a sufficiently powerful IR are incompetent or incomplete.
lemonzest has joined #dri-devel
konstantin has quit []
aravind has quit [Ping timeout: 480 seconds]
<jekstrand>
alyssa: Pretty sure that's not even a lemma. It's more one of those "it's obvious that..."
<alyssa>
very humble of you
<karolherbst>
jekstrand: yeah soo.. opt_if somehow only ends up increasing size
<jekstrand>
ok...
<karolherbst>
ahh yeah.. added inots
<karolherbst>
so it moves stuff from the else into the then block and adds inots
<karolherbst>
ehh
<jekstrand>
Woo! Found some gallium bugs...
<karolherbst>
wait..
<karolherbst>
I have an idea
<karolherbst>
jekstrand: sooo.. opt_if only makes sense if we run opt_algebraic as well
<karolherbst>
question is.. should I drop the optimize flag and always call opt_algebraic?
heat_ has joined #dri-devel
heat has quit [Read error: No route to host]
<karolherbst>
not sure if the optimize flag makes sense at all anyway... I tried to only call passes which all drivers are fine with (like hence running such a pessimistic nir_opt_peephole_select with 0, false, false instead of 8, true, true
<karolherbst>
and opt_algebric already depends on the nir_options, so I guess that's fine as well
<jekstrand>
karolherbst: Microsoft wants the optimize flag
<jekstrand>
So we can't just drop it
<karolherbst>
ohh, so they have to get the libclc shaders without opts ran on it?
<jekstrand>
Should we always call opt_algebraic? I don't see why not. We're going to disk cache it anywya.
<jekstrand>
karolherbst: No, they want to optimize without a disk cache
<zmike>
jekstrand: if you're good with zmike/32 I'm gonna add your ack and marge since I've got some other stuff that will probably conflict
<karolherbst>
jekstrand: yeah, but I meant drop the optimize flag and always optimize
<jekstrand>
zmike: Fine with me.
<karolherbst>
so the question is, does somebody wants to get libclc without opts at all
<jekstrand>
karolherbst: I think we don't want opts for rusticl or clover if there's no disk cache
<jekstrand>
It's 17s vs. 3s for clinfo on panfrost.
<karolherbst>
ahhh, I see
<karolherbst>
okay, makes sense then
<jekstrand>
If we could figure out on-demand loading, that'd be swell.
<karolherbst>
yeah
shankaru has quit [Quit: Leaving.]
slattann has quit [Ping timeout: 480 seconds]
maxzor has quit [Ping timeout: 480 seconds]
<karolherbst>
jekstrand: does brw_kernel.c want to always optimize or just with a cache?
<jekstrand>
probably just with a cache
<jekstrand>
I think it's only microsoft that wants to optimize without one and only because they have a different caching mechanism.
<karolherbst>
yeah.. probably
<karolherbst>
clover is slow :(
<karolherbst>
ehh maybe I should try a release build
<karolherbst>
clinfo runtime clover vs rusticl: 0.180s vs 0.080s
<karolherbst>
it's like 0.110s if I disable spirv and nir caching
h0tc0d3 has quit [Quit: Leaving]
<karolherbst>
ohh.. I forgot to enalbe the opts
<karolherbst>
0.06s :)
<karolherbst>
nvm
<karolherbst>
jekstrand: there is one thing I am wondering about though.. do you think it makes sense to key the optimize flag as well?
<karolherbst>
otherwise you'll get the unoptimized libclc if you disabled caching via env the next time you run with caching enabled
<karolherbst>
ehhh
<karolherbst>
wait...
<karolherbst>
that doesn't make sense
MajorBiscuit has joined #dri-devel
mbrost has joined #dri-devel
FireBurnUK is now known as FireBurn
ngcortes has joined #dri-devel
bbrezillon has joined #dri-devel
MajorBiscuit has quit [Ping timeout: 480 seconds]
maxzor has joined #dri-devel
mbrost has quit [Remote host closed the connection]
heat has quit [Read error: Connection reset by peer]
<pcercuei>
done
alyssa has left #dri-devel [#dri-devel]
mbrost_ has joined #dri-devel
mbrost has quit [Read error: Connection reset by peer]
MajorBiscuit has joined #dri-devel
MajorBiscuit has quit []
MajorBiscuit has joined #dri-devel
ybogdano has joined #dri-devel
jewins has joined #dri-devel
ngcortes has quit [Remote host closed the connection]
Major_Biscuit has joined #dri-devel
MajorBiscuit has quit [Ping timeout: 480 seconds]
ngcortes has joined #dri-devel
JohnnyonFlame has joined #dri-devel
gouchi has quit [Remote host closed the connection]
dj-death has quit [Ping timeout: 480 seconds]
neonking_ has joined #dri-devel
neonking has quit [Ping timeout: 480 seconds]
Namarrgon has quit [Quit: WeeChat 3.4]
Namarrgon has joined #dri-devel
heat_ has quit [Remote host closed the connection]
heat_ has joined #dri-devel
heat_ has quit [Read error: No route to host]
Haaninjo has quit [Quit: Ex-Chat]
heat has joined #dri-devel
ybogdano has quit [Ping timeout: 480 seconds]
ybogdano has joined #dri-devel
neonking_ has quit []
neonking has joined #dri-devel
<jekstrand>
karolherbst: My image count and sampler patches are looking good on iris in CI. I'm going to cherry-pick them into my branch and see where things are at.
<karolherbst>
doesn't work for CL :P
<jekstrand>
karolherbst: You may have an old version. I've fixed a bunch of bugs.
<karolherbst>
wait.. I have some patches to fix _some_ issues
<jekstrand>
If there are CL bugs, I'll fix them.
<karolherbst>
mhh? okay
<karolherbst>
did you fix the binding table issue?
<jekstrand>
maybe?
<karolherbst>
like the one where the hw only has 32 slots?
<jekstrand>
test name?
<karolherbst>
api min_max_read_image_args
<karolherbst>
and min_max_write_image_args
<karolherbst>
essentially the binding table runs full
<karolherbst>
I had patches to make it bigger than 32, but the hw only has 4 bits or something
<karolherbst>
or well.. genxml
<karolherbst>
jekstrand: btw, I pushed my branch with all the caching and libclc stuff
<karolherbst>
no regressions on llvmpipe
<jekstrand>
ok
<karolherbst>
can't say the same for iris, because... my machine dies
ybogdano has quit [Ping timeout: 480 seconds]
<jekstrand>
heh
<karolherbst>
let' see, your image patches are on rusticl/wip still, correct?
Major_Biscuit has quit [Ping timeout: 480 seconds]
<jekstrand>
no
<jekstrand>
not yet
<karolherbst>
ahh..
<karolherbst>
maybe I should remove them from my branch then as well
Major_Biscuit has joined #dri-devel
<karolherbst>
jekstrand: might have to do a "meson configure build -Drust_std=2021"
<karolherbst>
the latter should be on my branch though
mbrost_ has quit [Ping timeout: 480 seconds]
Duke`` has joined #dri-devel
jewins has quit [synthon.oftc.net reflection.oftc.net]
nchery has quit [synthon.oftc.net reflection.oftc.net]
sadlerap has quit [synthon.oftc.net reflection.oftc.net]
fxkamd has quit [synthon.oftc.net reflection.oftc.net]
alanc has quit [synthon.oftc.net reflection.oftc.net]
oneforall2 has quit [synthon.oftc.net reflection.oftc.net]
tlwoerner has quit [synthon.oftc.net reflection.oftc.net]
jessica_24 has quit [synthon.oftc.net reflection.oftc.net]
OftenTimeConsuming has quit [synthon.oftc.net charm.oftc.net]
anholt has quit [synthon.oftc.net reflection.oftc.net]
quantum5 has quit [synthon.oftc.net reflection.oftc.net]
orbea has quit [synthon.oftc.net reflection.oftc.net]
seanpaul has quit [synthon.oftc.net reflection.oftc.net]
lstrano has quit [synthon.oftc.net reflection.oftc.net]
aknautiy has quit [synthon.oftc.net reflection.oftc.net]
jhli has quit [synthon.oftc.net reflection.oftc.net]
stuartsummers has quit [synthon.oftc.net reflection.oftc.net]
mattrope has quit [synthon.oftc.net reflection.oftc.net]
demarchi has quit [synthon.oftc.net reflection.oftc.net]
TD-Linux has quit [synthon.oftc.net reflection.oftc.net]
radii has quit [synthon.oftc.net reflection.oftc.net]
Karyon has quit [synthon.oftc.net reflection.oftc.net]
xyene has quit [synthon.oftc.net reflection.oftc.net]
bcheng has quit [synthon.oftc.net reflection.oftc.net]
remexre has quit [synthon.oftc.net reflection.oftc.net]
SolarAquarion has quit [synthon.oftc.net reflection.oftc.net]
exit70 has quit [synthon.oftc.net reflection.oftc.net]
samueldr has quit [synthon.oftc.net reflection.oftc.net]
kurufu has quit [synthon.oftc.net reflection.oftc.net]
imirkin has quit [synthon.oftc.net reflection.oftc.net]
rossy has quit [synthon.oftc.net reflection.oftc.net]
robink has quit [synthon.oftc.net reflection.oftc.net]
zzag has quit [synthon.oftc.net reflection.oftc.net]
ajax has quit [synthon.oftc.net reflection.oftc.net]
mattst88 has quit [synthon.oftc.net reflection.oftc.net]
jolan has quit [synthon.oftc.net reflection.oftc.net]
JTL has quit [synthon.oftc.net reflection.oftc.net]
Lightning has quit [synthon.oftc.net reflection.oftc.net]
steev has quit [synthon.oftc.net reflection.oftc.net]
lemes has quit [synthon.oftc.net reflection.oftc.net]
enilflah has quit [synthon.oftc.net reflection.oftc.net]
jrayhawk has quit [synthon.oftc.net reflection.oftc.net]
siqueira has quit [synthon.oftc.net reflection.oftc.net]
smaeul has quit [synthon.oftc.net reflection.oftc.net]
austriancoder has quit [synthon.oftc.net reflection.oftc.net]
arnd has quit [synthon.oftc.net reflection.oftc.net]
neonking has quit [synthon.oftc.net larich.oftc.net]
sdutt has quit [synthon.oftc.net larich.oftc.net]
HankB has quit [synthon.oftc.net larich.oftc.net]
Emmy_ has quit [synthon.oftc.net larich.oftc.net]
flacks has quit [synthon.oftc.net larich.oftc.net]
Kayden has quit [synthon.oftc.net larich.oftc.net]
anujp has quit [synthon.oftc.net larich.oftc.net]
LexSfX has quit [synthon.oftc.net larich.oftc.net]
andrey-konovalov has quit [synthon.oftc.net larich.oftc.net]
soreau has quit [synthon.oftc.net larich.oftc.net]
jljusten has quit [synthon.oftc.net larich.oftc.net]
vyivel has quit [synthon.oftc.net larich.oftc.net]
rcf has quit [synthon.oftc.net larich.oftc.net]
macromorgan has quit [synthon.oftc.net larich.oftc.net]
rsripada has quit [synthon.oftc.net larich.oftc.net]
Znullptr has quit [synthon.oftc.net larich.oftc.net]
rpigott has quit [synthon.oftc.net larich.oftc.net]
xperia64 has quit [synthon.oftc.net larich.oftc.net]
dolphin has quit [synthon.oftc.net larich.oftc.net]
Lightsword has quit [synthon.oftc.net larich.oftc.net]
dri-logger has quit [synthon.oftc.net larich.oftc.net]
mareko has quit [synthon.oftc.net larich.oftc.net]
glisse has quit [synthon.oftc.net larich.oftc.net]
mslusarz has quit [synthon.oftc.net larich.oftc.net]
Sachiel has quit [synthon.oftc.net larich.oftc.net]
cengiz_io has quit [synthon.oftc.net larich.oftc.net]
neoXite__ has quit [synthon.oftc.net larich.oftc.net]
sagar__ has quit [synthon.oftc.net larich.oftc.net]
MTCoster has quit [synthon.oftc.net larich.oftc.net]
kem has quit [synthon.oftc.net larich.oftc.net]
kisak has quit [synthon.oftc.net larich.oftc.net]
craftyguy has quit [synthon.oftc.net reflection.oftc.net]
krushia has quit [synthon.oftc.net larich.oftc.net]
dschuermann has quit [synthon.oftc.net larich.oftc.net]
CosmicPenguin has quit [synthon.oftc.net larich.oftc.net]
cphealy has quit [synthon.oftc.net larich.oftc.net]
jhugo has quit [synthon.oftc.net larich.oftc.net]
hfink has quit [synthon.oftc.net larich.oftc.net]
markyacoub has quit [synthon.oftc.net larich.oftc.net]
tchar has quit [synthon.oftc.net larich.oftc.net]
jstultz has quit [synthon.oftc.net larich.oftc.net]
graphitemaster has quit [synthon.oftc.net larich.oftc.net]
zmike has quit [synthon.oftc.net larich.oftc.net]
narmstrong has quit [synthon.oftc.net larich.oftc.net]
`join_subline has quit [synthon.oftc.net larich.oftc.net]
ogabbay has quit [synthon.oftc.net larich.oftc.net]
JohnnyonFlame has quit [synthon.oftc.net weber.oftc.net]
ngcortes has quit [synthon.oftc.net weber.oftc.net]
eukara has quit [synthon.oftc.net weber.oftc.net]
mclasen has quit [synthon.oftc.net weber.oftc.net]
sneil has quit [synthon.oftc.net weber.oftc.net]
sumits has quit [synthon.oftc.net weber.oftc.net]
Lyude has quit [synthon.oftc.net weber.oftc.net]
abhinav__ has quit [synthon.oftc.net weber.oftc.net]
anarsoul has quit [synthon.oftc.net weber.oftc.net]
flto has quit [synthon.oftc.net weber.oftc.net]
aswar002 has quit [synthon.oftc.net weber.oftc.net]
agd5f has quit [synthon.oftc.net weber.oftc.net]
zf has quit [synthon.oftc.net weber.oftc.net]
tjmercier has quit [synthon.oftc.net weber.oftc.net]
kj has quit [synthon.oftc.net weber.oftc.net]
Ryback_ has quit [synthon.oftc.net weber.oftc.net]
mdnavare has quit [synthon.oftc.net weber.oftc.net]
unerlige has quit [synthon.oftc.net weber.oftc.net]
sarnex has quit [synthon.oftc.net weber.oftc.net]
rcn-ee_ has quit [synthon.oftc.net weber.oftc.net]
rib___ has quit [synthon.oftc.net weber.oftc.net]
nikitalita48 has quit [synthon.oftc.net weber.oftc.net]
mmenzyns has quit [synthon.oftc.net weber.oftc.net]
lileo___ has quit [synthon.oftc.net weber.oftc.net]
benettig has quit [synthon.oftc.net weber.oftc.net]
sh_zam has quit [synthon.oftc.net weber.oftc.net]
eletrotupi has quit [synthon.oftc.net weber.oftc.net]
chadv has quit [synthon.oftc.net weber.oftc.net]
jbarnes has quit [synthon.oftc.net weber.oftc.net]
sumoon has quit [synthon.oftc.net weber.oftc.net]
pzanoni has quit [synthon.oftc.net weber.oftc.net]
ceyusa has quit [synthon.oftc.net weber.oftc.net]
SanchayanMaity has quit [synthon.oftc.net weber.oftc.net]
Frogging101 has quit [synthon.oftc.net weber.oftc.net]
sauce has quit [synthon.oftc.net weber.oftc.net]
ezequielg has quit [synthon.oftc.net weber.oftc.net]
tfiga has quit [synthon.oftc.net weber.oftc.net]
ifreund has quit [synthon.oftc.net weber.oftc.net]
robher has quit [synthon.oftc.net weber.oftc.net]
angular_mike_____ has quit [synthon.oftc.net weber.oftc.net]
mmx_in_orbit has quit [synthon.oftc.net weber.oftc.net]
rodrigovivi has quit [synthon.oftc.net weber.oftc.net]
eric_engestrom has quit [synthon.oftc.net weber.oftc.net]
krh has quit [synthon.oftc.net weber.oftc.net]
hwentlan____ has quit [synthon.oftc.net weber.oftc.net]
robclark has quit [synthon.oftc.net weber.oftc.net]
cmarcelo has quit [synthon.oftc.net weber.oftc.net]
airlied has quit [synthon.oftc.net weber.oftc.net]
rg3igalia has quit [synthon.oftc.net weber.oftc.net]
cwabbott has quit [synthon.oftc.net weber.oftc.net]
melissawen has quit [synthon.oftc.net weber.oftc.net]
clever has quit [synthon.oftc.net weber.oftc.net]
swivel has quit [synthon.oftc.net weber.oftc.net]
Simonx22 has quit [synthon.oftc.net weber.oftc.net]
gpiccoli has quit [synthon.oftc.net weber.oftc.net]
jekstrand has quit [synthon.oftc.net weber.oftc.net]
dianders has quit [synthon.oftc.net weber.oftc.net]
daniels has quit [synthon.oftc.net weber.oftc.net]
ngcortes has joined #dri-devel
eukara has joined #dri-devel
Lyude has joined #dri-devel
tjmercier has joined #dri-devel
zf has joined #dri-devel
Ryback_ has joined #dri-devel
mdnavare has joined #dri-devel
kj has joined #dri-devel
unerlige has joined #dri-devel
sarnex has joined #dri-devel
nikitalita48 has joined #dri-devel
daniels has joined #dri-devel
rib___ has joined #dri-devel
mmx_in_orbit has joined #dri-devel
dianders has joined #dri-devel
ezequielg has joined #dri-devel
robclark has joined #dri-devel
lileo___ has joined #dri-devel
angular_mike_____ has joined #dri-devel
hwentlan____ has joined #dri-devel
sh_zam has joined #dri-devel
jbarnes has joined #dri-devel
gpiccoli has joined #dri-devel
ifreund has joined #dri-devel
Simonx22 has joined #dri-devel
krh has joined #dri-devel
eric_engestrom has joined #dri-devel
SanchayanMaity has joined #dri-devel
rcn-ee_ has joined #dri-devel
jekstrand has joined #dri-devel
melissawen has joined #dri-devel
benettig has joined #dri-devel
JohnnyonFlame has joined #dri-devel
airlied has joined #dri-devel
ceyusa has joined #dri-devel
eletrotupi has joined #dri-devel
mmenzyns has joined #dri-devel
sauce has joined #dri-devel
chadv has joined #dri-devel
Frogging101 has joined #dri-devel
sneil has joined #dri-devel
swivel has joined #dri-devel
mclasen has joined #dri-devel
sumits has joined #dri-devel
sumoon has joined #dri-devel
flto has joined #dri-devel
anarsoul has joined #dri-devel
abhinav__ has joined #dri-devel
agd5f has joined #dri-devel
aswar002 has joined #dri-devel
tfiga has joined #dri-devel
cwabbott has joined #dri-devel
rg3igalia has joined #dri-devel
rodrigovivi has joined #dri-devel
robher has joined #dri-devel
pzanoni has joined #dri-devel
cmarcelo has joined #dri-devel
clever has joined #dri-devel
sadlerap has joined #dri-devel
nchery has joined #dri-devel
jewins has joined #dri-devel
fxkamd has joined #dri-devel
anholt has joined #dri-devel
tlwoerner has joined #dri-devel
orbea has joined #dri-devel
oneforall2 has joined #dri-devel
seanpaul has joined #dri-devel
jessica_24 has joined #dri-devel
lstrano has joined #dri-devel
alanc has joined #dri-devel
mattrope has joined #dri-devel
jhli has joined #dri-devel
smaeul has joined #dri-devel
zzag has joined #dri-devel
arnd has joined #dri-devel
exit70 has joined #dri-devel
austriancoder has joined #dri-devel
ajax has joined #dri-devel
siqueira has joined #dri-devel
xyene has joined #dri-devel
imirkin has joined #dri-devel
SolarAquarion has joined #dri-devel
jolan has joined #dri-devel
enilflah has joined #dri-devel
lemes has joined #dri-devel
kurufu has joined #dri-devel
Karyon has joined #dri-devel
robink has joined #dri-devel
Lightning has joined #dri-devel
jrayhawk has joined #dri-devel
TD-Linux has joined #dri-devel
JTL has joined #dri-devel
rossy has joined #dri-devel
radii has joined #dri-devel
bcheng has joined #dri-devel
craftyguy has joined #dri-devel
mattst88 has joined #dri-devel
samueldr has joined #dri-devel
steev has joined #dri-devel
aknautiy has joined #dri-devel
stuartsummers has joined #dri-devel
remexre has joined #dri-devel
demarchi has joined #dri-devel
quantum5 has joined #dri-devel
Znullptr has joined #dri-devel
rpigott has joined #dri-devel
cphealy has joined #dri-devel
sdutt has joined #dri-devel
Emmy_ has joined #dri-devel
flacks has joined #dri-devel
soreau has joined #dri-devel
jljusten has joined #dri-devel
LexSfX has joined #dri-devel
andrey-konovalov has joined #dri-devel
vyivel has joined #dri-devel
rcf has joined #dri-devel
macromorgan has joined #dri-devel
rsripada has joined #dri-devel
Lightsword has joined #dri-devel
dolphin has joined #dri-devel
xperia64 has joined #dri-devel
dri-logger has joined #dri-devel
mareko has joined #dri-devel
glisse has joined #dri-devel
mslusarz has joined #dri-devel
Sachiel has joined #dri-devel
kem has joined #dri-devel
ogabbay has joined #dri-devel
dschuermann has joined #dri-devel
tchar has joined #dri-devel
MTCoster has joined #dri-devel
markyacoub has joined #dri-devel
zmike has joined #dri-devel
CosmicPenguin has joined #dri-devel
jstultz has joined #dri-devel
hfink has joined #dri-devel
narmstrong has joined #dri-devel
cengiz_io has joined #dri-devel
neoXite__ has joined #dri-devel
HankB has joined #dri-devel
`join_subline has joined #dri-devel
kisak has joined #dri-devel
krushia has joined #dri-devel
jhugo has joined #dri-devel
sagar__ has joined #dri-devel
graphitemaster has joined #dri-devel
neonking has joined #dri-devel
anujp has joined #dri-devel
Kayden has joined #dri-devel
OftenTimeConsuming has joined #dri-devel
<jekstrand>
karolherbst: No longer crashing but it fails. Working on why
<jekstrand>
karolherbst: I just came up with a terrible idea for how to do CL on crocus
<karolherbst>
:O
<karolherbst>
let's hear it
<karolherbst>
I mean.. nv50 supports CL 1.1, and that one is pre GL 4
<jekstrand>
So... HSW doesn't have 48-bit addresses, right? But this also means that the entire address space will fit inside a 2 or 4 SSBOs.
<jekstrand>
(An SSBO size is 2^30 and I don't remember if HSW's address space is 32 or 31 bits)
<karolherbst>
can't even do compute shaders on GL
<karolherbst>
well... it can, but crapily
<karolherbst>
although I think the issue was more like it doesn't support image_load_store and hence we can't claim compute shaders support
<karolherbst>
something wich surface ops being compute only
rkanwal has quit [Quit: rkanwal]
<karolherbst>
jekstrand: you can have device being 32 bits only
<karolherbst>
*Devices
<jekstrand>
So we bind 4 ssbos over the entire Address space and then load/store_global(addr) becomes load/store_ssbo(addr >> 30, addr & 0x3fffffff)
<jekstrand>
The tricky part becomes uploading kernel arguments
ybogdano has joined #dri-devel
<jekstrand>
But there's no reason why we can't have a UBO with relocations in it.
<karolherbst>
why?
<karolherbst>
don't support generic and do 32 bit only?
<karolherbst>
would be enough for CL 1.2
<jekstrand>
We don't have actual addresses. Only the kernel knows the actual address and only at exec time.
<jekstrand>
So we need to patch the input UBO
<bnieuwenhuizen>
don't forget your unaligned loads/stores though :P
<karolherbst>
jekstrand: I still don't see the problem
* jekstrand
really needs to stop caring about HSW. It's like a drug.
<karolherbst>
so instead of pointers, we do ssbo offsets
<karolherbst>
ms is doing it as well, so it can't be that bad
<jekstrand>
karolherbst: Yeah, that's an option if we're ok with one SSBO per memory objet
<karolherbst>
just needs a special case in kernel::launch I guess
<karolherbst>
jekstrand: should be fine for 32 bit devices
<karolherbst>
you just say your max buffer size is.. whatever fits in ssbo size
<jekstrand>
yeah
<karolherbst>
so if htat's 2^30, that's fine
<jekstrand>
Yeah, that would work too
<karolherbst>
it's just stupid to do that on GPUs with like massive amounts of VRAM :)
<jekstrand>
And would involve less driver gymnastics
<karolherbst>
yep
<jekstrand>
It's pretty sad that HSW only has a 32-bit address space given how much ram you can pair with that processor.
<karolherbst>
I hardcode 64 pretty much everywhere atm, but I might wire up rusticl on nv50 and see where that goes
<karolherbst>
we only have 16 32 bit buffers
<karolherbst>
the GPUs VA is bigger, but...
<karolherbst>
none of those GPUs have more than 4GB VRAM.. I doubt they even have more than 1GB
<karolherbst>
ahh, seems like there were 2GB variants
<clever>
talking about a specific gpu line or gpu's in general?
<karolherbst>
specific gpus line
<karolherbst>
jekstrand: yeah.. that's a bit odd
<karolherbst>
also sounds like a bad idea to have a smaller VA on an iGPU than the CPU/MMU supports..
mbrost has joined #dri-devel
<jekstrand>
Yeah....
<clever>
due to design limits, the pi4's 3d core only has a 32bit addr space, but the host can handle up to 16gig of ram
<clever>
there is a single level paging table and mmu patching that mismatch
<karolherbst>
oh wow
<clever>
and you need 4mb of physically contiguous memory to hold the paging table
<karolherbst>
not sure what intel needed that for, but intels GPUs also had some preallocated block of memory where you can even configure the size in the firmware
<clever>
drivers/gpu/drm/v3d/v3d_mmu.c deals with that
<clever>
for the pre-pi4 models (pi0-pi3), the 3d core is also limited to 32bit, but the host only supports 1gig of ram
<clever>
the extra 2 bits of the address are used for cache-control flags
<karolherbst>
ahh fun.. i915 survived this run
<clever>
`addr | 0xc000_0000` means to ignore all caches
<clever>
`addr | 0x8000_0000` is both coherent with the L2, and allocates into the L2
<clever>
but the arm core isnt coherent with the L2 on some models
<clever>
so you need to use the right addr for the model
<karolherbst>
jekstrand: mhh weird.. drm-tip looks more stable
alyssa has joined #dri-devel
<clever>
i think 0x4000_0000 was coherent but non-allocating?
* alyssa
wonders why she's getting f2f16(phi(f2f32(x), f2f32(y))) instead of a 16-bit phi..
<karolherbst>
wow.. the CPU is basically idling most of the time with the caches
<jekstrand>
alatiera: Our down-of-upcatst optimization can't see through phis. :(
<jekstrand>
alyssa, rather ^^
Duke`` has quit [Ping timeout: 480 seconds]
<alyssa>
jekstrand: right..
* karolherbst
tries -j48
<jekstrand>
I've been tempted to write a phi-of-unop CSE like thing
<alyssa>
heh
<karolherbst>
werid.. things won't go faster if the GPU is already at 100% all the time
<jekstrand>
But then someone's going to want me to handle fneg(phi(fneg(x), y))
<alyssa>
I would like that handled yes! :-p
<karolherbst>
"2176/2176 02:47" :3
<alyssa>
jekstrand: Oh, it's even more sinister than that
<alyssa>
My test does "if (x) { gl_FragColor = A; } else { gl_FragColor = B }"
<jekstrand>
karolherbst: \o/
<alyssa>
gl_FragColor is forced to be fp32 for some reason, so you get the conversion mess
<karolherbst>
jekstrand: seeks like something one can code generically
<jekstrand>
karolherbst: Is that on iris or lavapipe?
<jekstrand>
llvmpipe, rather
<karolherbst>
jekstrand: iris :D
<karolherbst>
llvmpipe is a bit slower
<alyssa>
If I rewrite it as "vec4 temp; if (x) { temp = A; } else { temp = B } gl_FragColor = temp"
<jekstrand>
karolherbst: Did you fix the max image tests?
<alyssa>
..there are no conversions
<jekstrand>
karolherbst: Or is that a subset
<karolherbst>
lp needs like 17 minutes with a cold cache
<karolherbst>
6 with a hot one
<karolherbst>
jekstrand: ahh no.. it still fails
<karolherbst>
mostly benchmarking the caching stuff
<jekstrand>
karolherbst: Ah
<karolherbst>
on lp the difference is like... small
<karolherbst>
guess if it's busy doing compute on the CPU caching won't help much
<jekstrand>
alyssa: Yeah, makes sense.
<karolherbst>
but for iris it kind of halfes the time it needs to run through the tests
<jekstrand>
alyssa: Are you using opt_peephole_select? I'm not sure if we can see through bcsel either, but that's easier than a phi
<karolherbst>
k.. filing a bug now
mbrost has quit [Ping timeout: 480 seconds]
<alyssa>
jekstrand: I am, `A` is a texture op (to test whether my RA can handle phi nodes good :p)
<jekstrand>
alyssa: Ah. Yeah....
<alyssa>
(answer: it can't yet, too many moves)
<alyssa>
..wait, copyprop should be helping here
<alyssa>
can I not copyprop into phis? shame, I should fix that!
<jekstrand>
copy-prop can propagate into phies
<jekstrand>
*phis
<alyssa>
AGX copyprop
<jekstrand>
ay
<jekstrand>
ah
<alyssa>
optimizer bug, doing iterating `opcode info.num sources` sources instead of using the agx_foreach_src iterator which does the right thing for phis
<alyssa>
ok, now it's copypropped but then RA can't coalesce the phi so no change in total instruction count
<jekstrand>
Yeah, but that's still a way better RA test
<alyssa>
ah, well, yes
<alyssa>
cleaner IR, too
<jekstrand>
:D
<alyssa>
Admittedly I am very overwhelmed by actually doing coalescing
<alyssa>
merge sets seem really complicated.
<jekstrand>
alyssa: Didn't you already break that all out into a helper so it's easy now? :P
<alyssa>
Hm?
<jekstrand>
merge sets
<alyssa>
no(t yet)
<alyssa>
parallel copy lowering is what I stole from ir3 for the benefit of the commons
* jekstrand
hasn't actually read the MR. It's easier to be snarky if you don't know what's going on.
<alyssa>
basically FOSS robin hood
* karolherbst
filed a bug
<alyssa>
I also don't know if the ir3 people will hate that MR
* karolherbst
thinks about wiring up CL spir-v support
<karolherbst>
ehhh the only thing I really don't like about markdown is new line handling
<jekstrand>
karolherbst: I think clover is messing up my inputs. :-/
<jekstrand>
rusticl, rather
<karolherbst>
noooo
<karolherbst>
how so?
<jekstrand>
hrm... maybe not?
* jekstrand
is confused
<karolherbst>
yeah, I am sure rusticl is fine :p
<jekstrand>
Well, someone's messing them up. :P
<karolherbst>
yeah well..
<karolherbst>
depends on what gets messed up
<karolherbst>
intels stack does some weird indexing of images/textures if you get both of them, but you might hit something else