<karolherbst>
I am sure this also fixes other random crashes with fp64
<karolherbst>
what a pita of a bug
<airlied>
thanks for digging in!
<karolherbst>
airlied: yeah.. that works :)
<airlied>
16288 has the patch
<karolherbst>
heh.. my ADL-S doesn't seem so much faster than my CML-H
<karolherbst>
ahh.. it's GT-1 vs GT-2
<karolherbst>
mhh, it should still be faster..
<airlied>
make sure you took of the LP_NUM_THREADS :-P
<karolherbst>
airlied: how can I make llvmpipe use more threads? :D
<karolherbst>
airlied: nah.. I was testing iris first
<karolherbst>
heh.. LP_NUM_THREADS=24 and still only 1100%
<karolherbst>
where is my perf
<karolherbst>
airlied: so uhm... how do I get more perf out of llvmpipe on my machine? :D
<karolherbst>
guess local size of 32 isn't helping? dunno
<karolherbst>
468 points and image validation seems happy
<airlied>
not really sure, it's probably limited by launch params
<karolherbst>
yeah... so luxmark uses 32 threads on CPU devices
<karolherbst>
and 64 on GPUs
<HdkR>
Is llvmpipe still bounded by vertex heavy jobs rather than fragment?
<karolherbst>
that's on CL :P
<HdkR>
oh wow
<karolherbst>
so.. llvmpipe is a GPU now
<karolherbst>
still only 1100%
* karolherbst
doens't have 20 cores for nothing
<karolherbst>
but iris seems a little slow
<karolherbst>
so iris on my desktop should be around 50% faster
<karolherbst>
but is only 10%
<karolherbst>
intel_gpu_top says 99%
<karolherbst>
¯\_(ツ)_/¯
<karolherbst>
maybe I hurt perf
<karolherbst>
yeah.. no
ybogdano has quit [Ping timeout: 480 seconds]
<karolherbst>
ADL-S GT1: 2719
<karolherbst>
CML GT2: 2305
<karolherbst>
airlied: guess I have to figure out the llvm header situation
<karolherbst>
and I think I might even require llvm-14, because the opencl header stuff isn't as terribly broken there...
<karolherbst>
it still is, but.. uhhh
* airlied
is going to go dig into coroutines
<karolherbst>
good luck
<karolherbst>
ADL-S GT1 + LP: 3139
mclasen_ has quit [Ping timeout: 480 seconds]
rkanwal has quit [Quit: rkanwal]
kts has quit [Quit: Konversation terminated!]
<karolherbst>
airlied: btw, your skynet email is dead
<airlied>
yeah need to chase down where that server went, might have to retire it
kts has joined #dri-devel
nchery has quit [Ping timeout: 480 seconds]
kts has quit [Quit: Konversation terminated!]
elongbug__ has quit [Ping timeout: 480 seconds]
rsalvaterra_ has joined #dri-devel
rsalvaterra is now known as Guest3407
rsalvaterra_ is now known as rsalvaterra
Guest3408 has quit [Ping timeout: 480 seconds]
fxkamd has quit []
sdutt has quit []
jimjams has joined #dri-devel
mwalle has quit [Quit: WeeChat 3.0]
Duke`` has joined #dri-devel
famfo has joined #dri-devel
itoral has joined #dri-devel
consolers has joined #dri-devel
<consolers>
i think since i moved from mesa-20.2 to 21.2 clinfo started segfaulting: now it segfaults when loading loading /usr/lib64/gallium-pipe/pipe_iris.so
<consolers>
that was on 22.0 i think i hit this before and figured something out but my mind is a blank
mhenning has quit [Quit: mhenning]
jewins has quit [Read error: Connection reset by peer]
Duke`` has quit [Ping timeout: 480 seconds]
lemonzest has quit [Quit: WeeChat 3.4]
consolers has quit [Ping timeout: 480 seconds]
ppascher has joined #dri-devel
danvet has joined #dri-devel
frieder has joined #dri-devel
garrison has joined #dri-devel
i-garrison has quit [Read error: Connection reset by peer]
consolers has joined #dri-devel
<consolers>
anyclues on troubleshooting why clinfo is just crashing with mesa?
mvlad has joined #dri-devel
<consolers>
i know it worked with mesa-20.2.0
<consolers>
but apparently not since, then when i've had 21.2.1 and 22.2.0
digetx has quit [Ping timeout: 480 seconds]
digetx has joined #dri-devel
MajorBiscuit has joined #dri-devel
consolers has quit [Ping timeout: 480 seconds]
tzimmermann has joined #dri-devel
<airlied>
danvet: fyi I backmerged rc5, I had an arm build fail it had a fix for
cheako has quit [Quit: Connection closed for inactivity]
<dolphin>
airlied, danvet: no patches got picked up for drm-intel-fixes this week
<javierm>
yeah, me neither. But I think that would be nice to have kunits for all the conversion helpers
xperia64_ has joined #dri-devel
<javierm>
tzimmermann: I'll add that to my TODO to look at some point, which just keeps growing :)
<tzimmermann>
that's a good idea with these unit tests
xperia64 has quit [Ping timeout: 480 seconds]
lynxeye has joined #dri-devel
<javierm>
tzimmermann: Ok, I comment in the list too
<mripard>
javierm: I had to use it a bit recently for the clocks framework, so I can help if needed
<mripard>
(it's awesome)
<javierm>
mripard: great
<javierm>
mripard: yes, I was in a talk about kunit at some conference (plumbers in lisbon maybe?) and thought that was awesome but never had the time to dig deeper
<javierm>
mripard: thanks for the offering, I'll for sure bug you if want to write some unit tests with kunit :)
nvishwa1 has quit [Read error: Connection reset by peer]
Lyude has quit [Ping timeout: 480 seconds]
mattrope has quit [Ping timeout: 480 seconds]
Lyude has joined #dri-devel
mattrope has joined #dri-devel
vyivel has quit [Read error: Connection reset by peer]
vyivel has joined #dri-devel
<mripard>
I wanted to write some infrastructure for drivers to create unit tests in KMS, but got distracted
<mripard>
maybe that would be worth adding in the TODO too
<javierm>
mripard: Ok, I'll see to add that too when writing the patch for Documentation/gpu/todo.rst
<mripard>
for vc4 for example, we have an atomic_check function that I have unit-tests for, but on my workstation, and it "works" with me copy/pasting the source code each and every time I need to rework it
<mripard>
it's very far from optimal :)
<javierm>
:D
<javierm>
mripard: now you made me even more curious about kunit, gah I wish that had more time
<javierm>
tzimmermann: what a nice patch series, the diff stat speaks for itself. And is great to see that much of code duplication going away
<tzimmermann>
thanks :)
<tzimmermann>
as i said before, i'd like to make these helpers composable, so that complex conversions can be assembled from multiple simple ones. we're not there yet, but it's a big step
<javierm>
tzimmermann: it is a big step indeed
<javierm>
specially since then someone reading these helpers will have to just understand drm_fb_xfrm() (which is complex, true) rather than the small differences between the different conversion helpers
<javierm>
tzimmermann: and the diffstat after your patches speak for itself :)
<tzimmermann>
javierm, the next step is to use iosys_map for the pointers arguments. iosys_map will be ammended with caching information. from this, we can easily detect which dbuf/sbuf need temporary buffers and which can be used as-is. we should also be able to merge drm_fb_xfrm() and drm_fb_xfrm_toio() into a single function
<javierm>
tzimmermann: yup, I remember you mentioned that. Will speed up for the cases that don't use CMA/need a temp buffer
<javierm>
since currently we are always doing the extra copy just in case
<tzimmermann>
pq, there are drivers that want a conversion+byteswap. i think we can already express that with the proper 4cc code. but conversion helpers are not there yet. i've been unifying these functions for some time and still in the middle of it. for now, i'd prefer to keep is as-is
<tzimmermann>
pq, i'll see if some of that vkms code can go into generic helpers
<pq>
tzimmermann, cool, thanks :-)
<pq>
also, someone who actually does kernel dev would be nice to check by review comments on that series, since I'm not familar with kernel practises
<pq>
*my review comment
<pq>
tzimmermann, for now, the VKMS intermediate pixel format is not defined as a 4cc in order to use a struct conveniently.
apinheiro has joined #dri-devel
ppascher has joined #dri-devel
digetx has quit [Ping timeout: 480 seconds]
mclasen has joined #dri-devel
echoed has joined #dri-devel
echoed has left #dri-devel [#dri-devel]
consolers has joined #dri-devel
Lucretia has quit []
<consolers>
could it be some thread thing that causes any opencl thing to segfault when loading the mesa iris gallium dll?
<consolers>
i cant spot any reports on it either - except 2 on libreoffice/opencv i thinkfrom 2021 which were solved with downgrades
<consolers>
and if i search for opencl google is giving me results for opened, like some ocr typo
digetx has joined #dri-devel
devilhorns has joined #dri-devel
Lucretia has joined #dri-devel
consolers has quit [Ping timeout: 480 seconds]
sagar__ has quit [Remote host closed the connection]
sagar__ has joined #dri-devel
consolers has joined #dri-devel
MajorBiscuit has quit [Ping timeout: 480 seconds]
rkanwal has joined #dri-devel
rasterman has quit [Quit: Gettin' stinky!]
rasterman has joined #dri-devel
Lucretia has quit []
Lucretia has joined #dri-devel
lemonzest has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
itoral has quit [Remote host closed the connection]
<HdkR>
robclark: I noticed you wanted a way to determine big.little cores. Welcome to the pain train, there is no exact way to determine this, you must use heuristics. You could peek at what FEX-Emu does to classify big versus little but it leaves open the possibility of getting things wrong.
<robclark>
HdkR: I believe "capacity" is exposed in sysfs.. crosvm (or rather the thing that launches it) looks at this to setup big and little vcpu's.. but haven't had a chance to look more closely at that
<daniels>
MrCooper: I think we already had this discussion on IRC a couple of years ago?
<daniels>
MrCooper: by the time the dust had settled between keithp and ickle, it sounded like the conclusion was that it wouldn't be landed until xserver had a smart scheduler which would wait until the fences had actually signaled before doing anything
<daniels>
unfortunately mine & lfrb's time on this earth is but finite
<MrCooper>
thanks, don't remember such a fight, maybe I forgot about it :/
<MrCooper>
that argument does make sense to me though
<daniels>
yeah, it sounded like there was zero support for simply shuttling fences through, i.e. acting as we currently do with implicit sync
<daniels>
and that nothing would be landable until the server was taking decisions itself
<daniels>
it does sound like a good idea in isolation, but given the time that would require, and that you'd only really see any benefit if you weren't simply proxying via Xwl, or if you were mixing present + core rendering ... eh
<MrCooper>
the same thing could be done with implicit sync in principle, using dma-buf fds
rgallaispou has quit [Read error: Connection reset by peer]
<daniels>
it could!
<daniels>
many things are possible
<daniels>
a superset of the things which are sensible :P
<MrCooper>
it's arguably a requirement for proper mailbox behaviour
* daniels
shrugs
rpigott has joined #dri-devel
jewins has joined #dri-devel
<javierm>
tzimmermann: interesting, should we just land it then ?
<tzimmermann>
javierm, sure, why not.
<javierm>
tzimmermann: I wondered the same that Junxiao asked but just did the minimum change to fix this particular issue
mszyprow has joined #dri-devel
<tzimmermann>
well, he has a point
<tzimmermann>
then maybe do a v2 with the other interfaces fixed.
Company has joined #dri-devel
maxzor has joined #dri-devel
<daniels>
MrCooper: it sounds like a good thing to do and I'm certainly not going to talk you out of it :)
<zmike>
anholt: what's the deqp-runner syntax for multiple --env options?
<zmike>
or any deqp-runner expert
<javierm>
tzimmermann: sure. It can't do any harm I guess^Whope :)
consolers has joined #dri-devel
alyssa has joined #dri-devel
alyssa has left #dri-devel [#dri-devel]
tzimmermann has quit [Quit: Leaving]
consolers has quit [Ping timeout: 480 seconds]
mszyprow has quit [Ping timeout: 480 seconds]
fxkamd has joined #dri-devel
MajorBiscuit has quit [Ping timeout: 480 seconds]
rgallaispou has joined #dri-devel
<pepp>
zmike: I think you just pass multiple "--env name=value" param
<ajax>
because release builds enforce -Werror=unused-function. so any file you include the header in, must reference all the statics therein
<ajax>
unless they're inline, or __attribute__((unused)), or whatever
<alyssa>
ajax: ok.. is there a reason debug builds don't enforce -Werror=unused-function?
<daniels>
alyssa: you're calling it from an assert()
<daniels>
which makes it used in debug builds, and unused in release builds
<ajax>
hah, indeed, i skipped a step there
<alyssa>
aaaah
<daniels>
but anyway, just make it inline ... ?
<alyssa>
Yeah, the correct fix it is make it inline
<alyssa>
I'm more baffled why it went through my local build (and made it to CI at all)
Lucretia has quit [Remote host closed the connection]
<alyssa>
instead of gcc screaming at me to mark it inline
<alyssa>
I'm the kind of person that needs compilers to scream at me :p
<karolherbst>
alyssa: some people implement things in headers and include those
<karolherbst>
so if gcc would scream, it would break code :)
<alyssa>
sounds like a good thing to break ;)
<karolherbst>
if you are ready for that bikeshedding, please write the patch and explain why violating some weird spec is fine :D
<karolherbst>
let's make it a daily thing: shitting on C and be annoyed by how bad it is or something
<alyssa>
which is the spec violation?
<karolherbst>
does C even know about headers ?
<karolherbst>
the pre processor is probably a spec on its own
<ajax>
would be somewhat weird for the C standard to both define what goes in what standard headers and not know what headers are
<karolherbst>
I am sure the standard lib is another spec
<karolherbst>
maybe it isn't.. :D
<ajax>
open-std.org is down atm so i can't pull up n1570.pdf to check, but
<karolherbst>
the C spec is cursed
<karolherbst>
"??=define arraycheck(a, b) a??(b??) ??!??! b??(a??)"
<karolherbst>
who is a C expert and knows what that resolves to?
Lucretia has joined #dri-devel
<karolherbst>
that's right, it's #define arraycheck(a, b) a[b] || b[a]
<daniels>
trigraphs are so awesome
<karolherbst>
I didn't even knew they existed
<ajax>
they don't anymore iirc
<karolherbst>
ajax: I have the C17 spec here :(
<daniels>
in fairness gcc warns when you use trigraphs unless you specifically suppress it
<daniels>
'you 100% do not mean this, if you did then you can enable it but you didn't'
<karolherbst>
what's the reason to add those anyway?
<ajax>
i thought they were getting dropped in c23 was the rumor
<ajax>
because ebcdic doesn't have all of the basic character set for c89 in its minimal subset
<ajax>
so depending which s360 you find yourself on you might not have [] as, like, keys on the keyboard
<karolherbst>
ohh wow.. "The trigraph sequences enable the input of characters that are not defined in the Invariant Code Set as described in ISO/IEC 646, which is a subset of the seven-bit US ASCII code set."
<alyssa>
karolherbst: "I have the C17 spec here :(" ditching clang+llvm-spirv are we now?
<karolherbst>
alyssa: :D
<karolherbst>
I won't comment on that
<ajax>
so you cannot write those characters into files, which makes it hard for the compiler to tokenise them
<karolherbst>
uhhh
<ajax>
i'm blaming ebcdic here and i think there's at least one other non-ascii encoding that was partly to blame here, but
<jekstrand>
hrm... nir_lower_blend really shouldn't require 32-bit for logic ops...
* karolherbst
should use univode emoticons as function names more often
nchery has joined #dri-devel
<ajax>
greek alphabet in math functions please
<karolherbst>
good idea actually
<karolherbst>
assert becomes 🔥
<karolherbst>
we have those joke programminc language, but maybe there needs to be one where ANSI chars are invalid
<ajax>
every character must be from a unicode codepoint > 0xff
<alyssa>
karolherbst: do it in rust :p
<karolherbst>
I am not sure if the world is ready for that yet
<alyssa>
a C->NIR compiler written in Rust? how hard can it be?
<alyssa>
famous last
<karolherbst>
mhhh
<karolherbst>
don't tempt me
<alyssa>
You've been tempted! :-p
<karolherbst>
how much is rust self hostet, if llvm is still written in C anyway
<karolherbst>
*hosted
stuart has joined #dri-devel
Duke`` has joined #dri-devel
<rgallaispou>
Hi. I'm struggling with gamma again...
<rgallaispou>
In drm_atomic_uapi.c:384, what is the point of this test ? Is it only to test data alignment ? Because it won't pass any error to userland if the data is aligned according to 'expected_elem_size' but out of the struct (let it be 2048 + 8). This is my current issue, shown by kms_color@pipe-a-invalid-gamma-lut-sizes: the ioctl returns 0 when it should not. How does it go on Intel/AMD sides ?
<jekstrand>
Ok, here's a fun question: If someone doesn't write to gl_FragData.w but blending is such that w doesn't matter, do they get well-defined results? I think the answer is yes, unfortunately.
<hch12907>
alyssa: I think I had a C parser somewhere, written in rust... maybe we can repurpose that and make a C->NIR compiler, lol
* karolherbst
doesn't think he is ready for linking inside nir yet
<karolherbst>
heck, not even vtn would be ready
<alyssa>
jekstrand: I think so. Why is that unfortunate?
<jekstrand>
alyssa: Just more juggling we have to do in nir_lower_blend
<alyssa>
right, okay
<jekstrand>
I think the easy thing to do is just make the variable always match the format. Then we'll even get some dead-code action happening, maybe.
<vsyrjala>
rgallaispou: sounds like you're not checking that the blob has the correct size
<alyssa>
jekstrand: hm, alright
<alyssa>
it might be nice to nir_lower_blend for radeonsi-style shader epilogs on AGX
<alyssa>
but.. meh, tbh
<jekstrand>
Sure
<jekstrand>
Doesn't sound like a terrible idea
<alyssa>
actually, jank from shader variants on AGX with AAA games sounds like a great problem to have, don't worry about it ;)
<alyssa>
(and presumably that's all Vulkan content when someday asahivk is a thing)
MajorBiscuit has quit [Ping timeout: 480 seconds]
<rgallaispou>
vsyrjala: it seems it resolves to drm_atomic_replace_property_blob_from_id(), but I don't see any call to the stm driver
<rgallaispou>
vsyrjala: did you meant on a userland level or on the kernel side ?
<MrCooper>
daniels: FWIW, assuming a fence fd becomes readable when the fence is signalled, it shouldn't require a "smart scheduler": IgnoreClient if fence isn't signalled yet, AttendClient when the fd becomes readable
<jenatali>
Ah, sure, yeah I don't see any reason to not always have alpha
rasterman has quit [Quit: Gettin' stinky!]
stuart has joined #dri-devel
<HdkR>
robclark: sadly capacity only works if that is actually filled out. Also only gives you an idea, you still need to make a choice in big.bigger.biggest or small.smaller.smallest weirdo clustering setups :|
nchery has quit [Ping timeout: 480 seconds]
alyssa has joined #dri-devel
<alyssa>
if an OpenCL shader gets a pointer to something on the stack, what does that look like in (optimized, lowered) NIR?
<alyssa>
I guess load_scratch_base_ptr
apinheiro has joined #dri-devel
<karolherbst>
alyssa: yes and no
<karolherbst>
I think we have enough opt passes by now to resolve a lot of those things
<karolherbst>
but yes.. if it ends up as funtion_temp memory, that gets lowered to scratch
<karolherbst>
alyssa: I just don't think we end up with nir_load_scratch_base_ptr in CL
<alyssa>
Hmm
<karolherbst>
the base_ptr is only relevant for shader_calls as it seems
<alyssa>
huh, ok
<karolherbst>
for CL we just have a scratch space starting at 0 and the driver has to allocate that
<alyssa>
that's better for Mali, I guess
<karolherbst>
llvmpipe just mallocs :)
<karolherbst>
on Nvidia we'd use local memory
<alyssa>
though er how does that work
<karolherbst>
like the same as for spilled memory
<karolherbst>
ehh... spilled registers
<alyssa>
yeah, but those aren't spilled to 0x0
<alyssa>
..
<karolherbst>
the address doesn't matter
<alyssa>
even if you take an address of it..?
<karolherbst>
the only thing CL cares about is alignment of the address
<karolherbst>
alyssa: the neat part is, sharing those pointers across invocations is just undefined behavior
<alyssa>
...oh, I see the trick you're doing now.
<alyssa>
so even if the app does something cruel like *(&x[63] + y)
<karolherbst>
anyway.. we have deref_cast to get the actual pointer value
<alyssa>
it still just turns into load_scratch, never load_global
<karolherbst>
yep
<alyssa>
excellent
<alyssa>
will do the easy thing then
<karolherbst>
yeah
<karolherbst>
just use whatever stuff you use for indirect arrays
mvlad has quit [Remote host closed the connection]
<karolherbst>
and if you don't support scratch mem yet, just port your driver over to it :D
<alyssa>
heh, we have scratch
<karolherbst>
ahh
<karolherbst>
excellent
<alyssa>
but the hw likes to mangle the addresses for cache reasons
<karolherbst>
then it should just work, no?
<karolherbst>
yeah.. shouldn't matter
<alyssa>
and at first blush it looks like that mangling needs to be disabled for CL
<alyssa>
but yeah, ok
<karolherbst>
as long as the alignment stays the same
<alyssa>
Yep
<karolherbst>
CL has strict rules though
<alyssa>
each 16 byte chunk remains as-is
<karolherbst>
so int16 is 0x80 aligned
<karolherbst>
ehh
<karolherbst>
long16
<alyssa>
grumble. guess the mangling goes.
<karolherbst>
it should be fine though
<alyssa>
though... maybe not..?
<karolherbst>
I think if the kernel wants the address we simply use the offset
<alyssa>
because the app can never get to the physical pointer, only the virtual pointer starting at 0, which is aligned?
<karolherbst>
deref_cast (ssa_x) whatever
<karolherbst>
and that's just casting the deref thing to the constant
<karolherbst>
alyssa: yeah, I think so
<karolherbst>
the nir shader doesn't know the physical pointer anyway
<karolherbst>
you get the offset into load_scratch/store_scratch
<karolherbst>
what you do with that is up to you
nchery has joined #dri-devel
<karolherbst>
I don't think we even get a load_scratch_base_ptr at all
<karolherbst>
alyssa: ahhhh.. I know why I never saw any nir_load_scratch_base_ptr
<karolherbst>
using nir_address_format_32bit_offset_as_64bit, for temp memory :)
<karolherbst>
if you'd use nir_address_format_64bit_global _then_ you'd get load_scratch_base_ptr
<alyssa>
Sure, that works great for us :)
<karolherbst>
yeah.. I don't know who even wants real pointers on temp mem
<jenatali>
Intel does
<karolherbst>
well.. except llvmpipe
<jenatali>
That's why jekstrand added the scratch base ptrs IIRC
<karolherbst>
jenatali: seems to work fine without it?
<jenatali>
Or maybe it was only for making it work with generic pointers
<karolherbst>
maybe
<karolherbst>
ahh yeah
<karolherbst>
I think that's it
<alyssa>
generic pointers...?
<karolherbst>
because you need to allow drivers to map it into global mem
<alyssa>
why does CL do this to us
<karolherbst>
so load_scratch_base_ptr is the pointer into _global_ mem of the scratch space
<jenatali>
It's optional in 3.0 at least
<alyssa>
jenatali: optional means wontfix! :p
<jenatali>
Yeah until you find some app that needs it
<jenatali>
Which I hope there aren't any?
<karolherbst>
alyssa: I got CL C 2.0 kernels using generic pointers to compile without any of this mess though :D
<karolherbst>
jenatali: luxmark 3.1
<karolherbst>
but...
<karolherbst>
nir was able to resolve all generics to its original type
<jenatali>
Huh really? It uses generic?
<karolherbst>
yeah
<jenatali>
Interesting
<karolherbst>
that's why we added that alu of cast optimization
<karolherbst>
so we can optimize away NULL checks on generics
<karolherbst>
well.. if NULL is passed as an arg that is
<karolherbst>
anyway.. my hope is, that we can always resolve those... but I am sure that function calling will make that impossible
<karolherbst>
or we duplicate...
<karolherbst>
dunno
<karolherbst>
not a fan of having to generate worse code, just because of generics
<karolherbst>
jenatali: my mistake was to expose CL C 3.0 as the "default" languge, turns out, some applications assume you support CL C 2.0 then :)
<karolherbst>
and the spec specifically says to only do that if you support _all_ CL C 2.0 features
<jenatali>
Ah, yeah that makes sense
<karolherbst>
you can still expose it in the list property
<karolherbst>
just not through that single value one
<karolherbst>
CL_DEVICE_OPENCL_C_VERSION needs to be 1.2
<karolherbst>
CL_DEVICE_OPENCL_C_ALL_VERSIONS can list 3.0
<airlied>
i think we will need generic addresses for sycl
<karolherbst>
airlied: that's fine
<karolherbst>
I don't claim support for generics, not even using that address mode, but it still works fine
<karolherbst>
there are just realy rare corner cases where that would be required
<karolherbst>
like storing it into global mem and loading it loader
<karolherbst>
but I think for most applications just implementing functions with generic args we can probably wing it and hope it works out
<karolherbst>
airlied: is there a sycl CTS or something btw?
<karolherbst>
:D
<zmike>
dcbaker: I'll have a couple more backports for the next rc
<zmike>
will prob do them tomorrow morning before you get up
<karolherbst>
alyssa: anyway.. once you get rusticl working, I'd be interested how much breaks :D
<karolherbst>
my hope is that stuff simply passes, but...
<karolherbst>
alyssa: btw.. I have a patch which uses an ubo for the input buffer
<karolherbst>
there are 161 commits, and only 95 do rusticl stuff
<alyssa>
delight
<karolherbst>
most of it is bumping texture/sampler view limits
<karolherbst>
and some iris fixes
<karolherbst>
we also need to fix llvm for conformance, but..
<alyssa>
22.3 then?
<karolherbst>
maybe?
<danvet>
mlankhorst, too late here, pls ping me again tomorrow ...
<karolherbst>
though 22.2 should be possible
<karolherbst>
we just need reviews
<karolherbst>
alyssa: most of the stuff isn't really needed though.. I could do a run without any of those patches and see how bad it would be :D
cheako has joined #dri-devel
<alyssa>
"rusticl: the CTS is a piece of shit"
<alyssa>
maybe some git rebase needed too? :p
<karolherbst>
no, that's intentional
<karolherbst>
:D
<karolherbst>
although I think we might get that fixed in the CTS
<karolherbst>
there are other applications broken by it though
<karolherbst>
it's all so terrible
<karolherbst>
really hate that we have to do it like that
<karolherbst>
yeah.. I guess I'll change that at some point
nchery has quit [Ping timeout: 480 seconds]
Duke`` has quit [Ping timeout: 480 seconds]
nchery has joined #dri-devel
MrCooper has joined #dri-devel
<jekstrand>
jenatali: base pointers are for making it work with generic pointers and for making it work with ray-tracing.
<karolherbst>
jekstrand: you need it for ray tracing? :( sounds aweful
<alyssa>
is unaligned access with load/store_scratch defined?
<karolherbst>
nope
<alyssa>
excellent
<karolherbst>
at least not inside llvmpipe as we figured out yesterday :)
<jekstrand>
karolherbst: Yup. RT kernels do scratch totally differently for $REASONS
Haaninjo has quit [Quit: Ex-Chat]
<karolherbst>
alyssa: anyway.. you can assume that you'll get correct alignments for everything
<karolherbst>
if not, we messed up
<jekstrand>
Well, actually, the reason is really simple: Scratch offsets are assigned per logical invocation, not per physical thread because invocations may move around between threads as shaders are dispatched, rays are traces, continuations happen, etc.
<jekstrand>
Ok, maybe that's not simple. (-:
<karolherbst>
sounds horrible
<jekstrand>
It's a pretty straightforward consequence of the API
<karolherbst>
I bet it was sure fun to implement all of that
<alyssa>
raytracing sounds awful
<jekstrand>
Eh, it's kinda fun, actually.
<karolherbst>
implementing OpenCL is also kind of fun :P
* alyssa
fixes piles of spilling bugs on Valhall
<karolherbst>
yay
<karolherbst>
are you running luxmark yet?
mszyprow has quit [Ping timeout: 480 seconds]
rasterman has joined #dri-devel
stuart has quit [Ping timeout: 480 seconds]
<alyssa>
no, ES3.1 cts
danvet has quit [Ping timeout: 480 seconds]
anarsoul has quit [Ping timeout: 480 seconds]
ppascher has quit [Ping timeout: 480 seconds]
rasterman has quit [Quit: Gettin' stinky!]
heat has joined #dri-devel
<anholt>
danylo: does gfxreconstruct have a way to look at the state (particularly image contents) along the way of rendering a frame?
rasterman has joined #dri-devel
iive has joined #dri-devel
stuart has joined #dri-devel
rasterman has quit [Quit: Gettin' stinky!]
rasterman has joined #dri-devel
ppascher has joined #dri-devel
nchery has quit [Ping timeout: 480 seconds]
<danylo>
anholt: nope, no way to look at any state there
nchery has joined #dri-devel
<danylo>
only way is to make a renderdoc capture and inspect it there, which could be tricky when you trying to debug a hang...
fxkamd has quit []
rasterman has quit [Quit: Gettin' stinky!]
<anholt>
luckily not a hang on this one, just the first 2kb of gfxbench vk-5-normal's screen being corrupted.
apinheiro has quit [Quit: Leaving]
maxzor has quit [Ping timeout: 480 seconds]
<HdkR>
Is there any way to get wayland to not autodetect monitor/output removal like X?
<daniels>
HdkR: ask your compositor
<HdkR>
hmmm
<daniels>
Wayland only does what it’s told to
<HdkR>
Sadly I don't think sway has a swaymsg command to disable autodetect
lemonzest has quit [Quit: WeeChat 3.4]
<HdkR>
Oh well, I'll wait for that part of the ecosystem to mature some more :)
pcercuei has quit [Quit: dodo]
eukara has quit []
<Ristovski>
karolherbst: Where can I find progress on radeonsi support for rusticl? In the draft comments you mentioned that airlied is working on that part?
<karolherbst>
Ristovski: dunno.. but talking with airlied on this made it sound like it would take a while, because how AMD is doing compute is super messy
<Ristovski>
Heh, sounds about right
<karolherbst>
they have their own kernel ABI and stuff
<Ristovski>
Hmm, as in amdkfd?
<karolherbst>
no, shader ABI
Kayden has quit [Quit: go to office]
<Ristovski>
Aaah, that makes more sense
<karolherbst>
so the idea would be to wire up ACO or something, but that also sounds like ton of work
* Ristovski
reads discussion from logs
alyssa has left #dri-devel [#dri-devel]
icecream95 has joined #dri-devel
icecream95 has quit []
icecream95 has joined #dri-devel
eukara has joined #dri-devel
mclasen has quit []
mclasen has joined #dri-devel
<karolherbst>
anybody ever used phoronix-test-suite with their own compiled binaries? I think it just cleans the environment making it a pita to use
<dschuermann>
will definitely take some time to land. we first have to get rid of the remaining radv bits in aco
<Ristovski>
Hmm, does it only support recent GFX or does it go all the way back to GCN1?
<karolherbst>
yeah.. sounds like quite the project
<karolherbst>
Ristovski: probably the same thing where radv runs on
heat has quit [Remote host closed the connection]
heat has joined #dri-devel
<Ristovski>
I see, that should cover GCN1 as well then
<Ristovski>
(asking since I saw MCBP mentioned and that is GFX8+)
<karolherbst>
"The test run did not produce a result." *sigh*
<karolherbst>
ahh works with system bins
<karolherbst>
"fun"
<karolherbst>
well.. it doesn't afterall
mdroper has joined #dri-devel
<karolherbst>
"The test run ended quickly" yeah well...
<karolherbst>
wow.. it does crash the GPU context
anarsoul has joined #dri-devel
<karolherbst>
ehh "write: 512 GB in 736.9 ms: 694.8 GB/s" I have questions
gawin has quit [Ping timeout: 480 seconds]
tursulin has quit [Read error: Connection reset by peer]
<Ristovski>
lol
<karolherbst>
either we are that good or something is fishy
<karolherbst>
I suspect we are not handling 64 bit sized things all that well
<karolherbst>
"Test buffers will use GB" well..
<karolherbst>
what crappy code is that
morphis has quit [Ping timeout: 480 seconds]
<karolherbst>
ahh yeah.. it passes a null buffer in? wtf
morphis has joined #dri-devel
<Ristovski>
unrelated PSA: https://github.com/iovisor/bpftrace is seriously OP, I just used it as a no-mess `initcall_debug` alternative and it's probably much less overhead as well. Possibilities are truly endless *goes back to profiling random crap*
<karolherbst>
jekstrand: I can trigger a "[drm] rusticl queue t[1648304 context reset due to GPU hang" reliably :(