<kurufu>
Is it possible to get perf to report symbols for amdgpu module stacks, they seem to be recorded normally for `sudo perf -a`, and `perf script (--show-kernel-path)` seems to report built in modules fine however amdgpu was built m instead of y so maybe I need something different to symbolize the stacks?
<kurufu>
i.e. drm symbols between amdgpu stacks show up fine.
grillo_00 has joined #dri-devel
cascardo_ has joined #dri-devel
halves1 has joined #dri-devel
cascardo has quit [Ping timeout: 480 seconds]
grillo_0 has quit [Ping timeout: 480 seconds]
halves has quit [Ping timeout: 480 seconds]
itoral has joined #dri-devel
JRepin has quit []
JRepin has joined #dri-devel
glennk has quit [Ping timeout: 480 seconds]
tzimmermann has joined #dri-devel
coldfeet has joined #dri-devel
YuGiOhJCJ has quit [Quit: YuGiOhJCJ]
haver has quit [Ping timeout: 480 seconds]
bolson has quit [Ping timeout: 480 seconds]
haver has joined #dri-devel
parthiban has joined #dri-devel
kts has joined #dri-devel
fab has quit [Quit: fab]
sima has joined #dri-devel
frieder has joined #dri-devel
kts has quit [Remote host closed the connection]
kts has joined #dri-devel
jsa1 has joined #dri-devel
fab has joined #dri-devel
sghuge has quit [Remote host closed the connection]
sghuge has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
The_Company has quit [Remote host closed the connection]
vliaskov has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
surajkandpal1 has quit [Ping timeout: 480 seconds]
MrCooper_ has joined #dri-devel
oneforall2 has quit [Read error: Connection reset by peer]
NiGaR has quit [Remote host closed the connection]
NiGaR has joined #dri-devel
lion328 has quit [Quit: Leaving]
lion328 has joined #dri-devel
jkrzyszt has joined #dri-devel
rasterman has joined #dri-devel
MrCooper_ has joined #dri-devel
MrCooper has quit [Ping timeout: 480 seconds]
jsa1 has quit []
mehdi-djait3397165695212282475 has joined #dri-devel
<emersion>
i haven't seen that
phasta has joined #dri-devel
itoral has quit [Ping timeout: 480 seconds]
itoral has joined #dri-devel
glennk has joined #dri-devel
MrCooper__ has joined #dri-devel
jsa1 has joined #dri-devel
MrCooper_ has quit [Ping timeout: 480 seconds]
<mlankhorst>
The mmap offset is used when mmaping into the drm fd, I don't think it's required for importing
<mlankhorst>
In fact counterproductive, you would mmap the dma-buf at offset 0
MrCooper__ is now known as MrCooper
<MrCooper>
per the mutter discussion I linked, passing offset 0 to eglCreateImageKHR results in all black, whereas passing the offset from struct drm_mode_map_dumb results in correct output
coldfeet has quit [Ping timeout: 480 seconds]
kts has joined #dri-devel
<sima>
MrCooper, yeah that sounds like busted driver implementation
<sima>
since the map_dumb offset is within the drm_fd and doesn't make sense anywhere else, and has nothing to do with any kind of in-buffer offset you might pass when recreating and image from metadata
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
coldfeet has joined #dri-devel
<MrCooper>
right, thanks, I guess it could actually be a mutter issue though, maybe it just "worked" in a different way than assumed
phasta has quit [Quit: Leaving]
NiGaR has quit [Remote host closed the connection]
NiGaR has joined #dri-devel
<karolherbst>
alyssa: sooo... I'm kinda thinking about using some of your CL C stuff, though I'm also wondering if I want it to be a bit more higher level... like a CL meta thing. What I want to do is to accelerate certain CL APIs by using kernels instead of well... CPU copies 🙃 but I also don't really want to integrate any of those helpers on the rust side,
<karolherbst>
so I was wondering if I want to simply compile cl kernels to spirv, and then do the meta thing instead using it as an internal helper lib like done for the other drivers.
kts has quit [Ping timeout: 480 seconds]
halves1 has quit []
parthiban has quit []
chaos_pr1 has joined #dri-devel
chaos_princess has quit [Read error: Connection reset by peer]
<sima>
airlied, btw on missing drops and accidentally wrong nesting, quite a while ago we discussed adding lockdep annotations to Arc around kref_put so that you never drop a reference while holding locks that would cause trouble if it's the final reference
<sima>
I think lina had the patches once somewhere, but no idea where they are
<sima>
dakr, ^^
<sima>
it's already a pain in C, but with rust's drop mostly not being visible in the source code, it's worse there imo
MrCooper_ has joined #dri-devel
MrCooper is now known as Guest8134
MrCooper_ is now known as MrCooper
chaos_pr1 has quit []
chaos_princess has joined #dri-devel
Guest8134 has quit [Ping timeout: 480 seconds]
digetx has quit [Ping timeout: 480 seconds]
digetx has joined #dri-devel
guludo has joined #dri-devel
<krh>
karolherbst: rust to spirv instead of CL C?
digetx has quit [Ping timeout: 480 seconds]
<karolherbst>
not sure I want to open that can of worms yet
<glehmann>
rewrite rusticl in C? :P
digetx has joined #dri-devel
<karolherbst>
I think the simplest path is to write kernels and simply use whatever internal APIs I have to load the spirv... maybe embed it into the lib so I won't have to do weirdo fs operations
odrling has quit [Remote host closed the connection]
odrling has joined #dri-devel
u-amarsh04 has quit []
MrCooper_ has joined #dri-devel
u-amarsh04 has joined #dri-devel
MrCooper has quit [Ping timeout: 480 seconds]
digetx has quit [Ping timeout: 480 seconds]
pcercuei has joined #dri-devel
nerdopolis has joined #dri-devel
digetx has joined #dri-devel
rgallaispou has quit [Remote host closed the connection]
nerdopolis has quit [Ping timeout: 480 seconds]
surajkandpal has quit [Ping timeout: 480 seconds]
rgallaispou has joined #dri-devel
NiGaR has quit [Remote host closed the connection]
NiGaR has joined #dri-devel
NiGaR has quit [Remote host closed the connection]
NiGaR has joined #dri-devel
itoral has quit [Remote host closed the connection]
rbm has quit [Quit: ---]
rbm has joined #dri-devel
digetx is now known as Guest8146
digetx has joined #dri-devel
Guest8146 has quit [Ping timeout: 480 seconds]
Mangix has quit [Ping timeout: 480 seconds]
feaneron has joined #dri-devel
Mangix has joined #dri-devel
<Ermine>
MrCooper: I guess the driver in question is nvidia?
MrCooper_ is now known as MrCooper
<krh>
karolherbst: yeah, kernels in rust is definitely not a well-trodden path
<karolherbst>
I only need them for memory copies, so writing the kernels itself isn't the issue here anywya
<karolherbst>
CL allows for strided buffer copies and other funky things
<MrCooper>
Ermine: good guess in general, and it's close but nouveau; anyway, I suspect it's something else funky, not a driver issue
<Ermine>
it uses default dumb_map_offset impl, so yeah
MrCooper_ has joined #dri-devel
MrCooper has quit [Ping timeout: 480 seconds]
digetx has quit [Ping timeout: 480 seconds]
glennk has quit [Read error: Connection reset by peer]
<karolherbst>
I don't think it's worth pulling huge libs like that in, just because I want to run some loops on the gpu copying memory around
<zmike>
pac85: 😬
<eric_engestrom>
karolherbst: ack
MrCooper__ has joined #dri-devel
kts has joined #dri-devel
<alyssa>
karolherbst: so the easy thing to do is to use vtn_bindgen2, which will let you write CL libraries (not kernels) and then it exposes nir_builder bindings for them
<alyssa>
which is all upstream now
<alyssa>
you still are responsible for wrapping it up in a nir_builder_init_simple_shader and such but it at least lets you express the logic in CL C
<alyssa>
the fancy stuff isn't really ready for common code to use yet (i'm working on that) and definitely not for Rust code
MrCooper_ has quit [Ping timeout: 480 seconds]
MrCooper__ is now known as MrCooper
fab has quit [Quit: fab]
ccr has quit [Ping timeout: 480 seconds]
rgallaispou has quit [Read error: Connection reset by peer]
guludo has quit [Ping timeout: 480 seconds]
rgallaispou has joined #dri-devel
cascardo_ has quit []
cascardo has joined #dri-devel
kzd has joined #dri-devel
davispuh has joined #dri-devel
fab has joined #dri-devel
gil has joined #dri-devel
guludo has joined #dri-devel
ccr has joined #dri-devel
MrCooper_ has joined #dri-devel
MrCooper has quit [Ping timeout: 480 seconds]
guludo has quit [Ping timeout: 480 seconds]
bolson has joined #dri-devel
JRepin has quit []
JRepin has joined #dri-devel
jsa1 has quit [Ping timeout: 480 seconds]
nerdopolis has joined #dri-devel
frieder has quit [Remote host closed the connection]
haaninjo has joined #dri-devel
ccr has quit [Ping timeout: 480 seconds]
u-amarsh04 has quit []
u-amarsh04 has joined #dri-devel
Duke`` has joined #dri-devel
guludo has joined #dri-devel
mrbro has joined #dri-devel
coldfeet has quit [Remote host closed the connection]
<mrbro>
Crypto Announcement: https://comcoin.fun/ - this will moon today at 20:30 CET
nerdopolis has quit [Ping timeout: 480 seconds]
MrCooper__ has joined #dri-devel
<karolherbst>
alyssa: mhh yeah.. might be a bit toooo much for what I need, which is simply to write some kernels to run against two memory objects + offset + range inputs, so maybe I just write kernels by hand and embed the spir-v then
<alyssa>
you just copypaste a few lines of meson adn go
<karolherbst>
but for that I have to use the nir_builder, no?
<alyssa>
yes, and..?
<karolherbst>
I don't use the nir_builder
<alyssa>
I guess you already have code to ingest spir-v's. lol. fair
<karolherbst>
yeah...
<alyssa>
at least use mesa_clc then
<karolherbst>
that's kinda the thing :D
<alyssa>
no binary blobs in tree please
<karolherbst>
yeah.. that was my original plan
tzimmermann has quit [Quit: Leaving]
<alyssa>
cool
<alyssa>
:P
<karolherbst>
I just wonder if I want to do raw CL calls (so basically creating OpenCL meta) or just my internal core API and create stuff directly...
<karolherbst>
well.. I should prototype it
heat has joined #dri-devel
nerdopolis has quit [Ping timeout: 480 seconds]
<alyssa>
karolherbst: I've contemplated doing a generic Gallium meta backend for my driver CL stuff
<alyssa>
depending on how much of a hurry you're in
<karolherbst>
mhhhhhhhh
<alyssa>
(I've started designing this stuff, will probably start typing next week at this rate, still arguing with myself over lifetime issues & locking & such)
<karolherbst>
90% of the time I spend on this will be on working on my own code anywya
ryanneph has joined #dri-devel
<karolherbst>
I really only want to accelerate some memory copies 🙃
<karolherbst>
mostly for copies between textures and buffers, and for strided buffer copies
<alyssa>
karolherbst: you can also just make this the gallium driver's problem
<karolherbst>
feels like something hardware can do, apparently gallium doesn't have interfaces for it
<alyssa>
because most gallium drivers can do something better than you can
<karolherbst>
yeah....
<karolherbst>
so that's the alternative approach
MrCooper__ is now known as MrCooper
<karolherbst>
but that means making a mess out of resource_copy_region
<alyssa>
any vulkan-capable hw has impls of all this in the vk driver
<alyssa>
it's just not plumbed into gl because nothing gallium has caredyet
<karolherbst>
also strided buffer copies?
yshui has quit [Read error: Connection reset by peer]
<karolherbst>
like..
<karolherbst>
you have two buffers, but for some reasons you hallucinate them being images, but not real images, just buffers, but you copy lines with strides around
rasterman has quit [Quit: Gettin' stinky!]
<karolherbst>
(clEnqueueCopyBufferRect)
<alyssa>
uhhh ok that one is pretty wacky
<karolherbst>
which I also need for things like image from buffer stuff.... I was wondering if I just write it all with a kernel and then I think about using optimized driver paths
<karolherbst>
so I work on the fallback first, so it works everywhere
<karolherbst>
because atm I'm doing copies on the CPU for those things
<karolherbst>
and then if I care enough, I make gallium and drivers be more competent
<zmike>
I think at this point I've been saying for literally years to just ram it through resource_copy_region
<zmike>
some drivers already handle it
<karolherbst>
yeah...
<karolherbst>
but I want to nuke my CPU path
<karolherbst>
so I will probably do the fallback first
<karolherbst>
and then use resource_copy_region
<karolherbst>
atm that I stall everything is worse than not using optimized driver paths but a kernel instead
lynxeye has quit [Quit: Leaving.]
jsa1 has joined #dri-devel
nerdopolis has joined #dri-devel
yshui has joined #dri-devel
ccr has joined #dri-devel
nerdopolis has quit [Ping timeout: 480 seconds]
guludo has quit [Ping timeout: 480 seconds]
guludo has joined #dri-devel
dsimic is now known as Guest8164
dsimic has joined #dri-devel
Guest8164 has quit [Ping timeout: 480 seconds]
mrbro has quit [autokilled: This host violated network policy. Mail support@oftc.net if you feel this in error. (2025-02-05 18:26:40)]
coldfeet has quit [Quit: Lost terminal]
alanc has quit [Remote host closed the connection]
alanc has joined #dri-devel
OpenSauce has joined #dri-devel
OpenSauce has quit []
mehdi-djait3397165695212282475 has quit []
haver has quit [Ping timeout: 480 seconds]
jsa1 has quit [Ping timeout: 480 seconds]
jsa1 has joined #dri-devel
jsa2 has joined #dri-devel
jsa1 has quit [Ping timeout: 480 seconds]
guludo has quit [Quit: WeeChat 4.5.1]
gil has quit []
guludo has joined #dri-devel
gouchi has joined #dri-devel
gouchi has quit [Remote host closed the connection]
feaneron has quit [Quit: feaneron]
<airlied>
karolherbst: seems pointless to do fallback
<karolherbst>
airlied: I have a lot of other things drivers won't be able to provide accelerations for
<airlied>
since most gpus will want to use copy engines anyways
<karolherbst>
I'll need that infra anyway
<karolherbst>
I'll have to do buffer clears faking to be a 2D image, meaning I'm not allowed to touch gaps
<karolherbst>
it's annoying
<karolherbst>
there are a couple of those around
<karolherbst>
plain image <-> buffer things should hit hardware paths, yes, but that's just one of the issues
<airlied>
anything vulkan exposes should hit hw paths, except in rare circumstances
<karolherbst>
airlied: what vulkan API should I use for clEnqueueCopyBufferRect then?
aljazmc has joined #dri-devel
<karolherbst>
though I think mike knows a dirty hack for that 🙃
<zmike>
how dare you
<zmike>
such accusations will not stand
<karolherbst>
:D
<karolherbst>
though we both know which one I'm talking about
<karolherbst>
though there are still other APIs which are even more of an issue
jsa2 has quit [Ping timeout: 480 seconds]
davispuhh has joined #dri-devel
davispuh has quit [Ping timeout: 480 seconds]
<airlied>
unrolled copy buffers :-P
<jenatali>
That's what I've got in CLOn12 right now
* airlied
won't tell how amd transfer queues do image copies
<karolherbst>
I doubt that's any faster...
<karolherbst>
but yeah, that's what I'm doing atm
<karolherbst>
uhm...
<karolherbst>
was thinking of fill buffer
<karolherbst>
same issue
<karolherbst>
but unrolled copy buffer is better than a cpu copy for real
haver has joined #dri-devel
<karolherbst>
no idea why I haven't thought of that...
<karolherbst>
prolly I was copying clover too aggressively there 🙃
aljazmc has quit []
<jenatali>
I've got a better solution I can/should do for D3D now too, if I ever get back to CL... the image<->buffer path can technically have buffers on both sides of it which are strided like images
<jenatali>
But originally the strides had to be 256-byte aligned. Now that's relaxed though