<Mary>
linkmauve: what version of mesa? Going to update to mainline to see if there is any kind of regression but latest main work here in a meson devenv
<linkmauve>
Mary, f1f93ff1f68, which was master two weeks ago.
<linkmauve>
I haven’t tried a stable release yet on this board, usually things always work better on master in Mesa.
<Mary>
linkmauve: tried on latest stable-rc + mesa main and I get everything working under a meson devenv
<Mary>
looking at linux master, it seems to only have my prio restriction patch so there shouldn't be any big difference with stable-rc
<Mary>
linkmauve: can you give me some logs? or maybe give meson devenv a try and see if it works under there?
<linkmauve>
I usually build on a different computer than the rk3588, so installing a package is easier for me. Which logs are you interested in?
<linkmauve>
I’m building main as we talk, so I can test on a build from today.
<Mary>
linkmauve: probably clinfo to see if anything is failing on rusticl side
<Mary>
I never really tested rusticl outside of devenv so maybe it's that? (maybe karolherbst have some ideas?)
<linkmauve>
With any debug env var? Because otherwise jagan_’s log is exactly the same as mine.
Consolatis_ has joined #panfrost
Consolatis is now known as Guest4869
Consolatis_ is now known as Consolatis
Guest4869 has quit [Ping timeout: 480 seconds]
<Mary>
hmm... I build in debugoptimized, let me try a proper release build just in case
<Mary>
linkmauve: tried with release no issues, I even installed the library and loaded it just fine too with RUSTICL_ENABLE=panfrost OCL_ICD_VENDORS=/opt/local/panfrost/etc/OpenCL/vendors clinfo
<Mary>
the only diff is that I have libclc installed and don't have that warning you both have so maybe it's related?
<Mary>
(Testing on a Fedora on a Rock 5B with latest stable-rc in term of kernel)
DPA has joined #panfrost
rasterman has joined #panfrost
DPA has quit [Quit: ZNC 1.8.2+deb3.1+deb12u1 - https://znc.in]
DPA has joined #panfrost
DPA2 has joined #panfrost
DPA has quit [Read error: Connection reset by peer]
manu has joined #panfrost
DPA has joined #panfrost
DPA2 has quit [Read error: Connection reset by peer]
DPA2 has joined #panfrost
DPA has quit [Ping timeout: 480 seconds]
DPA has joined #panfrost
DPA2 has quit [Ping timeout: 480 seconds]
rasterman has quit [Quit: Gettin' stinky!]
pbrobinson has joined #panfrost
<karolherbst>
Mary: should just work in a devenv
<karolherbst>
PipeScreen::new might be a good place to look with gdb and see what happens
<linkmauve>
karolherbst, it would be nice to have RUSTICL_ENABLE=list or RUSTICL_ENABLE=help to make sure we don’t typo it.
<linkmauve>
And maybe error out if we try to enable a driver which doesn’t exist.
<linkmauve>
Hmm, despite having built Mesa in debug mode, /usr/lib/libRusticlOpenCL.so.1 only has very few of them.
<linkmauve>
Not true, when I strip it it gets lighter from 50 MiB to 9.9 MiB, but there are no symbols inside, especially not PipeScreen::new() nor PipeScreen::load_screens().
<karolherbst>
linkmauve: rusticl only exports three symbols, but maybe something went wrong with the rust flags?
<karolherbst>
rusticl here is 166MiB
<linkmauve>
I only built the panfrost and rocket Gallium drivers, and the panfrost (panvk) Vulkan driver.
<karolherbst>
yeah.. maybe it's that
<linkmauve>
So it being smaller might be related.
<karolherbst>
I have a couple of more drivers
<karolherbst>
linkmauve: can you break on panfrost_create_screen instead and see if it gets called?
<linkmauve>
I could add llvmpipe to see if the issue comes from rusticl or from panfrost.
<karolherbst>
yeah.. having llvmpipe built is always a good idea
<linkmauve>
Although, the exact same Mesa build gives a clinfo result on a Mali-G52.
<karolherbst>
could be some odd difference in supported stuff
<linkmauve>
karolherbst, no it isn’t.
<linkmauve>
(panfrost_create_screen)
<karolherbst>
mhhhhhh
<karolherbst>
_odd_
<karolherbst>
mhh wait.. the screen gets created after filtering...
<karolherbst>
break on pipe_loader_probe then
<karolherbst>
it might get multiple times
<linkmauve>
This one does work.
<karolherbst>
but it should return something if panfrost works alright
<karolherbst>
and maybe dump the driver
<linkmauve>
It gets called exactly twice.
<linkmauve>
Breakpoint 1, pipe_loader_probe (devs=0x0, ndev=0, with_zink=true) at ../mesa/src/gallium/auxiliary/pipe-loader/pipe_loader.c:64
<linkmauve>
Breakpoint 1, pipe_loader_probe (devs=0xaaaaaab92990, ndev=1, with_zink=true) at ../mesa/src/gallium/auxiliary/pipe-loader/pipe_loader.c:64
<karolherbst>
sure, but random_things can happen and mesa fails to get an fd
<linkmauve>
Oh uh, eglinfo also gives me swrast!
<karolherbst>
yeah.. something something, maybe the kernel driver fails to do something
<karolherbst>
you should check what error you get when mesa tries to open the renderer node
<karolherbst>
should be in loader_open_device
<karolherbst>
there is alog_ thing, but you might want to print errnor after open just in case
<karolherbst>
*errno
<linkmauve>
[ 10.193714] [drm] Initialized panthor 1.0.0 for fb000000.gpu on minor 0
<linkmauve>
$3 = 22
<linkmauve>
(gdb) p (int)errno
<linkmauve>
So EINVAL.
<karolherbst>
yeah mhhh no idea about that one :) sounds like either a panfrost or... something else bug
<linkmauve>
panthor*
<karolherbst>
or panthor
<karolherbst>
Mary: ^^ open returns EINVAL when opening the file descriptor
<linkmauve>
How do you enable more verbose DRM logs in the kernel?
<karolherbst>
there is drm.debug you can flip at runtime, but that can be a bit verbose if you don't know the right flags to enable
<linkmauve>
Hmm, just says [28734.372029] [drm:drm_stub_open] four times.
<linkmauve>
I wrote 0xff in /sys/module/drm/parameters/debug.
<karolherbst>
yeah.. probably need to check if panthor has any debug flags
<linkmauve>
Not according to modinfo.
<linkmauve>
modinfo drm describes the drm.debug bits btw.
<karolherbst>
yeah,.. but I don't have any ideas on how to debug this issue :) Maybe something something needs to happen so one can open the node, but if GL also fails, I suspect it's some general issue
<linkmauve>
Yeah, panvk also fails, all three of them.
<linkmauve>
I’ll try stable 6.11 instead of mainline.
<linkmauve>
If this work I’ll bisect the kernel.
<linkmauve>
6.11.0 does open() fine!
<linkmauve>
And clinfo and eglinfo do work, but PAN_I_WANT_A_BROKEN_VULKAN_DRIVER=1 vulkaninfo still doesn’t.
<linkmauve>
I’ll bisect.
urja has quit [Read error: Connection reset by peer]
urja has joined #panfrost
<linkmauve>
13 steps remaining.
<karolherbst>
impressive... wouldn't have considered a kernel regression
<linkmauve>
Yeah me neither.
kinkinkijkin has quit [Quit: Leaving]
<linkmauve>
… I should have moved from SD to NVMe before doing this bisect, I’m bottlenecked by the IO speed of extracting the modules. ^^'
<Mary>
karolherbst: that's weird...
<Mary>
you don't set any priority on the context creation right?
<linkmauve>
Bisecting: 28 revisions left to test after this (roughly 5 steps)
<linkmauve>
I should have used ccache.
<karolherbst>
Mary: nope, but it already fails at loader time, so just opening the renderer node
<Mary>
Okay so it's not my changes around group priority then...
Andrey has joined #panfrost
Andrey has quit [Remote host closed the connection]
warpme has joined #panfrost
warpme has quit []
<Mary>
linkmauve, karolherbst: 6.11.1-rc1 is fine and that contains all the changes for panthor that are in current master
<Mary>
there is a merge commit on master tho
<Mary>
so maybe the issue is coming from that?
cphealy has quit [Quit: Leaving]
rasterman has quit [Quit: Gettin' stinky!]
<linkmauve>
Bisecting: 6 revisions left to test after this (roughly 3 steps)
luiz_felipe has joined #panfrost
luiz_felipe has quit []
<linkmauve>
Mary, karolherbst, jagan_, 641bb4394f405cba498b100b44541ffc0aed5be1 is the first bad commit.