ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar
pcercuei has quit [Quit: dodo]
vliaskov has quit [Remote host closed the connection]
danylo has joined #dri-devel
alatiera has quit [Quit: Connection closed for inactivity]
mclasen has quit []
mclasen has joined #dri-devel
mvlad has quit [Remote host closed the connection]
mclasen_ has joined #dri-devel
mclasen has quit [Remote host closed the connection]
mclasen_ has quit [Read error: Connection reset by peer]
glennk has quit [Ping timeout: 480 seconds]
Leopold_ has joined #dri-devel
shashanks_ has joined #dri-devel
Leopold_ has quit [Remote host closed the connection]
Leopold has joined #dri-devel
shashanks has quit [Ping timeout: 480 seconds]
Leopold has quit [Remote host closed the connection]
mbrost_ has quit [Ping timeout: 480 seconds]
Leopold_ has joined #dri-devel
flynnjiang has joined #dri-devel
Leopold_ has quit [Remote host closed the connection]
flynnjiang has quit [Remote host closed the connection]
flynnjiang has joined #dri-devel
alanc has quit [Remote host closed the connection]
alanc has joined #dri-devel
checkfoc_us has quit []
checkfoc_us has joined #dri-devel
macslayer has quit [Ping timeout: 480 seconds]
yyds has joined #dri-devel
co1umbarius has joined #dri-devel
columbarius has quit [Ping timeout: 480 seconds]
heat has quit [Ping timeout: 480 seconds]
YuGiOhJCJ has joined #dri-devel
cheako has quit [Quit: Connection closed for inactivity]
mbrost has joined #dri-devel
Leopold_ has joined #dri-devel
mbrost has quit [Ping timeout: 480 seconds]
Leopold_ has quit [Remote host closed the connection]
mbrost has joined #dri-devel
Leopold_ has joined #dri-devel
Company has quit [Quit: Leaving]
Leopold_ has quit [Remote host closed the connection]
Leopold has joined #dri-devel
<kurufu> https://registry.khronos.org/EGL/extensions/EXT/EGL_EXT_device_query.txt suggests eglQueryDisplayAttribEXT should never return EGL_TRUE and EGL_NO_DEVICE_EXT, but I am experiencing this and quite confused how this might occur without my EGLDisplay being totally busted.
konstantin_ has joined #dri-devel
konstantin has quit [Ping timeout: 480 seconds]
mbrost has quit [Ping timeout: 480 seconds]
aravind has joined #dri-devel
TMM has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
TMM has joined #dri-devel
Leopold has quit [Remote host closed the connection]
Leopold_ has joined #dri-devel
Leopold_ has quit [Remote host closed the connection]
Leopold_ has joined #dri-devel
<kurufu> Is it possible that glvnd is somehow introducing this, debugging with minimal mesa symbols suggests I get into eglQueryDisplayAttribEXT in eglapi.c, and it writes to value, but it writes 0.
<kurufu> but i dont have symbols to inspect disp.
Leopold_ has quit [Remote host closed the connection]
<HdkR> kurufu: I'd recommend getting symbols
<kurufu> If symbols suggest that device is 0, whats left?
Duke`` has joined #dri-devel
<HdkR> Confirmation at least :)
<kurufu> The write is confirmed at least, by the time it returns to glvnd indeed it has written nothing and returned something that seems to be forbidden by the spec.
<kurufu> but yea ill build mesa later and confirm.
Leopold_ has joined #dri-devel
<kurufu> `$5 = (_EGLDevice *) 0x0` gdb seems to agree the device is 0.
bmodem has joined #dri-devel
fab has joined #dri-devel
Leopold_ has quit [Remote host closed the connection]
itoral has joined #dri-devel
Duke`` has quit [Ping timeout: 480 seconds]
ngcortes has quit [Ping timeout: 480 seconds]
<kurufu> I guess its worth a bug, it seems dri2_setup_device happens and has the device and sets it, but the same display later has the device zeroed out...
kzd has quit [Ping timeout: 480 seconds]
kts has joined #dri-devel
kts has quit [Remote host closed the connection]
Leopold has joined #dri-devel
konstantin_ is now known as konstantin
ngcortes has joined #dri-devel
Leopold has quit [Remote host closed the connection]
Leopold_ has joined #dri-devel
asriel has quit [Quit: Don't drink the water. They put something in it to make you forget.]
Danct12 has quit [Quit: ZNC 1.8.2 - https://znc.in]
Danct12 has joined #dri-devel
Leopold_ has quit [Remote host closed the connection]
Leopold has joined #dri-devel
glennk has joined #dri-devel
ngcortes has quit [Ping timeout: 480 seconds]
Leopold has quit [Remote host closed the connection]
fab has quit [Quit: fab]
Leopold has joined #dri-devel
sima has joined #dri-devel
kts has joined #dri-devel
Leopold has quit [Remote host closed the connection]
kts has quit [Remote host closed the connection]
rasterman has joined #dri-devel
kts has joined #dri-devel
asriel has joined #dri-devel
Leopold_ has joined #dri-devel
Leopold_ has quit [Remote host closed the connection]
fab has joined #dri-devel
Omax has joined #dri-devel
Omax_ has quit [Remote host closed the connection]
bmodem has quit [Ping timeout: 480 seconds]
frieder has joined #dri-devel
tzimmermann has joined #dri-devel
sghuge has quit [Remote host closed the connection]
sghuge has joined #dri-devel
Leopold_ has joined #dri-devel
jsa has joined #dri-devel
Leopold__ has joined #dri-devel
kts has quit [Ping timeout: 480 seconds]
tursulin has joined #dri-devel
djbw has joined #dri-devel
rasterman has quit [Quit: Gettin' stinky!]
hansg has joined #dri-devel
Leopold_ has quit [Remote host closed the connection]
kts has joined #dri-devel
Leopold__ has quit [Remote host closed the connection]
bmodem has joined #dri-devel
camus has quit [Remote host closed the connection]
lynxeye has joined #dri-devel
pcercuei has joined #dri-devel
rgallaispou has joined #dri-devel
lplc has quit [Quit: WeeChat 3.8]
rasterman has joined #dri-devel
vliaskov has joined #dri-devel
lplc has joined #dri-devel
apinheiro has joined #dri-devel
libv has quit [Ping timeout: 480 seconds]
krei-se has quit [Quit: ZNC 1.8.2 - https://znc.in]
krei-se has joined #dri-devel
Kwiboo- has quit [Ping timeout: 480 seconds]
kts has quit [Ping timeout: 480 seconds]
heat has joined #dri-devel
Kwiboo has joined #dri-devel
Leopold has joined #dri-devel
kts has joined #dri-devel
<ity> Hi, is there a channel for asking for kernel driver bugs, namely amdgpu? I got a serious the-driver-crashes-blender-in-the-kernel-on-7900xtx issue :/ I am aware this channel is for mesa, but I thought you all might know a place I could ask for kernel stuff too.
Leopold has quit [Remote host closed the connection]
<pepp> ity: you can report your issue here: https://gitlab.freedesktop.org/drm/amd/-/issues
probablymoony has quit [Ping timeout: 480 seconds]
glennk has quit [Ping timeout: 480 seconds]
moony has joined #dri-devel
libv has joined #dri-devel
mclasen has joined #dri-devel
aravind has quit [Ping timeout: 480 seconds]
glennk has joined #dri-devel
kts has quit [Ping timeout: 480 seconds]
<sima> ity, this is also for kernel stuff, it's kinda the general gpu driver channel
<sima> ity, agd5f and hwentlan are two people here who can help with amdgpu kernel issues
bmodem has quit [Ping timeout: 480 seconds]
f11f12 has joined #dri-devel
f11f12 has quit [Remote host closed the connection]
mclasen has quit [Remote host closed the connection]
mclasen has joined #dri-devel
mclasen_ has joined #dri-devel
<ity> Ooh, oki, should I send the stacktrace + info here?
mclasen has quit [Remote host closed the connection]
mvlad has joined #dri-devel
<ity> Right, so, it occurs when I open Preferences in blender (4.0.2), the blender window freezes, and it does not respond to SIGKILL, which makes me think it's stuck in a syscall. Imma copy the log and send a link right away, gimme a second
<ity> uname -a `Linux ity-pc 6.7.0-arch3-1 #1 SMP PREEMPT_DYNAMIC Sat, 13 Jan 2024 14:37:14 +0000 x86_64 GNU/Linux` , Arch Linux-patched kernel
<ity> Do note that both Vulkan & OpenGL seem to work, I tested a few random games + my own few Vulkan test apps, as well as HIPBLAS with llama.cpp.
<ity> Lemme also try stracing blender
mclasen_ has quit [Remote host closed the connection]
mclasen has joined #dri-devel
<ity> write(2, "HIP hipInit: Invalid device\n", 28HIP hipInit: Invalid device
<ity> ) = 28
<ity> ioctl(12, DRM_IOCTL_AMDGPU_GEM_CREATE, 0x7ffd050651e0) = 0
<ity> ioctl(12, DRM_IOCTL_V3D_PERFMON_CREATE
<ity> Last output in strace
<ity> combo of mobo & GPU (/sys/class/dmi/id/board_name = PRO Z690-A WIFI DDR4(MS-7D25))
<ity> Unfortunately I do not have any experience debugging kernel drivers whatsoever, so this is as much info as I can provide right away. I do not have the option to passthrough the GPU to a QEMU VM and attach a debugger to the kernel at the current moment unfortunately. This might be a regression, as it did not happen a few months ago, though I had firmware issues with this particular
<javierm> vsyrjala: at some point the kernel stable process changed from opt-in to opt-out :(
<javierm> now it seems that even commits without a Fixes: tag are getting pulled, I guess yours was due the "Fix" in the subject line?
<pepp> ity: does your kernel have this revert https://patchwork.freedesktop.org/patch/573129/?
aravind has joined #dri-devel
<pepp> if not you probably want to add it because it's likely the fix for your issue
<ity> Actually my browser just crashed and I am unable to open it, and random utilities seem to refuse to start up
mclasen has quit [Ping timeout: 480 seconds]
<ity> Namely doas, firefox, chromium, nix all refuse to cooperate now that the blender crash occured, no idea how to fix that one up
<ity> Not sure if it's related to the GPU bug
<ity> Random apps that worked before are now crashing, I think I might have to reload my session
ity has quit [Quit: WeeChat 4.1.2]
ity has joined #dri-devel
<ity> Add `login` to the list of things affected, had to reboot :/
aravind has quit [Ping timeout: 480 seconds]
<ity> Lemme check the patch
<ity> Is the patch mainlined?
<ity> pepp:
crabbedhaloablut has quit []
aravind has joined #dri-devel
crabbedhaloablut has joined #dri-devel
heat_ has joined #dri-devel
kts has joined #dri-devel
heat has quit [Read error: No route to host]
mclasen has joined #dri-devel
paulk has quit [Quit: WeeChat 3.0]
mclasen has quit [Remote host closed the connection]
kts has quit [Quit: Leaving]
hansg has quit [Remote host closed the connection]
hansg has joined #dri-devel
hansg has quit [Remote host closed the connection]
<ity> Trying to compile the kernel with the patch applied
hansg has joined #dri-devel
kts has joined #dri-devel
YuGiOhJCJ has quit [Quit: YuGiOhJCJ]
<sima> mripard, some neat discussion going on on #wayland about the totally broken SAND format modifiers in vc4
mclasen has joined #dri-devel
paulk has joined #dri-devel
samuelig has quit [Quit: Bye!]
samuelig has joined #dri-devel
yyds has quit [Ping timeout: 480 seconds]
yyds has joined #dri-devel
mclasen_ has joined #dri-devel
mclasen has quit [Ping timeout: 480 seconds]
yyds has quit [Remote host closed the connection]
<karolherbst> mhhhh... hitting an LTO related bug in radeonsi? :')
ity has quit [Quit: WeeChat 4.1.2]
mclasen_ has quit [Ping timeout: 480 seconds]
mclasen has joined #dri-devel
aravind has quit [Ping timeout: 480 seconds]
simondnnsn has quit [Ping timeout: 480 seconds]
simondnnsn has joined #dri-devel
Company has joined #dri-devel
<karolherbst> getting an `LLVM ERROR: Cannot select: 0x7f91140ba7e0: v4i32 = bitcast 0x7f91140b5940`
mclasen has quit [Remote host closed the connection]
gouchi has joined #dri-devel
gouchi has quit []
rasterman has quit [Quit: Gettin' stinky!]
Leopold_ has joined #dri-devel
kts has quit [Ping timeout: 480 seconds]
kts has joined #dri-devel
ity has joined #dri-devel
ity has quit [Remote host closed the connection]
ity has joined #dri-devel
simondnnsn has quit [Read error: Connection reset by peer]
itoral has quit [Remote host closed the connection]
dri-logg1r has joined #dri-devel
mareko_ has joined #dri-devel
dri-logger has quit [Ping timeout: 480 seconds]
mareko has quit [Ping timeout: 480 seconds]
<ity> weechat wiped my IRC history :/. With that said, seems that on kernel 6.7.2 the GPU seems to be fully out of order, same with 6.7.0 with the patch applied. The kernel never modesets stuff correctly if a monitor is connected to the dGPU's output. The dGPU is also not visible to *anything* now. There might be another thing that happened that broke it to this degree, but my GPU that
<ity> was semi-working yesterday outside of blender is no longer working at all :( There is a bunch of amdgpu stacktraces in dmesg
kts has quit [Ping timeout: 480 seconds]
<karolherbst> ity: maybe your cable isn't proberly connected or something? unplugging an eGPU isn't really supported, or rather, pretty much not tested, so a flaky cable could trigger all sorts of weird issues
<karolherbst> best to reboot with it connected and use the `sysfs` `remove` file to remove the eGPU before unplugging
<ity> Wdym by unplugging?
<karolherbst> the eGPU?
<zamundaaa[m]> karolherbst: unplugging eGPUs works completely fine with amdgpu + Plasma Wayland
<ity> I am confused
<zamundaaa[m]> But ity wrote dGPU
<karolherbst> ehh wait..
<karolherbst> ohh yeah...
<karolherbst> my fault 🙃
<karolherbst> brain is silly today
<ity> Why would I unplug the dGPU, like I have before while debugging but the computer has rebooted quite a few times since then
<karolherbst> nevermind me
<ity> Ah
<karolherbst> I read "eGPU" not "dGPU"
<ity> OH
<ity> I kinda hoped when I bought the 7900 XTX that it will be a smooth experience on Linux :/ So far it has given me so much more trouble than even Nvidia :/
<ity> This computer has never 100% worked since I bought it
<pq> ity, are you sure it's not a hardware fault?
<karolherbst> mhh yeah.. new gen issues probably
alatiera has joined #dri-devel
<ity> Well, I stress-tested the GPU on another computer yesterday, and absolutely no issues happened
<karolherbst> or hw being faulty, though in 99% of the cases where that's assumed it's actually a sw bug :P
<pq> ity, like maybe your PSU is not big enough?
<ity> Is 1000W not enough? :P
<pq> I don't know what is, or what you have.
<karolherbst> ity: maybe want to ask inside #radeon .. I think..
<ity> #radeon for questions about amdgpu? I could try
<karolherbst> yeah
<ity> pq: it's the amount of watts the power supply has
<karolherbst> checked the bug tracker? could be there are others with the same issue
<pq> ity, I don't know how much is enough, or how much you have.
<ity> I did say I have 1000W
<pq> sure
<pq> ity, you said the machine has never worked properly? even without the dGPU?
<ity> karolherbst: I haven't yet tbh, this whole situation is extremely stressful for me :/ I just want my computer to at least go back to only crashing inside blender. Could this be a 6.7.2 regression? And downgrading to 6.7.0 would fix it? I dunnooooooooo :/
<ity> The machine never worked *fully*, but the issues it had changed ~monthly
<ity> Namely the dGPU *used* to work ~5 months ago
<pq> so it has always been somewhat unstable?
<karolherbst> ity: tried 6.6?
<karolherbst> but yeah.. I'd file a bug report on gitlab or at least see if others have similar issues. Could also be that it's just HIP being HIP or something...
<ity> haven't tested 6.6 recently yet nope
<karolherbst> are you only seeing those issues when using HIP/ROCm, or generally?
<ity> Currently the dGPU is 100% out of order, it fails to modeset
<ity> HIP/ROCm used to work
<karolherbst> ahh
<karolherbst> on a fresh boot? but it works in a different machine?
<karolherbst> yeah.. I'd try downgrading the kernel release first.. try 6.6 or 6.5 and see if that's better
<ity> Yep on a fresh boot. It works on a different machine running Windows (I have no Windows machines at home so I drove to someone else's place to test it)
<ity> Hmm, downgrading, I guess lemme try downgrading to 6.7.0 first
<ity> Do note I have no IRC bouncer set up, so I won't be able to see any messages while I am rebooting
ity has quit [Quit: WeeChat 4.2.1]
dogukan has joined #dri-devel
simondnnsn has joined #dri-devel
ity has joined #dri-devel
simondnnsn has quit [Read error: Connection reset by peer]
<ity> Back, the kernel that used to work yesterday, 6.7.0, doesn't anymore
<ity> oftc seems to be having some connection issues :/ took 8 tries to connect
<ity> [ 5.997772] amdgpu 0000:03:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x00000006 SMN_C2PMSG_82:0x00000000
<ity> [ 5.997774] amdgpu 0000:03:00.0: amdgpu: Failed to enable requested dpm features!
<ity> [ 5.997775] amdgpu 0000:03:00.0: amdgpu: Failed to setup smc hw!
<ity> [ 5.997775] [drm:amdgpu_device_init [amdgpu]] *ERROR* hw_init of IP block <smu> failed -62
<ity> [ 5.997973] amdgpu 0000:03:00.0: amdgpu: amdgpu_device_ip_init failed
<ity> [ 5.997974] amdgpu 0000:03:00.0: amdgpu: Fatal error during GPU init
<ity> [ 5.997976] amdgpu 0000:03:00.0: amdgpu: amdgpu: finishing device.
<ity> I should probably post this in #radeon too I guess?
<karolherbst> ity: yeah.. I'd check the bug tracker if there is something going on with this
<karolherbst> but looks like a kernel issue
<karolherbst> I'd try 6.6 or older as well
<karolherbst> ity: maybe this is yours? https://gitlab.freedesktop.org/drm/amd/-/issues/3140
<karolherbst> anyway..
<karolherbst> there seem to be regressions in 6.7
<ity> Hmm, I do not *think* so, as 6.7.0 *worked* yesterday, and the error seems slightly diff. I will try with 6.6 though, in case some magic thing triggered the 6.7 regression
<ity> Thanks for the IRC logs btw
<karolherbst> ohh wait
<karolherbst> ity: https://gitlab.freedesktop.org/drm/amd/-/issues/3135 maybe that's yours?
<karolherbst> same GPU at least 🙃
<karolherbst> looks like a linux-firmware update broke it
<karolherbst> which would explain why it broke on 6.7 if your firmware files were updated in the meantime
thaytan has quit [Ping timeout: 480 seconds]
<ity> What oughta do it, I did not downgrade linux-firmware
<ity> Though that does not seem to be the exact issue either
<ity> My GPU fails to modeset, rather than random hangs
<ity> Random hangs *did* happen before but they only happened ~once a week, ~5 months ago or so
talcohen[m] has quit []
<ity> If the hangs are patched tho then that'd be nice
<ity> Though rn my priority is getting the GPU to not fail initialization
angerctl has quit [Quit: WeeChat 4.1.1]
<ity> Ig firmware would explain "*ERROR* hw_init of IP block <smu> failed -62" ?
simondnnsn has joined #dri-devel
<karolherbst> yeah...
<karolherbst> sometimes those issues cause random errors to appear
<karolherbst> I'd try if you can get linux-firmware-git or something installed
<karolherbst> and see if that solves it
<karolherbst> seems like those files were pushed a week ago
<karolherbst> maybe
<karolherbst> could be a duplicate of the other
<karolherbst> who knows :)
<ity> This is torture :)
<karolherbst> welcome to kernel development/debugging
<ity> I can try fetching linux-firmware-git
<ity> Thank you :D This is indeed the 12th ring of hell
<ity> (11th ring being X11)
Namarrgon has joined #dri-devel
<ity> I am gonna try -git + booting with no screen plugged in
<ity> Thank god I can blast loud music outta my speakers to try to destress a bit while doing all this lmao
<karolherbst> mood
<ity> Listening to an English cover of Cruel Angel's Thesis haha.
<ity> While installing the AUR package
<ity> In order to have a change of pace from listening to the German cover
dogukan has quit [Quit: Konversation terminated!]
thaytan has joined #dri-devel
<ity> Eyy it's done, reboot time!!
ity has quit [Quit: WeeChat 4.2.1]
kzd has joined #dri-devel
zf has quit [Remote host closed the connection]
zf has joined #dri-devel
mclasen has joined #dri-devel
simondnnsn has quit [Read error: Connection reset by peer]
ity has joined #dri-devel
fab has quit [Quit: fab]
cheako has joined #dri-devel
<ity> Aaaand now not even the iGPU modesets if the dGPU is plugged in, I had to plug out the dGPU to make the system boot, even into firmware settings
simondnnsn has joined #dri-devel
krumelmonster has joined #dri-devel
mareko_ is now known as mareko
<mareko> karolherbst: you need to set shader_info::image_buffers
<karolherbst> ahh
ity has quit [Remote host closed the connection]
ity has joined #dri-devel
<mareko> it's only set by GLSL right now
<karolherbst> yeah.. let me try that
<karolherbst> okay cool, this seems to work, it crashes a bit later on a test with GL_RGBA16F mhh...
<karolherbst> but GL_RGBA16UI_EXT works..
<karolherbst> oh well
<karolherbst> okay, fixing the image_buffers thing first properly and debug the other bug after that :)
<ity> karolherbst: After booting with linux-firmware-git, somehow now my iGPU isn't modesetting either if the dGPU is plugged in :/ Any ideas what to do now? I plugged it out so I can use my computer at all, but like, :( I don't want my 7900 XTX to only be an expensive paperweight...
<karolherbst> ity: made sure your initramfs was regenerated and everything?
<karolherbst> but yeah.. no ideas besides commenting on the bug with "yeah, same issue here" or so
<ity> mkinitcpio ran yea, hmm
<karolherbst> I'd try 6.6 and older linux-firmware (from when it used to work or so) to get your going at least
<ity> I am now highly afraid to touch this pile of cards, but I will try that in a bit haha
<ity> So, kernel 6.6 and which linux-firmware ver?
<ity> Like, by messing with it seemingly random shit broke
<ity> Eg my mouse no longer works properly :(
<ity> I feel like I am cursed
<ity> I always run into the most obscure bugs ever
<ity> Gonna try with firmware 2023-09-18 & kernel 6.6.0
<ity> Or is that a bad combo?
<karolherbst> ity: probably good to follow the changes here: https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/log/amdgpu
<karolherbst> ity: there can't be a bad combo as it's alwyas supposed to work "somewhat"
<karolherbst> seems like there was a big update on 2023-09-28
<ity> Hmm
<karolherbst> so maybe try before/after that
<karolherbst> maybe before first and see if that works
<ity> Oki
<ity> Time to shut this thing down, plug in the dGPU, and try if it works :/
<ity> Imma report back in hopefully in a bit
ity has quit [Quit: WeeChat 4.2.1]
Leopold_ has quit [Remote host closed the connection]
fab has joined #dri-devel
<karolherbst> mareko: thanks, proberly setting image_buffers makes it all work :)
ity has joined #dri-devel
ity has quit [Remote host closed the connection]
ity has joined #dri-devel
simondnnsn has quit [Ping timeout: 480 seconds]
<ity> karolherbst: good news, system boots, OpenGL & Vulkan work, monitor is connected to the dGPU. Bad news, opening Blender Preferences freezes the system. I am unsure as to whether the kernel is up and background services or if it died, I need to test that.
<ity> Time to try the newer firmware package
<karolherbst> ity: yeah.. so.. firmware updates happen for a reason, so it might be they tried to address some issues, but then regressed or something
<karolherbst> soo yeah...
<ity> 2023-10-30 namely, which is the oldest package newer than 2023-09-18 & also newer than 2023-09-28
<ity> Yea :/
simondnnsn has joined #dri-devel
<ity> Lemme also test if other HIP stuff works
<ity> Well, ROCm, (I still have no idea what is the diff between ROCm, HIP, and BLAS)
<ity> Seems that llama.cpp works, it does some ROCm HIPBLAS stuff
<karolherbst> BLAS is an API, ROCm is what AMD calls their current compute stack and HIP is a CUDA clone in bad
<ity> A compute API, smth like OpenCL?
<karolherbst> nah
<karolherbst> more on a primitive level
<ity> Oh
<karolherbst> like...
<ity> O?
<karolherbst> implementing algorithms
<ity> I am confused
simondnnsn has quit [Read error: Connection reset by peer]
<karolherbst> BLAS is a collection of common algorithms, and there are many implementations for various compute APIS
<ity> OOOH
<karolherbst> that reminds me.. I wanted to try llama.cpp on top of CL 🙃
<ity> So HIPBLAS is a part of ROCm and is AMD's impl of BLAS on top of their compute API called HIP which is like OpenCL but AMD specific?
<ity> CL?
<ity> ~~You don't mean Common Lisp right~~
<karolherbst> OpenCL
<ity> OH
<ity> Which GPU do you have?
<karolherbst> that's difficult to answer
<ity> Oh lmfao
<karolherbst> because I don't have one GPU, I have like 40 🙃
<karolherbst> maybe 50
<karolherbst> but one of the discrete AMDs I have is a 6700 XT I think
Haaninjo has joined #dri-devel
simondnnsn has joined #dri-devel
<ity> Wait why do you have 50 GPUs, are you a data center O.O
<tnt> karolherbst: did you get an Arc one btw ?
<karolherbst> I haven't
<karolherbst> ity: developer more like
<karolherbst> *driver developer
<ity> OH
<ity> Makes sense
<karolherbst> though most are Nvidia ones :D
<ity> Where do you get the money for it though
<karolherbst> that's the neat part, I don't
<ity> O.O
<karolherbst> my employer does :P
<karolherbst> well
<karolherbst> some GPUs are also just lendings, so there is that
<ity> Oooh, might I ask who is your employer?
<karolherbst> red hat
<ity> Oooh
<ity> Which drivers do you work on?
<karolherbst> mhhh.. I used to work primarily on nouveau, but lately I've been found working on a couple of drivers for various reasons (mostly fixing OpenCL related issues or adding new features or something)
<ity> Ooooh
<ity> Sounds fun tbh haha, I would like to work on drivers but it seems kinda impenetrable lol.
<ity> Well, I am reading driver code at random and trying to understand how the stuff works, but it has been...
<ity> Very hard :D
rasterman has joined #dri-devel
macslayer has joined #dri-devel
simondnnsn has quit [Ping timeout: 480 seconds]
simondnnsn has joined #dri-devel
ity has quit [Quit: WeeChat 4.2.1]
ity has joined #dri-devel
<ity> So, the new firmware seems to have the regression in that the amdgpu does not modeset but it does not prevent the igpu (intel) from modesetting. "*ERROR* hw_init of IP block <smu> failed -62". The regression seems to have happened somewhere between 2023-09-18 (blender crash) and 2023-10-30 (hw_init of IP block)
<ity> It is also possible that the blender thing is a regression in the amdgpu driver rather than the firmware like someone mentioned before, idk.
<ity> Though the patch mentions "since v6.6.1. Revert it to fix blender again.", though I am on v6.6.0
<ity> I am not 100% sure how I should report this on the isuse tracker
<ity> There is no working version, just diff versions get diff issues
<ity> Which one do I report
<karolherbst> yeah... maybe just file a new issue and describe the entire situation
<ity> Honestly I kinda lost track of the situation partway through myself, there is just so much like, random things happening :/
<ity> Let's see what I remember
Leopold_ has joined #dri-devel
Mangix has quit [Ping timeout: 480 seconds]
<ity> Latest kernel and firmware, unable to boot at all iirc? v6.6.0, firmware-20230918, ROCm, VK & GL work, but Blender Preferences bring down the kernel. Nothing interesting in journalctl --boot=-1. firmware-2023-10-30, the AMD gpu fails to modeset but Intel iGPU modesets properly. Latest kernel & firmware on arch, computer refuses to boot at all, I have no further confirmed
<ity> information.
<karolherbst> ity: I think I'd ignore the blender thing for now
<ity> Oh I am dum I repeated myself at the beginning and end of the message, wtf is with my short term memory
<karolherbst> could be a rocm bug or something
<ity> Hmm
<karolherbst> if your system boots, that's a baseline for the kernel
<karolherbst> userspace missbehaving can bring down the GPU
<karolherbst> and GPU reset isn't the most reliable thing on AMD
<ity> I mean it *used* to work at *some* point, I don't know when though
<ity> Could also be a blender regression
<ity> Ooh
<karolherbst> yeah.. so if your kernel accesses a NULL pointer with bad luck this can bring down your system as well. it shouldn't but...
<karolherbst> those things aren't easy to blame the kernel or usespace for without investigating
<ity> Ah
<karolherbst> so if your system boots, that's your working state :D
<ity> I mean rn the blender thing is decently important for me
<karolherbst> if updating linux-firwmare breaks it -> bug, if updating your kernel breaks it -> bug
<karolherbst> yeah...
<karolherbst> but that's a different bug probably
<ity> Might be
<karolherbst> should either file against blender or ROCm
<ity> Should I try older blender versions perhaps hmm
<karolherbst> yeah.. maybe
<karolherbst> or older rocm
<ity> Hmm
<ity> There's 2 HIPs in arch repos for ROCm O.O
<ity> rocm-hip-runtime and hip-runtime-amd
<ity> Hmm
<ity> The arch rocm version is from 2023-11-12
<ity> Well, llama.cpp works so I don't *think* it's rocm? But might also be a diff code path between blender and llama, idk
<karolherbst> yeah..
<karolherbst> it can easily be triggered by a kernel doing weird things
<karolherbst> or something
<ity> I mean I presume that just opening Preferences wouldn't actually run a compute kernel... Right???
<ity> Is this naive hope
<karolherbst> it probably is :D
<ity> Oh no...
<karolherbst> could try to identify capabilities but running stuff, who knows
<karolherbst> or runtime initialization or something
<ity> That's a lovely test, if you hit a negative nothing happens, if you hit a positive the computer blows up :D The 7900 XTX also has a horrid POST time of 20 seconds, so each reboot is costly
<karolherbst> that's quite a bit
<ity> I forgot that nix on non-nixos has problems with graphics acceleration and tried to downgrade blender with nix :D
<ity> Yea :/
tobiasjakobi has joined #dri-devel
<ity> Oh fuck "blender: error while loading shared libraries: libOpenColorIO.so.2.2: cannot open shared object file: No such file or directory"
tobiasjakobi has quit []
ity has quit [Quit: WeeChat 4.2.1]
kts has joined #dri-devel
<jani> drm-tip seems to have a wrong conflict resolution for drivers/gpu/drm/bridge/samsung-dsim.c http://paste.debian.net/1305887/
Mangix has joined #dri-devel
Leopold_ has quit [Remote host closed the connection]
ity has joined #dri-devel
Duke`` has joined #dri-devel
<ity> Back, oftc took a few minutes to decide to stop timing out. So, I tried blender inside a flatpak, and also no dice, it can't see the GPU at all there, but at least it doesn't crash
yyds has joined #dri-devel
heat_ has quit [Remote host closed the connection]
heat has joined #dri-devel
simondnnsn has quit [Read error: Connection reset by peer]
simondnnsn has joined #dri-devel
tzimmermann has quit [Quit: Leaving]
<MrCooper> is it intentional that code-validation stage CI jobs run automatically in Mesa fork pipelines?
yyds_ has joined #dri-devel
yyds has quit [Ping timeout: 480 seconds]
yyds_ has quit [Remote host closed the connection]
simondnnsn has quit [Ping timeout: 480 seconds]
simondnnsn has joined #dri-devel
ADS_Sr has quit [Ping timeout: 480 seconds]
mclasen has quit [Ping timeout: 480 seconds]
<airlied> jani: yes it does, there's a thread, I should get to fixing it up
<jani> airlied: thanks
simondnnsn has quit [Ping timeout: 480 seconds]
simondnnsn has joined #dri-devel
iive has joined #dri-devel
mbrost has joined #dri-devel
hansg has quit [Quit: Leaving]
Duke`` has quit [Ping timeout: 480 seconds]
Mangix has quit [Ping timeout: 480 seconds]
Duke`` has joined #dri-devel
mbrost has quit [Ping timeout: 480 seconds]
mbrost has joined #dri-devel
Kwiboo has quit [Quit: .]
Kwiboo has joined #dri-devel
Kwiboo has quit []
frieder has quit [Remote host closed the connection]
Kwiboo has joined #dri-devel
Mangix has joined #dri-devel
simondnnsn has quit [Read error: Connection reset by peer]
mbrost_ has joined #dri-devel
simondnnsn has joined #dri-devel
mbrost has quit [Ping timeout: 480 seconds]
mbrost_ has quit [Remote host closed the connection]
mbrost_ has joined #dri-devel
konstantin_ has joined #dri-devel
konstantin is now known as Guest1118
konstantin_ is now known as konstantin
Guest1118 has quit [Ping timeout: 480 seconds]
mbrost_ has quit [Remote host closed the connection]
mbrost_ has joined #dri-devel
guru_ has joined #dri-devel
guru_ has quit []
guru_ has joined #dri-devel
<hwentlan> airlied: thanks
guru_ has quit []
kts has quit [Quit: Leaving]
oneforall2 has quit [Ping timeout: 480 seconds]
fab has quit [Read error: No route to host]
fab has joined #dri-devel
mbrost_ has quit [Ping timeout: 480 seconds]
oneforall2 has joined #dri-devel
lynxeye has quit [Quit: Leaving.]
flom84 has joined #dri-devel
flom84 has quit [Remote host closed the connection]
rasterman has quit [Quit: Gettin' stinky!]
Leopold_ has joined #dri-devel
Leopold_ has quit [Remote host closed the connection]
glennk has quit [Remote host closed the connection]
Leopold_ has joined #dri-devel
simon-perretta-img has quit [Ping timeout: 480 seconds]
simon-perretta-img has joined #dri-devel
gouchi has joined #dri-devel
tursulin has quit [Ping timeout: 480 seconds]
gouchi has quit [Remote host closed the connection]
ADS_Sr has joined #dri-devel
gouchi has joined #dri-devel
Leopold_ has quit [Remote host closed the connection]
mort_ has joined #dri-devel
Leopold_ has joined #dri-devel
ngcortes has joined #dri-devel
Leopold_ has quit [Remote host closed the connection]
MajorBiscuit has joined #dri-devel
MajorBiscuit has quit []
Leopold_ has joined #dri-devel
gouchi has quit [Quit: Quitte]
Leopold_ has quit [Remote host closed the connection]
mclasen has joined #dri-devel
Leopold_ has joined #dri-devel
simon-perretta-img has quit [Ping timeout: 480 seconds]
simon-perretta-img has joined #dri-devel
mclasen has quit []
mclasen has joined #dri-devel
Leopold_ has quit [Remote host closed the connection]
Leopold_ has joined #dri-devel
mbrost has joined #dri-devel
mclasen has quit [Ping timeout: 480 seconds]
Leopold_ has quit [Remote host closed the connection]
sima has quit [Ping timeout: 480 seconds]
libv has quit [Remote host closed the connection]
Duke`` has quit [Ping timeout: 480 seconds]
Leopold_ has joined #dri-devel
DavidHeidelberg has quit [Remote host closed the connection]
Leopold_ has quit [Remote host closed the connection]
Leopold_ has joined #dri-devel
mvlad has quit [Remote host closed the connection]
DavidHeidelberg has joined #dri-devel
libv has joined #dri-devel
Leopold_ has quit [Remote host closed the connection]
greenjustin has quit [Ping timeout: 480 seconds]
Leopold has joined #dri-devel
moony has quit [Ping timeout: 480 seconds]
illwieckz has quit [Quit: I'll be back!]
moony has joined #dri-devel
illwieckz has joined #dri-devel
Leopold has quit [Remote host closed the connection]
Leopold_ has joined #dri-devel
apinheiro has quit [Quit: Leaving]
pcercuei has quit [Quit: dodo]
Leopold_ has quit [Remote host closed the connection]
Leopold_ has joined #dri-devel
jsa has quit []
Leopold_ has quit [Remote host closed the connection]
TMM has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
TMM has joined #dri-devel
<airlied> jani: okay should be fixed now
<airlied> as soon as tip rebuilds here
Haaninjo has quit [Quit: Ex-Chat]
vliaskov has quit [Remote host closed the connection]