daniels changed the topic of #freedesktop to: GitLab is currently down for upgrade; will be a while before it's back || https://www.freedesktop.org infrastructure and online services || for questions about freedesktop.org projects, please see each project's contact || for discussions about specifications, please use https://gitlab.freedesktop.org/xdg or xdg@lists.freedesktop.org
columbarius has joined #freedesktop
co1umbarius has quit [Ping timeout: 480 seconds]
konstantin_ has joined #freedesktop
konstantin has quit [Ping timeout: 480 seconds]
AbleBacon has quit [Read error: Connection reset by peer]
spiegela has left #freedesktop [The Lounge - https://thelounge.chat]
kode54 has quit [Quit: Connection closed for inactivity]
lsd|2 has quit []
lsd|2 has joined #freedesktop
lsd|2 has quit []
lsd|2 has joined #freedesktop
epony has quit [Ping timeout: 480 seconds]
epony has joined #freedesktop
<karolherbst> the freedrno runners are kinda... not doing anything
<airlied> they do that when the backlog is on the lava side
<karolherbst> sure, but they fail for every pipeline
<airlied> I think the default answer is to disable them until someone can dig in
kode54 has joined #freedesktop
<anholt> cool, there's a full vk run going on them, and I don't know why. https://gitlab.freedesktop.org/mesa/mesa/-/pipelines/973062
<anholt> maybe that's the scheduled nightly?
<anholt> oh, there's the scheduled tag. yeah.
<karolherbst> fair enough... I'll just disable the a660 jobs then
<anholt> so, right now 3/6 are busy, and will be for half an hour. the rest are busy with other MRs.
<anholt> this is how LAVA queues jobs for all the HW runners doing it, which is confusing and frustrating.
<karolherbst> sure, but it's only the a660 jobs which are timing out
<anholt> have they been all day, or just right now?
<karolherbst> all day
<karolherbst> I already tried like three times yesterday
<karolherbst> and other MRs as well
<anholt> hmm. I don't see load that should cause that from the gitlab side right now. @DavidHeidelberg[m]
<anholt> (I am so so so frustrated with LAVA's opaqueness)
<karolherbst> yeah.. so I'd just disable them so we can get critical fixes in which that crosvm fix actually is)
<anholt> yeah, please file a bug at collabora folks and disable it for now.
<karolherbst> just add a . in front of the job name, right?
<airlied> yu
<airlied> yup
<airlied> it could have kernel jobs running on it, or it could be dead also
<karolherbst> ehh.. also need to update other places..
<karolherbst> yeah.. well.. it fails 5 out of 5 pipelines I'm aware of running those jobs in the last 10 hours, and I think we really want to land that crosvm fix to unblock other MRs where people already got annoyed by :')
sima has joined #freedesktop
<karolherbst> looks like they were fine 11 hours ago: https://gitlab.freedesktop.org/mesa/mesa/-/jobs/48180073
<karolherbst> and that's the first failing one: https://gitlab.freedesktop.org/mesa/mesa/-/jobs/48188142
<karolherbst> currently running and not doing anything: https://gitlab.freedesktop.org/mesa/mesa/-/jobs/48204703
<karolherbst> so yeah...
<karolherbst> also.. what's up with this job? https://gitlab.freedesktop.org/mesa/mesa/-/jobs/48208522
<karolherbst> anyway.. retried that one as it was kinda stuck for 15 minutes not doing anything
<karolherbst> "wget: bad address 'gitlab.freedesktop.org'" :')
<karolherbst> but that retry magic is really cool honestly
aninternettroll has joined #freedesktop
Ahuj has joined #freedesktop
<mupuf> bentiss: minio.error.S3Error: S3 operation failed; code: XMinioStorageFull, message: Storage backend has reached its minimum free drive threshold. Please delete a few objects to proceed.
<mupuf> Oh, sorry, that is likely on my end!
Major_Biscuit has joined #freedesktop
Major_Biscuit has quit []
i-garrison has quit [Remote host closed the connection]
i-garrison has joined #freedesktop
<bentiss> mupuf: yeah, we don't have minio anymore on the gitlab side
<mupuf> yeah, it was on my gateway
<mupuf> sorry about that
<mupuf> it's fixed now
<mupuf> half the drive was used by old volumes, containers, and caching of the fdo containers
<bentiss> and FWIW, no 500 error on the registry over the night :) \o/
<karolherbst> noice
mirai_ has joined #freedesktop
mirai has quit [Remote host closed the connection]
An0num0us has joined #freedesktop
ximion has quit [Quit: Detached from the Matrix]
tzimmermann has joined #freedesktop
<mupuf> bentiss: that's amazing!
<mupuf> should increase CI's reliability nicely :)(
lsd|2 has quit [Quit: KVIrc 5.0.1 Aria http://www.kvirc.net/]
<bentiss> changing the data backend of the registry now so we disconnects more from the failing cluster
bmodem has joined #freedesktop
nedko has quit [Remote host closed the connection]
<bentiss> looks like something is not happy there
<bentiss> got a lot of blob unknown to registry
<bentiss> could very well be that the ones I tried were failing already
MajorBiscuit has joined #freedesktop
<bentiss> switching back to the previous data storage
konstantin_ is now known as konstantin
<bentiss> doesn't seem to change a bit, so returning to the main cluster
mirai_ has quit [Remote host closed the connection]
mirai_ has joined #freedesktop
<mupuf> bentiss: yeah, I found a lot of missing blobs, even in ci-templates
<bentiss> oops, I think I found out why: there was a missing config in the new registry deployment in which it told users to directly fetch from S3. Not sure how this could have been working yesterday
<bentiss> doesn't seem to change
bmodem has quit [Ping timeout: 480 seconds]
<mupuf> https://gitlab.freedesktop.org/mesa/mesa/-/jobs/48216598 <-- seems like S3 may be missing some artifacts
bmodem has joined #freedesktop
mripard has joined #freedesktop
bmodem has quit [Remote host closed the connection]
bmodem has joined #freedesktop
<bentiss> mupuf: only the arm32 version is available on s3.fd.o
<bentiss> must have been an error in a previous stage
<mupuf> ack!
<bentiss> shit, I understand why: I haven't change the registry config on packet to not point at GCS :/
<bentiss> which means all registry access from yesterday where uploading/downloading from GCS
<mupuf> oh boy
<bentiss> so probably we need to re-upload some images
<mupuf> do you have the data there?
<bentiss> you mean which images have been uploaded?
<mupuf> well, I was more thinking about rsyncing everything to the new registry
<mupuf> but I guess you don't have ssh access to GCS
<bentiss> nope, because it went directly into the pool of "unknown blobs"
<bentiss> so rsync is out of the question
<mupuf> ack
<bentiss> I can babysit mesa
<bentiss> well, let me first use the proper cluster
<bentiss> it was weird that the rsync was saying that we did not create any image yesterday :)
<bentiss> and now it's 500 everywhere
<bentiss> I think I don't have the correct credentials for accessing the S3 on the cluster...
<bentiss> finally
<mupuf> bentiss: skopeo inspect docker://registry.freedesktop.org/freedesktop/ci-templates/container-build-base:2023-07-12.1
<mupuf> FATA[0001] fetching blob: blob unknown to registry
<mupuf> it was working earlier today
<bentiss> yeah, I'll fix that in a bit
<hakzsam> still a registry issue?
<bentiss> hakzsam: yeah, I thought it was fine yesterday until I realized this morning that we were not usingt he correct data backend
<bentiss> mupuf: fixed
<mupuf> thanks!
<hakzsam> bentiss: do you know how to fix it?
<bentiss> hakzsam: I'm fixing it right now
<hakzsam> cool thanks
<bentiss> mupuf: this might be of interest for you: https://pastebin.com/zdjgtAA3 my script to sync the missing tags for a gitlab project
<bentiss> pastebin is full of adds...
<mupuf> thanks!
<mupuf> however, it did not seem to fix container-build-base:2023-07-12.1
<mupuf> still getting missing blobs
<bentiss> works here
<bentiss> mupuf: maybe your local cache?
<mupuf> Is skopeo caching anything?
<mupuf> let's see
<bentiss> not that I know of
<bentiss> but if you point at harbor in registries.conf, maybe that's the reason
<mupuf> I don't. And I am running this on my desktop pc :s
<bentiss> I can also pull it just fine
<mupuf> what does `dig registry.freedesktop.org` answer for you?
<bentiss> 147.75.198.156
<bentiss> and I tried on my desktop and on a server I haven't mess up
<mupuf> same
<mupuf> Da heck is going on :D
<mupuf> let's test from a server
<ystreet00> I'm getting a 'fetching blob: blob unknown to registry' for kopeo inspect docker://registry.freedesktop.org/gstreamer/gstreamer/amd64/fedora:2023-08-25.1-main in https://gitlab.freedesktop.org/ystreet/gstreamer/-/jobs/48223776#L112
killpid_ has joined #freedesktop
<ystreet00> also happens locally (skopeo inspect)
<bentiss> ystreet00: k, on it
<mupuf> bentiss: you are right. it works
<mupuf> there must be a cache somewhere
<bentiss> ystreet00: fixed
hch12907 has joined #freedesktop
<ystreet00> works, thanks
<hch12907> hi, I just noticed gitlab.fd.o doesn't seem to have ipv6 connectivity. is it because of the migration?
<bentiss> hch12907: it could have if the migration happened properly, but the new cluster is not happy, so nopt
<bentiss> nope
<hch12907> i see
<hch12907> but still, the migration must've been stressful, keep up the good work guys
<mupuf> bentiss: is there an http frontend that could be caching requests to the registry?
<bentiss> mupuf: nope. I don't have redis enabled for the registry
<mupuf> ack
<mupuf> I just tried to run skopeo in a container, either fedora or arch, and I keep getting the same result
<mupuf> bentiss: I asked eric to inspect registry.freedesktop.org/freedesktop/ci-templates/container-build-base:2023-07-12.1 and he also gets "blob unknown to the registry"
<mupuf> I'll try to repush it myself
<bentiss> mupuf: can you please retry now?
<mupuf> same
<bentiss> yeah, maybe retry pushing it
mirai_ has quit [Ping timeout: 480 seconds]
<mupuf> bentiss: done, and it works
<mupuf> and the runners seem happy now
<bentiss> strange, but thanks :)
<mupuf> indeed
hch12907 has left #freedesktop [#freedesktop]
<mupuf> bentiss: the images i built this morning were pushed to gcs, not the registry
<mupuf> is that related to packet being configured to push to gcs?
<mupuf> I thought it would only affect the fdo runners. In my case, I pushed it from my runners
bmodem has quit [Ping timeout: 480 seconds]
vkareh has joined #freedesktop
mripard has quit []
<bentiss> mupuf: it was the registry config. It was pointing at gcs, so whatever you pushed and from anywhere, it was going to gcs
<svuorela> win 25
<svuorela> woops
<mupuf> bentiss: ha ha, right
killpid_ has quit [Quit: Quit.]
<zmike> shaderdb again taking hours for nouveau https://gitlab.freedesktop.org/mesa/mesa/-/jobs/48231373
<zmike> DavidHeidelberg[m]: ^
<daniels> just disable it
<DavidHeidelberg[m]> I would blame it on the infra I guess, it never took that much, probably overloaded a lot, if you have two samples, disabling would make sense
emery has joined #freedesktop
<zmike> I've had this happen multiple times over the past couple months
viktoria has quit [Quit: Page closed]
AbleBacon has joined #freedesktop
Ahuj has quit [Ping timeout: 480 seconds]
<pq> Is there still an elevated risk that review etc. comments posted on gitlab might be lost?
mripard has joined #freedesktop
<mupuf> pq: shouldn't be an issue
<pq> cool!
<bentiss> heh, I'm playing with a dump of the registry db, and it seems that noone but ci-templates is using the "fdo.expires-after" label
<bentiss> still, that's 5173 registry images to prune :)
Major_Biscuit has joined #freedesktop
<bentiss> my bad, gst-editing-services, gst-examples and a few others are using it :)
<bentiss> now I wonder if I should mass delete them or not
MajorBiscuit has quit [Ping timeout: 480 seconds]
vyivel has quit [Read error: Connection reset by peer]
vyivel has joined #freedesktop
mvlad has joined #freedesktop
Haaninjo has joined #freedesktop
<DavidHeidelberg[m]> zmike: hmm, then I guess we'll do separation, so feel free to kill it for now. I'm making a TODO to split it into separate job.
<zmike> what exactly am I killing
<DavidHeidelberg[m]> zmike: the nouveau part of shader-db test, if you noticing it takes unsual amount time sometimes
<DavidHeidelberg[m]> then we split shader-db away and set reasonable timeout for it
gert31 has joined #freedesktop
An0num0us has quit [Ping timeout: 480 seconds]
gert31 has quit [Quit: Leaving]
tzimmermann has quit [Quit: Leaving]
Major_Biscuit has quit [Ping timeout: 480 seconds]
vyivel has quit [Read error: Connection reset by peer]
<DavidHeidelberg[m]> zmike: Ack
vyivel has joined #freedesktop
vyivel has quit [Ping timeout: 480 seconds]
<dcbaker> I'm running into what looks like an internal caching service failing to provide files for CI jobs on the 23.2 branch (I can directly download these files myself): https://mesa.pages.freedesktop.org/-/mesa/-/jobs/48190376/artifacts/results/summary/results/trace@gl-intel-amly@godot@Material%20Testers.x86_64_2020.04.08_13.38_frame799.rdc.html
<dcbaker> Does this look familiar to anyone
bmodem has joined #freedesktop
vyivel has joined #freedesktop
<gallo[m]> dcbaker: it is related to this: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24819
Major_Biscuit has joined #freedesktop
<dcbaker> gallo: thanks, I missed thought somehow looking through the log yesterday
Major_Biscuit has quit []
<dcbaker> ... because I didn't do a git pull. thanks again
<gallo[m]> dcbaker: np! happy to help.
vyivel has quit [Read error: Connection reset by peer]
vyivel has joined #freedesktop
vyivel has quit [Read error: Connection reset by peer]
vyivel has joined #freedesktop
vyivel has quit [Read error: Connection reset by peer]
vyivel has joined #freedesktop
fahien has quit []
lsd|2 has joined #freedesktop
thecollaboran147 has quit [Quit: The Lounge - https://thelounge.chat]
ndufresne has quit [Quit: The Lounge - https://thelounge.chat]
italove has quit [Quit: The Lounge - https://thelounge.chat]
gallo72 has quit []
koike has quit [Quit: The Lounge - https://thelounge.chat]
rpavlik has quit [Quit: The Lounge - https://thelounge.chat]
ocrete has quit [Quit: The Lounge - https://thelounge.chat]
nuclearcat2 has quit [Quit: The Lounge - https://thelounge.chat]
bmodem has quit [Ping timeout: 480 seconds]
ocrete has joined #freedesktop
anholt_ has joined #freedesktop
alanc has quit [Remote host closed the connection]
alanc has joined #freedesktop
anholt has quit [Ping timeout: 480 seconds]
ndufresne has joined #freedesktop
ndufresne has quit []
ndufresne has joined #freedesktop
ocrete6 has joined #freedesktop
ocrete is now known as Guest1244
ocrete6 is now known as ocrete
ocrete has quit []
ocrete has joined #freedesktop
flom84 has joined #freedesktop
Guest1244 has quit [Ping timeout: 480 seconds]
An0num0us has joined #freedesktop
ximion has joined #freedesktop
flom84 has quit [Quit: Leaving]
flom84 has joined #freedesktop
flom84 has quit [Remote host closed the connection]
vkareh has quit [Quit: WeeChat 3.6]
italove has joined #freedesktop
flom84 has joined #freedesktop
flom84 has quit [Ping timeout: 480 seconds]
Haaninjo has quit [Quit: Ex-Chat]
sima has quit [Ping timeout: 480 seconds]
mvlad has quit [Remote host closed the connection]
lsd|2 has quit []
atticf has quit [Ping timeout: 480 seconds]
atticf has joined #freedesktop
An0num0us has quit [Ping timeout: 480 seconds]
Guest1037 has quit [Quit: ZNC 1.8.2 - https://znc.in]