#freedesktop on 2023-08-29 — irc logs at oftc.irclog.whitequark.org

2023-08-27 13:06 daniels changed the topic of #freedesktop to: GitLab is currently down for upgrade; will be a while before it's back || https://www.freedesktop.org infrastructure and online services || for questions about freedesktop.org projects, please see each project's contact || for discussions about specifications, please use https://gitlab.freedesktop.org/xdg or xdg@lists.freedesktop.org

00:07 egbert is now known as Guest969

00:07 egbert has joined #freedesktop

00:10 Guest969 has quit [Ping timeout: 480 seconds]

00:10 bnilawar has quit [Ping timeout: 480 seconds]

00:24 Leopold__ has quit [Remote host closed the connection]

00:29 lack has quit [Read error: Connection reset by peer]

00:30 lack has joined #freedesktop

00:33 Leopold_ has joined #freedesktop

00:49 utsweetyfish has quit [Remote host closed the connection]

00:49 utsweetyfish has joined #freedesktop

00:51 AbleBacon has quit [Read error: Connection reset by peer]

00:56 co1umbarius has joined #freedesktop

00:58 columbarius has quit [Ping timeout: 480 seconds]

01:57 _DOOM_ has joined #freedesktop

02:06 <_DOOM_> I am working the StatusNotifierItem spec as the watcher when a host or item have a NameOwnerChange what should the watcher do if the item/host has a new name?

02:06 <_DOOM_> Should the watcher reannounce the item/host?

02:18 ximion has quit [Quit: Detached from the Matrix]

02:36 Yakov has joined #freedesktop

02:36 <Yakov> using base libevdev sample https://www.freedesktop.org/wiki/Software/libevdev/ -> getting error Failed to init libevdev - how to fix?

02:40 utsweetyfish has quit [Remote host closed the connection]

02:40 utsweetyfish has joined #freedesktop

02:50 asdon has joined #freedesktop

02:53 asdon has quit []

02:58 _DOOM_ has quit [Ping timeout: 480 seconds]

03:06 _DOOM_ has joined #freedesktop

03:10 Yakov has quit [Remote host closed the connection]

03:32 utsweetyfish has quit [Remote host closed the connection]

03:32 utsweetyfish has joined #freedesktop

03:37 Leopold_ has quit [Remote host closed the connection]

03:51 Leopold_ has joined #freedesktop

04:04 _DOOM_ has quit [Quit: WeeChat 4.0.2]

04:07 bnilawar has joined #freedesktop

04:07 agd5f has joined #freedesktop

04:12 agd5f has quit [Read error: Connection reset by peer]

04:12 agd5f has joined #freedesktop

04:13 agd5f_ has quit [Ping timeout: 480 seconds]

04:49 anon_ has quit [Quit: Page closed]

04:59 Salz has quit [Remote host closed the connection]

05:12 bmodem has joined #freedesktop

05:17 <tpalli> daniels regarding the rust issue, I will try to bump RUST_VERSION to 1.70.0-2023-06-01 and see if that works out

05:26 Leopold_ has quit [Remote host closed the connection]

05:29 dcunit3d has joined #freedesktop

05:30 Leopold_ has joined #freedesktop

05:31 dcunit3d_ has quit [Ping timeout: 480 seconds]

05:40 <tpalli> aww hitting something else now: "Error: authenticating creds for "harbor.freedesktop.org": can't talk to a V1 container registry"

05:44 <mupuf> tpalli: is that in a fork or in mesa/mesa?

05:45 bmodem has quit [Ping timeout: 480 seconds]

05:45 bmodem has joined #freedesktop

05:47 bmodem has quit [Excess Flood]

05:47 bmodem has joined #freedesktop

05:50 <tpalli> mupuf that is a MR which needs to bump the the rootfs tag .. so I think that is why it is hitting these things, it is https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24600

05:52 Ahuj has joined #freedesktop

05:54 <bbhtt> In the user verification template I don't have external: true, what do I do?

05:57 <mupuf> bbhtt: you have "false"?

05:57 <mupuf> If so, then you don't need to request anything

05:57 <bbhtt> mupuf: Yea

05:57 <bbhtt> Ah thanks

05:57 <mupuf> you should be able to fork :)

05:57 <mupuf> you must work for a company that fd.o trusts

06:00 <bbhtt> I think it's because my account was created before all this

06:03 <mupuf> oh, could be

06:03 <mupuf> tpalli: checking it out

06:04 * mupuf restored the banner about spam since new users need to know they need to request rights

06:05 <tpalli> mupuf thanks!

06:07 <mupuf> tpalli: looks to me like an issue with the unreliable network. It should be improved today

06:08 <tpalli> mupuf okeydokkey

06:14 kode54 has joined #freedesktop

06:19 lazka has left #freedesktop [bye]

06:26 sima has joined #freedesktop

06:31 gbissett has joined #freedesktop

06:31 <kode54> aaaaaa, apparently the gitlab was just migrated sideways?

06:32 gbissett has quit []

06:35 alanc has quit [Remote host closed the connection]

06:35 alanc has joined #freedesktop

06:36 MajorBiscuit has joined #freedesktop

06:39 tzimmermann has joined #freedesktop

06:56 An0num0us has joined #freedesktop

06:57 AbleBacon has joined #freedesktop

07:04 ximion has joined #freedesktop

07:07 ximion has quit []

07:08 pendingchaos has quit [Read error: Network is unreachable]

07:08 MTCoster has quit [Read error: Network is unreachable]

07:08 robclark has quit [Read error: Network is unreachable]

07:08 i509vcb has quit [Write error: connection closed]

07:08 bwidawsk has quit [Write error: connection closed]

07:08 kode54 has quit [Read error: Network is unreachable]

07:08 zmike has quit [Read error: Network is unreachable]

07:08 pendingchaos has joined #freedesktop

07:08 zmike has joined #freedesktop

07:08 kode54 has joined #freedesktop

07:08 i509vcb has joined #freedesktop

07:08 <daniels> yes, it went up to 16.x

07:08 zzag has quit [Remote host closed the connection]

07:08 aswar002 has quit [Remote host closed the connection]

07:08 zzag has joined #freedesktop

07:08 <daniels> which does feature some big UI changes

07:09 <kode54> ah

07:09 bwidawsk has joined #freedesktop

07:09 robclark has joined #freedesktop

07:09 aswar002 has joined #freedesktop

07:10 ebassi has quit [Remote host closed the connection]

07:11 jsto has quit [Remote host closed the connection]

07:11 jsto has joined #freedesktop

07:11 MTCoster has joined #freedesktop

07:15 ebassi has joined #freedesktop

07:18 Yakov has joined #freedesktop

07:18 <Yakov> is it possilbe to detect windows key with libevdev?

07:38 <alatiera> registry usage should be transparent post-migration still right?

07:39 <alatiera> linux runners seem to work fine, but the windows builds can't reach it for push it seems

07:39 <bentiss> alatiera: minus the fact that it's hosted on the new cluster which is showing some serious disk issues

07:39 <alatiera> though I think the windows job did manage to login

07:39 <alatiera> bentiss ack, thanks

07:39 <bentiss> alatiera: plan is to solve this this morning, but I can not seem to pg_dump the current db right now

07:40 <bentiss> alatiera: yeah, I don't think the login requires an access to the db

07:45 <Yakov> can I get help upon lebevdev here?

07:52 <bentiss> I'm glad I made a dump of the registry db yesterady and I kept it around: the registry db on the new cluster is simply not anwswering any requests

07:52 <bentiss> II'll reset it to yesterday's state soon

07:57 <alatiera> what db is currently backing the registy now

07:57 <alatiera> old machine with the old dump?

07:57 <bentiss> alatiera: the one on the new cluster which is failing

07:58 <alatiera> hmm, seems to be working on my end mostly

07:58 <bentiss> I'm making it pointing at the old cluster with the new db

07:58 <alatiera> (as in it's pushing things)

07:58 <bentiss> alatiera: yeah, you'll have to re-push, the db will be reset to yesterday's state

07:59 <alatiera> weird that it doesn't ack requests on your side huh

07:59 <alatiera> bentiss yea I don't mind

07:59 <bentiss> when I run the db dump on the machine it's running, it's simply hanging, so I guess I must not be the only one having issues

08:02 fgdfgdfgd has quit [Ping timeout: 480 seconds]

08:03 AbleBacon has quit [Read error: Connection reset by peer]

08:06 blatant has joined #freedesktop

08:12 <mupuf> bentiss: yeah, the registry has been unreliable

08:14 <mupuf> daniels: did the update happen? I still see 15.X in the admin

08:16 <mupuf> Anyway, the priority should be fixing the registry :)

08:29 <bentiss> hmm... It seems I can now dump the registry that was failing

08:30 <bentiss> and it seems that when noone is accessing the disks, they are fine. That's weird isn't it :)

08:31 <alatiera> if a disk does io and nobody hears it, did it do it at all?

08:31 * alatiera knows where the door is

08:31 <bentiss> good question :)

08:32 <bentiss> anyway, big question: should I keep running the current registry db with the backup from yesterday, or should I dump the one from 30 min ago?

08:33 <bentiss> mupuf: ^^?

08:33 <mupuf> bentiss: the new one, plrase

08:33 <bentiss> mupuf: ok.

08:33 <bentiss> I need to take the registry down then

08:34 <bentiss> it's down now

08:38 <mupuf> Crossing fingers it will go well

08:38 <bentiss> so far so good

08:39 <bentiss> (replicating)

08:39 <bentiss> creating indexes....

08:40 <bentiss> mupuf: and regarding the gitlab migration to , yes it's not done, but I need a stable cluster for that

08:40 <bentiss> 16.x

08:40 <mupuf> Exactly

08:41 <bentiss> and done, respinning up the registry pods

08:41 <bentiss> (they seem to be happy)

08:42 <mupuf> Gitlab reports psql to be 14.9. Isn't that too old for gitlab 16.x?

08:42 <bentiss> mupuf: it's supposed to be 15.9

08:42 <bentiss> oops, no, 14.9, you are correct

08:42 <bentiss> IIRC we were on 13.x before

08:43 <mupuf> I see, hopefully this is good-enough for gitlab 16

08:44 <bentiss> https://docs.gitlab.com/charts/installation/tools.html#postgresql mentions postgres 13 for gitlab 15.x (gitlab chart was 6.x, it's now 7.x)

08:44 <mupuf> Yeah, just saw tgat

08:44 <mupuf> So I guess we were running something older than psql 13 then

08:44 <bentiss> https://docs.gitlab.com/charts/releases/7_0.html -> recommended postgres 14.8

08:45 <bentiss> no, we were on 13, and now we are on 14

08:45 <bentiss> and gitlab 16 requires 14

08:45 <mupuf> Yep, it was 12.7

08:45 <bentiss> we couldn't have run gitlab 15.x on 12.7

08:46 <hakzsam> is it safe to re-assign MR to Marge now?

08:46 <mupuf> Ok, whatever, I am probably misreading

08:46 <mupuf> hakzsam: you can assign, it may or may not go through

08:46 <bentiss> hakzsam: safe, not sure, but you can try :)

08:46 <mupuf> I'll babysit it

08:46 <hakzsam> ack

08:47 <bentiss> mupuf: indeed: https://gitlab.freedesktop.org/freedesktop/helm-gitlab-deployment/-/commit/ecc1760e8bb8533ca8b18f3259aeb2ea529f5dfd

08:47 <hakzsam> is the registry also restored now ? because there is 0 tags from https://gitlab.freedesktop.org/hakzsam/vk-cts-image/container_registry/5327

08:47 <bentiss> also, for mesa, we need to remove the CI variables pointing at harbor, it's useless now

08:47 <mupuf> hakzsam: user tags are gone

08:48 <hakzsam> like lost?

08:48 <mupuf> Not lost, you can transfer them using skopeo. It is explained in the banner

08:48 <mupuf> I'll send you a link when I reach my pc

08:48 <hakzsam> ok

08:49 <bentiss> mupuf: the link in the banner disappeared

08:49 <mupuf> oh, right, I'll add it back

08:50 <bentiss> mupuf: no rush, I haven't updated it

08:50 <alatiera> I wonder, do we have numbers of the size of the registry with and without user tags?

08:50 <bentiss> or maybe we should promote the instruction as a wiki page

08:50 <bentiss> alatiera: no, and we can not, the blobs are shared

08:51 <alatiera> ah

08:51 <bentiss> what I can give you is the size of the registry on gcs and the one we hold now that has garbage collection

08:51 <alatiera> was curious how much of the blobs were leaf to the users

08:51 <mupuf> bentiss: I would love to see the size difference between the registries

08:52 <bentiss> so on GCS, we had 27TB of data, and I pulled only 9.8TB

08:53 <alatiera> if we remove old mesa/gst images we can probably half that

08:53 <bentiss> on those 9.8 TB, the data contains the main projects plus all new registry repos that were created after I started the registry migration (I think last September, one year ago)

08:53 <mupuf> alatiera: more like 75% down :D

08:53 <alatiera> (but that only works when there are no user tags)

08:54 <bentiss> well, harbor has some more numbers, and since I set it up mesa is roughly 1TB of data

08:54 <alatiera> I have half a script to parse the image tags in yml for the gst repo

08:54 <bentiss> but in any case, we have gc now, so in theory, if we can clear the tags, the blobs will be cleared eventually

08:54 <alatiera> but never finished the "query the reigstry and delete everything not in main|stable branches

08:55 <mupuf> bentiss: still getting some 503: https://gitlab.freedesktop.org/mesa/mesa/-/jobs/48138108

08:55 <bentiss> weird

08:57 <bentiss> let me update the dns

08:57 <mupuf> So, the registry's DB is now in the old cluster, but the data is still in the new cluster, right?

08:58 <mupuf> and the migration of the data back to the old cluster should happen online

08:58 <mupuf> and when this is done, nothing will be using the new cluster

08:58 <mupuf> and thus it could be destroyed and re-built

08:59 <bentiss> mupuf: almost, s3.freedesktop.org is also on the new cluster

08:59 <bentiss> and it was working fine for the past week

08:59 <mupuf> but not anymore?

09:00 <bentiss> nobody seems to be complaining (though it's hard to test without the registry)

09:00 <mupuf> Yeah, the big data transfer must have broken a disk down... very unlucky.

09:00 <bentiss> these are different diskss

09:00 <bentiss> the HDDs are working fine

09:00 <bentiss> the SSDs are completely failing

09:00 <mupuf> I see

09:02 blatant has quit [Quit: WeeChat 4.0.4]

09:02 <mupuf> so, what's the plan for s3 then? Remain in the new cluster?

09:03 <bentiss> still TBD

09:03 <bentiss> ideally I need to do forensic to understand what is happening

09:07 konstantin_ is now known as konstantin

09:09 <mupuf> yeah, let's not be brash about it

09:25 <bentiss> I am configuring the runners to directly point at registry.fd.o, not harbor, and rebooting them

09:25 <bentiss> just in case the job log grows

09:29 <mupuf> bentiss: nice!

09:29 <mupuf> so, harbor will be gone then?

09:30 <bentiss> mupuf: that's the plan yes

09:30 <mupuf> what registry server are you using then?

09:30 <bentiss> but I need to ensure that mesa and gfx-ci are synced

09:30 <bentiss> mupuf: the one bundled with gitlab

09:30 <mupuf> ok :)

09:30 <mupuf> well, fewer services == better

09:30 <bentiss> which is capable of properly handle the authorizations and such

09:30 <bentiss> yeah

09:31 <bentiss> no rewrite of the urls on live for the runners to

09:31 <bentiss> too

09:31 <mupuf> damn right!

09:42 <bentiss> ml-24 is in a bad place, I'll reinstall it in a bit

09:42 <mupuf> ack, I think it had some issues previously... so not a big loss

09:43 <bentiss> it was weird: it was showing some link down on the bond, and now everytime I reboot, it's not using the correct boot entry

09:44 <mupuf> there's been some "network down" errors recently

09:44 <mupuf> may be related

09:45 <mupuf> I did not check if it was -24

09:45 <bentiss> could very well be

09:49 mvlad has joined #freedesktop

09:50 llyyr has left #freedesktop [WeeChat 3.8]

09:58 <dabrain34[m]1> I'm trying to rebuild a windows image for my project and I'm getting for the second time a "HTTP status: 503 service unavailable" when pushing it to the registry. Here is the job https://gitlab.freedesktop.org/dabrain34/GstPipelineStudio/-/jobs/48139879

10:01 <mupuf> dabrain34[m]1: hmm

10:03 <mupuf> I retried the job

10:04 <mupuf> bentiss: could this be an s3 error ^

10:04 <mupuf> the push fails

10:05 <mupuf> dabrain34[m]1: it doesn't hurt to "docker push ... || {sleep 5; docker push ...}

10:05 <mupuf> in case of network errors

10:05 <mupuf> but still, seems pretty flaky

10:07 <bentiss> mupuf: that runner is still pointing at the registry in the old cluster. And I can see errors from it. We either need to wait for the dns cache refresh or force one

10:07 <mupuf> bentiss: oh, great, thanks :)

10:08 <dabrain34[m]1> shall I do something ?

10:08 <bentiss> dabrain34[m]1: unless you have root access on that server, no

10:25 vsyrjala_ is now known as vsyrjala

10:30 <mupuf> bentiss: So, anything else you want to work on today or this week? I would like to write that the down time is over

10:30 <mupuf> (and that DNS may take some time to propagate, but otherwise, we are done)

10:31 <bentiss> mupuf: would be nice if we could upgrade gitlab too

10:31 <dabrain34[m]1> how long should I wait more or less for this DNS propagation ?

10:31 <bentiss> dabrain34[m]1: at most 4 hours

10:31 <dabrain34[m]1> ok

10:31 <dabrain34[m]1> thanks for the support :)

10:32 <mupuf> bentiss: right, yeah, probably a good thing to do

10:32 <mupuf> but the registry work is done for now, right?

10:33 <bentiss> maybe? :)

10:33 <mupuf> we have the data in the new cluster, the DB in the old one

10:33 <bentiss> yeah

10:33 <mupuf> good good

10:33 <bentiss> I've disabled the gc while I am copying the data over to the old cluster

10:33 <bentiss> so yeah, not entirely finished

10:33 <mupuf> good call

10:34 <mupuf> is that a hot transfer, or is there still potential for data loss?

10:34 <bentiss> no hot transfer

10:34 <bentiss> no, hot tranfer

10:35 <bentiss> well, could have like a blob not transfered when I switch from the new to the old cluster, but I'll continue to sync the blobs in the background, so like 10 min delay

10:35 <mupuf> ack

10:36 <mupuf> ok, I'll write something down and ask you for a review

10:36 <bentiss> thanks!

10:45 <mupuf> bentiss: how long do you think it would take to upgrade to gitlab 16?

10:45 <mupuf> ~30 minutes?

10:46 <bentiss> mupuf: no idea. It can take a while, and it can be transparent or not depending on how the migration happens

10:46 * bentiss <- lunch, bbl

10:47 <mupuf> bentiss: enjoy!

11:03 Yakov has quit [Remote host closed the connection]

11:07 bnilawar has quit [Ping timeout: 480 seconds]

11:13 Ndfkjhw4 has quit []

11:13 Ndfkjhw4 has joined #freedesktop

11:14 Major_Biscuit has joined #freedesktop

11:16 Ndfkjhw4 has quit []

11:16 MajorBiscuit has quit [Ping timeout: 480 seconds]

11:33 bmodem has quit [Ping timeout: 480 seconds]

12:01 vkareh has joined #freedesktop

12:05 funestia[m] has left #freedesktop [#freedesktop]

12:07 <hakzsam> looks like pushing new images to registry is unavailable: received unexpected HTTP status: 500 Internal Server Error ?

12:09 <bentiss> hakzsam: which job, and which runner?

12:10 <hakzsam> it happened to me when I wanted to push a new image to vk-cts-image

12:11 <bentiss> hakzsam: I can see access to your registry on the failing registry pod, so hopefully when the dns cache gets properly expired, you should be fine

12:11 <hakzsam> ok, I will wait a bit then, thanks!

12:11 <bentiss> (should be another 2 hours tops)

12:14 <hakzsam> sounds good

12:53 peelz has joined #freedesktop

12:53 raghavgururajan has joined #freedesktop

12:53 elibrokeit_ has joined #freedesktop

12:53 rpigott_ has joined #freedesktop

12:53 moses_ has joined #freedesktop

12:53 ifreund_ has joined #freedesktop

12:54 peelz is now known as Guest1037

12:54 _lemes has joined #freedesktop

12:54 raghavgururajan is now known as Guest1043

12:55 MTCoster_ has joined #freedesktop

12:56 MTCoster_ has quit []

12:56 dcunit3d_ has joined #freedesktop

12:56 ebassi_ has joined #freedesktop

12:57 melissawen_ has joined #freedesktop

12:57 Sachiel has quit [resistance.oftc.net larich.oftc.net]

12:57 vkareh has quit [resistance.oftc.net larich.oftc.net]

12:57 ebassi has quit [resistance.oftc.net larich.oftc.net]

12:57 MTCoster has quit [resistance.oftc.net larich.oftc.net]

12:57 aswar002 has quit [resistance.oftc.net larich.oftc.net]

12:57 dcunit3d has quit [resistance.oftc.net larich.oftc.net]

12:57 alanc has quit [resistance.oftc.net larich.oftc.net]

12:57 sumits has quit [resistance.oftc.net larich.oftc.net]

12:57 itaipu has quit [resistance.oftc.net larich.oftc.net]

12:57 mattst88 has quit [resistance.oftc.net larich.oftc.net]

12:57 kem has quit [resistance.oftc.net larich.oftc.net]

12:57 Guest8532 has quit [resistance.oftc.net larich.oftc.net]

12:57 lemes has quit [resistance.oftc.net larich.oftc.net]

12:57 siqueira has quit [resistance.oftc.net larich.oftc.net]

12:57 melissawen has quit [resistance.oftc.net larich.oftc.net]

12:57 bleb has quit [resistance.oftc.net larich.oftc.net]

12:57 anholt has quit [resistance.oftc.net larich.oftc.net]

12:57 lkundrak has quit [resistance.oftc.net larich.oftc.net]

12:57 elibrokeit has quit [resistance.oftc.net larich.oftc.net]

12:57 Guest7111 has quit [resistance.oftc.net larich.oftc.net]

12:57 rpigott has quit [resistance.oftc.net larich.oftc.net]

12:57 ifreund has quit [resistance.oftc.net larich.oftc.net]

12:57 moses has quit [resistance.oftc.net larich.oftc.net]

12:57 abrotman has quit [resistance.oftc.net larich.oftc.net]

12:57 Lyude has quit [resistance.oftc.net larich.oftc.net]

12:57 demarchi has quit [resistance.oftc.net larich.oftc.net]

12:57 abrotman has joined #freedesktop

12:57 vkareh has joined #freedesktop

12:57 moses_ is now known as moses

12:58 elibrokeit_ is now known as elibrokeit

12:58 ifreund_ is now known as ifreund

12:58 siqueira has joined #freedesktop

12:58 MTCoster has joined #freedesktop

12:58 Sachiel has joined #freedesktop

13:00 aswar002 has joined #freedesktop

13:00 demarchi has joined #freedesktop

13:02 lkundrak has joined #freedesktop

13:02 alanc has joined #freedesktop

13:02 Lyude has joined #freedesktop

13:02 kem has joined #freedesktop

13:02 itaipu has joined #freedesktop

13:02 anholt has joined #freedesktop

13:04 sumits has joined #freedesktop

13:06 mattst88_ has joined #freedesktop

13:07 bleb has joined #freedesktop

13:10 <bentiss> alright, fixed the pages jobs and all artifacts uploads... it was trying to access the failing cluster instead of using the current one

13:13 MajorBiscuit has joined #freedesktop

13:14 Major_Biscuit has quit [Ping timeout: 480 seconds]

13:36 spiegela has joined #freedesktop

13:36 <bentiss> I've removed harbor from teh CI configuration in mesa. In theory, no visible impact

13:39 <bentiss> maybe not

13:41 epony has joined #freedesktop

13:56 <alatiera_afk[m]> 🤞

14:00 <zmike> anyone know what's going on with these cargo failures https://gitlab.freedesktop.org/zmike/mesa/-/pipelines/972362

14:01 <bentiss> zmike: I think karolherbst and daniels talked about that last week

14:04 melissawen_ has left #freedesktop [Leaving]

14:05 melissawen has joined #freedesktop

14:19 Haaninjo has joined #freedesktop

14:24 <karolherbst> yeah, but it was unclear what's causing this problem or rather what we want to do to fix it... Those containers don't use our rustup script, so the installed rust version comes from _somewhere_

14:25 <zmike> it's blocking further updates

14:25 <zmike> so ideally we want to do something

14:25 <zmike> even if it's just a stopgap

14:25 <karolherbst> sure, but the infra update happened :) I guess we should get back to that issue

14:26 <karolherbst> but I have no idea about that part of CI, it's something something the kernel stuff is doing there

14:26 <hakzsam> yeah, it's blocking every new containers

14:26 vkareh has quit [Quit: WeeChat 3.6]

14:28 <mupuf> eric_engestrom: that may be something you can help with ^

14:28 <karolherbst> it's probably the clap_lex upgrade to 0.5.1 which happened like 5 days ago and bindgen (or something) selects 0.5.x

14:28 <karolherbst> yeah.. that bumped the rust req from 1.64.0 to 1.70.0

14:29 <karolherbst> I think the solution here is to make crosvm use rustup (and our script for that) and install rustc 1.70 instead of relying on the distributions rustc

14:30 <eric_engestrom> I don't have much context here, but that last sentence makes sense to me karolherbst :)

14:30 <karolherbst> hakzsam, zmike: there is a workaround you can try

14:31 <karolherbst> inside .gitlab-ci/container/build-crosvm.sh

14:31 <zmike> you know I love workaroudns

14:31 <karolherbst> mhh.. maybe not, not sure how --locked actually works here with binaries

14:32 <karolherbst> yeah.. it's doing something else

14:33 <karolherbst> there is a thing called a cargo.lock file, but I've never used it and have no idea how it works

14:35 <bentiss> that's weird, the mesa images have not been mirrored from harbour to registry, when harbor says so

14:35 <karolherbst> who is maintaining/managing the crossvm stuff?

14:35 <karolherbst> *crosvm

14:36 <karolherbst> tintou and DavidHeidelberg[m]?

14:36 <karolherbst> please read ^^

14:36 <karolherbst> crosvm generation has to use a fixed rustc, not whatever the distribution uses, else a crate dependency update _might_ not compile, because of too old rustc

14:37 <karolherbst> it's currently broken, see the pipeline link

14:37 killpid_ has joined #freedesktop

14:37 <tintou> Yeah I actually just bumped on it

14:37 <karolherbst> _maybe_ using a Cargo.lock file is the more reliable solution here

14:37 <karolherbst> probably the one causing less issues

14:44 mattst88_ has quit []

14:55 <bentiss> FWIW, I'm babysitting the mesa pipeline by manually copying from harbor to registry the images that are not here

14:55 <zmike> heroic

14:55 <bentiss> I really don't know why harbor wasn't doing the replication properly

14:57 <eric_engestrom> bentiss: is it possible that somehow multiarch images got lost in translation?

14:57 <eric_engestrom> https://gitlab.freedesktop.org/mesa/mesa/-/jobs/48166178

14:57 <eric_engestrom> > .gitlab-ci/meson/build.sh: line 28: /usr/bin/llvm-config-15: cannot execute binary file: Exec format error

14:57 <bentiss> eric_engestrom: it's not a multiarch image, isn't it?

14:57 <eric_engestrom> wait no, that's not a multiarch image, that's an x86_64 image cross-building to s390x

14:58 <bentiss> let me push it again

14:58 AbleBacon has joined #freedesktop

14:58 <bentiss> the blobs did not match, they are pushed again

14:58 <eric_engestrom> thanks!

14:58 <eric_engestrom> is the push finished?

14:59 <bentiss> not yet

14:59 <bentiss> I'll restart the job

14:59 <eric_engestrom> ack; I jumped the gun and already did

15:00 <bentiss> I need to remove the image on the runners also

15:00 <eric_engestrom> actually no rush on retrying that job, the MR will fail anyway because other jobs have been taking too long so it's too late for marge anyway

15:01 <bentiss> well, it's for the next run

15:02 <bentiss> but maybe it's because it was running on ml24 I just reinstalled

15:02 <bentiss> and *maybe* s390x is not working there

15:02 <eric_engestrom> bentiss: your latest retry worked, thanks!

15:03 <bentiss> ok... no ideas what is happening, the image also works on ml-24

15:04 <bentiss> hah! it's a runner issue

15:04 <bentiss> /usr/bin/llvm-config-15: cannot execute binary file: Exec format error

15:04 <bentiss> error: Checking out added file "/ppc64le-linux-gnu": mkdirat: No such file or directory

15:11 <dabrain34[m]1> mupuf: As a follow up I face now received unexpected HTTP status: 500 Internal Server Error https://gitlab.freedesktop.org/dabrain34/GstPipelineStudio/-/jobs/48154304 is it expected ?

15:11 <mupuf> No, I would have expected the DNS to be updated by now

15:11 <mupuf> But I know it can take a while

15:11 <dabrain34[m]1> the same

15:12 <dabrain34[m]1> ok I'll give a try tomorrow morning

15:12 <mupuf> Maybe you can modify the job to print the IP for registry.freedesktop.org?

15:18 <bentiss> eric_engestrom: a reboot of the runner solved the issue (that's a package we can not put in the current boot apparently)

15:22 vyivel has quit [Remote host closed the connection]

15:23 <dabrain34[m]1> it gives 172.29.208.1

15:24 <dabrain34[m]1> where on my machine it gives 147.75.198.156

15:24 <bentiss> looks like a proxy?

15:26 <dabrain34[m]1> what should I expect ?

15:26 <bentiss> 147.75.198.156 is the correct IP

15:35 MajorBiscuit has quit [Ping timeout: 480 seconds]

15:41 <bentiss> I think the errors on the registry are due to "FATAL: sorry, too many clients already / FATAL: remaining connection slots are reserved for non-replication superuser connections"

15:41 <bentiss> so we are DoS the db

15:43 <mupuf> Oops

15:44 <bentiss> I'll probably split the db in 2 pods, one for registry, and one for gitlab

15:44 vyivel has joined #freedesktop

15:48 <mupuf> bentiss: can't just increase the connection count?

15:48 <bentiss> maybe

15:49 <bentiss> but we already increased it to 300, so maybe we are loading too much the db

15:49 <mupuf> I'm all for splitting, but as quick workaround, it would help

15:50 <mupuf> bentiss: I guess the load average and iostats would tell us that better than connection counts

15:51 <bentiss> but if I increase the connection count, I'll have to cut gitlab (or at least the db), while if I split, I just have to stop the registry for 1-2 min

15:52 <mupuf> And how much work is it?

15:52 <tpalli> brw I did bump the RUST_VERSION in latest version of the particular pipeline that failed, not sure if that is the correct solution

15:52 <mupuf> If it isn't much work, then fuck yeah!

15:52 <bentiss> mupuf: should be too hard to do

15:52 <tpalli> s/r/t/

15:54 <mupuf> Modularity is good then

15:55 todi has quit []

16:11 <bentiss> alright, cutting down the registry for a short amount of time, while I migrate to a separate db

16:14 <mupuf> bentiss: crossing fingers

16:15 <bentiss> db migrated, waiting for the new config to propagate

16:15 <bentiss> pods are starting

16:15 <bentiss> and running

16:15 <mupuf> \o/

16:16 <bentiss> seems to be working (as in skopeo inspect works)

16:18 <mupuf> yeah, and my runners seem to be happy too

16:22 killpid_ has quit [Quit: Quit.]

16:26 <mupuf> bentiss: so far, so good! https://gitlab.freedesktop.org/mesa/mesa/-/pipelines/972658

16:27 <bentiss> yep, not a single 500 since 18:00

16:27 <mupuf> <3

16:27 <bentiss> and noone should be using harbor by now too :)

16:30 <bentiss> alright I'm done for the day. I hope nothing will crash over the night

16:32 <mupuf> bentiss: thanks! Enjoy your evening!

16:36 MajorBiscuit has joined #freedesktop

16:50 tzimmermann has quit [Quit: Leaving]

17:06 MajorBiscuit has quit [Ping timeout: 480 seconds]

17:33 konstantin_ has joined #freedesktop

17:33 konstantin is now known as Guest1099

17:33 konstantin_ is now known as konstantin

17:37 Guest1099 has quit [Ping timeout: 480 seconds]

17:37 alanc has quit [Remote host closed the connection]

17:45 mvlad has quit [Remote host closed the connection]

17:51 lsd|2 has joined #freedesktop

18:06 Ahuj has quit [Ping timeout: 480 seconds]

18:06 ximion has joined #freedesktop

18:24 fgdfgdfgd has joined #freedesktop

18:42 alanc has joined #freedesktop

19:06 An0num0us has quit [Ping timeout: 480 seconds]

19:11 thaller is now known as Guest1110

19:11 thaller has joined #freedesktop

19:15 todi has joined #freedesktop

19:17 Guest1110 has quit [Ping timeout: 480 seconds]

19:21 Haaninjo has quit [Quit: Ex-Chat]

19:22 shbrngdo has quit [Ping timeout: 480 seconds]

19:25 shbrngdo has joined #freedesktop

19:29 mirai has quit [Remote host closed the connection]

19:29 mirai has joined #freedesktop

19:36 krushia has joined #freedesktop

19:37 MajorBiscuit has joined #freedesktop

19:42 An0num0us has joined #freedesktop

19:44 DodoGTA has quit [Quit: DodoGTA]

19:44 Leopold_ has quit []

19:45 DodoGTA has joined #freedesktop

19:49 Leopold_ has joined #freedesktop

19:49 thaller has quit [Read error: Connection reset by peer]

19:49 thaller has joined #freedesktop

19:58 sima has quit [Ping timeout: 480 seconds]

20:16 lsd|2 has quit []

20:16 thaller has quit [Read error: Connection reset by peer]

20:17 thaller has joined #freedesktop

20:17 MajorBiscuit has quit [Ping timeout: 480 seconds]

20:32 melissawen has quit [Remote host closed the connection]

20:33 melissawen has joined #freedesktop

20:56 duncaen has joined #freedesktop

20:57 krushia has quit [Quit: Konversation terminated!]

21:05 MajorBiscuit has joined #freedesktop

21:08 DodoGTA has quit [Quit: DodoGTA]

21:09 DodoGTA has joined #freedesktop

21:18 Major_Biscuit has joined #freedesktop

21:19 MajorBiscuit has quit [Ping timeout: 480 seconds]

21:26 Major_Biscuit has quit [Ping timeout: 480 seconds]

22:27 <koike> bentiss thanks for working on this

22:30 Leopold_ has quit [Ping timeout: 480 seconds]

22:38 Leopold_ has joined #freedesktop

23:09 An0num0us has quit [Ping timeout: 480 seconds]

23:24 lsd|2 has joined #freedesktop