#freedesktop on 2021-06-02 — irc logs at oftc.irclog.whitequark.org

00:00 ximion1 has quit [Remote host closed the connection]

00:00 ximion1 has joined #freedesktop

00:10 jcline has quit [Quit: Bye.]

00:16 shbrngdo has quit [Remote host closed the connection]

00:17 shbrngdo has joined #freedesktop

00:22 tanty has quit [Ping timeout: 480 seconds]

00:22 tanty has joined #freedesktop

00:24 MrCooper has quit [Read error: Connection reset by peer]

00:24 MrCooper has joined #freedesktop

02:28 <bentiss> alright, mc mirror --watch is taking way too much memory, and if I don't stop it now, we are going to lose an other node soon

02:29 <bentiss> I've been running it for the past ~24 h, so I'm going to switch the config now, and hopefully people will not notice

02:33 <bentiss> and that should hopefully solve the 500s when uploading artifacts

02:37 ximion1 has quit []

02:47 <bentiss> ok, new config applied. I'll sync the remaining logs tomorrow, but I can target them more easily now given that I don't have to run the mirror on the whole instance

02:56 * bentiss goes back to bed

04:22 agd5f_ has joined #freedesktop

04:22 agd5f has quit [Read error: Connection reset by peer]

04:26 agd5f_ has quit [Read error: Connection reset by peer]

04:26 agd5f_ has joined #freedesktop

05:32 chomwitt has joined #freedesktop

05:53 i-garrison has quit []

05:59 jarthur has quit [Ping timeout: 480 seconds]

06:25 jarthur has joined #freedesktop

06:35 jarthur has quit [Quit: Textual IRC Client: www.textualapp.com]

06:42 chomwitt has quit [Ping timeout: 480 seconds]

06:50 danvet has joined #freedesktop

07:04 chomwitt has joined #freedesktop

07:47 sunarch has joined #freedesktop

07:57 <daniels> bentiss: ooof

08:01 chomwitt has quit [Remote host closed the connection]

08:02 <bentiss> daniels: BTW, there is a security update pending, but I'd like to have a full backup done first

08:03 <bentiss> I enabled the pages bucket to be backed up yesterday, but it keeps failing to work. I think I manually managed to work around the errors, but I'd rather not touch the deployment now

08:04 <bentiss> daniels: in case you wonder too: minio-artifacts is now gone, so is fdo-k3s-large-4, in 2 hours I should have locally copied te 6 backup files, and we can killminio-backup too

08:06 <bentiss> and FWIW, not a single 500 since the switch to ceph

08:11 pendingchaos has quit [Read error: Connection reset by peer]

08:11 pendingchaos has joined #freedesktop

08:16 <daniels> bentiss: \o/ \o/ \o/

08:16 <daniels> and yeah, I did see the security update, but was thinking waiting might be better ...

08:16 <daniels> nothing in it looks _crushingly_ urgent, but definitely good to have

08:16 <daniels> I have some other things for the next couple of hours btw

08:45 <psychon> somehow GitLab seems to require a lot of maintenance...

08:45 <psychon> anyway, https://gitlab.freedesktop.org/jfkthame/cairo/-/pipelines/325031 only gives me a 500

08:52 <bentiss> psychon: that's a pipeline from last week, and we basically lost all of them

08:54 <bentiss> we are slowly recovering the data from 2021/02/01 to 2021/05/20, there is a hole and then we got data since 2021/05/28

08:58 <daniels> psychon: it's not GitLab itself fwiw, it's our underlying storage

09:53 <bentiss> sigh... today's backup was successful, but only 222GB instead of ~350GB -> the pod could not talk to the old cluster anymore :(

10:12 vmeson has quit [Read error: Connection reset by peer]

10:12 <daniels> task-runner on the new cluster?

10:13 pendingchaos has quit []

10:13 pendingchaos has joined #freedesktop

10:13 <bentiss> yep

10:13 <bentiss> the backups are running on the new cluster

10:15 <bentiss> daniels: my new favorit tool to use remote storage: https://rclone.org/ -> it managed to copy the backups when mc would just fail and s3cmd is soooo slow (because single threaded)

10:20 <psychon> okay, thanks for the info & no problem

10:22 <daniels> bentiss: noted!

10:46 shbrngdo has quit [Remote host closed the connection]

10:46 shbrngdo has joined #freedesktop

11:01 aleksander has quit []

12:59 vmeson has joined #freedesktop

13:47 <shadeslayer> daniels: looks like it works https://gitlab.freedesktop.org/shadeslayer/mesa/-/pipelines/330815/test_report

13:49 <daniels> nice!

14:05 bl4ckb0ne has joined #freedesktop

14:06 <bl4ckb0ne> is the monado channel still on freenode?

14:09 bl4ckb0ne has quit [Remote host closed the connection]

14:09 emersion has quit [Remote host closed the connection]

14:12 <daniels> yeah for now, they're going to move it soon

14:13 <daniels> but realistically all the activity happens on Discord anyway

14:15 emersion has joined #freedesktop

14:16 bl4ckb0ne has joined #freedesktop

14:40 chomwitt has joined #freedesktop

15:50 jarthur has joined #freedesktop

15:51 chomwitt has quit [Ping timeout: 480 seconds]

16:39 <bentiss> daniels: couple of things:

16:40 <bentiss> daniels: 1. the transfer of all the artifacts and backups is now done (rclone was *way* faster)

16:41 <bentiss> daniels: 2. I understand why pages are not properly backing up: the files are uploaded from a git user with id 998, and we are running the task-runner pod as git/1000 -> the `s3cmd sync` called tries to change the owner, and it fails with persmission error

16:41 <bentiss> for 1. -> I'll clean up large-2 and large-3 tomnight

16:41 <bentiss> for 2. -> still thinking what is the best course of actions

16:42 <bentiss> maybe we should monkey patch /usr/lib/ruby/vendor_ruby/object_storage_backup.rb

16:43 <bentiss> adding --no-preserve to https://gitlab.com/gitlab-org/build/CNG/-/blob/a30b5bee0b8d37ea80f9c98c001e0a777529692d/gitlab-task-runner/scripts/lib/object_storage_backup.rb#L24 would be good enough...

16:44 ximion has joined #freedesktop

16:47 alanc has quit [Remote host closed the connection]

16:48 alanc has joined #freedesktop

17:00 chomwitt has joined #freedesktop

17:07 ximion1 has joined #freedesktop

17:07 ximion has quit [Remote host closed the connection]

17:13 chomwitt has quit [Ping timeout: 480 seconds]

17:55 chomwitt has joined #freedesktop

18:17 <bentiss> oh, well, this time the backup passed...

18:41 ngcortes has joined #freedesktop

18:48 ngcortes has quit [Remote host closed the connection]

18:56 <bentiss> daniels: large-2 and large-3 are now evicted from the cluster. We "just" need to release the servers

19:45 shbrngdo has quit [Remote host closed the connection]

19:46 shbrngdo has joined #freedesktop

20:36 danvet has quit [Ping timeout: 480 seconds]

20:59 ngcortes has joined #freedesktop

21:14 karolherbst_ is now known as karolherbst

21:22 chomwitt has quit [Ping timeout: 480 seconds]

22:07 ngcortes has quit [Remote host closed the connection]