#freedesktop on 2024-06-04 — irc logs at oftc.irclog.whitequark.org

2023-09-08 23:49 daniels changed the topic of #freedesktop to: https://www.freedesktop.org infrastructure and online services || for questions about freedesktop.org projects, please see each project's contact || for discussions about specifications, please use https://gitlab.freedesktop.org/xdg or xdg@lists.freedesktop.org

00:04 lsd|2 has joined #freedesktop

00:11 kode54 has quit [Quit: The Lounge - https://thelounge.chat]

00:11 KDDLB has quit [Quit: The Lounge - https://thelounge.chat]

00:14 kode54 has joined #freedesktop

00:15 lsd|2 has quit [Ping timeout: 480 seconds]

00:18 lsd|2 has joined #freedesktop

00:21 Leopold_ has quit [Remote host closed the connection]

00:35 lsd|2 has quit [Quit: KVIrc 5.2.2 Quasar http://www.kvirc.net/]

01:05 guludo has quit [Ping timeout: 480 seconds]

02:22 KDDLB has joined #freedesktop

02:38 Trevinho has joined #freedesktop

02:38 M839ty9[m] has joined #freedesktop

02:38 Wallbraker has joined #freedesktop

02:38 adziahel[m] has joined #freedesktop

02:38 aenderboy[m] has joined #freedesktop

02:38 alatiera[m] has joined #freedesktop

02:39 anomalous_creator[m] has joined #freedesktop

02:39 ashelytina[m] has joined #freedesktop

02:39 bendlas[m] has joined #freedesktop

02:39 BLumia[m] has joined #freedesktop

02:39 cassidy[m] has joined #freedesktop

02:40 chrysn[m]1 has joined #freedesktop

02:40 cmeissl[m] has joined #freedesktop

02:40 dabrain34[m]1 has joined #freedesktop

02:40 Hiperion[m] has joined #freedesktop

02:40 dcbaker has joined #freedesktop

02:40 Guest8266 has joined #freedesktop

02:41 enunes[m] has joined #freedesktop

02:41 ewlsh[m] has joined #freedesktop

02:41 gallo[m] has joined #freedesktop

02:41 gegoxaren[m] has joined #freedesktop

02:41 general_j[m] has joined #freedesktop

02:42 gnfzdz[m] has joined #freedesktop

02:42 havdan[m] has joined #freedesktop

02:42 Hazematman has joined #freedesktop

02:42 hch12907 has joined #freedesktop

02:42 heftig has joined #freedesktop

02:42 Guest8258 has joined #freedesktop

02:43 jasuarez has joined #freedesktop

02:43 jenatali has joined #freedesktop

02:43 jjardon[m] has joined #freedesktop

02:43 jtatz[m] has joined #freedesktop

02:43 kusma has joined #freedesktop

02:43 LaughingMan[m] has joined #freedesktop

02:44 Pope_Rigby[m] has joined #freedesktop

02:44 mairacanal[m] has joined #freedesktop

02:44 marcel203s[m] has joined #freedesktop

02:44 Mark[m]1 has joined #freedesktop

02:44 marting[m] has joined #freedesktop

02:44 matrix638[m] has joined #freedesktop

02:45 mlsatzin[m] has joined #freedesktop

02:45 mripard1 has joined #freedesktop

02:45 msizanoen[m] has joined #freedesktop

02:45 muhlinux[m] has joined #freedesktop

02:45 nazarewk[m] has joined #freedesktop

02:45 nee[m] has joined #freedesktop

02:46 nielsdg has joined #freedesktop

02:46 nirbheek_ has joined #freedesktop

02:46 pac85[m] has joined #freedesktop

02:46 pitch has joined #freedesktop

02:46 pv[m] has joined #freedesktop

02:46 valida-69[m] has joined #freedesktop

02:47 ramman_[m] has joined #freedesktop

02:47 razze[m] has joined #freedesktop

02:47 Guest8395 has joined #freedesktop

02:47 rpurdie[m] has joined #freedesktop

02:47 dabrain34[m] has joined #freedesktop

02:47 seaweed[m] has joined #freedesktop

02:48 serbbenzo[m] has joined #freedesktop

02:48 siddh has joined #freedesktop

02:48 SintayewGashaw[m] has joined #freedesktop

02:48 sergi has joined #freedesktop

02:48 Sumera[m] has joined #freedesktop

02:48 swick[m] has joined #freedesktop

02:49 sythemeta847[m] has joined #freedesktop

02:49 tayloralgo1[m] has joined #freedesktop

02:49 Nova[m] has joined #freedesktop

02:49 therealsteamlord[m] has joined #freedesktop

02:49 tintou has joined #freedesktop

02:49 underpantsgnome[m] has joined #freedesktop

02:50 tinywrkb has joined #freedesktop

02:50 tomeu has joined #freedesktop

02:50 ttancos[m] has joined #freedesktop

02:50 twopubsolar[m] has joined #freedesktop

02:50 mitTengiz[m] has joined #freedesktop

02:50 Soroush has joined #freedesktop

02:51 unrznbl[m] has joined #freedesktop

02:51 valentine has joined #freedesktop

02:51 Torxon[m] has joined #freedesktop

02:51 MatrixTravelerbot[m] has joined #freedesktop

02:51 vulpes2[m] has joined #freedesktop

02:51 dlx[m] has joined #freedesktop

02:52 ylatuya[m] has joined #freedesktop

02:52 zredshift[m] has joined #freedesktop

03:13 ximion has quit [Quit: Detached from the Matrix]

03:22 jokester1365 has joined #freedesktop

03:22 jokester1365 has left #freedesktop [#freedesktop]

03:24 jokester1365 has joined #freedesktop

03:32 jokester1365 has quit [Remote host closed the connection]

03:48 privacy has quit [Quit: Leaving]

04:07 manuels2 has quit [Quit: Ping timeout (120 seconds)]

04:07 manuels2 has joined #freedesktop

04:14 bmodem has joined #freedesktop

04:24 Juest is now known as Guest8535

04:24 Juest has joined #freedesktop

04:31 Guest8535 has quit [Ping timeout: 480 seconds]

04:51 tlwoerner has quit [Remote host closed the connection]

04:51 tlwoerner has joined #freedesktop

05:02 Leopold_ has joined #freedesktop

05:46 Shibe has quit [Remote host closed the connection]

05:57 vx has quit [Quit: G-Line: User has been permanently banned from this network.]

05:57 vx has joined #freedesktop

06:06 Guest8266 is now known as DrNick

06:15 <bilboed> IIIUC the red step, the DB already had the data ??

06:16 AbleBacon has quit [Read error: Connection reset by peer]

06:29 alice has quit [Ping timeout: 480 seconds]

06:48 Leopold_ has quit [Remote host closed the connection]

06:54 <bentiss> bilboed: nah, the new config confused the script: it thought it had the data when it hadn't

07:01 Haaninjo has joined #freedesktop

07:03 <bilboed> Ok. Looks like things are coming along nicely

07:05 <bentiss> yeah, I'm going to restore the service and run a couple of tests. It might have to revert to the last backup, so please don't start working on it

07:06 <bilboed> 👍️

07:11 samuelig has quit []

07:12 samuelig has joined #freedesktop

07:14 <bentiss> hmm... the gitlab webservice pods keep crashing

07:21 <pq> It feels like holiday while gitlab is down... no rush :-)

07:29 <bentiss> I have some errors in the db, running reindex on it

08:01 hec2233 has joined #freedesktop

08:01 hec2233 has quit []

08:08 ehfd[m] has joined #freedesktop

08:14 smpl has joined #freedesktop

08:29 jani has quit []

08:29 jani has joined #freedesktop

08:49 <svuorela> bentiss: done ?

08:49 <bentiss> svuorela: nope :(

08:49 <bentiss> the webservice are still crashing

08:50 <svuorela> oh. Just got actual pages when accidentally reloading a tab.

08:50 <bentiss> yeah, it keeps getting up and done

09:04 mvlad has joined #freedesktop

09:12 <bentiss> I've restarted the 2 db and migrated them on different nodes, that seems better

09:16 <bentiss> not really...

09:24 Shibe has joined #freedesktop

09:39 Haaninjo has quit [Quit: Ex-Chat]

09:46 <bentiss> daniels: I can't seem to make the webservice pods stay up

09:53 <bentiss> attempting to restart gitaly pods, as they might still be confused since teh db split (I haven't restarted them since)

09:57 <daniels> bentiss: I’m just coming to work (very slowly; turns out walking on a broken tire isn’t much fun) so can have a look in a little bit

09:58 <bentiss> daniels: ok, thanks. it's about lunch time here...

09:59 <bentiss> daniels: long story short: the webservice pods keep getting down and up (from 1/2 to 2/2 to 1/2)

09:59 <bentiss> daniels: it could come from the db or the webservice configuration, not sure

10:00 <bentiss> daniels: one solution would be to redeploy today's backup (it takes ~5h), disable db split and see if that changes anything

10:01 <bentiss> we can do that by keeping the current db in the back (we keep the pv around, and just recreate the pvc)

10:01 <bentiss> daniels: also as soon as I disable ingress towards the webservice pods, they are back up. So it's load related

10:04 <bentiss> I've tested spinning up and down the number of replicas, but same result

10:05 <bentiss> and of course, nothing interesting in the logs I could find

10:07 <bentiss> let me disable webservice :(

10:17 <bentiss> I'm going to restore the db dump before the split. We'll see if that helps

10:30 <daniels> bentiss: hmmm ok, I was going to go stare at the logs (wow that walk was really slow) - did you see anything?

10:30 <bentiss> not much no...

10:31 <bentiss> daniels: I've kept the pv being the db split. So once that restore ends, we can go live with one or the other

10:31 <daniels> nice, thanks a lot

10:33 <bentiss> the db is restoring (manually with pg_restore, this is faster), and I'm going afk a bit

10:34 <bentiss> (lunch)

10:38 <daniels> bon appetit :)

11:47 vkareh has joined #freedesktop

11:52 bmodem has quit [Ping timeout: 480 seconds]

11:56 guludo has joined #freedesktop

12:31 <ndufresne> Was this expected to take that long ?

12:31 <karolherbst> the downtime was planed to be 48 hours

12:32 <ndufresne> ok, so far so good, is there a ML I perhaps should signing ?

12:32 <karolherbst> it was communicated via banners on gitlab

12:33 <ndufresne> ah, I should have checked the day of the week, I always assume this is happening over the weekend

12:33 <karolherbst> but maybe it makes sense to add an ETA on the maintenance page in the future as well?

12:33 <karolherbst> bentiss: ^^

12:33 <karolherbst> or rather.. the planned ETA

12:35 <ehfd[m]> karolherbst: I think it's pretty unpredictable.

12:36 <karolherbst> sure, but the point is to give drive-by people some kind of answer

12:37 <zmike> oh no we hit a goto 80%

13:16 lsd|2 has joined #freedesktop

13:17 teemperor has joined #freedesktop

13:18 <teemperor> question: is there an ETA for the gitlab upgrade? it seems to be stuck since a while

13:19 lsd|2 has quit []

13:22 pocek[m] has joined #freedesktop

13:23 <bentiss> karolherbst: OK, added the planned window outage time

13:23 <karolherbst> thanks :)

13:24 <bentiss> teemperor: no ETA no. Things did not went as well as expected. We have to revert part of the changes, but if the problem is different, that might require another 5h of db restore

13:28 <bl4ckb0ne> are you still going with the db split?

13:31 <bentiss> I'm going to try if the webservice pods are working better without the split, and take a decision after

13:31 <bentiss> but it's unlikely we will be able to re-split given the time left

13:32 <karolherbst> :'(

13:32 <karolherbst> well at least you have backups and could try locally to see what's the problem or something

13:32 <karolherbst> (after getting the system live again)

13:32 <bentiss> karolherbst: if the problems are unrelated to the db split and we fix them I can always revert to the splitted db

13:32 <karolherbst> fair enough

13:32 <bentiss> that takes a couple of seconds

13:33 lsd|2 has joined #freedesktop

13:33 <bentiss> but the moving target where new postgresql, upgrade of the bitnami chart (arguably only a minor upgrade), and db split

13:33 <karolherbst> yeah...

13:34 <bentiss> my bets are right now on the db split, but maybe it's a problem with the new chart or postgres

13:34 <karolherbst> maybe those things should be done individually even if that means down time more often

13:35 <bl4ckb0ne> thanks! fingers crossed

14:05 <bentiss> daniels: db restore done (in theory)

14:18 <bentiss> so far, it's holding the load

14:19 <karolherbst> that's now without the split db, right?

14:19 <bentiss> yeah

14:19 <karolherbst> kinda sad, but I guess next time

14:19 <zmike> :(

14:19 <ehfd[m]> Probably that GitLab didn't harden the split DB scheme enough...

14:20 <bentiss> thought one difference is that this time I ran the migration job (because I forgot to opt out)

14:20 <bentiss> but yeah, maybe it's better to keep it that way

14:20 <karolherbst> though it did error when you tried to do the split, no?

14:21 <bentiss> yeah, but it was related to my config change, I went too far in the config because I wanted to have 2 separate postgres, not just one

14:22 <bentiss> so I setup the change first and then when I started the process it happily told me that the config was already there so it should stop

14:22 <daniels> so the migrate job was designed for having split dbs but on the same host?

14:22 <bentiss> while the data wasn't

14:22 <bentiss> yeah, so far, yah, same host

14:22 <bentiss> *yeah

14:23 <bentiss> daniels: happy to keep it that way?

14:23 <bentiss> it feels so much smoother now that it's not timing out every request :)

14:23 <karolherbst> heh

14:23 <karolherbst> wait until people use it

14:24 <daniels> hahaha

14:24 <daniels> yeah, I mean, working > not working for sure

14:24 <daniels> I guess before we try again we should probably try the backups with the adjusted config on a shadow host to figure out wtf is making the webservice die?

14:27 <zmike> so maintenance is done?

14:27 <bentiss> yeah, good point. I can deploy a pvc with the old main db (splitted) so it's not lost

14:27 <bentiss> zmike: I need to remember if all the pieces are together

14:28 <bentiss> and I had a short night :)

14:28 <zmike> 👍

14:28 <zmike> just want to make sure it's done before I start using it

14:28 <karolherbst> marge is already busy 🙃

14:29 <karolherbst> bentiss: maybe marged triggering CI stuff broke it

14:29 <karolherbst> *marge

14:29 <bentiss> nah, it wasn't even able to start because it failed at pulling its container

14:30 <karolherbst> I see

14:31 <daniels> bentiss: errr

14:31 <daniels> oh, Marge, right :)

14:31 <daniels> not webservice, heh

14:31 <bentiss> yeah webservice was starting but timing out a lot, and then kubelet restarted it every 30 secs

14:32 <daniels> hm, not answering to /ready ?

14:32 <bentiss> at least, we upgraded postgres and gitlab, so that's still positive :)

14:32 <daniels> yeah :)

14:32 <bentiss> I extended the timout from 2 to 20 secs, and it was still randomly shooting at pods

14:32 <bentiss> though 20s was terrible

14:32 <daniels> also, I wonder, instead of doing pg_dump to pull the db content, would it be safe to snapshot the PV?

14:33 <bentiss> we can not upgrade the db this way

14:33 <daniels> yeah, not across major versions

14:33 <bentiss> between major postgres, we need to reinstall all of the data

14:34 <daniels> right

14:34 <daniels> I meant outside of the major upgrades, like if we want to try db split tests again

14:34 <bentiss> that's weird, I don't see the postgres errors I was having after the db split

14:34 <daniels> snapshot PV -> provision new PV with snapshot -> test migration from there

14:34 <bentiss> yeah

14:34 <daniels> might save a few hours relative to pg_dump/pg_restore?

14:35 <bentiss> we can also rely on the latest daily backup

14:35 <bentiss> still takes time to provision though

14:35 <bentiss> anyway, something for next time

14:36 <bentiss> ... and banner is gone

14:36 <DragoonAethis> \o/

14:36 <karolherbst> nice

14:36 <daniels> ack

14:36 <daniels> thanks!

14:36 <bl4ckb0ne> thanks o/

14:36 <karolherbst> we celebrate by DDOSing gitlab through everybody pushing the work they've done since Monday 🙃

14:37 * DragoonAethis flips 10 heavy CI jobs back on

14:37 <bentiss> go ahead!

14:37 <karolherbst> oh no

14:37 <karolherbst> it's alreayd slowing down

14:38 * bentiss doesn't think so

14:45 <karolherbst> mhh, yeah, seems to be still quicker than before actually

14:45 teemperor has quit [Remote host closed the connection]

15:05 AbleBacon has joined #freedesktop

15:15 bmodem has joined #freedesktop

15:25 lsd|2 has quit [Quit: KVIrc 5.2.2 Quasar http://www.kvirc.net/]

16:06 <nirbheek_> Is it safe to use gitlab yet? Can I merge MRs?

16:06 <daniels> yep

16:12 privacy has joined #freedesktop

16:24 <mupuf> bentiss: thank you so much! Sorry I could not be of any help, but baby deliveries are notoriously unpredictable!

16:28 <bentiss> mupuf: hey! congrats!

16:42 <dwfreed> bentiss: pg_upgrade claims to work with as far back as 9.2

16:43 <dwfreed> in 16

16:44 <bentiss> dwfreed: last time I checked, this wasn't the recommended option by gitlab. They might mention it now

16:48 <dwfreed> https://docs.gitlab.com/ee/administration/postgresql/external_upgrade.html

16:49 <dwfreed> definitely mentions pg_upgrade

16:50 <dwfreed> they even used it themselves in 2020 when upgrading gitlab.com's postgres

16:50 <bentiss> one benefit of backup/restore is that it cleans the db a lot and it shrinks the overall size

16:50 <bentiss> not sure we've got a vacuum enabled and running properly

16:51 <bentiss> anyway if anybody feels like they want to have a look, we welcome new admins :)

16:53 <dwfreed> I know nothing about gitlab

16:53 <dwfreed> Not sure I want to learn, either

16:53 <daniels> it's a lab for git. any questions?

16:59 bmodem has quit [Ping timeout: 480 seconds]

17:01 <DodoGTA> So is the GitLab database currently split (or not)?

17:02 <daniels> it's not currently split

17:13 AbleBacon has quit [Read error: Connection reset by peer]

17:16 alice has joined #freedesktop

17:24 dcunit3d has quit [Quit: Quitted]

17:25 dcunit3d has joined #freedesktop

17:26 dcunit3d has quit [Remote host closed the connection]

17:28 alice_ has joined #freedesktop

17:32 alice has quit [Ping timeout: 480 seconds]

17:37 ximion has joined #freedesktop

17:40 dcunit3d has joined #freedesktop

17:52 pboushy has joined #freedesktop

18:00 lsd|2 has joined #freedesktop

18:05 pboushy has quit [Remote host closed the connection]

18:06 pboushy has joined #freedesktop

18:07 <kusma> It seems to me like gitlab no longer sends out email notification, could that be?

18:08 <daniels> no

18:08 <daniels> I'm still getting them

18:09 <kusma> Hmm, strange. I got one at 16:38 CEST, and then silence. And going through the activity, I see new MRs that I should have been notified about...

18:10 <daniels> I've got them as recently as a couple of minutes ago

18:11 <daniels> it is very possible they're not getting delivered to _you_, but yeah, gmail is pretty capricious

18:12 <kusma> Hmm. Seems a subscription I thought I had was missing.

18:15 <kusma> That seems to pre-date the upgrade, so probably PEBCAK

18:39 pboushy has quit [Remote host closed the connection]

18:39 pboushy has joined #freedesktop

18:40 pboushy has quit [Remote host closed the connection]

18:59 mripard has quit [Remote host closed the connection]

19:15 konstantin_ has joined #freedesktop

19:15 konstantin is now known as Guest8614

19:15 konstantin_ is now known as konstantin

19:18 Guest8614 has quit [Ping timeout: 480 seconds]

19:26 Haaninjo has joined #freedesktop

19:30 alanc has quit [Remote host closed the connection]

19:30 alanc has joined #freedesktop

19:41 alice_ is now known as alice

20:25 mvlad has quit [Remote host closed the connection]

20:54 AbleBacon has joined #freedesktop

21:00 guludo has quit [Ping timeout: 480 seconds]

21:01 guludo has joined #freedesktop

21:17 vkareh has quit [Quit: WeeChat 4.3.0]

21:28 scrumplex has joined #freedesktop

21:35 scrumplex_ has quit [Ping timeout: 480 seconds]

21:51 lsd|2 has quit [Quit: KVIrc 5.2.2 Quasar http://www.kvirc.net/]

22:28 valida-69[m] has quit []

22:40 Haaninjo has quit [Quit: Ex-Chat]

22:44 M839ty9[m] has quit []

22:44 MatrixTravelerbot[m] has quit []

22:53 Guest8258 has quit []

23:08 LaughingMan[m] has quit []

23:15 jjardon[m] has quit [Quit: Client limit exceeded: 20000]

23:46 anomalous_creator[m] has quit [Quit: Client limit exceeded: 20000]

23:52 dcbaker has quit []

23:56 hch12907 has quit []