daniels changed the topic of #freedesktop to: https://www.freedesktop.org infrastructure and online services || for questions about freedesktop.org projects, please see each project's contact || for discussions about specifications, please use https://gitlab.freedesktop.org/xdg or xdg@lists.freedesktop.org
lsd|2 has joined #freedesktop
kode54 has quit [Quit: The Lounge - https://thelounge.chat]
KDDLB has quit [Quit: The Lounge - https://thelounge.chat]
kode54 has joined #freedesktop
lsd|2 has quit [Ping timeout: 480 seconds]
lsd|2 has joined #freedesktop
Leopold_ has quit [Remote host closed the connection]
lsd|2 has quit [Quit: KVIrc 5.2.2 Quasar http://www.kvirc.net/]
guludo has quit [Ping timeout: 480 seconds]
KDDLB has joined #freedesktop
Trevinho has joined #freedesktop
M839ty9[m] has joined #freedesktop
Wallbraker has joined #freedesktop
adziahel[m] has joined #freedesktop
aenderboy[m] has joined #freedesktop
alatiera[m] has joined #freedesktop
anomalous_creator[m] has joined #freedesktop
ashelytina[m] has joined #freedesktop
bendlas[m] has joined #freedesktop
BLumia[m] has joined #freedesktop
cassidy[m] has joined #freedesktop
chrysn[m]1 has joined #freedesktop
cmeissl[m] has joined #freedesktop
dabrain34[m]1 has joined #freedesktop
Hiperion[m] has joined #freedesktop
dcbaker has joined #freedesktop
Guest8266 has joined #freedesktop
enunes[m] has joined #freedesktop
ewlsh[m] has joined #freedesktop
gallo[m] has joined #freedesktop
gegoxaren[m] has joined #freedesktop
general_j[m] has joined #freedesktop
gnfzdz[m] has joined #freedesktop
havdan[m] has joined #freedesktop
Hazematman has joined #freedesktop
hch12907 has joined #freedesktop
heftig has joined #freedesktop
Guest8258 has joined #freedesktop
jasuarez has joined #freedesktop
jenatali has joined #freedesktop
jjardon[m] has joined #freedesktop
jtatz[m] has joined #freedesktop
kusma has joined #freedesktop
LaughingMan[m] has joined #freedesktop
Pope_Rigby[m] has joined #freedesktop
mairacanal[m] has joined #freedesktop
marcel203s[m] has joined #freedesktop
Mark[m]1 has joined #freedesktop
marting[m] has joined #freedesktop
matrix638[m] has joined #freedesktop
mlsatzin[m] has joined #freedesktop
mripard1 has joined #freedesktop
msizanoen[m] has joined #freedesktop
muhlinux[m] has joined #freedesktop
nazarewk[m] has joined #freedesktop
nee[m] has joined #freedesktop
nielsdg has joined #freedesktop
nirbheek_ has joined #freedesktop
pac85[m] has joined #freedesktop
pitch has joined #freedesktop
pv[m] has joined #freedesktop
valida-69[m] has joined #freedesktop
ramman_[m] has joined #freedesktop
razze[m] has joined #freedesktop
Guest8395 has joined #freedesktop
rpurdie[m] has joined #freedesktop
dabrain34[m] has joined #freedesktop
seaweed[m] has joined #freedesktop
serbbenzo[m] has joined #freedesktop
siddh has joined #freedesktop
SintayewGashaw[m] has joined #freedesktop
sergi has joined #freedesktop
Sumera[m] has joined #freedesktop
swick[m] has joined #freedesktop
sythemeta847[m] has joined #freedesktop
tayloralgo1[m] has joined #freedesktop
Nova[m] has joined #freedesktop
therealsteamlord[m] has joined #freedesktop
tintou has joined #freedesktop
underpantsgnome[m] has joined #freedesktop
tinywrkb has joined #freedesktop
tomeu has joined #freedesktop
ttancos[m] has joined #freedesktop
twopubsolar[m] has joined #freedesktop
mitTengiz[m] has joined #freedesktop
Soroush has joined #freedesktop
unrznbl[m] has joined #freedesktop
valentine has joined #freedesktop
Torxon[m] has joined #freedesktop
MatrixTravelerbot[m] has joined #freedesktop
vulpes2[m] has joined #freedesktop
dlx[m] has joined #freedesktop
ylatuya[m] has joined #freedesktop
zredshift[m] has joined #freedesktop
ximion has quit [Quit: Detached from the Matrix]
jokester1365 has joined #freedesktop
jokester1365 has left #freedesktop [#freedesktop]
jokester1365 has joined #freedesktop
jokester1365 has quit [Remote host closed the connection]
privacy has quit [Quit: Leaving]
manuels2 has quit [Quit: Ping timeout (120 seconds)]
manuels2 has joined #freedesktop
bmodem has joined #freedesktop
Juest is now known as Guest8535
Juest has joined #freedesktop
Guest8535 has quit [Ping timeout: 480 seconds]
tlwoerner has quit [Remote host closed the connection]
tlwoerner has joined #freedesktop
Leopold_ has joined #freedesktop
Shibe has quit [Remote host closed the connection]
vx has quit [Quit: G-Line: User has been permanently banned from this network.]
vx has joined #freedesktop
Guest8266 is now known as DrNick
<bilboed> IIIUC the red step, the DB already had the data ??
AbleBacon has quit [Read error: Connection reset by peer]
alice has quit [Ping timeout: 480 seconds]
Leopold_ has quit [Remote host closed the connection]
<bentiss> bilboed: nah, the new config confused the script: it thought it had the data when it hadn't
Haaninjo has joined #freedesktop
<bilboed> Ok. Looks like things are coming along nicely
<bentiss> yeah, I'm going to restore the service and run a couple of tests. It might have to revert to the last backup, so please don't start working on it
<bilboed> πŸ‘οΈ
samuelig has quit []
samuelig has joined #freedesktop
<bentiss> hmm... the gitlab webservice pods keep crashing
<pq> It feels like holiday while gitlab is down... no rush :-)
<bentiss> I have some errors in the db, running reindex on it
hec2233 has joined #freedesktop
hec2233 has quit []
ehfd[m] has joined #freedesktop
smpl has joined #freedesktop
jani has quit []
jani has joined #freedesktop
<svuorela> bentiss: done ?
<bentiss> svuorela: nope :(
<bentiss> the webservice are still crashing
<svuorela> oh. Just got actual pages when accidentally reloading a tab.
<bentiss> yeah, it keeps getting up and done
mvlad has joined #freedesktop
<bentiss> I've restarted the 2 db and migrated them on different nodes, that seems better
<bentiss> not really...
Shibe has joined #freedesktop
Haaninjo has quit [Quit: Ex-Chat]
<bentiss> daniels: I can't seem to make the webservice pods stay up
<bentiss> attempting to restart gitaly pods, as they might still be confused since teh db split (I haven't restarted them since)
<daniels> bentiss: I’m just coming to work (very slowly; turns out walking on a broken tire isn’t much fun) so can have a look in a little bit
<bentiss> daniels: ok, thanks. it's about lunch time here...
<bentiss> daniels: long story short: the webservice pods keep getting down and up (from 1/2 to 2/2 to 1/2)
<bentiss> daniels: it could come from the db or the webservice configuration, not sure
<bentiss> daniels: one solution would be to redeploy today's backup (it takes ~5h), disable db split and see if that changes anything
<bentiss> we can do that by keeping the current db in the back (we keep the pv around, and just recreate the pvc)
<bentiss> daniels: also as soon as I disable ingress towards the webservice pods, they are back up. So it's load related
<bentiss> I've tested spinning up and down the number of replicas, but same result
<bentiss> and of course, nothing interesting in the logs I could find
<bentiss> let me disable webservice :(
<bentiss> I'm going to restore the db dump before the split. We'll see if that helps
<daniels> bentiss: hmmm ok, I was going to go stare at the logs (wow that walk was really slow) - did you see anything?
<bentiss> not much no...
<bentiss> daniels: I've kept the pv being the db split. So once that restore ends, we can go live with one or the other
<daniels> nice, thanks a lot
<bentiss> the db is restoring (manually with pg_restore, this is faster), and I'm going afk a bit
<bentiss> (lunch)
<daniels> bon appetit :)
vkareh has joined #freedesktop
bmodem has quit [Ping timeout: 480 seconds]
guludo has joined #freedesktop
<ndufresne> Was this expected to take that long ?
<karolherbst> the downtime was planed to be 48 hours
<ndufresne> ok, so far so good, is there a ML I perhaps should signing ?
<karolherbst> it was communicated via banners on gitlab
<ndufresne> ah, I should have checked the day of the week, I always assume this is happening over the weekend
<karolherbst> but maybe it makes sense to add an ETA on the maintenance page in the future as well?
<karolherbst> bentiss: ^^
<karolherbst> or rather.. the planned ETA
<ehfd[m]> karolherbst: I think it's pretty unpredictable.
<karolherbst> sure, but the point is to give drive-by people some kind of answer
<zmike> oh no we hit a goto 80%
lsd|2 has joined #freedesktop
teemperor has joined #freedesktop
<teemperor> question: is there an ETA for the gitlab upgrade? it seems to be stuck since a while
lsd|2 has quit []
pocek[m] has joined #freedesktop
<bentiss> karolherbst: OK, added the planned window outage time
<karolherbst> thanks :)
<bentiss> teemperor: no ETA no. Things did not went as well as expected. We have to revert part of the changes, but if the problem is different, that might require another 5h of db restore
<bl4ckb0ne> are you still going with the db split?
<bentiss> I'm going to try if the webservice pods are working better without the split, and take a decision after
<bentiss> but it's unlikely we will be able to re-split given the time left
<karolherbst> :'(
<karolherbst> well at least you have backups and could try locally to see what's the problem or something
<karolherbst> (after getting the system live again)
<bentiss> karolherbst: if the problems are unrelated to the db split and we fix them I can always revert to the splitted db
<karolherbst> fair enough
<bentiss> that takes a couple of seconds
lsd|2 has joined #freedesktop
<bentiss> but the moving target where new postgresql, upgrade of the bitnami chart (arguably only a minor upgrade), and db split
<karolherbst> yeah...
<bentiss> my bets are right now on the db split, but maybe it's a problem with the new chart or postgres
<karolherbst> maybe those things should be done individually even if that means down time more often
<bl4ckb0ne> thanks! fingers crossed
<bentiss> daniels: db restore done (in theory)
<bentiss> so far, it's holding the load
<karolherbst> that's now without the split db, right?
<bentiss> yeah
<karolherbst> kinda sad, but I guess next time
<zmike> :(
<ehfd[m]> Probably that GitLab didn't harden the split DB scheme enough...
<bentiss> thought one difference is that this time I ran the migration job (because I forgot to opt out)
<bentiss> but yeah, maybe it's better to keep it that way
<karolherbst> though it did error when you tried to do the split, no?
<bentiss> yeah, but it was related to my config change, I went too far in the config because I wanted to have 2 separate postgres, not just one
<bentiss> so I setup the change first and then when I started the process it happily told me that the config was already there so it should stop
<daniels> so the migrate job was designed for having split dbs but on the same host?
<bentiss> while the data wasn't
<bentiss> yeah, so far, yah, same host
<bentiss> *yeah
<bentiss> daniels: happy to keep it that way?
<bentiss> it feels so much smoother now that it's not timing out every request :)
<karolherbst> heh
<karolherbst> wait until people use it
<daniels> hahaha
<daniels> yeah, I mean, working > not working for sure
<daniels> I guess before we try again we should probably try the backups with the adjusted config on a shadow host to figure out wtf is making the webservice die?
<zmike> so maintenance is done?
<bentiss> yeah, good point. I can deploy a pvc with the old main db (splitted) so it's not lost
<bentiss> zmike: I need to remember if all the pieces are together
<bentiss> and I had a short night :)
<zmike> πŸ‘
<zmike> just want to make sure it's done before I start using it
<karolherbst> marge is already busy πŸ™ƒ
<karolherbst> bentiss: maybe marged triggering CI stuff broke it
<karolherbst> *marge
<bentiss> nah, it wasn't even able to start because it failed at pulling its container
<karolherbst> I see
<daniels> bentiss: errr
<daniels> oh, Marge, right :)
<daniels> not webservice, heh
<bentiss> yeah webservice was starting but timing out a lot, and then kubelet restarted it every 30 secs
<daniels> hm, not answering to /ready ?
<bentiss> at least, we upgraded postgres and gitlab, so that's still positive :)
<daniels> yeah :)
<bentiss> I extended the timout from 2 to 20 secs, and it was still randomly shooting at pods
<bentiss> though 20s was terrible
<daniels> also, I wonder, instead of doing pg_dump to pull the db content, would it be safe to snapshot the PV?
<bentiss> we can not upgrade the db this way
<daniels> yeah, not across major versions
<bentiss> between major postgres, we need to reinstall all of the data
<daniels> right
<daniels> I meant outside of the major upgrades, like if we want to try db split tests again
<bentiss> that's weird, I don't see the postgres errors I was having after the db split
<daniels> snapshot PV -> provision new PV with snapshot -> test migration from there
<bentiss> yeah
<daniels> might save a few hours relative to pg_dump/pg_restore?
<bentiss> we can also rely on the latest daily backup
<bentiss> still takes time to provision though
<bentiss> anyway, something for next time
<bentiss> ... and banner is gone
<DragoonAethis> \o/
<karolherbst> nice
<daniels> ack
<daniels> thanks!
<bl4ckb0ne> thanks o/
<karolherbst> we celebrate by DDOSing gitlab through everybody pushing the work they've done since Monday πŸ™ƒ
* DragoonAethis flips 10 heavy CI jobs back on
<bentiss> go ahead!
<karolherbst> oh no
<karolherbst> it's alreayd slowing down
* bentiss doesn't think so
<karolherbst> mhh, yeah, seems to be still quicker than before actually
teemperor has quit [Remote host closed the connection]
AbleBacon has joined #freedesktop
bmodem has joined #freedesktop
lsd|2 has quit [Quit: KVIrc 5.2.2 Quasar http://www.kvirc.net/]
<nirbheek_> Is it safe to use gitlab yet? Can I merge MRs?
<daniels> yep
privacy has joined #freedesktop
<mupuf> bentiss: thank you so much! Sorry I could not be of any help, but baby deliveries are notoriously unpredictable!
<bentiss> mupuf: hey! congrats!
<dwfreed> bentiss: pg_upgrade claims to work with as far back as 9.2
<dwfreed> in 16
<bentiss> dwfreed: last time I checked, this wasn't the recommended option by gitlab. They might mention it now
<dwfreed> definitely mentions pg_upgrade
<dwfreed> they even used it themselves in 2020 when upgrading gitlab.com's postgres
<bentiss> one benefit of backup/restore is that it cleans the db a lot and it shrinks the overall size
<bentiss> not sure we've got a vacuum enabled and running properly
<bentiss> anyway if anybody feels like they want to have a look, we welcome new admins :)
<dwfreed> I know nothing about gitlab
<dwfreed> Not sure I want to learn, either
<daniels> it's a lab for git. any questions?
bmodem has quit [Ping timeout: 480 seconds]
<DodoGTA> So is the GitLab database currently split (or not)?
<daniels> it's not currently split
AbleBacon has quit [Read error: Connection reset by peer]
alice has joined #freedesktop
dcunit3d has quit [Quit: Quitted]
dcunit3d has joined #freedesktop
dcunit3d has quit [Remote host closed the connection]
alice_ has joined #freedesktop
alice has quit [Ping timeout: 480 seconds]
ximion has joined #freedesktop
dcunit3d has joined #freedesktop
pboushy has joined #freedesktop
lsd|2 has joined #freedesktop
pboushy has quit [Remote host closed the connection]
pboushy has joined #freedesktop
<kusma> It seems to me like gitlab no longer sends out email notification, could that be?
<daniels> no
<daniels> I'm still getting them
<kusma> Hmm, strange. I got one at 16:38 CEST, and then silence. And going through the activity, I see new MRs that I should have been notified about...
<daniels> I've got them as recently as a couple of minutes ago
<daniels> it is very possible they're not getting delivered to _you_, but yeah, gmail is pretty capricious
<kusma> Hmm. Seems a subscription I thought I had was missing.
<kusma> That seems to pre-date the upgrade, so probably PEBCAK
pboushy has quit [Remote host closed the connection]
pboushy has joined #freedesktop
pboushy has quit [Remote host closed the connection]
mripard has quit [Remote host closed the connection]
konstantin_ has joined #freedesktop
konstantin is now known as Guest8614
konstantin_ is now known as konstantin
Guest8614 has quit [Ping timeout: 480 seconds]
Haaninjo has joined #freedesktop
alanc has quit [Remote host closed the connection]
alanc has joined #freedesktop
alice_ is now known as alice
mvlad has quit [Remote host closed the connection]
AbleBacon has joined #freedesktop
guludo has quit [Ping timeout: 480 seconds]
guludo has joined #freedesktop
vkareh has quit [Quit: WeeChat 4.3.0]
scrumplex has joined #freedesktop
scrumplex_ has quit [Ping timeout: 480 seconds]
lsd|2 has quit [Quit: KVIrc 5.2.2 Quasar http://www.kvirc.net/]
valida-69[m] has quit []
Haaninjo has quit [Quit: Ex-Chat]
M839ty9[m] has quit []
MatrixTravelerbot[m] has quit []
Guest8258 has quit []
LaughingMan[m] has quit []
jjardon[m] has quit [Quit: Client limit exceeded: 20000]
anomalous_creator[m] has quit [Quit: Client limit exceeded: 20000]
dcbaker has quit []
hch12907 has quit []