ChanServ changed the topic of #freedesktop to: https://www.freedesktop.org infrastructure and online services || for questions about freedesktop.org projects, please see each project's contact || for discussions about specifications, please use https://gitlab.freedesktop.org/xdg or xdg@lists.freedesktop.org
Hooloovoo has quit [Ping timeout: 480 seconds]
jnoorman has quit [Read error: Network is unreachable]
linyaa has quit [Read error: Network is unreachable]
vignesh has quit [Read error: Network is unreachable]
markco has quit [Read error: Network is unreachable]
zmike has quit [Read error: Network is unreachable]
zmike has joined #freedesktop
kj2 has quit [Read error: Network is unreachable]
pendingchaos has quit [Read error: Network is unreachable]
samuelig has quit [Read error: Network is unreachable]
austriancoder has quit [Read error: Network is unreachable]
rg3igalia has quit [Read error: Network is unreachable]
pendingchaos has joined #freedesktop
austriancoder has joined #freedesktop
samuelig has joined #freedesktop
rg3igalia has joined #freedesktop
jnoorman has joined #freedesktop
linyaa has joined #freedesktop
markco has joined #freedesktop
cascardo_ has quit []
cascardo has joined #freedesktop
vignesh has joined #freedesktop
bwidawsk has quit [Read error: Network is unreachable]
bwidawsk has joined #freedesktop
kj2 has joined #freedesktop
balrog_ has joined #freedesktop
balrog has quit [Read error: Connection reset by peer]
Hooloovoo has joined #freedesktop
guludo has quit [Ping timeout: 480 seconds]
pjakobsson_ has joined #freedesktop
pjakobsson has quit [Ping timeout: 480 seconds]
m5zs7k has quit [Ping timeout: 480 seconds]
m5zs7k has joined #freedesktop
scrumplex has joined #freedesktop
scrumplex_ has quit [Ping timeout: 480 seconds]
jarthur has joined #freedesktop
sewn has joined #freedesktop
ximion has quit [Remote host closed the connection]
pjakobsson has joined #freedesktop
pjakobsson_ has quit [Ping timeout: 480 seconds]
jsa1 has joined #freedesktop
jarthur has quit [Ping timeout: 480 seconds]
jsa1 has quit [Ping timeout: 480 seconds]
haaninjo has joined #freedesktop
haaninjo has quit [Remote host closed the connection]
noodlez1232 has quit [Remote host closed the connection]
noodlez1232 has joined #freedesktop
blu has joined #freedesktop
swatish2 has joined #freedesktop
krastevm has quit [Ping timeout: 480 seconds]
swatish21 has joined #freedesktop
swatish2 has quit [Ping timeout: 480 seconds]
tzimmermann has joined #freedesktop
AbleBacon has quit [Read error: Connection reset by peer]
swatish2 has joined #freedesktop
jsa1 has joined #freedesktop
swatish21 has quit [Ping timeout: 480 seconds]
jsa1 has left #freedesktop [#freedesktop]
jsa1 has joined #freedesktop
swatish2 has quit [Ping timeout: 480 seconds]
swatish2 has joined #freedesktop
sima has joined #freedesktop
sghuge has quit [Remote host closed the connection]
sghuge has joined #freedesktop
dcunit3d has quit [Quit: No Ping reply in 180 seconds.]
dcunit3d has joined #freedesktop
mripard has joined #freedesktop
krastevm has joined #freedesktop
blu has quit [Ping timeout: 480 seconds]
blu has joined #freedesktop
martink has joined #freedesktop
krastevm has quit [Ping timeout: 480 seconds]
blu has quit [Ping timeout: 480 seconds]
<bentiss> sigh, one OSD (disk) is full and ceph decided to stop all operations
<kode54> :/
<soreau> booo
<bentiss> the weird part is ceph should have rebalanced teh cluster way before, as other disks have some space
<kode54> maybe the rebalance wasn't queued up yet
<kode54> would be nice if those were queued by disk fill reaching a threshold instead of just on timers
<bentiss> nah, it's been filling this one for days
<kode54> oh
<kode54> quality software :[
<kode54> then again, for pro software, not much else you can do without losing a lot of functionality
<kode54> gitlab is really without any competition for professional self hosting
<bentiss> that's part of why I don't want to deal with ceph anymore
<kode54> gitea and forgejo are nowhere near the feature parity
krastevm has joined #freedesktop
<soreau> needs 'cloud accessible' disks and runners, then run everything on a capable server, like an rpi5 ;)
<kode54> haha, no
<kode54> maybe forgejo is cloud capable, but it lacks a lot of the professional features present in gitlab
<kode54> and it needs the CI functionality
<kode54> I don't know if forgejo is capable of plugging in the level and diversity of build bots that FDO uses
<kode54> considering I think forgejo is way lighter on a capable server than gitlab is
<soreau> raid nfs :P
<kode54> the problem was it was focusing a bunch of data on a single node of storage
<kode54> ceph being a bit weird
<kode54> technically it should have been balancing that across all the nodes
<bentiss> it should have, but depending on the topology and the data on disk, sometimes it doesn't find a good solution and needs a little push
<soreau> almost sounds like some other authorative agencies - 'due to our error, you have been penalized'
<bentiss> (though it must be at least 1 year since I haven't done that)
<kode54> soreau: think of ceph like S3 or R2, only self hosted
<kode54> it should have been fully capable of distributing objects across the entire array instead of plopping them all on one node
<kode54> it could be a nfs of a bunch of raids if you like
<kode54> and it will locate the objects on their correct nodes when retrieving them, or even duplicate them for load balancing
<kode54> I think?
<kode54> sounds like something it would do
martink has quit [Ping timeout: 480 seconds]
<kode54> but spamming a bunch of data all on one node? that sounds like it's being fiddly
<soreau> every company that has folks comitting to gitlab should spilt the bill on something nice
<mupuf> welcome back, gitlab!
<mupuf> thanks bentiss
<bentiss> mupuf: only temporary, as I assume the cluster might stop soon :(
<bentiss> (I just pushed the full boundary a bit so the recovery starts)
<kode54> ah
<kode54> you only brought it back
<kode54> maybe force a rebalance?
<kode54> god I hope there's logging that can show why it's focusing all the writes to one node
<kode54> unless it's like one huge object
<kode54> that's just continuously growing in size
<kode54> can't really rebalance a single object, unless ceph supports that?
<bentiss> well, I managed to reweight that OSD and it started the recovery
<bentiss> usually yeah, there are a lot of big objects on an OSD and this screws up the balance of the whole cluster
<bentiss> but also, TBH, we are at 98% usage of the ssd pool. So that migration is not so bad in the end, we'll be able to breathe a little bit more
<bentiss> and we are using too much of SSDs because the db is constantly growing so are the git trees :/
<kode54> having lots of copies of the kernel doesn't help that
guludo has joined #freedesktop
<kode54> the mailing list paradigm is really "great" for such huge trees in keeping much of the data client side
<kode54> but mailing list paradigm sucks so much
* bentiss found the magic command: `ceph osd reweight-by-utilization`
<bentiss> I was manually assigning reweight, when this does it automatically :)
<soreau> script it :P
<bentiss> heh
<bentiss> though arguably, why on earth ceph doesn't does that automatically (or the regular balancing doesn't do the equivalent)
<bentiss> FWIW, we should be good now, max disk usage is 89%, so far from the 95% deadlock
<kode54> perfect timing, considering the Equinix thing
<bentiss> yeah...
<daniels> bentiss: ah thanks for fixing, sorry I was still out at pilates
<bentiss> daniels: no worries
<bentiss> luckily I caught it before people started screaming too much
<daniels> I was screaming too, but only because my hamstrings are )(*@#$
<bentiss> heh
swatish2 has quit [Ping timeout: 480 seconds]
swatish21 has joined #freedesktop
m5zs7k has quit [Ping timeout: 480 seconds]
m5zs7k has joined #freedesktop
swatish2 has joined #freedesktop
<bentiss> \o/ HEALTH_OK on the cluster :)
swatish21 has quit [Ping timeout: 480 seconds]
swatish2 has quit [Ping timeout: 480 seconds]
guludo has quit [Ping timeout: 480 seconds]
guludo has joined #freedesktop
jkhsjdhjs_ has joined #freedesktop
jkhsjdhjs has quit [Ping timeout: 480 seconds]
swatish2 has joined #freedesktop
todi1 has quit []
todi has joined #freedesktop
jsa1 has quit [Ping timeout: 480 seconds]
swatish2 has quit [Ping timeout: 480 seconds]
swatish2 has joined #freedesktop
codegirl has quit [Quit: Ping timeout (120 seconds)]
codegirl has joined #freedesktop
blu has joined #freedesktop
krastevm has quit [Ping timeout: 480 seconds]
jsa1 has joined #freedesktop
guludo has quit [Ping timeout: 480 seconds]
swatish2 has quit [Ping timeout: 480 seconds]
haaninjo has joined #freedesktop
guludo has joined #freedesktop
JerryXiao has quit [Quit: Bye]
JerryXiao has joined #freedesktop
codegirl has quit [Quit: Ping timeout (120 seconds)]
codegirl has joined #freedesktop
jarthur has joined #freedesktop
digetx has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
tzimmermann has quit [Quit: Leaving]
<emersion> bentiss: FWIW we already have a new server for MLs at PSU, but migrating is difficult
<emersion> i think i'm mostly there, just need to figure out a good way to expose the HTTP interface
<bentiss> emersion: oh yeah, that rings a bell now. Anyway, if we can have people donate HW that can surely be a good thing for adding "cheap" runners
<emersion> yeah
digetx has joined #freedesktop
f_ is now known as funderscore
funderscore is now known as f_
jsa1 has quit [Ping timeout: 480 seconds]
AbleBacon has joined #freedesktop
alanc has quit [Remote host closed the connection]
alanc has joined #freedesktop
jsa1 has joined #freedesktop
swatish2 has joined #freedesktop
ximion has joined #freedesktop
jsa1 has quit [Ping timeout: 480 seconds]
swatish2 has quit [Ping timeout: 480 seconds]
frompeacefulplanet has joined #freedesktop
frompeacefulplanet has left #freedesktop [#freedesktop]
sima has quit [Ping timeout: 480 seconds]
jsa1 has joined #freedesktop
digetx has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
digetx has joined #freedesktop
guludo has quit [Ping timeout: 480 seconds]
jsa1 has quit [Ping timeout: 480 seconds]
haaninjo has quit [Quit: Ex-Chat]
jarthur has quit [Ping timeout: 480 seconds]