ChanServ changed the topic of #freedesktop to: https://www.freedesktop.org infrastructure and online services || for questions about freedesktop.org projects, please see each project's contact || for discussions about specifications, please use https://gitlab.freedesktop.org/xdg or xdg@lists.freedesktop.org
oeai has joined #freedesktop
ybogdano has quit [Ping timeout: 480 seconds]
Seirdy has quit []
Seirdy has joined #freedesktop
apteryx_ has quit [Ping timeout: 480 seconds]
apteryx_ has joined #freedesktop
apteryx_ has quit [Ping timeout: 480 seconds]
r_elop has quit [Remote host closed the connection]
oeai has left #freedesktop [#freedesktop]
apteryx_ has joined #freedesktop
ximion has quit []
apteryx_ has quit [Ping timeout: 480 seconds]
danvet has joined #freedesktop
alanc has quit [Remote host closed the connection]
alanc has joined #freedesktop
<bentiss> daniels: yeah, we can probably ditch that one.
GNUmoon has joined #freedesktop
GNUmoon has quit [Ping timeout: 480 seconds]
GNUmoon has joined #freedesktop
<bentiss> daniels: I removed the PV and then the rbd on ceph, and it was empty apparently :)
<bentiss> anyway, about to restart large-6, hopefully this will be transparant
<hakzsam> CI windows failure it seems here https://gitlab.freedesktop.org/mesa/mesa/-/jobs/16774019
<hakzsam> is it a known issue?
alatiera has quit [Remote host closed the connection]
alatiera has joined #freedesktop
hikiko_ has joined #freedesktop
<daniels> alatiera: ^ windows needs some dockering again
hikiko has quit [Ping timeout: 480 seconds]
<alatiera> ack, will kick it
<daniels> thanks!
<alatiera> hakzsam daniels, kicked, please try again
<bentiss> daniels: so AFAICT, the issue with osd.26 is that there is too much fragmentation on the disk and bluefs can not even start
<alatiera> "Failed to load container mount foo, mount does not exist" please don't be NUL files agains
<bentiss> it seems we can start the compaction on the disk, but for that, we need the pod to be up, and given that it fails at boot, we can not do much
hikiko_ has quit [Remote host closed the connection]
<bentiss> so my next plan is to remove the disk from the cluster, format it and then re-insert it
<bentiss> which will likely take some time to do if I want to not break the other disks :)
<hakzsam> looks fixed, thanks!
hikiko has joined #freedesktop
<daniels> bentiss: yeah, 'fragmentation 1'
<daniels> bentiss: that sounds totally reasonable, however I have nfi how to do that without causing damage :P
<daniels> bentiss: is there anything I can do?
jarthur has quit [Quit: Textual IRC Client: www.textualapp.com]
ximion has joined #freedesktop
ximion has quit []
ttancos[m] has joined #freedesktop
<bentiss> daniels: sorry, was out hunting for lunch. and now I need to eat it. I have a vague idea on how to do it and a crash test cluster locally I can use to see if that works, so I should be fine
<daniels> bentiss: sure thing, bon appetit :)
alyssa has left #freedesktop [#freedesktop]
ngcortes has quit [Read error: Connection reset by peer]
ppascher has quit [Ping timeout: 480 seconds]
ppascher has joined #freedesktop
<alatiera> jenatali hey, did you end up figuring out how to copy the dumps over ssh? do you need any help with the windows issue?
<bentiss> daniels: still witing for my local cluster to spin up from dust (lokslike something is wrong there), but meanwhile -> https://rook.io/docs/rook/v1.6/ceph-osd-mgmt.html#remove-an-osd
<bentiss> there is an official documentation!
<daniels> oh nice!
thaller has quit [Remote host closed the connection]
reillybrogan_ has joined #freedesktop
rg3igalia_ has joined #freedesktop
jcline_ has joined #freedesktop
Ford_Prefect_ has joined #freedesktop
seanpaul_ has joined #freedesktop
Ford_Prefect has quit [Ping timeout: 480 seconds]
demarchi has quit [Ping timeout: 480 seconds]
reillybrogan has quit [Ping timeout: 480 seconds]
Ford_Prefect_ has quit []
jcline has quit [Ping timeout: 480 seconds]
rg3igalia has quit [Ping timeout: 480 seconds]
seanpaul has quit [Ping timeout: 480 seconds]
demarchi has joined #freedesktop
Ford_Prefect has joined #freedesktop
<jenatali> alatiera: no, I gave up for now
<bentiss> daniels: so I followed the docs, and we have a new disk available now. It is *very* slowly rebalancing the cluster
Haaninjo has joined #freedesktop
<alatiera> jenatali anything I could help with? don't recall where we left it off last time
<alatiera> I think I was looking at having a webdav share between the vm and the host, but something like scp would be easier if we got it to work I guess
<jenatali> alatiera: I just couldn't figure out how to transfer a file from that machine, since I was connected over VNC and not RDP, the Windows machine wasn't running an SSH server, it doesn't have a web browser or file explorer, etc
<alatiera> jenatali thanks, I will look at getting ssh to work then in the next couple days
<jenatali> alatiera: I think I managed to start the built-in Windows SSH server, but couldn't figure out how to poke it through the Linux host so that I could connect to it
<alatiera> needing ssh to forward an ssh port! sysadmins beware!
<__tim> can't you do an outgoing ssh connection? (with a temp key/account)
<alatiera> atm we would need to fwd the port through ssh and then ssh to the new port I think?
<alatiera> I imagine its doable same way vnc works, but haven't tried it, nor know how windows ssh clients would handle it
<daniels> bentiss: :o nice!
___nick___ has joined #freedesktop
<bentiss> daniels: I am now checking which rbd blocks are not used and removing them one by one
<bentiss> there was a couple of blocks that are using quite some space for nothing
<bentiss> when we delete the PV, it doesn't delte the data on ceph and we have a lot of leftovers
<bentiss> daniels: FWIW (and for me later), in the toolbox rook-ceph pod: for i in $(rbd ls replicapool-ssd) ; do rbd -p replicapool-ssd du $i; rbd -p replicapool-ssd status $i; done
<bentiss> this gives the disk usages and the watchers (mounts) of each rbd block
<daniels> that's awesome, and I guess we could probably also find out by looking at the PV before nuking it?
<daniels> (I nuked all the Sentry PVCs last night to give us more headroom btw)
<bentiss> daniels: so now, with the removal of all the bread crumbs, and fstrim on all nodes, we went from 290 GB of free space to 472 GB
<bentiss> daniels: yes, describe on the PV gives you the csi volume
<bentiss> we had 2 disks almost full with ~70 GB of data, probably some left overs of elasticsearch
<bentiss> the next worrying part is the disk fragmentation
<daniels> it doesn't do online defrag?
<bentiss> I am not sure
<bentiss> I saw that we can force the defrag and at some point I took a random disk and got a fragmentation of 0.8
<bentiss> daniels: so in a OSD pod -> `CEPH_ARGS='' ceph daemon osd.$ROOK_OSD_ID bluestore allocator fragmentation block` gives the fragmentation level
<daniels> yeah, plus it's logged at OSD pod startup
<bentiss> actually it's not too bad now, only one OSD at 0.1, all the other below
<bentiss> anyway, enough of fdo admin for me today, I think we are in a better shape than yesterday now ;)
<daniels> just a bit! thankyou again
<bentiss> no worries, thanks for your effort too
thaller has joined #freedesktop
ybogdano has joined #freedesktop
ybogdano has quit [Remote host closed the connection]
thaller has quit [Read error: Connection reset by peer]
thaller has joined #freedesktop
ybogdano has joined #freedesktop
ybogdano has quit [Ping timeout: 480 seconds]
ybogdano has joined #freedesktop
___nick___ has quit []
___nick___ has joined #freedesktop
___nick___ has quit []
___nick___ has joined #freedesktop
ybogdano has quit [Ping timeout: 480 seconds]
ximion has joined #freedesktop
___nick___ has quit [Ping timeout: 480 seconds]
kgrygo has joined #freedesktop
reillybrogan has joined #freedesktop
reillybrogan_ has quit [Ping timeout: 480 seconds]
GNUmoon has quit [Ping timeout: 480 seconds]
ngcortes has joined #freedesktop
GNUmoon has joined #freedesktop
danvet has quit [Ping timeout: 480 seconds]
ybogdano has joined #freedesktop
jcline_ has left #freedesktop [#freedesktop]
jstein_ has joined #freedesktop
jstein_ has quit []
Haaninjo has quit [Quit: Ex-Chat]