ChanServ changed the topic of #freedesktop to: https://www.freedesktop.org infrastructure and online services || for questions about freedesktop.org projects, please see each project's contact || for discussions about specifications, please use https://gitlab.freedesktop.org/xdg or xdg@lists.freedesktop.org
columbarius has joined #freedesktop
co1umbarius has quit [Ping timeout: 480 seconds]
<airlied>
spammer on again, i turned off mesa public issues again
strugee_ has joined #freedesktop
strugee has quit [Ping timeout: 480 seconds]
strugee_ is now known as strugee
___nick___ has quit []
___nick___ has joined #freedesktop
___nick___ has quit []
___nick___ has joined #freedesktop
miracolix has quit [Ping timeout: 480 seconds]
richar has joined #freedesktop
richar has quit [Quit: Page closed]
miracolix has joined #freedesktop
jarthur has quit [Ping timeout: 480 seconds]
agd5f_ has joined #freedesktop
agd5f has quit [Ping timeout: 480 seconds]
agd5f has joined #freedesktop
agd5f_ has quit [Ping timeout: 480 seconds]
agd5f_ has joined #freedesktop
ximion has quit [Quit: Detached from the Matrix]
agd5f has quit [Ping timeout: 480 seconds]
chipxxx has quit [Remote host closed the connection]
chipxxx has joined #freedesktop
chipxxx has quit []
chipxxx has joined #freedesktop
robobub has joined #freedesktop
danvet has joined #freedesktop
agd5f has joined #freedesktop
agd5f_ has quit [Ping timeout: 480 seconds]
agd5f_ has joined #freedesktop
agd5f has quit [Ping timeout: 480 seconds]
Haaninjo has joined #freedesktop
agd5f has joined #freedesktop
agd5f_ has quit [Ping timeout: 480 seconds]
agd5f_ has joined #freedesktop
agd5f has quit [Ping timeout: 480 seconds]
Haaninjo has quit [Read error: Connection reset by peer]
Haaninjo has joined #freedesktop
MrCooper has quit [Remote host closed the connection]
MrCooper has joined #freedesktop
<__tim>
hrm, I'm seeing ./xmrig -a rx -o stratum+ssl://rx.unmineable.com:443 -u REP:0x41f16b4cCB10FE72936C1787A0c54A683cA0977C.7#n4t1-uiz9 -p x --threads=64 on the gst htz2 runner
<alatiera>
though we will need to do a custom vm/kernel setup in order to have working virtiofs
<i-garrison>
more spam in gitlab pulseaudio issues
ximion has joined #freedesktop
<__tim>
daniels, running again on htz2 / htz3 / htz4 in case you can see anything in the admin iface (will kill it in a bit)
<daniels>
__tim: hmmm, when did they start?
<__tim>
I don't know
<__tim>
but it looks like the ssh connection got severed and I can't log in again right now
<daniels>
hmmm. tbh you should probably consider that host compromised and rebuild at this point. going to do that for the equinix ones when I’m back at my laptop
<__tim>
yeah, probably
<__tim>
anyway, can't look more right now, back later
<daniels>
o/
<bentiss>
ouch, sth weird is happening: 699 pending jobs...
<bentiss>
daniels: so all the equinix runners are compromised?
<daniels>
bentiss: that’s my working assumption
* bentiss
can't seem to log in on to them (though it's just not answering ssh)
<bentiss>
they are handling jobs though
<bentiss>
maybe I can use the tmate magic to get an ssh connection on to one of them
<daniels>
yeah just crazy slowly. tpm noted that he could no longer SSH to his either, and I have seen job runtimes like 4x what they should be. so it all adds up
<daniels>
we can use the Equinix SOS console but not sure what the root pw is …
<bentiss>
yeah, the root password is trached after 24h
<daniels>
I think prob easiest is to just burn & recreate, and in parallel I’ll figure something kata-like tonight?
<bentiss>
would be great if you can, yeah
<daniels>
in about an hour
<bentiss>
BTW, if we recreate them we need to add one packet mupuf requested during the week (I'll need to find it in the backlog)
<mupuf>
podman-plugins
<bentiss>
thanks :)
<mupuf>
that would be nice :)
<bentiss>
daniels: the console gives a bunch of systemd-journald[1366018]: Failed to open system journal: Not a directory
<daniels>
bentiss: …
<__tim>
ours are definitely compromised
<__tim>
user accounts/dirs purged, ssh keys installed into /root/.ssh/authorized_keys
<bentiss>
__tim: ok, so that's worrying, because if they manage to get control of all runners, thye might do so again if we spin up new ones
<bentiss>
__tim: you were running docker, not podman?
<__tim>
afaik yes
<bentiss>
daniels: maybe a quickest solution would be to run the runners in user, not root, and keep the privileged onbe for virglrenderer onlu
<bentiss>
__tim: k, so it's not podman related then
<bentiss>
looks like the host override for ssh.tmate.io is still there
* alatiera
touched nothing
<bentiss>
daniels: I confirm, all CPU are 100%
<bentiss>
(I used the tmate trick on with a local server
<bentiss>
)
<alatiera>
let's wipe them all and figure how to re provision them tmr
<bentiss>
but now, I am stuck in podman :)
<alatiera>
is there a button in the admin panel to like disable all shared runners?
<alatiera>
other thing we could do is have only group runners and only merge pipelines, no branch pipelines, no fork pipelines
<__tim>
alatiera, I rebooted the gst-htz- runners into a rescue system for now, so they are essential offline for now
<alatiera>
it would mean that in order to trigger a pipeline there must be an MR
<alatiera>
__tim nice, thanks
<alatiera>
or rather, you don't need a merge request but the group runners will only pick up merge pipelines or ones from branches on the repos in the group
<alatiera>
mesa/mesa, not forks unless shared explicitly
<bentiss>
that was one proposal I had for these cases
<bentiss>
daniels: I'm going to nuke all of them except 13, I am logged on it.
<__tim>
bentiss, that issue says "This is a confidential draft" but the issue isn't actually marked as such?
<bentiss>
It used to be confidential, I opened it once I got some acks
<__tim>
ah ok
<alatiera>
bentiss yea seems to be inline with what I was imagining
<alatiera>
gst is already using merge pipelines
<bentiss>
painful, but should be OK-ish
<alatiera>
I could probably look over and port other smaller groups/repos in gitlab
<bentiss>
now I need to find a CVE that I can use to evade the podman sandbox...
<alatiera>
I should ask bart if he ever looked into kata containers before
<bentiss>
alatiera: IIRC we looked, but never could get it working properly and the current solution was working... so...
<alatiera>
oh fuck I wonder if the windows runners are also pawned
<alatiera>
they were 100% usage up until I logged in
<alatiera>
encouraging
<bentiss>
I guess time for cutting all CI entirely
<alatiera>
it was probably some CI job that just finished hopefully
* bentiss
crosses fingers
<bentiss>
alatiera: actually if you could log in, then maybe it's not compromised. We are basically unable to do anything on the compromised hosts
<bentiss>
well, unless spwaning a container thorugh CI and logging in it
<daniels>
bentiss: mmm, weston also uses kvm, and others using docker-in-docker are going to need privileged
<alatiera>
yea I think the windows ones are fine but probably nto for long
<alatiera>
I will disable the runner for now
<bentiss>
daniels: kvm was fine in usermode
<bentiss>
it was virglrenderer that was failing
<daniels>
oh right, why was virgl failing?
<alatiera>
d-in-d is a nogo
<alatiera>
but its probably only used to build docker images hopefully?
<daniels>
some projects like dbus use dind; virgl is using ci-templates tho
<bentiss>
daniels: I couldn't really understand why
<bentiss>
alatiera: d-in-d is already not working anymore, because the runners are behind podman these days
<alatiera>
oh nice
<__tim>
alatiera, windows seems to be working fine at least (cerbero msvc job at least running fine)
<__tim>
if it was running miners you'd notice ;)
<alatiera>
yea I just saw that job
<__tim>
it's not terribly important if you want to stop it
<alatiera>
not useful without the linux runners anyway
<alatiera>
so yea I stopped the runner
<bentiss>
daniels: I rebooted 13, and can log back in
<bentiss>
and there are 3 suspicious ssh keys indeed
<bentiss>
one with a plain email :)
<bentiss>
yes, on that compromised machine, if I run podman logs runner-cvxra4bi-project-44165466-concurrent-1-6cd38c9473c64837-build-2 -> definitely a crypto handling
<bentiss>
xmrig and everythin
<bentiss>
but it doesn't seem to be a compromising job. Just a crypto one
<bentiss>
but the project id is suspicious, because we are not anywhere near id 44165466
<daniels>
mupuf: podman-plugins doesn't seem to exist as a package - what is it that you wanted installed?
<mupuf>
daniels: what distro are you using?
<bentiss>
it's debian stable with a custom repo for podman
<mupuf>
ack!
<bentiss>
so not surprising that it doesn't exist
* mupuf
will need to figure out what is important, and which package provides it
<mupuf>
so... ignore my request for now
<mupuf>
sorry for the noise
<mupuf>
even arch doesn't have such a package
<alatiera>
what's project 19974
<bentiss>
daniels: so what's the plan with -16 and -18?
<bentiss>
alatiera: gfx-ci-bot/mesa
<alatiera>
pheww
<alatiera>
also its still on windows 2019!
<alatiera>
I had almost forgot about that runner
<bentiss>
I'm afraid they are using a bug in gitlab... because that project id doesn't exist
<bentiss>
runner-cvxra4bi-project-44165466-concurrent-1-6cd38c9473c64837-build-2 I mean
<daniels>
bentiss: I was going to leave them for now - so they're at least usable - whilst I try to bring up kata in the background on another runner
<bentiss>
daniels: ack
<bentiss>
the thing that worries me is that I don't know what they did to escape the container
<bentiss>
ohhh. /etc/gitlab-runner/config.toml is polluted with a lot of new registrations
<bentiss>
towards gitlab.com
<DavidHeidelberg[m]>
outch
<bentiss>
so AFAICT, the chain was -> we are compromised -> they register a gitlab.com runner to their project -> they mine crypto
<bentiss>
next question, how were we compromised????
<daniels>
bentiss: oh ... !
<daniels>
you can compromise it from a job if you have --privileged
<bentiss>
daniels: I kept a copy of the file in /root/compromised_config.toml
<bentiss>
daniels: then instead of kata, maybe we can just use usermode podman
<daniels>
we can do both!
<bentiss>
daniels: but that also mean that that person registered before March 2 when we blocked provite repos
<mupuf>
lovely...
<bentiss>
or... that it's one of the last person requesting access
<bentiss>
which is easy to track :)
<mupuf>
but yeah, privileged runners == not so good. The problem is that ci-templates currently requires it
<mupuf>
(it used to work, but a newer version of buildah fails un unprivileged runners)
<bentiss>
mupuf: nope, it was working fine with usermode podman
<mupuf>
ha, I see! so the problem is that I did not finish this transition in my farm just yet
<mupuf>
thanks
<mupuf>
one more thing I need to do :D
<mupuf>
well, the code is written, just not released
<bentiss>
daniels: it could be interesting to ping our friends at gitlab about that project (44165466). All I have is a username in an path which doesn't match to a public gitlab.com account: builds/chanakyan.j/
<bentiss>
and FWIW, one of the ssh key was chana@LAPTOP-K58CTK2G
<alatiera>
that's sloppy if that's it lol
<bentiss>
alatiera: I even have an ssh key with an email :) (@gmail.com)
<alatiera>
but you can put anything on the key description
<alatiera>
but yea that's funny
<alatiera>
wait wait, the priv part of the key too? :O
<bentiss>
no no, the ssh-rsa XXXXXX foo@bar
<bentiss>
which can be anything
<bentiss>
daniels: FWIW, -15 had the placeholder job, and is gone. You might want to bring this one back in
<daniels>
bentiss: yep, will do thanks
<daniels>
bentiss: indeed I just found both chanakyan.j and dakshesh07
<bentiss>
daniels: where id you find them?
<daniels>
on the arm runner, running a job from chanakyan.j/llvm-tc-build (which has since been deleted) correlates with the time the gitlab-runner config was modified
<bentiss>
FWIW, dakshesh07 was the one I reported on github and that was messing with us a few month ago
<daniels>
yeah, I remember him from LLVM builds
<daniels>
dakks@ was another of the SSH keys in there
<bentiss>
yep
<daniels>
I've also got an email written to gitlab.com
<bentiss>
nice :)
<bentiss>
daniels: which arm runner was it, 7 or 8?
<daniels>
both ...
<bentiss>
finding that job in the list of jobs is going to be cumbersome :(
<daniels>
the project was deleted, so the jobs are gone too
<bentiss>
ah, damn
<bentiss>
though you got it all written down in your email, so we are fine