ChanServ changed the topic of #freedesktop to: https://www.freedesktop.org infrastructure and online services || for questions about freedesktop.org projects, please see each project's contact || for discussions about specifications, please use https://gitlab.freedesktop.org/xdg or xdg@lists.freedesktop.org
scrumplex has joined #freedesktop
scrumplex_ has quit [Ping timeout: 480 seconds]
scrumplex_ has joined #freedesktop
scrumplex has quit [Ping timeout: 480 seconds]
bozo16 has quit [Remote host closed the connection]
swatish2 has joined #freedesktop
dcunit3d has joined #freedesktop
swatish2 has quit [Ping timeout: 480 seconds]
swatish2 has joined #freedesktop
swatish2 has quit [Ping timeout: 480 seconds]
eluks has quit [Remote host closed the connection]
eluks has joined #freedesktop
ximion has quit [Quit: Detached from the Matrix]
ximion has joined #freedesktop
swatish2 has joined #freedesktop
tlwoerner_ has joined #freedesktop
tlwoerner has quit [Ping timeout: 480 seconds]
tzimmermann has joined #freedesktop
swatish2 has quit [Ping timeout: 480 seconds]
jsa1 has joined #freedesktop
swatish2 has joined #freedesktop
<svuorela> I've gotten a feeling that gitlab has gotten slower ?
<MrCooper> compared to a pre- or post-migration baseline?
nephyrin has quit [Quit: ... besides, it was hot]
DemiMarie is now known as Guest12805
sghuge has quit [Remote host closed the connection]
ximion has quit [Remote host closed the connection]
sghuge has joined #freedesktop
<svuorela> definitely compared to a post-migration baseline, maybe even compared to a pre-migration baseline.
<svuorela> (I'm primarily in poppler if that makes a difference)
swatish2 has quit [Ping timeout: 480 seconds]
nephyrin has joined #freedesktop
swatish2 has joined #freedesktop
sima has joined #freedesktop
AbleBacon has quit [Read error: Connection reset by peer]
swatish21 has joined #freedesktop
MrCooper_ has joined #freedesktop
<bilboed> hm... indeed
MrCooper has quit [Ping timeout: 480 seconds]
swatish2 has quit [Ping timeout: 480 seconds]
jsa1 has quit [Ping timeout: 480 seconds]
<slomo> i think still a bit faster then pre-migration, but not as much as right after migration
jsa1 has joined #freedesktop
<bentiss> one thing that could explain, is the backups of gitaly that I only enabled last Sunday
<bentiss> the daily backup takes 6h, and we are 4h12m in
<bentiss> side note: I've enabled fastly for all *.freedesktop.org pages sites as of this morning. Of course I screwed up a bit the DNS, so if this is not working yet, wait a little bit more that the DNS gets cached properly to fastly
<bentiss> (IOW, mesa.freedesktop.org is using fastly, mesa3d.org is not)
overtime69ffcf[m] has joined #freedesktop
<eric_engestrom> bentiss: womp womp... we need to add other tags to fdo runners, otherwise just `priority:low` gets picked by any runner that has that tag, such as... a steamdeck in mupuf's farm: https://gitlab.freedesktop.org/mesa/mesa/-/jobs/73883719
<eric_engestrom> I think we need to haave the fdo runners register both the priority tag and an `fdo-runner` tag or something like that, and jobs needs to require both
kxkamil2 has quit []
<eric_engestrom> (ci-tron jobs are fine because they always have a tag for the farm they run on, so they don't risk being picked up by fdo runners, it's only the other way around that's a problem right now)
<bentiss> sigh, the tagging mechanism in gitlab is just shitty
<eric_engestrom> yeah :/
zerozero2 has quit [Ping timeout: 480 seconds]
<eric_engestrom> I think my solution should work though, what do you think?
<bentiss> I just checked, this runner from mupuf is the only one having priority:low (mupuf-gfx10-vangogh-1 and mupuf-gfx10-vangogh-5), so I wonder if we should not address that instead
<bentiss> your solution works, but I feel like that's not the best
MrCooper_ is now known as MrCooper
<bentiss> mupuf: do you use the priority in mupuf-gfx10-vangogh-*?
overtime69ffcf[m] has left #freedesktop [User left]
<eric_engestrom> yeah we use it, but we can rename it to eg. `ci-tron-priority:*`
<eric_engestrom> or `ci-tron:priority:*` to be more in line with our other tags
<eric_engestrom> (I haven't renamed the tag on the farm side, I'll do that when merging this)
<bentiss> thanks!
<mupuf> I proposed an alternative name
swatish21 has quit [Ping timeout: 480 seconds]
samuelig has joined #freedesktop
ybogdano has quit [Remote host closed the connection]
ybogdano has joined #freedesktop
<__tim> we've been seeing loads of "WARNING: Uploading artifacts as "archive" to coordinator... 500 Internal Server Error" since yesterday, is that related to the hetzner S3 problems or something else?
<__tim> (and failed jobs/pipelines as a result)
___nick___ has joined #freedesktop
swatish2 has joined #freedesktop
<mupuf> bentiss:
<mupuf> is fastly still hammering our bandwidth?
<mupuf> I keep getting KVM jobs to timeout, seemingly due to slow network but could also be insane CPU usage too
MrCooper_ has joined #freedesktop
kxkamil has joined #freedesktop
MrCooper has quit [Ping timeout: 480 seconds]
JerryXiao has quit [Quit: Bye]
<bentiss> mupuf: runner-x86-1 is pulling a lot of data from RIPE-ERX-146-75-0-0
<mupuf> bentiss: any idea what this is?
<bentiss> nothing seems abnormal on the runner
<bentiss> android-ndk is running, maybe that's related
<bentiss> qit eventually stopped
<bentiss> s/qit/it
<bentiss> mupuf: also one thing to remember, is that those runners now only have a single Gbit line, when the Equinix ones had a 10 Gbit (maybe dual)
<mupuf> bentiss: ack, but it shouldn't be using *that* much network
<bentiss> mupuf: link?
<bentiss> __tim: yeah, hetzner is still having a little bit of issues with their object storage
<__tim> "little bit" 😆
<mupuf> this very same step takes less than 5 minutes on all the gateways we have, and none have as good a connection as one would expect from hetzner
<mupuf> at equinix, it took less than 2 minutes
<bentiss> I just don't know what I'm supposed to see
<mupuf> it shouldn't be pulling much data at all, no more than 100 MB... most of it coming from a DNF update
<mupuf> there isn't much to see, indeed
<mupuf> maybe I could re-run the job and from there you could tell if there is a high cpu load or something?
<bentiss> the pull, create, and init the container only took 32 secs, so I guess it's not a network issue
<mupuf> yeah, but sometimes I saw that just revalidating an artifact (a HEAD request) would take over a minute
<bentiss> artifacts is different, as mentioned above hetzner is having issues
MrCooper_ is now known as MrCooper
<mupuf> it's basically been ever since you moved the kvm runner, so that pre-dates the issues at hetzner
<bentiss> k
swatish2 has quit [Ping timeout: 480 seconds]
<mupuf> bentiss, eric_engestrom: the gitlab runner priority has a little bit of a bug
<mupuf> no jobs of a lower priority will be picked up as long as there is at least a high priority job executing
<mupuf> the script was designed for `parallel: 1`
<mupuf> in other words, we should probably drop the parallel and register as many runners as we can run in parallel
<eric_engestrom> ah indeed, good catch
<eric_engestrom> I haven't looked at how bentiss integrated my code into the fdo infra; do you have a link I could look at?
<mupuf> but not sure how this works
swatish2 has joined #freedesktop
todi1 has quit []
todi has joined #freedesktop
Caterpillar has quit [Quit: Konversation terminated!]
Caterpillar has joined #freedesktop
<bentiss> yep to all three files
MrCooper_ has joined #freedesktop
<bentiss> mupuf, eric_engestrom: IIRC I fixed that in the deployed version. Each runner has a concurrent variable set to the number of threads, and the commit from above ensures each thread is independant of each other
MrCooper has quit [Ping timeout: 480 seconds]
<mupuf> bentiss: great!
<bentiss> I just realized I promised eric_engestrom a MR with my changes... sorry
swatish2 has quit [Ping timeout: 480 seconds]
guludo has joined #freedesktop
sooc has quit [Remote host closed the connection]
mebious has quit [Write error: connection closed]
Guest9365 has quit [Remote host closed the connection]
moses has quit [Remote host closed the connection]
nucfreq has quit [Remote host closed the connection]
shymega[i] has quit [Remote host closed the connection]
elibrokeit_ has quit [Remote host closed the connection]
rpigott has quit [Remote host closed the connection]
ajhalili2006 has quit [Remote host closed the connection]
MrCooper_ is now known as MrCooper
moses has joined #freedesktop
raghavgururajan has joined #freedesktop
jsa1 has quit [Ping timeout: 480 seconds]
raghavgururajan is now known as Guest12825
shymega[i] has joined #freedesktop
elibrokeit_ has joined #freedesktop
rpigott has joined #freedesktop
jsa1 has joined #freedesktop
sooc has joined #freedesktop
nucfreq has joined #freedesktop
ajhalili2006 has joined #freedesktop
mebious has joined #freedesktop
MrCooper_ has joined #freedesktop
ximion has joined #freedesktop
swatish2 has joined #freedesktop
MrCooper has quit [Ping timeout: 480 seconds]
jsa1 has quit [Ping timeout: 480 seconds]
<eric_engestrom> bentiss: no worries! I applied some already
<eric_engestrom> the concurrent change didn't make enough sense to me so I didn't apply it back for now
<eric_engestrom> also, there's `os.cpu_count()` instead of calling `nproc` :)
<eric_engestrom> also, we might want a check that `cpu_count % concurrent == 0` to make sure we're not leaving some cpus unreachable
emersion_ has joined #freedesktop
r00tobo[BNC] has joined #freedesktop
phryk_ has joined #freedesktop
MrCooper__ has joined #freedesktop
minus_ has joined #freedesktop
emersion has quit [Read error: Connection reset by peer]
leftas has quit [Quit: Ping timeout (120 seconds)]
ocrete has quit [Quit: Ping timeout (120 seconds)]
dbrouwer has quit [Quit: Ping timeout (120 seconds)]
tintou has quit [Quit: Ping timeout (120 seconds)]
ndufresne has quit [Quit: Ping timeout (120 seconds)]
ao2_collabora has quit [Quit: Ping timeout (120 seconds)]
konstantin has quit [Remote host closed the connection]
r00tobo has quit [Quit: Quit]
phryk has quit [Read error: Connection reset by peer]
minus has quit [Remote host closed the connection]
mrpops2ko has quit [Remote host closed the connection]
fantom has joined #freedesktop
leftas has joined #freedesktop
konstantin has joined #freedesktop
ocrete has joined #freedesktop
mrpops2ko has joined #freedesktop
ao2_collabora has joined #freedesktop
jsa1 has joined #freedesktop
MrCooper_ has quit [Ping timeout: 480 seconds]
a_fantom has quit [Ping timeout: 480 seconds]
swatish2 has quit [Ping timeout: 480 seconds]
dbrouwer has joined #freedesktop
ndufresne has joined #freedesktop
tintou has joined #freedesktop
tintou has quit []
tintou has joined #freedesktop
tintou has quit []
tintou has joined #freedesktop
<bentiss> eric_engestrom: sure, I'll take any upgrade in that script. It was kind of a "let get this thing working" situation
<__tim> ooc, how does mesa get any MRs merged? does mesa not do any artefact upload/download?
<bentiss> __tim: looks like they get a few merged regularly, so not sure if they have an issue
<__tim> yes exactly, I'm wondering if we're the only ones having issues :) (esp since it's with our own runners)
<bentiss> oh... link to a failed job?
<__tim> artefact download on mac os runner (this did pass on the 15th retry though): https://gitlab.freedesktop.org/gstreamer/cerbero/-/jobs/73923885
<__tim> usually it's the uploads that are failing
<__tim> and yes, I guess that's the hetzner s3 issue, but why is mesa not so affected?
<bentiss> maybe they use smaller artifacts?
<__tim> maybe :)
<bentiss> but yeah, right now, there isn't much I can do. Worse case we'll have to use a different bucket location, but that means we'd have to move all of the data first, which is a PITA
<__tim> ouch
<bentiss> the artifacts data, not the git data
<bentiss> I put the data there, to be closer to the machines
MrCooper_ has joined #freedesktop
MrCooper__ has quit [Ping timeout: 480 seconds]
haaninjo has joined #freedesktop
emersion_ has quit [Remote host closed the connection]
emersion has joined #freedesktop
jsa1 has quit [Ping timeout: 480 seconds]
MrCooper__ has joined #freedesktop
MrCooper_ has quit [Ping timeout: 480 seconds]
karolherbst has quit [Read error: Connection reset by peer]
karolherbst has joined #freedesktop
swatish2 has joined #freedesktop
tzimmermann has quit [Quit: Leaving]
swatish2 has quit [Ping timeout: 480 seconds]
mripard has quit [Quit: WeeChat 4.6.0]
Kayden has quit [Quit: -> jf]
MrCooper_ has joined #freedesktop
MrCooper__ has quit [Ping timeout: 480 seconds]
guludo has quit [Ping timeout: 480 seconds]
guludo has joined #freedesktop
tlwoerner_ has quit [Ping timeout: 480 seconds]
swatish2 has joined #freedesktop
___nick___ has quit [Ping timeout: 480 seconds]
swatish2 has quit [Ping timeout: 480 seconds]
swatish2 has joined #freedesktop
swatish2 has quit [Ping timeout: 480 seconds]
Guest12805 is now known as DemiMarie
<DemiMarie> Was anyone ever concerned about the security of Hetzner’s bare-metal offerings in light of hardware/firmware infection attacks?
cascardo_ has joined #freedesktop
cascardo has quit [Ping timeout: 480 seconds]
haaninjo has quit [Quit: Ex-Chat]
AbleBacon has joined #freedesktop
sima has quit [Ping timeout: 480 seconds]
<pinchartl> DemiMarie: are you volunteering to go camp in the data centre to keep watch ? :-)
<DemiMarie> pinchartl: what I mean is “was it wise to pick a bare-metal offering run by a not-that-high-end provider, as opposed to one of the big name vendors or a colo”
<pinchartl> are the big names inherently safer ? especially when considering that many of them are USA companies, and are covered by the USA cloud act ?
<pinchartl> I don't think anyone can answer that question with any certainty
<DemiMarie> More resources to spend on things like custom board designs
<DemiMarie> I believe that generally the people who are really concerned about security go for colos
<DemiMarie> or their own datacenters if the scale justifies it (which this does not)
<pinchartl> it reminds me of https://xkcd.com/641/. do you pick the cereals guaranteed 100% free of asbestos, or the ones guaranteed 100% free of plutonium ?
<DemiMarie> see above w.r.t. colos
<DemiMarie> (read: giving up on cloud and using dedicated hardware)
<pinchartl> I don't think fdo can afford building its own data centre indeed :-)
<DemiMarie> I think the general rule is that if security is the top priority, you want to own hardware, not rent it
<pixelcluster> honestly hetzner isn't exactly a no-name provider either is it
<pixelcluster> this really seems like a "if it's so important to you, feel free to provide the resources to make it happen" scenario to me
* pixelcluster is not too involved in infra tbc
<DragoonAethis> DemiMarie: Would you consider "Oracle" to be enough of a big name to trust?
aswar002_ has quit []
<DemiMarie> DragoonAethis: for me, “big name” in the cloud space means “AWS/Azure/GCP”, especially Amazon or Google
<DragoonAethis> And all 3 of these options are at least an order of magnitude more expensive than what Hetzner gets you
<DemiMarie> personally, I would have gone with a colo facility and buying servers from a vendor, but if fd.o doesn’t ahve the resources for that it makes sense why they had to go with a different option
<DemiMarie> DragoonAethis: you get what you pay for in the hosting space
<DemiMarie> my concern, of course, is that someone would target https://gitlab.freedesktop.org so they can backdoor Mesa or one of the other giant projects
<vyivel> i would just pay someone to push vulnerable code
<vyivel> sounds much easier
<DemiMarie> vyivel: am I too paranoid?
<DragoonAethis> DemiMarie: kinda?
<DragoonAethis> This is a massive project that you would like to run at hyperscaler's levels of corporate security
aswar002 has joined #freedesktop
<DragoonAethis> Whereas the backend gets 3 part-time admins mostly trying to keep it held together with duct tape
<vyivel> oh right bribing/blackmailing an admin is even "better"
<pixelcluster> infecting a server with a baremetal malware to (I guess?) alter some files in the git repo honestly sounds like the most elaborate and expensive setup for the smallest possible result to me
<airlied> indeed if you wanted to run a botnet on hetzner it might be okay or hoping someone with corp secrets would provision the same server after you, but for a server hosting open source git repos, probably not worth it
kasper93 has quit [Ping timeout: 480 seconds]
<alanc> "if security is the top priority" - for fd.o though, security cannot be the top priority - something the org can afford (from both a monetary and admin time perspective) has to be the top priority, since otherwise the project is just dead
<alanc> security is important, and a high priority, but at a level appropriate to the project, not excluding everything else
kasper93 has joined #freedesktop
kasper93 has quit [Ping timeout: 480 seconds]
<zmike> anyone know what's going on with CI jobs on https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34235
<zmike> seems like trace jobs are having issues maybe?
guludo has quit [Ping timeout: 480 seconds]
<robclark> for another example, https://gitlab.freedesktop.org/mesa/mesa/-/pipelines/1396465 .. the traces are not ok
alanc has quit [Remote host closed the connection]
alanc has joined #freedesktop