ChanServ changed the topic of #freedesktop to: https://www.freedesktop.org infrastructure and online services || for questions about freedesktop.org projects, please see each project's contact || for discussions about specifications, please use https://gitlab.freedesktop.org/xdg or xdg@lists.freedesktop.org
<daniels>
though this isn't useful to you if you don't have full admin access ...
<daniels>
the other thing you can do is to either cancel all pipelines on that MR, or just wait an hour
<daniels>
there's a dumb race where Marge latches on to the wrong pipeline for an MR - only the pipeline for the commit ref, not the one for the MR - and so never sees the actual MR pipeline starting
<anholt>
to be fair, I lose that race a lot too
<daniels>
yeah?
<anholt>
push new core code change, refresh web ui, click play on the commit ref pipeline since that shows up before mr pipeline does
<daniels>
ah, heh
ximion has quit [Remote host closed the connection]
ximion has joined #freedesktop
* DavidHeidelberg[m]
it's the time to say "I think they should rewrite Gitlab in Rust"? :D
<DavidHeidelberg[m]>
and all the rest of the builds are non-LTO, so if it diverges a little, that path should be tested well enough by all the other jobs
<eric_engestrom>
DavidHeidelberg[m]: yeah, makes sense to me
ximion1 has joined #freedesktop
<eric_engestrom>
besides, I expect basically none of us build with lto enable locally, so if something ever does slip in because of !20623 we'll all notice it right away
pendingchaos has joined #freedesktop
<DavidHeidelberg[m]>
your words are kind, but what I'm eager for are Acks and Rbs :D
ximion has quit [Ping timeout: 480 seconds]
<anholt>
someone with build system interest should probably look into whether we can actually ship mesa as lto. I expect it would need -fno-strict-aliasing, but would the win of lto beat the loss of strict aliasing?
<eric_engestrom>
about github not being updated anymore, I think it's because of an update of `git` on kemper where it became more strict about * gestures vaguely * things:
<eric_engestrom>
remote: remote: /srv/kemper.freedesktop.org/not-for-git/wayland/update-github.sh: line 11: shift: shift count out of range
<eric_engestrom>
remote: remote: fatal: detected dubious ownership in repository at '/srv/git.freedesktop.org/git/mesa/mesa.git'
<eric_engestrom>
(that `shift out of range` is unrelated, it's been there for literally years and I kept meaning to look into it but never did; I just copied th eline because it shows the filename)
<eric_engestrom>
a quick search indicates this should disable this check globally (probably what we want there, but it might be good to look into it more before we do that just in case there's an attack vector that we didn't think about until now):
<eric_engestrom>
anholt: I was going to link to arch's package since I knew they built mesa with lto, but then it turns out they actually disabled it when 22.2.0 came out because apparently it caused a rendering issue
<eric_engestrom>
but until then, they did
<anholt>
yeah, over time people periodically try lto, and eventually they back off.
<anholt>
my suspicion is strict aliasing -- we violate it thoroughly on a regular basis, but mostly get away with it and I suspect in a lot of cases it's due to no optimization between compilation units.
pendingchaos_ has quit [Ping timeout: 480 seconds]
pendingchaos has quit [Ping timeout: 480 seconds]
pendingchaos has joined #freedesktop
<DavidHeidelberg[m]>
eric_engestrom: one setback (when I enabled it only on debian-build-testing) is that LTO takes a long time, so it's suboptimal for builds where other test jobs are waiting for them to finish.
Leopold_ has quit [Ping timeout: 480 seconds]
<daniels>
DavidHeidelberg[m]: if you're going to be around for a little bit still, could you please get an MR together which s/wget/curl/g and adds the option which will retry on every single error (DNS, network connection, 5xx, whatever - anything other than 4xx) so we try every network request for like 2min?
<bentiss>
DavidHeidelberg[m]: .gitlab-ci/container/container_pre_build.sh is *not* executed in the target container, but in the gitlab-runner helper container, which is locally configured on the machine
<daniels>
DavidHeidelberg[m]: I believe you'd need to add retry_on_host_error and retry_on_http_error=502,503,504
<daniels>
bentiss: it populates /etc/wgetrc
<bentiss>
daniels: yes, it does mount the volume used by the target container, but again, you don't have the same binaries available
<daniels>
bentiss: right, I just mean that it's called in the context of creating the rootfs used for all the target containers, and it populates /etc/wgetrc with a set of options, so it will be influencing what future wget calls do
<bentiss>
daniels: maybe, I was just pointing at the fact that the script in .gitlab-ci/container/container_pre_build.sh could not be used as "wget is working in container_pre_build.sh, so we always have wget available"
<daniels>
sure :)
<bentiss>
and given that it's a different environment when you start your continers, all bets are off
<bentiss>
daniels: while you are here too. I played last week with the registry db on my local install
<bentiss>
so I have a couple of items I'd like you to validate/refute:
<DavidHeidelberg[m]>
daniels: ok, I'm slowly heading to the bed, so I would prefer to push wget solution for now and curl for tomorrow :D
<bentiss>
- we still pay for artifacts, lfs, uploads on GCS, can we nuke them? (would represent a small amount, but still no reasons to keep them IMO)
<bentiss>
- I can enable the registry db feature, but there are warning everywhere about not using that in production environment, though gitlab.com is using it
<bentiss>
- the only migration we can do for the registry db is, luckily when the resgitry is backed by GCS, so in theory it should go well, but I couldn't test it on my local gitlab+minio install
<daniels>
DavidHeidelberg[m]: sounds good :)
<bentiss>
- basically, even if I enable the postgres registry db and the settings in the chart, we still need to enable a gitlab ruby feature to enable migration, and migration is entirely controlled by ruby from the console
<daniels>
bentiss: hmmm did we migrate artifacts etc to minio already?
<bentiss>
daniels: yes, we haven't access those GCS buckets in a year I think
<bentiss>
it represents ~2TB, so not a lot, but that's still 10% of the $400 a month we pay
<daniels>
ahhh nice, yeah we can nuke them then
<daniels>
the registry … heh … I think we’d need some serious (full-weekend?) downtime to try it tbh
<bentiss>
so summary for the registry db, I'd be tempted to enable it nonetheless, do the migration for a few selected projects, and then do the whole thing
<daniels>
unless we can like … enable it running but unused and run a background rake task to try the migration and see if it works?
<daniels>
yeah
<daniels>
that sounds like a great plan
<bentiss>
the migration is transparent: if the repository exists, it uses the old path. If not or if it has migrated, it'll be using the db
<bentiss>
I might have a little bit of time at the end of the week, but not tomorrow FWIW
<daniels>
awesome
<bentiss>
daniels: also note that the registry migration keeps blobs on GCS, so we will potentially double the size of the registry. But in theory we will be copying only the used blobs
<daniels>
I can try to free up some time towards the end of the week
<bentiss>
so less than double
<daniels>
only got back to work yesterday so still catching up
<bentiss>
yeah, no worries
<DavidHeidelberg[m]>
daniels: could be "retry_on_host_error = 429,500,502,503,504 ?
<daniels>
DavidHeidelberg[m]: *http_error?
<daniels>
host_error is used for DNS AIUI
<bentiss>
daniels: one thing that *might* be missing is the db backup of the registry. I haven't checked if it was already setup on the gitlab backup jobs
<bentiss>
daniels: one last thing before going to bed: marge and the developers of mesa are safe, because they would use harbour in their MR pipelines, so that community won't be impacted normally
<DavidHeidelberg[m]>
daniels: thanks for catch, same number of letters, error-prone :D
<daniels>
bentiss: hehe, we can pick another project that can deal with a day or two of no merges - I'd volunteer either Wayland or Weston
<daniels>
DavidHeidelberg[m]: we do still want host_error too tho :)
<daniels>
cf. the 'it's always DNS' meme
<DavidHeidelberg[m]>
I know, I know :D
<DavidHeidelberg[m]>
We want it all, we want it now.
<bentiss>
daniels: ci-templates is also a *very* good candidate
<bentiss>
easy enough to stress test the registry
<bentiss>
alright. I killed the unused buckets and going afk now
<daniels>
thanks, bonne nuit!
<DavidHeidelberg[m]>
let's do it now. At least I'll know if it worked tomorrow.
<DavidHeidelberg[m]>
daniels: I think one cp in lava_build should do the trick. Pushed.
ybogdano has quit [Ping timeout: 480 seconds]
danvet has quit [Ping timeout: 480 seconds]
<daniels>
DavidHeidelberg[m]: ty!
<daniels>
and guten nacht
<DavidHeidelberg[m]>
daniels: should I just run pipeline or assign marge/
<DavidHeidelberg[m]>
thx! :) Guten nacht :)
<daniels>
o/
<daniels>
I've assigned Marge, have only just finished dinner so I'll hang around to see how it turns out