#freedesktop on 2023-01-10 — irc logs at oftc.irclog.whitequark.org

2022-12-21 00:45 ChanServ changed the topic of #freedesktop to: https://www.freedesktop.org infrastructure and online services || for questions about freedesktop.org projects, please see each project's contact || for discussions about specifications, please use https://gitlab.freedesktop.org/xdg or xdg@lists.freedesktop.org

00:29 mairacanal has quit []

00:29 tales-aparecida has quit []

00:30 mairacanal has joined #freedesktop

00:31 tales-aparecida has joined #freedesktop

00:46 jarthur has quit [Quit: Textual IRC Client: www.textualapp.com]

00:49 tales-aparecida has quit []

00:49 mairacanal has quit []

00:50 mairacanal has joined #freedesktop

00:50 tales-aparecida has joined #freedesktop

00:58 alatiera has quit [Ping timeout: 480 seconds]

01:45 Kayden has quit [Quit: go home]

02:12 vsyrjala_ has joined #freedesktop

02:14 vsyrjala has quit [Ping timeout: 480 seconds]

02:57 ximion has quit []

03:45 cdochufe^ has joined #freedesktop

04:32 Kayden has joined #freedesktop

06:07 GNUmoon has quit [Ping timeout: 480 seconds]

06:11 GNUmoon has joined #freedesktop

06:17 cdochufe^ has quit [Remote host closed the connection]

06:22 itoral has joined #freedesktop

07:08 bilboed has quit [Quit: The Lounge - https://thelounge.chat]

07:10 bilboed has joined #freedesktop

07:20 bilboed has quit []

07:22 bilboed has joined #freedesktop

07:23 bilboed has quit []

07:26 bilboed has joined #freedesktop

08:00 agd5f has quit [Remote host closed the connection]

08:05 GNUmoon has quit [Remote host closed the connection]

08:06 GNUmoon has joined #freedesktop

08:09 ximion has joined #freedesktop

08:13 ximion has quit []

08:20 mvlad has joined #freedesktop

08:45 danvet has joined #freedesktop

08:52 MajorBiscuit has joined #freedesktop

08:59 vbenes has joined #freedesktop

09:17 alatiera has joined #freedesktop

09:31 dos1 has quit [Quit: Kabum!]

09:31 dos1 has joined #freedesktop

09:35 dos1 has quit []

09:36 dos1 has joined #freedesktop

10:43 immibis has quit [Read error: Connection reset by peer]

10:46 immibis has joined #freedesktop

11:01 sunarch has quit [Quit: Connection closed for inactivity]

11:19 <gkiagia> ooow, that doesn't sound good... "There has been a structural integrity problem detected, please contact system administrator"

11:19 <gkiagia> (showing on gitlab CI)

11:20 <daniels> gkiagia: link please?

11:20 <daniels> at a guess, that's related to me having just upgraded GitLab about 5 minutes ago

11:21 <gkiagia> https://gitlab.freedesktop.org/pipewire/wireplumber/-/jobs/34438405

11:25 <daniels> yeah, that's just spectacularly bad timing

11:32 <gkiagia> heh

11:47 AbleBacon has quit [Read error: Connection reset by peer]

12:05 genpaku has quit [Read error: Connection reset by peer]

12:08 genpaku has joined #freedesktop

12:25 itoral has quit [Remote host closed the connection]

13:42 agd5f has joined #freedesktop

13:43 ybogdano has joined #freedesktop

15:01 TheDisruptiveCollective|JoinOu has left #freedesktop [#freedesktop]

15:19 Haaninjo has joined #freedesktop

15:37 ybogdano has quit [Read error: Connection reset by peer]

16:06 GNUmoon has quit [Remote host closed the connection]

16:07 GNUmoon has joined #freedesktop

16:11 ybogdano has joined #freedesktop

16:37 AbleBacon has joined #freedesktop

16:50 ximion has joined #freedesktop

17:08 pendingchaos_ has joined #freedesktop

17:10 alanc has quit [Remote host closed the connection]

17:11 alanc has joined #freedesktop

17:11 pendingchaos has quit [Ping timeout: 480 seconds]

17:17 jarthur has joined #freedesktop

17:27 pendingchaos_ is now known as pendingchaos

17:56 MajorBiscuit has quit [Ping timeout: 480 seconds]

19:06 pendingchaos has quit [Ping timeout: 480 seconds]

19:11 <glehmann> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20370 what happened here? marge just seems to be stuck without any message. And meanwhile it's working on other MRs.

19:24 <anholt> https://gitlab.freedesktop.org/users/marge-bot/activity

19:25 <anholt> not sure why that mr was left to go do the other one

19:26 <DavidHeidelberg[m]> glehmann: maybe "Part-of: <mesa/mesa!20370>"?

19:27 <DavidHeidelberg[m]> glehmann: try remove the line from the commits and push again

19:29 <DavidHeidelberg[m]> daniels: btw. I don't see Marge workin on any MR

19:30 <daniels> glehmann: a very rare bug in Marge

19:30 <DavidHeidelberg[m]> daniels: wait don't cancel the Marge-bot

19:31 <DavidHeidelberg[m]> the venus disable in in queue already, so I did disable -> enable for all other MRs to keep them AFTER the venus disable

19:32 <daniels> DavidHeidelberg[m]: ah thanks, I'd missed that

19:32 <daniels> I've got them all open to reassign when it finally gets through

19:33 <DavidHeidelberg[m]> daniels: oki :)

19:34 <DavidHeidelberg[m]> haha, I ommited docs: Add calendar entries for 23.0 release candidates. but that job just build docs :D

19:35 <DavidHeidelberg[m]> daniels: what magic to do, when is Marge stuck?

19:41 <daniels> DavidHeidelberg[m]: kubectl --context fdo -n marge-bot rollout restart deploy/marge-bot-mesa

19:41 <daniels> though this isn't useful to you if you don't have full admin access ...

19:42 <daniels> the other thing you can do is to either cancel all pipelines on that MR, or just wait an hour

19:42 <daniels> there's a dumb race where Marge latches on to the wrong pipeline for an MR - only the pipeline for the commit ref, not the one for the MR - and so never sees the actual MR pipeline starting

19:43 <anholt> to be fair, I lose that race a lot too

19:43 <daniels> yeah?

19:44 <anholt> push new core code change, refresh web ui, click play on the commit ref pipeline since that shows up before mr pipeline does

19:46 <daniels> ah, heh

19:49 ximion has quit [Remote host closed the connection]

19:50 ximion has joined #freedesktop

19:51 * DavidHeidelberg[m] it's the time to say "I think they should rewrite Gitlab in Rust"? :D

19:52 <DavidHeidelberg[m]> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20623 btw. for a some speedup, would you be ok to build only LTO in debian-build-testing? In general it should be +- only superset of what does normal builds

19:53 <DavidHeidelberg[m]> and all the rest of the builds are non-LTO, so if it diverges a little, that path should be tested well enough by all the other jobs

19:55 <eric_engestrom> DavidHeidelberg[m]: yeah, makes sense to me

19:55 ximion1 has joined #freedesktop

19:56 <eric_engestrom> besides, I expect basically none of us build with lto enable locally, so if something ever does slip in because of !20623 we'll all notice it right away

19:56 pendingchaos has joined #freedesktop

19:57 <DavidHeidelberg[m]> your words are kind, but what I'm eager for are Acks and Rbs :D

19:59 ximion has quit [Ping timeout: 480 seconds]

20:10 <anholt> someone with build system interest should probably look into whether we can actually ship mesa as lto. I expect it would need -fno-strict-aliasing, but would the win of lto beat the loss of strict aliasing?

20:11 <eric_engestrom> about github not being updated anymore, I think it's because of an update of `git` on kemper where it became more strict about * gestures vaguely * things:

20:11 <eric_engestrom> remote: remote: /srv/kemper.freedesktop.org/not-for-git/wayland/update-github.sh: line 11: shift: shift count out of range

20:11 <eric_engestrom> remote: remote: fatal: detected dubious ownership in repository at '/srv/git.freedesktop.org/git/mesa/mesa.git'

20:11 <eric_engestrom> (that `shift out of range` is unrelated, it's been there for literally years and I kept meaning to look into it but never did; I just copied th eline because it shows the filename)

20:11 <eric_engestrom> a quick search indicates this should disable this check globally (probably what we want there, but it might be good to look into it more before we do that just in case there's an attack vector that we didn't think about until now):

20:11 <eric_engestrom> $ git config --global --add safe.directory '*'

20:12 pendingchaos_ has joined #freedesktop

20:13 <eric_engestrom> anholt: I was going to link to arch's package since I knew they built mesa with lto, but then it turns out they actually disabled it when 22.2.0 came out because apparently it caused a rendering issue

20:13 <eric_engestrom> but until then, they did

20:13 <anholt> yeah, over time people periodically try lto, and eventually they back off.

20:14 <anholt> my suspicion is strict aliasing -- we violate it thoroughly on a regular basis, but mostly get away with it and I suspect in a lot of cases it's due to no optimization between compilation units.

20:15 <eric_engestrom> https://bugs.archlinux.org/task/76019 apparently lto is still broken on 22.3.0

20:15 <eric_engestrom> perhaps we should run our tests with lto enabled

20:15 pendingchaos has quit [Ping timeout: 480 seconds]

20:16 <eric_engestrom> (and not merge that for a couple years until we fix it)

20:17 <eric_engestrom> (daniels, bentiss: ^ 10 messages above was for you, so you don't miss it)

20:21 <eric_engestrom> anholt: if the problem is aliasing, wouldn't `-fno-strict-aliasing` be enough to fix it?

20:22 <anholt> yeah, which is why I suggested that :)

20:25 pendingchaos has joined #freedesktop

20:30 pendingchaos_ has quit [Ping timeout: 480 seconds]

20:30 pendingchaos_ has joined #freedesktop

20:33 <daniels> eric_engestrom: god that git change is annoying. thanks for finding that.

20:33 pendingchaos has quit [Ping timeout: 480 seconds]

20:36 pendingchaos has joined #freedesktop

20:39 pendingchaos_ has quit [Ping timeout: 480 seconds]

20:42 <eric_engestrom> anholt: oh right, I missed it ^^'

20:44 mvlad has quit [Remote host closed the connection]

20:47 <DavidHeidelberg[m]> eric_engestrom: I have testing MR, which uses LTO build everywhere and we pass everything

20:49 <DavidHeidelberg[m]> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20375 except armhf :D

20:50 pendingchaos_ has joined #freedesktop

20:53 pendingchaos has quit [Ping timeout: 480 seconds]

20:58 pendingchaos has joined #freedesktop

20:59 pendingchaos_ has quit [Ping timeout: 480 seconds]

21:07 pendingchaos has quit [Ping timeout: 480 seconds]

21:07 pendingchaos has joined #freedesktop

21:16 <DavidHeidelberg[m]> eric_engestrom: one setback (when I enabled it only on debian-build-testing) is that LTO takes a long time, so it's suboptimal for builds where other test jobs are waiting for them to finish.

21:18 Leopold_ has quit [Ping timeout: 480 seconds]

21:57 <daniels> DavidHeidelberg[m]: if you're going to be around for a little bit still, could you please get an MR together which s/wget/curl/g and adds the option which will retry on every single error (DNS, network connection, 5xx, whatever - anything other than 4xx) so we try every network request for like 2min?

22:05 <daniels> eric_engestrom: I think one of the runners is still having network issues - this took out several RPi jobs and no others https://gitlab.freedesktop.org/mesa/mesa/-/jobs/34483268

22:06 thaller is now known as Guest1014

22:06 <jenatali> daniels: Take a look at the retries. All the traces jobs failed at first too

22:06 thaller has joined #freedesktop

22:06 Guest1014 has quit [Read error: Connection reset by peer]

22:08 <daniels> jenatali: oh thanks, I didn't spot those

22:08 <daniels> now that is interesting, because I can't see any reason why we'd be bouncing those :\

22:17 <DavidHeidelberg[m]> daniels: wget "should" behave right, .gitlab-ci/container/container_pre_build.sh

22:18 <DavidHeidelberg[m]> anyway, quoting my old comment https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17655#note_1493455 I would probably be for switching to curl

22:20 <bentiss> DavidHeidelberg[m]: .gitlab-ci/container/container_pre_build.sh is *not* executed in the target container, but in the gitlab-runner helper container, which is locally configured on the machine

22:20 <daniels> DavidHeidelberg[m]: I believe you'd need to add retry_on_host_error and retry_on_http_error=502,503,504

22:21 <daniels> bentiss: it populates /etc/wgetrc

22:22 <bentiss> daniels: yes, it does mount the volume used by the target container, but again, you don't have the same binaries available

22:23 <daniels> bentiss: right, I just mean that it's called in the context of creating the rootfs used for all the target containers, and it populates /etc/wgetrc with a set of options, so it will be influencing what future wget calls do

22:24 <bentiss> daniels: maybe, I was just pointing at the fact that the script in .gitlab-ci/container/container_pre_build.sh could not be used as "wget is working in container_pre_build.sh, so we always have wget available"

22:25 <daniels> sure :)

22:25 <bentiss> and given that it's a different environment when you start your continers, all bets are off

22:26 <bentiss> daniels: while you are here too. I played last week with the registry db on my local install

22:26 <bentiss> so I have a couple of items I'd like you to validate/refute:

22:27 <DavidHeidelberg[m]> daniels: ok, I'm slowly heading to the bed, so I would prefer to push wget solution for now and curl for tomorrow :D

22:27 <bentiss> - we still pay for artifacts, lfs, uploads on GCS, can we nuke them? (would represent a small amount, but still no reasons to keep them IMO)

22:28 <bentiss> - I can enable the registry db feature, but there are warning everywhere about not using that in production environment, though gitlab.com is using it

22:29 <bentiss> - the only migration we can do for the registry db is, luckily when the resgitry is backed by GCS, so in theory it should go well, but I couldn't test it on my local gitlab+minio install

22:29 <daniels> DavidHeidelberg[m]: sounds good :)

22:30 <bentiss> - basically, even if I enable the postgres registry db and the settings in the chart, we still need to enable a gitlab ruby feature to enable migration, and migration is entirely controlled by ruby from the console

22:30 <daniels> bentiss: hmmm did we migrate artifacts etc to minio already?

22:30 <bentiss> daniels: yes, we haven't access those GCS buckets in a year I think

22:31 <bentiss> it represents ~2TB, so not a lot, but that's still 10% of the $400 a month we pay

22:31 <daniels> ahhh nice, yeah we can nuke them then

22:32 <daniels> the registry … heh … I think we’d need some serious (full-weekend?) downtime to try it tbh

22:32 <bentiss> so summary for the registry db, I'd be tempted to enable it nonetheless, do the migration for a few selected projects, and then do the whole thing

22:33 <daniels> unless we can like … enable it running but unused and run a background rake task to try the migration and see if it works?

22:33 <daniels> yeah

22:33 <daniels> that sounds like a great plan

22:33 <bentiss> the migration is transparent: if the repository exists, it uses the old path. If not or if it has migrated, it'll be using the db

22:33 <bentiss> I might have a little bit of time at the end of the week, but not tomorrow FWIW

22:34 <daniels> awesome

22:35 <bentiss> daniels: also note that the registry migration keeps blobs on GCS, so we will potentially double the size of the registry. But in theory we will be copying only the used blobs

22:35 <daniels> I can try to free up some time towards the end of the week

22:35 <bentiss> so less than double

22:35 <daniels> only got back to work yesterday so still catching up

22:35 <bentiss> yeah, no worries

22:35 <DavidHeidelberg[m]> daniels: could be "retry_on_host_error = 429,500,502,503,504 ?

22:36 <daniels> DavidHeidelberg[m]: *http_error?

22:36 <daniels> host_error is used for DNS AIUI

22:36 <bentiss> daniels: one thing that *might* be missing is the db backup of the registry. I haven't checked if it was already setup on the gitlab backup jobs

22:38 <bentiss> daniels: one last thing before going to bed: marge and the developers of mesa are safe, because they would use harbour in their MR pipelines, so that community won't be impacted normally

22:40 <DavidHeidelberg[m]> daniels: thanks for catch, same number of letters, error-prone :D

22:45 <daniels> bentiss: hehe, we can pick another project that can deal with a day or two of no merges - I'd volunteer either Wayland or Weston

22:45 <daniels> DavidHeidelberg[m]: we do still want host_error too tho :)

22:45 <daniels> cf. the 'it's always DNS' meme

22:47 <DavidHeidelberg[m]> I know, I know :D

22:47 <DavidHeidelberg[m]> We want it all, we want it now.

22:48 <bentiss> daniels: ci-templates is also a *very* good candidate

22:48 <bentiss> easy enough to stress test the registry

22:49 <bentiss> alright. I killed the unused buckets and going afk now

23:05 <daniels> thanks, bonne nuit!

23:08 <DavidHeidelberg[m]> let's do it now. At least I'll know if it worked tomorrow.

23:15 <DavidHeidelberg[m]> daniels: I think one cp in lava_build should do the trick. Pushed.

23:15 ybogdano has quit [Ping timeout: 480 seconds]

23:17 danvet has quit [Ping timeout: 480 seconds]

23:18 <daniels> DavidHeidelberg[m]: ty!

23:18 <daniels> and guten nacht

23:20 <DavidHeidelberg[m]> daniels: should I just run pipeline or assign marge/

23:21 <DavidHeidelberg[m]> thx! :) Guten nacht :)

23:22 <daniels> o/

23:22 <daniels> I've assigned Marge, have only just finished dinner so I'll hang around to see how it turns out

23:44 Haaninjo has quit [Quit: Ex-Chat]