#freedesktop on 2023-06-14 — irc logs at oftc.irclog.whitequark.org

2022-12-21 00:45 ChanServ changed the topic of #freedesktop to: https://www.freedesktop.org infrastructure and online services || for questions about freedesktop.org projects, please see each project's contact || for discussions about specifications, please use https://gitlab.freedesktop.org/xdg or xdg@lists.freedesktop.org

00:22 columbarius has joined #freedesktop

00:24 co1umbarius has quit [Ping timeout: 480 seconds]

00:25 jarthur has quit [Quit: Textual IRC Client: www.textualapp.com]

00:37 oldpcuser_ has quit [Remote host closed the connection]

00:37 oldpcuser_ has joined #freedesktop

00:42 oldpcuser_ has quit []

00:45 oldpcuser has joined #freedesktop

01:09 ximion has quit [Remote host closed the connection]

01:09 ximion has joined #freedesktop

01:58 AbleBacon has quit [Quit: I am like MacArthur; I shall return.]

01:59 AbleBacon has joined #freedesktop

02:29 dcunit3d has quit [Remote host closed the connection]

04:08 ximion has quit [Quit: Detached from the Matrix]

04:55 Leopold___ has joined #freedesktop

04:57 alatiera has quit [Ping timeout: 480 seconds]

04:59 tzimmermann has joined #freedesktop

05:02 Leopold has quit [Ping timeout: 481 seconds]

05:03 alatiera has joined #freedesktop

05:47 sima has joined #freedesktop

06:31 alanc has quit [Remote host closed the connection]

06:32 alanc has joined #freedesktop

06:56 enunes has joined #freedesktop

07:00 AbleBacon has quit [Read error: Connection reset by peer]

07:04 <enunes> hi, were there any changes on freedesktop infra/network over the weekend? in particular to s3.freedesktop.org . my shared runner on gitlab.fdo had always had a somewhat slow connection to it, but since this weekend it's unstable and sometimes doesn't connect at all

07:08 <enunes> so far we were blaming my ISP but I tried over a completely separate network (testing with something like curl -O https://s3.freedesktop.org/artifacts/mesa/mesa/908217/43715138/results.tar.zst) and I can reproduce the timeouts there after a few tries

07:10 enunes[m] has joined #freedesktop

07:16 <bentiss> enunes: no changes over the weekend, and I can not reproduce your timeouts with that URL. I guess bad timing? If other people are doing heavy tasks on s3 and gitlab, there is not much we can do :(

07:17 <enunes> I just run it locally around 10 times either on my laptop or on the runner, and in one of the runs it just doesn't connect and waits for about 2 minutes and gives up

07:18 <enunes> so I tried over my mobile tethered network over a separate provider and it's the same...

07:20 <enunes> so my runner is still disabled, I'm not sure what to do next to bring it back up

07:39 <bentiss> enunes: have you tried using the magic curl retries commands? It's a little bit slower when it fails, but at least this solves transient network errors in many cases

07:41 <enunes> the CI scripts do that in some places but not all, also sometimes the requests come from something like s3cp and that fails in CI when it hits the bad connection now

07:49 <bentiss> rights3cp is supposed to use retries -> https://gitlab.freedesktop.org/freedesktop/ci-templates/-/commit/2dc7a00b85b444f3ae913ed912a1924f5fd4a5e1

07:51 <enunes> even with retries, it has hit things like this https://gitlab.freedesktop.org/mesa/mesa/-/jobs/43737967

07:53 <enunes> first LAVA job timeouts after a couple of tries to download a tarball from s3, in the LAVA job second try the s3cp fails

07:53 shbrngdo has quit [Read error: Connection reset by peer]

07:54 shbrngdo has joined #freedesktop

08:01 <bentiss> enunes: not sure what is going on: I can follow the traces from the curl on the internal logs on nginx, but the PUT s3cp is supposed to do is not even appearing in the logs

08:02 <bentiss> so the connect error is likely that you couldn't connect entirely to the server

08:03 <bentiss> maybe it was BGP that dropped the packet outside of the cluster, or maybe nginx didn't have the capacity to accept new connections, but I can't see anything in the logs

08:05 <enunes> since it started yesterday the plan was to wait for a day and see if any "routing issues" just disappear, should we just wait another day?

08:08 <enunes> mupuf mentioned there was something like that in other countries, so while now I seem to be the only affected runner and likely the only one in my country, it has happened before apparently

08:28 kxkamil2 has joined #freedesktop

08:30 kxkamil has quit [Ping timeout: 480 seconds]

09:22 <karolherbst> daniels: somehow the label maker got super unreliable :'(

09:27 <karolherbst> daniels: yeah.. something is fucked on the API level

09:27 <karolherbst> or in the bot..

09:28 <karolherbst> it seems like the bot doesn't get all merge requests loaded

09:28 <bentiss> karolherbst: https://gitlab.freedesktop.org/freedesktop/mr-label-maker/-/merge_requests/11

09:28 <bentiss> karolherbst: IIRC we found a better solution with whot... but -ETIMNE

09:29 <karolherbst> but that isn't the issue?

09:30 <bentiss> karolherbst: it is. Given that now mr-labbel-maker is immediately notified about a new MR, it tries to apply the labels immediately, and gitlab takes its time to get the MR sorted out

09:30 <bentiss> so mr-label-maker sees that no files are touched, and bails out

09:30 <karolherbst> ahh right.. but when I was running over all merge requests, some aren't processed as well

09:30 <karolherbst> but yeah... the other issue is more pressing I guerss

09:31 <daniels> karolherbst: so you're querying the API for 'all open merge requests', and it is returning a number smaller than reality?

09:31 <karolherbst> yeah.. well... not sure. the script sees some MR I missed in dry-run

09:31 <karolherbst> ehh doesn't see them in non dry-run

09:31 <karolherbst> it's weird

09:31 <karolherbst> but now it's listed...

09:32 <karolherbst> maybe something when applying the label fails and it silently fails or something.. dunno

09:33 <karolherbst> shell history: https://gist.github.com/karolherbst/b66fca4bdb40e35f4b690e8c7cb549b7#file-gistfile1-txt-L620

09:33 <karolherbst> there is a run where you won't see 23639

09:34 <karolherbst> even as dry-run where previously it was seen

09:34 <karolherbst> it's weird

09:35 <karolherbst> and once further down without dry-run

10:16 Leopold___ has quit [Remote host closed the connection]

10:16 Leopold has joined #freedesktop

10:32 DodoGTA is now known as Guest3057

10:32 DodoGTA has joined #freedesktop

10:38 Guest3057 has quit [Ping timeout: 480 seconds]

13:11 ximion has joined #freedesktop

13:47 gert31 has joined #freedesktop

13:56 ximion has quit [Quit: Detached from the Matrix]

14:51 lynxis_ has quit []

14:51 lynxis has joined #freedesktop

15:04 AbleBacon has joined #freedesktop

15:35 gert31 has quit [Quit: Leaving]

15:45 tzimmermann has quit [Quit: Leaving]

16:24 i509vcb has joined #freedesktop

16:24 lack has quit [Read error: Connection reset by peer]

16:27 lack has joined #freedesktop

16:47 todi has quit [Remote host closed the connection]

16:47 todi has joined #freedesktop

16:48 bilboed3 has joined #freedesktop

16:51 bilboed has quit [Ping timeout: 480 seconds]

16:51 bilboed3 is now known as bilboed

16:56 todi has quit [Ping timeout: 480 seconds]

17:01 Leopold has quit [Remote host closed the connection]

17:04 Leopold has joined #freedesktop

17:20 oldpcuser has quit [Ping timeout: 480 seconds]

18:10 jarthur has joined #freedesktop

20:21 ximion has joined #freedesktop

20:23 agd5f has quit [Read error: Connection reset by peer]

20:25 systwi_ has quit [Ping timeout: 480 seconds]

20:27 systwi has joined #freedesktop

20:29 sima has quit [Ping timeout: 480 seconds]

20:40 Haaninjo has joined #freedesktop

20:50 agd5f has joined #freedesktop

20:56 Leopold__ has joined #freedesktop

21:03 Leopold has quit [Ping timeout: 480 seconds]

21:55 egbert has quit [Ping timeout: 480 seconds]

21:57 egbert has joined #freedesktop

22:09 Haaninjo has quit [Quit: Ex-Chat]

23:26 Leopold__ has quit [Remote host closed the connection]

23:55 jarthur has quit [Quit: Textual IRC Client: www.textualapp.com]