#freedesktop on 2025-01-29 — irc logs at oftc.irclog.whitequark.org

2024-07-16 04:52 ChanServ changed the topic of #freedesktop to: https://www.freedesktop.org infrastructure and online services || for questions about freedesktop.org projects, please see each project's contact || for discussions about specifications, please use https://gitlab.freedesktop.org/xdg or xdg@lists.freedesktop.org

01:26 scrumplex has joined #freedesktop

01:30 scrumplex_ has quit [Ping timeout: 480 seconds]

02:00 DragoonAethis has quit [Quit: hej-hej!]

02:00 DragoonAethis has joined #freedesktop

02:22 Turkish-Men has quit [Quit: Leaving]

04:07 marcheu_ has joined #freedesktop

04:09 marcheu has quit [Ping timeout: 480 seconds]

04:09 dri-logger has quit [Ping timeout: 480 seconds]

04:16 marcheu_ has quit [Ping timeout: 480 seconds]

04:17 dri-logger has joined #freedesktop

04:19 marcheu has joined #freedesktop

04:29 dri-logg1r has joined #freedesktop

04:29 eluks has quit [Remote host closed the connection]

04:30 eluks has joined #freedesktop

04:31 dri-logger has quit [Ping timeout: 480 seconds]

04:31 marcheu has quit [Ping timeout: 480 seconds]

04:39 marcheu has joined #freedesktop

04:50 ybogdano has quit [Remote host closed the connection]

04:52 ybogdano has joined #freedesktop

04:57 m5zs7k has quit [Ping timeout: 480 seconds]

05:06 m5zs7k has joined #freedesktop

05:32 swatish2 has joined #freedesktop

05:34 haver has quit [Ping timeout: 480 seconds]

05:37 AbleBacon has quit [Read error: Connection reset by peer]

05:54 haver has joined #freedesktop

06:51 tzimmermann has joined #freedesktop

07:09 pjakobsson has joined #freedesktop

07:16 jsa1 has joined #freedesktop

07:17 pjakobsson has quit [Remote host closed the connection]

07:19 pjakobsson has joined #freedesktop

07:49 <hakzsam> mesa CI is very slow these days, the queue is like ~10h long :/

08:00 sghuge has quit [Remote host closed the connection]

08:00 sghuge has joined #freedesktop

08:12 pjakobsson has quit [Remote host closed the connection]

08:23 lsd|2 has joined #freedesktop

08:34 swatish21 has joined #freedesktop

08:35 swatish2 has quit [Ping timeout: 480 seconds]

08:36 lsd|2 has quit [Quit: KVIrc 5.2.2 Quasar http://www.kvirc.net/]

08:48 sima has joined #freedesktop

09:05 ximion has quit [Remote host closed the connection]

09:19 swatish21 has quit [Ping timeout: 480 seconds]

09:26 swatish2 has joined #freedesktop

09:48 swatish2 has quit [Ping timeout: 480 seconds]

09:50 swatish2 has joined #freedesktop

10:00 <daniels> hakzsam: yeah I know, some of the build jobs were obscenely long which I just managed to trim down last night

10:01 <hakzsam> great, thanks for looking into it!

10:01 <daniels> there's definitely some worrying per-runner variability which I think means we have noisy neighbours competing too much for memory perhaps

10:02 <daniels> but yeah, we now have two Intel compilers to build, NVK and Rusticl to build and link in Rust which takes about a year, ACO not getting any smaller, and we get to eat more of that pain as we run clc in every build

10:04 <daniels> so atm the best thing anyone can do to help out would be to make compile times faster - either the C/C++/Rust compile and build, or the NIR compiler chain to build all the CLC

10:07 <daniels> istr alyssa mentioning nir_validate also got super slow recently?

10:08 swatish2 has quit [Ping timeout: 480 seconds]

10:13 swatish2 has joined #freedesktop

10:20 <mupuf> daniels: thanks <3

10:47 jsa1 has quit [Ping timeout: 480 seconds]

10:49 <daniels> np

10:51 guludo has joined #freedesktop

10:57 swatish21 has joined #freedesktop

11:02 swatish2 has quit [Ping timeout: 480 seconds]

11:12 jsa1 has joined #freedesktop

11:31 swatish2 has joined #freedesktop

11:31 swatish21 has quit [Ping timeout: 480 seconds]

11:41 <karolherbst> mhh, I guess I could see how to make clc compiles faster

11:41 <karolherbst> I did though about adding pch support which might help a bit here

11:41 <karolherbst> *thought

11:53 swatish2 has quit [Ping timeout: 480 seconds]

12:08 mvlad has joined #freedesktop

12:09 swatish2 has joined #freedesktop

12:29 <psykose> there's a libanv file that takes like 50s to compile @_@

12:49 <dj-death> one of the genX ?

12:52 <psykose> bit variable every run, my trace one became 42 for it, src/intel/vulkan/libanv_per_hw_ver300.a.p/genX_cmd_buffer.c.o

12:53 <karolherbst> I'm sure the intel clc stuff would benefit a lot from pch, because it pulls in the expensive internal opencl header, because of that intel subgroup extension

12:54 <karolherbst> though not quite sure how to integrate it well into the build system... maybe as its own target

12:54 <karolherbst> and then users depend on it

12:54 <karolherbst> if somebody is motivated enough to figure it out, please do

12:56 swatish2 has quit [Ping timeout: 480 seconds]

12:58 swatish2 has joined #freedesktop

12:59 <karolherbst> daniels: I'm wondering if we could improve the situation alot by cleaning up the libasan/libubsan situation. Like why not enabling both at once for all builds and just use that instead of splitting it all up? Might require some work here and there, but that would cut down at least jobs, though that might make the individuals pipelines slower,

12:59 <karolherbst> though might not matter much

13:05 <psykose> longest jobs for a default build locally https://img.ayaya.dev/eWQDkrK6xMfn

13:08 <karolherbst> interesting...

13:09 <karolherbst> I tihnk somebody mentioned that the genx macros are super heavy

13:10 <karolherbst> and looking at that genx_init_state.c file I can see how it explodes in size

13:12 <karolherbst> anyway.. guess I'll look into the pch stuff, because I wanted to do that anyway

13:14 <psykose> if you want the full perfetto trace by chance: https://img.ayaya.dev/XALK1KwzgfyQ (includes -ftime-trace so same query needs where depth = 0)

13:14 <psykose> wonder why nir_opt_algebraic.py takes 15s

13:20 todi1 has joined #freedesktop

13:25 todi has quit [Ping timeout: 480 seconds]

13:34 <daniels> karolherbst: asan/ubsan have a pretty huge runtime cost - especially in memory - and can't be used together iirc

13:34 <daniels> oh, it's asan and msan that are mutually exclusive

13:34 <daniels> but yeah, they'll massively slow down the run

13:36 <karolherbst> maybe could merge ubsan+asan at least

13:44 <dj-death> intel-clc should not be a thing anymore

13:44 <dj-death> code is still there but the default build options should avoid building it

13:44 <dj-death> we only need mesa_clc instead

13:44 <dj-death> psykose: I've noticed it takes a while indeed

13:45 <dj-death> psykose: the number of headers included is pretty bad on those files, it's mostly everything pulled by anv_private.h

13:49 <psykose> not sure how includes show in -ftime-trace tbh, like 95% of the time is in backend passes, but maybe it takes more time to shake them out

13:50 <psykose> though it does add up to a lot, plenty of files with >50% frontend

13:56 pjakobsson has joined #freedesktop

13:59 <bentiss> daniels: Anything to add to the new banner I created at https://gitlab.freedesktop.org/admin/broadcast_messages?

13:59 ximion has joined #freedesktop

14:00 <bentiss> in a way I'd like to be dramatic so we get volunteers, but OTOH, it wouldn't be fair to say we are cutting operations on April 30

14:02 <daniels> bentiss: it lgtm, thankyou

14:02 <daniels> hopefully we don't end up having to kill fd.o :P

14:02 <bentiss> k

14:03 <bentiss> well, one solution could be to migrate to gitlab.com and let them deal with the infra :)

14:03 <bentiss> but it's probably worse because vendor lock :(

14:06 <bentiss> daniels: also I had a chat this morning with whot, and while we were talking about AI scrapers, and one solution would be to use cloudflare to chase them out. Would this be something that can be on the table as well?

14:07 <daniels> I'd be fine with CloudFlare, or we might still have contacts at fastly as well

14:08 <daniels> hmmm, is that notification undismissable? ouch :)

14:09 <pinchartl> it sounds like something that shouldn't be dismissed until we find a solution

14:10 <bentiss> daniels: I wanted to make a point for a couple of days before making it dismissable :)

14:10 <pinchartl> what's needed ? sponsorship from a datacenter host ? money ? other things ? all of that ?

14:10 <bentiss> pinchartl: all of that I think

14:11 <bentiss> we were running on datacenter sponsorship which makes things easy, but right now I wish we just had money we spent so we could just handle the situation at the financial level, not at the technical one

14:11 <DragoonAethis> bentiss: how many resources, roughly, GitLab needs?

14:12 <DragoonAethis> CPU cores, storage, etc

14:12 <DragoonAethis> Runners are going to be a mess anyways, but we have some spare hardware and we might be able to figure out something

14:13 <bentiss> DragoonAethis: currently my main concern is the egress data: 56TB over the past month only. So if we start splitting the runners in various datacenter, this is going to explode

14:13 <bentiss> and cost a bunch

14:14 <pinchartl> :-O

14:14 <bentiss> (many thanks to the AI scrapers as well on that)

14:14 <DragoonAethis> by "some" hardware I might mean a rack full of Broadwells

14:14 <DragoonAethis> so not great, not terrible

14:15 <bentiss> right now we have interesting machines, but they are split so we can share more load, so if we have a dozen of not great runners, that might still make the difference

14:15 <pinchartl> (we should all get back to NNTP and let the tech bros destroy the rest of the internet. then we'll rebuild on the ashes)

14:15 <bentiss> heh

14:16 <Consolatis> that sounds like a plan

14:16 * pinchartl things of the last scene of season 1 of Mr Robot

14:16 <pinchartl> s/things/thinks/

14:17 <daniels> well, there was that gopher-v2 thing that was basically exactly what you're describing

14:19 <pinchartl> maybe a CI infrastructure designed on the assumption that resources are infinite (I'm not talking about fd.o in particular, but about CI solutions in general) is not sustainable ?

14:19 <bentiss> DragoonAethis: for gitlab, I'd like to have: a managed DB that I don't have to worry about (the disk used right now is 400GB, and it's growing), then we got 4 gitaly services, each of them between 200 to 500GB of data on disk), ideally on 4 separate machines/nodes, and then there are multiple services like webservers, sidekiq, marge-bot, pages webserver, which should all fit in 2

14:19 <bentiss> to 3 machines, and last, we need a ton of S3 storage: roughly 65TB

14:19 <pinchartl> every time I push a branch to the linux-media CI I feel bad (and that doesn't use the fd.o shared runners)

14:20 <bentiss> FTR, I've put all the numbers on a message to the board yesterday, not sure it's public though

14:20 <bentiss> (and we should probably bring the technical discussion on a public issue)

14:22 <daniels> pinchartl: what would you prefer?

14:22 <DragoonAethis> bentiss: The compute I could deliver, the storage and managed DB - not so much

14:23 <pinchartl> daniels: I don't know yet. I'm not sure anyone has really thought about alternative architectures (at least I haven't read about any) that would significantly cut CPU time and bandwidth

14:23 <DragoonAethis> lemme check what the boxes have, one sec

14:23 <pinchartl> but recompiling a kernel from scratch with 10 compilers/archs for every commit... that's just crazy

14:24 <psykose> normally caching like ccache or incrementally building on older build dirs (this fell out of fashion) helps with that, but the fundamental issue is that to test what you care about there is always a combinatorial explosion of options

14:25 <bentiss> oooh -> https://www.fastly.com/products/storage "zero egress fees"... daniels: if you have contacts at fastly that might be very interesting for us :)

14:25 <psykose> multiple build types * archs * test suites = mega time :D

14:25 <psykose> there's no way to reduce that without losing coverage of something

14:25 swatish2 has quit [Ping timeout: 480 seconds]

14:26 <pinchartl> and caches require more disk space

14:26 <psykose> yup

14:26 <karolherbst> dj-death: right, but I wanted to look into pch support, because intel_subgroups requires pulling in this other header which adds a significant amount of time to compile times

14:26 <pinchartl> but that could be cheaper than CPU time ? and maybe also less costly from an environmental point of view ? I don't have visibility on that

14:26 <daniels> the time-honoured cycle is that people start with shitty but small CI ('I don't need these templates because I don't understand them'), then they have shitty and big CI so we go talk to them, then they fix it to have big but relatively efficient CI

14:36 <svuorela> KDE iirc runs their/our gitlab on a couple of hetzner machines. I'm not sure fdo is bigger... though have weirder test setups for at least mesa.

14:44 <bentiss> svuorela: we are hosting a few kernel repos, and they are big... I wish we could run on just a couple of machines

14:55 kj2 has joined #freedesktop

15:04 <karolherbst> oh wow... using pch 1s -> 0.1s for the libintel shaders

15:04 <karolherbst> just need to figure out how the details all work there...

15:05 <karolherbst> it e.g. complains if the specified CL version differs

15:07 <karolherbst> let's see what it does in regards to the ray tracing ones

15:37 <Consolatis> I wonder how much of that 56TB egress traffic is from CI and by how much caching on the runner side + binary diffs could lower the amount (assuming most of the CI traffic is container images for the runners)

15:41 <mupuf> Consolatis: I would think most of the egress is build artifacts

15:41 <mupuf> The GitLab runner does not deduplicate their downloads :s

15:41 <bentiss> container images are already cached on the runners. The only info I can get is that we pulled almost 5TB from indico+ci-stats+s3.fd.o (s3.fd.o being the public facing s3)

15:42 <bentiss> so we get ~50TB of build artifacts, git pulling and AI scrapers

15:43 <bentiss> and registry.fd.o and all the various pages website we host\

15:44 <Ford_Prefect> Do we have any sort of outreach happening to get more infra etc.? Anything users can be helping amplify?

15:52 todi1 has quit []

15:52 todi has joined #freedesktop

16:04 ximion has quit [Remote host closed the connection]

16:43 soreau has joined #freedesktop

16:44 <soreau> I saw the banner on gitlab about the server provider.. migration doesn't sound like fun

16:45 <soreau> It seemed like fd.o gitlab was coming together pretty well on its current space

16:46 <soreau> It would be nice if I can do something to help but I can only offer thanks to everyone who helps maintain the fd.o space and hoping the transition will somehow change things for the better

16:49 <bentiss> soreau: thanks for the kind words :)

16:52 <soreau> 👍

16:55 jsa1 has quit [Ping timeout: 480 seconds]

17:10 haaninjo has joined #freedesktop

17:10 <daniels> karolherbst: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33287#note_2757025 might be of interest

17:12 <karolherbst> yeah.. looks good from a quick glance. The driver taking the most time to build seems to be intel, hopefully somebody figures out how to make the genxml stuff faster

17:24 <bentiss> daniels: anything to add/correct on https://gitlab.freedesktop.org/freedesktop/freedesktop/-/issues/2011 ?

17:28 <bentiss> sigh... I forgot about cloudflare/fastly

17:31 <jenatali> bentiss: Out of curiosity do you have any numbers for what replacements might end up costing? I... doubt that I can convince folks to pony up finances but I wouldn't have any chance to even try if there's no budget ask

17:34 <bentiss> jenatali: yesterday Arek made a rough guesstimate, and by being overly generous he got between 10k to 15k a month

17:35 AbleBacon has joined #freedesktop

17:35 <bentiss> I think we can cut this down by a fair bit, but we are talking roughly 100K a year, maybe less, maybe more

17:35 <jenatali> 👍

17:35 <bentiss> (that's another emoji I can not see ;-) )

17:37 <daniels> bentiss: nope, it's super thorough

17:37 <bentiss> \o/

17:38 <daniels> thankyou for teh writeup :)

17:38 <bentiss> copy/paste from the previous email mostly :)

17:38 <daniels> Ford_Prefect: the writeup there is probably the start of the outreach. I think a couple of people pinged a couple of people but it didn't get much anywhere. a lot more people are now pinging a lot more people and trying to figure out who would be able to contribute how much

17:39 <daniels> Ford_Prefect: is Asymptotic feeling flush? :)

17:40 * bentiss hangs up for today

17:50 tzimmermann has quit [Quit: Leaving]

18:00 <Ford_Prefect> daniels: haha, if only!

18:09 swatish2 has joined #freedesktop

18:16 alanc has quit [Remote host closed the connection]

18:16 alanc has joined #freedesktop

19:03 random_james has quit [Remote host closed the connection]

19:04 random_james has joined #freedesktop

19:05 jsa1 has joined #freedesktop

19:08 swatish2 has quit [Ping timeout: 480 seconds]

19:11 <mupuf> bentiss: "We don't owe Equinix anything" --> did you mean that Equinix doesn't owe us anything?

19:12 <bentiss> mupuf: maybe... It's too late for me to think straight :)

19:12 <mupuf> ha ha

19:13 <bentiss> feel free to correct these kind of mistakes

19:13 <mupuf> you meant to say that equinix did not have to give us this service, right?

19:13 <mupuf> if so, I'll fix it

19:14 <bentiss> yeah, thanks

19:15 <bentiss> I also hesitated to put something like "any trolling would be moderated and/or start ban actions", but we are a good community, right?

19:35 <mupuf> hehe

19:35 <karolherbst> bentiss: there was an interesting idea to mess with bots I saw today: have a URL disallowed by bots in robots.txt and just ban anything accessing that url

19:36 <vyivel> i would definitely access that url out of curiosity

19:36 <bentiss> karolherbst: good idea, but those bots don't even bother with robots.txt, so how are they going to pull that URL in the first place?

19:37 <karolherbst> some random meta html tag

19:37 <karolherbst> vyivel: yeah, but we could also add comments to not do that :D

19:38 <pinchartl> "do *NOT* press the big red button"

19:40 <karolherbst> there is also things like this: https://zadzmo.org/code/nepenthes/

19:40 <karolherbst> *are

19:40 <pinchartl> and iocaine

19:40 <karolherbst> but yeha...

19:41 <karolherbst> those AI crawlers are a pain

19:41 <karolherbst> would be fun to try out a few things post migration

19:42 <karolherbst> also once we have better monitoring of those things

19:43 <pinchartl> do we have any leverage on (some of) those companies through developers who work there ?

19:45 karolherbst4 has joined #freedesktop

19:46 karolherbst has quit [Read error: Connection reset by peer]

19:48 <pinchartl> I suppose openai wouldn't care if they can't access fd.o

19:48 <pinchartl> and all the chinese AI companies wouldn't either

19:56 karolherbst4 has quit []

19:56 karolherbst has joined #freedesktop

20:14 jsa1 has quit [Ping timeout: 480 seconds]

20:49 <DavidHeidelberg> does cleaning up forks container registry useful or is 0.00% impact?

20:50 <DavidHeidelberg> (for the upcoming FDO migration)

20:53 <ivyl> bentiss: another CDN we can reach out to is bunny.net. I've heard some good things about them. What would be good time to contact them? We can possibly even budget something like that.

21:06 <pinchartl> speaking of container registry, is there a way to automatically delete older container images ? when I modify the libcamera CI in a way that triggers a container rebuild, I don't need to keep the old container images forever. I try to delete them manually when I remember

21:06 <pinchartl> (it won't save much, as we rarely update containers these days, but still)

21:07 <karolherbst> now that I *checks notes* eliminated 5 seconds from the mesa compile time, I'm sure this solves all our mesa CI overload issues

21:13 <ivyl> pinchartl: https://docs.gitlab.com/ee/user/packages/container_registry/reduce_container_registry_storage.html#cleanup-policy

21:18 <pinchartl> ivyl: thanks

21:18 <ivyl> I've been using it on some other gitlab instance. Sadly it looks like it may be disabled on fdo: `This project's cleanup policy for tags is not enabled. Please contact your administrator.`

21:18 <DavidHeidelberg> yup

21:19 <pinchartl> it could be useful to link to that from the FDO_EXPIRES_AFTER documentation. I initially thought FDO_EXPIRES_AFTER was about cleaning up old images

21:19 <pinchartl> ah, if it's not available, then it's not an option :-)

21:20 <ivyl> Maybe it can be enabled? It's optional per project anyway.

21:35 mvlad has quit [Remote host closed the connection]

21:38 ximion has joined #freedesktop

21:43 jsa1 has joined #freedesktop

22:31 Kayden has quit [Quit: upgrade distro]

22:57 jsa1 has quit [Ping timeout: 480 seconds]

23:13 <DragoonAethis> bentiss: another option that comes to mind - purchase the hardware and run it at a colo dc

23:13 <DragoonAethis> Extreme one-time cost, pretty chill afterwards

23:16 <robclark> isn't that more or less what we had pre-gitlab ;-)

23:19 MrBonkers has quit [Quit: Ping timeout (120 seconds)]

23:19 MrBonkers has joined #freedesktop

23:24 haaninjo has quit [Quit: Ex-Chat]

23:32 <__tim> kemper might be out of disk space

23:32 <__tim> remote: remote: fatal: unable to write loose object file: No space left on device

23:32 <__tim> remote: hint: The 'hooks/post-update' hook was ignored because it's not set as executable.

23:32 <__tim> remote: error: remote unpack failed: unpack-objects abnormal exit

23:32 <__tim> remote: hint: You can disable this warning with `git config advice.ignoredHook false`.

23:32 <__tim> remote: To ssh://kemper.freedesktop.org/git/gstreamer/gstreamer

23:32 <__tim> remote: ! [remote rejected] 1.24.12 -> 1.24.12 (unpacker error)

23:32 <__tim> remote: error: failed to push some refs to 'ssh://kemper.freedesktop.org/git/gstreamer/gstreamer'

23:33 <__tim> sorry, that should've been two lines only

23:37 donte has joined #freedesktop

23:37 <donte> https://www.donte.net/

23:37 donte has left #freedesktop [#freedesktop]

23:41 sima has quit [Ping timeout: 480 seconds]