#freedesktop on 2024-04-15 — irc logs at oftc.irclog.whitequark.org

2023-09-08 23:49 daniels changed the topic of #freedesktop to: https://www.freedesktop.org infrastructure and online services || for questions about freedesktop.org projects, please see each project's contact || for discussions about specifications, please use https://gitlab.freedesktop.org/xdg or xdg@lists.freedesktop.org

00:11 zxrom_ has quit []

00:20 lsd|2 has quit [Quit: KVIrc 5.2.2 Quasar http://www.kvirc.net/]

01:31 thaytan has quit [Ping timeout: 480 seconds]

01:33 scrumplex has joined #freedesktop

01:40 scrumplex_ has quit [Ping timeout: 480 seconds]

02:15 ghishadow has quit [Ping timeout: 480 seconds]

02:19 jarthur has joined #freedesktop

02:19 jarthur has quit []

03:04 mrpops2ko has quit []

03:13 mrpops2ko has joined #freedesktop

04:07 bmodem has joined #freedesktop

04:09 ximion1 has joined #freedesktop

04:13 ximion has quit [Ping timeout: 480 seconds]

04:20 ximion1 has quit [Quit: Detached from the Matrix]

04:30 ghishadow has joined #freedesktop

04:52 bmodem has quit [Ping timeout: 480 seconds]

05:11 bmodem has joined #freedesktop

05:49 thaytan has joined #freedesktop

05:54 cisco87_ is now known as cisco87

06:05 emusia has quit []

06:20 Zeroine_ has joined #freedesktop

06:25 Zeroine has quit [Ping timeout: 480 seconds]

06:50 mripard has joined #freedesktop

06:54 tzimmermann has joined #freedesktop

06:54 <bentiss> DavidHeidelberg: yeah, s3 is fine. The only question is how to get your file there. If you can upload it through a job, then that's easy enough. If you need to manually upload it, we might need to create a special bucket for manual uploads. But we'll have to get a JWT from gitlab for that...

07:04 Haaninjo has joined #freedesktop

07:22 <mupuf> between S3 and the container registry?

07:22 <mupuf> bentiss: I would like to add a new test container in mesa, for arm64 testing. This would be used by both imagination and freedreno. Looking at the current code, it seems like we are storing the arm64 rootfs in two places: S3 (for lava), and in a container (bundled with other stuff and extra dependencies) in baremetal. Ideally, I feel like we should have one container that contains all the userspace stuff (stored as zstd:chunked to reduce

07:22 <mupuf> bandwidth when making new versions) which could be used directly by gitlab runners / CI-tron, then have the lava/baremetal jobs create the rootfs they need at run time by extracting the container to NFS and downloading/extracting the kernel they want to use in the same way they currently download them just like they currently do, except by using skopeo/podman rather than wget. Do you have any thoughts on this? Is storage/bandwidth cost the same

07:24 zxrom has joined #freedesktop

07:39 <bentiss> mupuf: no real thoughts on this. storage/bandwidth cost the same between s3 and container registry as it's the same backend (one is serving files directly, the other has a container registry in front of an internal s3)

07:39 <mupuf> good, this matches my expectations

07:40 <bentiss> it just feels more work to pull the container, extract it, when using direct s3 can be cached by a local proxy

07:40 <mupuf> the container can be cached by a local proxy just as well, no?

07:40 <bentiss> yeah but you need to extract it everytime

07:41 <bentiss> from an admin pov, I have more visibility on the s3 filesystem when used directly as the paths are encoded directly (no hash checksum in the paths)

07:41 <bentiss> so if we get out of storage, I can more easily pinpoint who is the responsible

07:41 <mupuf> oh, actually, this is what happens with the current way (the rootfs is download and extracted every time) whereas the container would be extracted on the first download and just copied to NFS

07:42 <bentiss> when using the registry, between everybody copying the image, we have more chances of having stalled intermediate images and no ideas if we can delete those

07:42 nwm has joined #freedesktop

07:43 <mupuf> The container way should be faster than the current solution, on top of reducing bandwidth needs when new versions get download thanks to zstd:chuncked

07:43 <mupuf> oh, right, accountability is indeed an issue

07:43 <bentiss> long story short: I won't prevent you to have one common container for everything if that's simpler

07:44 <mupuf> (not that we don't already have the problem with all the other containers we build for mesa)

07:44 <bentiss> for acountability, we should have a script that goes on the entire container registry, look at the labels and purge the out of date images

07:45 <bentiss> mupuf: I'm not saying we don't have the problem, just that this part is more easy to be seen

07:45 <mupuf> yeah, got you :)

07:46 <mupuf> alright, so containers win for local reproducibility of the CI environment and bandwidth reduction but they lose on the ease of accountability of storage

07:46 <bentiss> sounds like a good summary, yeah

07:47 <mupuf> Ack, got it!

07:47 <mupuf> thanks a lot :)

07:47 <bentiss> no worries (not sure I helped much TBH)

07:56 lynxeye has joined #freedesktop

07:59 <mupuf> bentiss: You confirmed that both end up hitting the same place, and you shared your accounting concerns

07:59 <mupuf> that's all I needed

08:08 ungeskriptet has quit [Quit: The Lounge - https://thelounge.chat]

08:09 ungeskriptet has joined #freedesktop

08:10 ungeskriptet has quit []

08:14 ximion has joined #freedesktop

08:14 ximion has quit []

08:17 <daniels> mupuf: well, if you rewrote LAVA and bare-metal to consume containers rather than a tarball URL, then fine

08:17 <daniels> but short of that, we'd have to have something which would pull the container, tar it up, store it somewhere and access it, which is ... exactly what we do now?

08:17 <daniels> so I'm not really sure what problem that's solving atm

08:26 <mupuf> daniels: yeah, to really reduce duplication, we would have to rewrite both

08:27 <mupuf> What this would solve is imagination and freedreno being able to run tests on aRM64

08:28 <mupuf> We consume containers replicating what baremetal does it just making the problem worse

08:29 <mupuf> Potman has a rootfs mode, but using it would be very inefficient to use

08:29 <daniels> img and fdno are already running tests on arm64 though?

08:29 <daniels> or do you mean enabling those on ci-tron

08:29 <mupuf> Yeah, new jobs using ci tron

08:31 <daniels> yeah I mean, short of rewriting LAVA/bare-metal to both look exactly like ci-tron does (or completely breaking all the existing jobs, or making them much slower than they already are), the best thing for now is just to replicate what all the other jobs do

08:32 <daniels> we'll never get to the point where DUTs are directly consuming containers because most of them are way too slow (either CPU, I/O, or both - plus limited RAM) to take the hit of unpacking a container image during startup

08:33 bmodem has quit [Ping timeout: 480 seconds]

08:33 <daniels> there could be some kind of local cache which takes a container image, flattens it out to a single tarball, and uploads it some kind of storage service which can be used by the ultimate NFS host, but again that's already what we have

08:33 mvlad has joined #freedesktop

08:34 <daniels> (we're not hitting s3 every time because that would obviously be insane - we have multi-tiered local caching proxies and monitoring on hit rates etc - so s3 is primarily just there for better remote visibility)

08:34 bmodem has joined #freedesktop

08:37 <mupuf> I'm confused because what you guys are doing is just a cruder version of a container runtime, and we could replace that to do the exact same thing without changing anything.for the.DUTs

08:38 <mupuf> I'll give it a try, maybe I 've missed something

08:39 <daniels> well sure, whatever you need to do to provide LAVA a URL to a single flat tarball which contains the rootfs

08:39 <daniels> I'm not really sure what the difference would be with what you're proposing but happy to review MRs

08:40 <daniels> (there's no container runtime on the DUTs obviously, they just boot a single rootfs over NFS)

08:40 <mupuf> Yep

08:40 <mupuf> Lava is the one extracting the rooffs, right?

08:41 <mupuf> Makes sense for its design

08:42 <mupuf> So the best we could do is change baremetal not to package the rootfs in its image and instead download it from the registry

08:42 <mupuf> So the rootfs job would push both a container AND a container

08:42 <daniels> that sounds like a nice cleanup, yeah

08:43 <mupuf> I looked into creating a custom manifest that would point to the same layer... but it doesn't seem possible

08:43 <mupuf> Maybe we could get an http link to the layer for lava to consume though

08:43 <mupuf> Will investigate

08:43 <daniels> indeed LAVA extracts the rootfs - the job you submit has a URL to a tarball, then when it's scheduled the job to a given DUT, the local worker attached to that DUT pulls the tarball and unpacks to the NFS root path

08:43 <mupuf> Ack

08:44 <daniels> so yeah, LAVA could be modified to instead unpack a container image to that path instead

08:44 <mupuf> Thanks!

08:44 <daniels> it just hasn't been the low-hanging fruit to date

08:44 <mupuf> Right, and I doubt I'll be hacking on it

08:45 <mupuf> But baremetal seems more ripe for improvements

08:45 <mupuf> Thanks

08:46 <daniels> np

08:47 blatant has joined #freedesktop

09:05 Juest has quit [Ping timeout: 480 seconds]

09:05 zxrom has quit []

09:06 Juest has joined #freedesktop

09:11 blatant has quit [Quit: WeeChat 4.2.2]

09:13 <martink> good day! would a kind soul land a hand and send a maintenance MR to mergebot's queue: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28737

09:16 <daniels> martink: sure, done

09:17 <martink> daniels: thanks!

09:18 <daniels> np

09:26 ungeskriptet has joined #freedesktop

09:27 mrpops2ko has quit []

09:27 mrpops2ko has joined #freedesktop

09:47 zxrom has joined #freedesktop

10:43 navarre has joined #freedesktop

11:08 MrCooper has quit [Ping timeout: 480 seconds]

11:24 vkareh has joined #freedesktop

12:02 sentriz has quit [Read error: Connection reset by peer]

12:11 kxkamil has joined #freedesktop

12:13 sentriz has joined #freedesktop

12:30 bmodem has quit [Ping timeout: 480 seconds]

12:38 navarre has quit []

13:08 MrCooper has joined #freedesktop

14:22 Arsen has quit [Quit: Quit.]

14:24 bmodem has joined #freedesktop

14:32 bmodem has quit [Ping timeout: 480 seconds]

14:47 mvlad has quit [Remote host closed the connection]

14:56 Arsen has joined #freedesktop

15:18 Arsen has quit [Quit: Quit.]

15:18 Arsen has joined #freedesktop

16:02 f_ has joined #freedesktop

17:00 lynxeye has quit [Quit: Leaving.]

17:50 f_ has quit [Ping timeout: 480 seconds]

18:14 tzimmermann has quit [Quit: Leaving]

18:18 f_ has joined #freedesktop

18:23 ximion has joined #freedesktop

18:39 f_ has quit [Ping timeout: 480 seconds]

19:55 vkareh has quit [Quit: WeeChat 4.2.1]

20:50 AbleBacon has joined #freedesktop

21:07 karolherbst has joined #freedesktop

22:07 Haaninjo has quit [Ping timeout: 480 seconds]

22:12 Haaninjo has joined #freedesktop

22:38 thaytan has quit [Ping timeout: 480 seconds]

23:06 lsd|2 has joined #freedesktop

23:14 Haaninjo has quit [Quit: Ex-Chat]

23:50 alanc has quit [Remote host closed the connection]

23:51 alanc has joined #freedesktop