<MrCooper> bwidawsk: if you need anything beyond just installing a fixed set of packages, my advice would be to handle everything in the FDO_DISTRIBUTION_EXEC script and don't bother with FDO_DISTRIBUTION_PACKAGES
<MrCooper> bwidawsk: the templates automatically do the autoremove purge dance at the end now, that could be removed from westong
<MrCooper> (ugh, I was reaching for backspace, but hit enter instead :)
<daniels> MrCooper: hm? it’s removing ephemeral packages that we use only in the container build stage
<daniels> same as Mesa
<MrCooper> ah right, templates just do apt-get autoremove
<mupuf> Venemo: works for me. Does it for you?
<Venemo> mupuf: I dunno, I just noticed this problem on an overnight run
<Venemo> I reassigned to the bot again
<mupuf> Venemo: right. This is a temporary failure, so yeah, annoying
<Venemo> could there be some different handling of these kinds of failures vs. when the tests really fail?
<daniels> not really; we've already stuffed in as many retries as we can
<Venemo> I mean instead of "CI failed" I'd like to see a different message
<Venemo> currently "CI failed" is 50%-50% split between: "CI crapped itself because of a networking error so it didn't even run any tests" vs. "CI actually ran the tests but some tests failed because I wrote crap code"
<daniels> you mean a different message from marge?
<Venemo> yes, that's exactly what I mean
<daniels> being worked on
<Venemo> "CI was unable to run properly" vs. "CI ran tests successfully but some of them failed" or something along those lines
<MrCooper> how would Marge be able to tell the difference? (Or is the plan to replace her with ChatGPT? ;)
<Venemo> same way I can?
<Venemo> I don't understand the question
<MrCooper> are you volunteering then? ;)
<Venemo> for what?
<MrCooper> not sure what difference it would make anyway, the MR can only be merged if CI passes
<Venemo> in case I wrote crap code, I need to investigate it and fix it. otherwise I just reassign to marge
<daniels> ManMower: the CI daily reports already do that kind of classification; they break down whether failures are due to infrastructure / hung machine / timeout / failed tests. we've been working on cleaning up and generalising that code, combining it with some of the ci-uprev and ci_run_n_monitor so we can reuse the same stuff across the board and make marge able to tell you more without having to go read the log.
<daniels> in the meantime you have to click through and read
<Venemo> usually what I do is just always reassign to marge and if it fails 2-3 times then I look at the actual logs
<Venemo> it's really tedious to find out anything from those logs, so I want to avoid that
<MrCooper> that's wasting infrastructure resources
<Venemo> yes, that's why I made the suggestion above
<daniels> yeah, don't do that
<Venemo> if it were clear what happened, I wouldn't
<daniels> the 503 seems pretty clear to me
<daniels> we've been working on cleaning up the logs as well to make them more immediately valuable
<daniels> but if you're just smashing resubmit 3 times before you even try to look, it's hard to take complaints about the merge queue being too long seriously
<Venemo> it's not just me, everyone does this
<daniels> I wouldn't say everyone tbh, but yeah, there certainly are some
<Venemo> I'm exaggerating, of course
<daniels> I've watched people smash MRs at Marge like 5 times in a row before I had to tell them that their code didn't even compile so could they please stop
<daniels> it's hard to know what to do when someone cares that little
<Venemo> about 90% of the time, I already test our stuff in the valve infra, so I rarely assign anything to marge that I didn't already test
<Venemo> it is then very frustrating to see a failure from something that just passed 5 minutes ago
<Venemo> also most of them time even before going to the valve infra I run tests locally as I see reasonable
<daniels> I don't know what else to tell you. people are working hard to fix it, but when the average level of care is 'I can't even be bothered reading the logs so I just make it try again', then it's not exactly easy
<Venemo> no that's not what I'm saying
<Venemo> I'm saying that after all that effort spent testing the thing BEFORE submitting them to the CI, IMO it's more likely to fail due to a network error than be an actual test failure
<Venemo> also, I appreciate the work of everyone involved
<daniels> by far the most common reason for things to fail is flakes, which is exacerbated by people just smashing retry without reading logs or doing anything about any observed failure
<daniels> hence why we've been working on automated systems to find and categorise flakes and just update expectations automatically
<Venemo> I don't doubt that but I personally rarely see flakes these days
<Venemo> maybe we could allow the CI jobs to return a custom error string that indicates why the job failed? then each job would be able to give some useful info
<daniels> yeah, that’s exactly what we’ve been working on
<daniels> but flakes are the biggest issue (and we also have other priorities) so we’ve been working on that first
<Venemo> all right then that's great news
<Venemo> thanks for doing this
<daniels> np!
<bwidawsk> MrCooper: thanks for the suggestion. I'm looking now at doing multiple builds for different feature sets in a Rust crate. Using cargo I think it makes sense to do these all as a single build target so cargo doesn't have to do a full build for different feature sets. Curious if you have advice there.
<bwidawsk> (it would be preferable for me to run them as different jobs, but I believe that's wasteful)
<bwidawsk> meh, I think I'll leave well enough alone until it becomes a problem
<bwidawsk> the runner seems to eat output from any scripts invoked by EXEC
<bwidawsk> Am I doing something wrong?
<bwidawsk> So it'd be pretty nice if I could have FDO_DISTRIBUTION_CARGO_PACKAGES or some such
<bwidawsk> then I don't have to reinstall a bunch of crates every run
<bwidawsk> For whatever reason I can't seem to get them to be permanently there using a script in EXEC
<bwidawsk> oh, I needed to bump the tag
<bwidawsk> ignore me
<pobrn> Hi, sorry for the banal question... but has the font changed on I could swear it looked different a couple days(?) ago, but maybe something on my end has changed (or I am just imagining it...).
<vyivel> pobrn: yes, gitlab was updated and the new version has different fonts
<pobrn> Thanks for the reply. Interesting, but I don't think that affects me as of yet, in my case Firefox simply started choosing Noto Sans instead of Segue UI.
