daniels changed the topic of #freedesktop to: https://www.freedesktop.org infrastructure and online services || for questions about freedesktop.org projects, please see each project's contact || for discussions about specifications, please use https://gitlab.freedesktop.org/xdg or xdg@lists.freedesktop.org
i509vcb has quit [Quit: Connection closed for inactivity]
AbleBacon has quit [Read error: Connection reset by peer]
bmodem has quit [Ping timeout: 480 seconds]
dos1 has quit [Ping timeout: 480 seconds]
dos1 has joined #freedesktop
bmodem has joined #freedesktop
bluca has joined #freedesktop
<bluca>
hi - we have a contractor working on revamping the systemd documentation, and we need a redirect for https://www.freedesktop.org/software/systemd/man/ set up to be able to make progress, would greatly appreciate if any fdo.org admin had some time to help us out - thanks in advance!
<daniels>
alanc: because debian:stable is a newer version of Debian than it was two years ago, and that requires a newer version of ci-templates since they changed some stuff
<daniels>
alanc: you can use debian:woody or whatever the actual release codename was (or numeric versions), which pins it in a more predictable way
<MrCooper>
haha, woody, nice in-joke there :)
<MrCooper>
woody can legally drink booze in the US
<daniels>
I figured I might as well give a codename which wasn't available in container form so it was obvious that it'd have to change :P
shashanks has quit [Remote host closed the connection]
<karolherbst>
mupuf: kinda looks like the valve navi21 runners are having trouble to keep up with the workload
<mupuf>
karolherbst: let me check if anyone is using them too much
<mupuf>
karolherbst: pre-merge testing gone wrong. We need to lower our timeouts
<robclark>
zmike: anholt mentioned that disk on the servo NuC (the one controlling the a630 runners) was full.. possibly container images not getting cleaned up? I'll look when I get to the office in a bit (was out yesterday)
<eric_engestrom>
zmike: ci_run_n_monitor has `--force-manual`, does it solve your issue?
<DavidHeidelberg>
eric_engestrom: nope, I see this problem myself
<eric_engestrom>
ok
<eric_engestrom>
can you write it down in an issue?
<eric_engestrom>
I don't have the spare cycles to think about it much right now, but that's the kind of thing I want to see working so I'll look into it when I can
<DavidHeidelberg>
I'll try to look into it today or tomorrow, but cannot promise 100%
<zmike>
it used to work until this past week or so
<robclark>
daniels: for reasons, I had to boot init=/bin/bash so don't really have network access
<DavidHeidelberg>
eric_engestrom: does it help you? I had force-manual all the time and joba get canceled anyway due to some issue (the 1-2 week ago regression)
<robclark>
daniels: also, where is this script?
<eric_engestrom>
DavidHeidelberg: not tested, no, but from looking at the code this was obviously wrong so it fixes _something_ at least ^^
<eric_engestrom>
can you check if the script before koike's MR works?
* eric_engestrom
ended up not having time to review her MR
<DavidHeidelberg>
Helen changes had no impact on this type of problem, I had it before the merge
<psykose>
you see warning: dropping unsupported crate type `cdylib` because it's +crt-static
phire has quit [Remote host closed the connection]
<karolherbst>
do we have any plans for the freedreno farm or do we just disable it? It prevents MR from merging
phire has joined #freedesktop
<bwidawsk>
psykose:
<bwidawsk>
psykose: I was trying to move to CONFIG_HOME/config.toml
<bwidawsk>
I can reproduce the issue with RUSTFLAGS, one moment
<daniels>
karolherbst: robclark is looking at it
<karolherbst>
okay, cool
AnuthaDev has joined #freedesktop
<robclark>
daniels, karolherbst: the servo NuC is rebooting now and things should be ok when it comes back
<karolherbst>
okay :)
<karolherbst>
I've also seen some piglit trace fails, but that one also looked a bit like ENOSPACE related...
<DavidHeidelberg>
we're aware :)
<robclark>
everything controlled by servo NuC (ie. a630 jobs) would have been unhappy
<robclark>
karolherbst, zmike: a630 runners should be back now
<karolherbst>
cool
<karolherbst>
there is one thing I've noticed with all the fails happening. Often the pipeline keeps running even though it faileds, due to some jobs still being triggered/running/whatever. Could we make sure that marge just stops the entire pipeline on failures?
<zmike>
that's sounds bad to me
<zmike>
then there's no ability to manually click retry on e.g., network failures
<karolherbst>
I mean when marge already moved on to the next MR
<zmike>
ah
<karolherbst>
we can make marge not do it if there is no other MR queued
<karolherbst>
had a a630 job still running even though marge unassigned 58 minutes ago
<daniels>
karolherbst: tldr no
<daniels>
hmm, actually, maybe yes
<daniels>
I think it's fine
<daniels>
the complication I was thinking about is that the GitLab API doesn't expose anything about job retries or whatever, so marge can't really reason about the status of individual jobs and whether or not they're terminal to a pipeline
<karolherbst>
it could be a "I start processing this MR, let me stop all jobs from the previous one" thing
<daniels>
yeah
<daniels>
so I think that's ok once the pipeline goes to failed and marge gives up
<karolherbst>
yeah
todi has joined #freedesktop
swatish2 has quit [Read error: Connection reset by peer]
<elibrokeit>
I've been helping out with their build system and to solve the CI they tried giving me access to the repo, but it is not granting me pipeline start permissions... unless I push to a branch of the project repo instead of my fork
cleetus_garfield has joined #freedesktop
<mupuf>
elibrokeit: you need to be added to a ci-OK group somewhere
iNKa is now known as Brocker
<alanc>
ideally the person you're working with would add you to the group for cangjie, but I could probably add you to the xorg group if that would help
<robclark>
karolherbst: no, not expected.. but not really sure how the mapping from ttyUSB to device is maintained.. wonder if it is possible that cheza-16 has become one of the DUTs that we'd previously taken offline?
<robclark>
I guess anholt should know.. but for now seems like we should take that runner out
<karolherbst>
yeah please do, it seems to cause some issues.
<karolherbst>
the flaky test isn't really critical as it seems that CI rerunning the test gets us past that, still something we might want to address to speed up CI
<DavidHeidelberg>
Another one bites the dust :(
<daniels>
karolherbst: please marge the flake addition
<robclark>
would be nice if we could symbolize the backtrace in a trace crash
<karolherbst>
or uhm.. DavidHeidelberg I guess as well
<psykose>
bwidawsk: the issue you see now seems to be rustc itself segfaulting, which isn't related to the output (and so stuff like -C link-self-contained=on doesn't do anything)
<DavidHeidelberg>
karolherbst: not nice, for a660 it happens time to time :/
<DavidHeidelberg>
this seems to be issue with the magic bootloader I guess
<daniels>
yeah, impressive - for the same sharded mesa job, two different 888s (each in different racks with different power/network/serial/etc hosts) have refused to boot that job and only that job because they can't figure out USB
AnuthaDev has quit []
<daniels>
karolherbst: buy a lottery ticket and let's split the returns between you + me + DavidHeidelberg