ChanServ changed the topic of #freedesktop to:
Seirdy has quit [Quit: exiting 3.2]
Seirdy has joined #freedesktop
<airlied> bentiss, anholt_ : registry going away agin
ximion has quit []
chomwitt has joined #freedesktop
hikiko_bsd has joined #freedesktop
<bentiss> airlied: done
hikiko_bsd has quit [Ping timeout: 480 seconds]
danvet has joined #freedesktop
alatiera has quit [Quit: The Lounge - https://thelounge.chat]
rbrune has joined #freedesktop
___nick___ has joined #freedesktop
kem has quit [Ping timeout: 480 seconds]
kem has joined #freedesktop
muhomor has quit [Remote host closed the connection]
___nick___ has quit []
___nick___ has joined #freedesktop
ximion has joined #freedesktop
___nick___ has quit []
___nick___ has joined #freedesktop
rbrune has quit [Quit: Leaving]
ximion has quit []
___nick___ has quit [Ping timeout: 480 seconds]
agd5f has joined #freedesktop
<pq> failed to pull image, unexpected EOF again in CI
alanc has quit [Remote host closed the connection]
alanc has joined #freedesktop
jenatali has joined #freedesktop
<anholt_> kicked it again
holmanb has joined #freedesktop
agd5f_ has joined #freedesktop
agd5f has quit [Ping timeout: 480 seconds]
kem has quit [Ping timeout: 480 seconds]
jhweruyuw has joined #freedesktop
kem has joined #freedesktop
kem has quit [Ping timeout: 480 seconds]
jhweruyuw has quit []
kem has joined #freedesktop
ximion has joined #freedesktop
jarthur has joined #freedesktop
<airlied> bentiss: what's happening with the registry? it seems to be crashing once every day or two now, mostly in my working hours :P
<airlied> anholt_: ^ in case you know
<anholt_> airlied: I don't know anything. bentiss says it's been intermittent and doesn't have anything useful in the logs that they've found.
<anholt_> I haven't found anything googling for the specific error message.
<bentiss> airlied, anholt_: yeah, sorry, no ideas on what is going on. Could be ceph not in sync, but why would it fail to handle the load just for this data?
<bentiss> it could be that docker registry do not like too much too parallel accesses to the same data, and messes up itself it's own cache
<bentiss> docker registry *in passthrough mode*
<bentiss> the thing that is sure is that the local cache gets messed up, and killing the pod makes it realized the data is not there
<airlied> bentiss: so we have a registry, with a cache that is failing, can we fallback to hitting the real registry here?
<bentiss> airlied: to have the runner hit the cache I changed the IP address in /etc/hosts
<airlied> ah so hard to fallback then :-)
<bentiss> yeah... well, maybe we can do something clever with haproxy or gobetween
<bentiss> but maybe we could just have a job that kicks the registry cache every 3 hours
<airlied> sounds horrid, maybe not related but horrid
<bentiss> looks awfully similar to what we are seeing
Haaninjo has joined #freedesktop
<bentiss> it would make a lot of sense, knowing that we can not enable garbage collection on the actual upstream registry for exactly that same reason (it will delete blobs that are used by other layers)
<anholt_> /o\
* bentiss tries to correlate the "crashes followed by kicks" with the expiration schedulers kicking in the logs
<bentiss> airlied, anholt_: FWIW this is exactly what we are having: http.response.status=200 http.response.written=0 in the logs when that happens
<bentiss> so at least now we know the culprit
<bentiss> maybe the simplest "solution" for now could be to write a kubernetes job that resets the registry cache every 7 days
<bentiss> by reset I means clear the data
<bentiss> this could be a good tradeoff between caching data and money we spend rebuilding the registry
halfline has quit [Ping timeout: 480 seconds]
tomeu has quit [Read error: Connection reset by peer]
shadeslayer has quit [Write error: connection closed]
ocrete has quit [Remote host closed the connection]
ndufresne has quit [Write error: connection closed]
fahien2 has quit [Remote host closed the connection]
ndufresne has joined #freedesktop
tomeu has joined #freedesktop
fahien2 has joined #freedesktop
ndufresne is now known as Guest1159
ocrete has joined #freedesktop
shadeslayer has joined #freedesktop
ndufresne5 has joined #freedesktop
Guest1159 is now known as ndufresne
xexaxo has quit [Remote host closed the connection]
ndufresne5 has quit []
xexaxo has joined #freedesktop
jstein has joined #freedesktop
danvet has quit [Ping timeout: 480 seconds]
bcarvalho has joined #freedesktop
immibis has quit [Remote host closed the connection]
immibis has joined #freedesktop
karolherbst has quit [Quit: Konversation terminated!]
jhweruyuw has joined #freedesktop
jhweruyuw has quit [Remote host closed the connection]
jhweruyuw has joined #freedesktop
chomwitt has quit [Ping timeout: 480 seconds]
Lyude has quit [Ping timeout: 480 seconds]
ximion has quit [Remote host closed the connection]
ximion has joined #freedesktop
karolherbst has joined #freedesktop
Seirdy has quit [Quit: exiting 3.2]
Seirdy has joined #freedesktop
Haaninjo has quit [Quit: Ex-Chat]
jstein has quit []
jarthur has quit [Ping timeout: 480 seconds]
jarthur has joined #freedesktop