ChanServ changed the topic of #freedesktop to: https://www.freedesktop.org infrastructure and online services || for questions about freedesktop.org projects, please see each project's contact || for discussions about specifications, please use https://gitlab.freedesktop.org/xdg or xdg@lists.freedesktop.org
JanC is now known as Guest13431
JanC has joined #freedesktop
Guest13431 has quit [Ping timeout: 480 seconds]
scrumplex_ has joined #freedesktop
scrumplex has quit [Ping timeout: 480 seconds]
vedm_ has joined #freedesktop
vedm has quit [Ping timeout: 480 seconds]
garrison has quit [Read error: Connection reset by peer]
garrison has joined #freedesktop
<DemiMarie>
emersion: yup, very cheap
<DemiMarie>
if you want to make it real expensive you have to either make a legitimate user work more, do fingerprinting (ewwww), rely on IP addresses (limited effectiveness), or rely on some sort of cryptographic challenge
<DemiMarie>
Are X.org membership email verification emails down?
JanC is now known as Guest13438
JanC has joined #freedesktop
Guest13438 has quit [Ping timeout: 480 seconds]
JanC is now known as Guest13444
JanC has joined #freedesktop
Guest13444 has quit [Ping timeout: 480 seconds]
swatish2 has joined #freedesktop
swatish21 has joined #freedesktop
swatish2 has quit [Ping timeout: 480 seconds]
ximion has quit [Remote host closed the connection]
swatish2 has joined #freedesktop
swatish21 has quit [Ping timeout: 480 seconds]
<colinmarc>
<bentiss> "pinchartl: don't know, but I..." <- the WAF thing should be, no?
<colinmarc>
that's part of their marketing materials, at least
sima has joined #freedesktop
<colinmarc>
<DemiMarie> "emersion: yup, very cheap" <- in theory they can afford it. in practice a little bit of friction goes a looong way here, I think.
<colinmarc>
that said, I think a workable solution would be for big sites who get scraped the most to basically publish the lists of IPs that are scraping them automatically. then smaller sites can just set up an automatic ip ban for those lists.
<colinmarc>
* them automatically on a regular basis. then
tzimmermann has joined #freedesktop
JanC is now known as Guest13448
JanC has joined #freedesktop
jsa1 has joined #freedesktop
sghuge has quit [Remote host closed the connection]
<pinchartl>
colinmarc: the issue is that you'd end up blocking half of the internet. unlike search engine bots that rely on a "small" pool of (relatively) well-behaved hosts, AI scrappers use botnets with millions of devices
swatish2 has joined #freedesktop
<pinchartl>
and I wouldn't be surprised if those botnets were made of either compromised devices, or devices "borrowed" from unsuspecting owners (all kind of home appliances)
<colinmarc>
are you talking about the big AI companies, or other parties?
<colinmarc>
I would be surprised if openAI/meta/etc are using botnets. consumer devices is possible, but if they get their own consumer devices blocked, that's their problem, I think
<bilboed>
The problem is that there's no accountability for being a bad actor on the Internet. If ISPs really told their clients "hey, you're a bad actor, another strike and you're banned" it might have some impact. Right now people don't even care that their devices/computers are hacked, it doesn't impact them
<bilboed>
(Most Hosting providers do that. But I guess they have "less" clients, and the impact of having their AS banned/blacklisted is catastrophic)
jramsay has quit [Quit: Signing off]
swatish21 has joined #freedesktop
mvlad has joined #freedesktop
swatish2 has quit [Ping timeout: 480 seconds]
swatish21 is now known as swatish2
jsa1 has joined #freedesktop
andy-turner has quit []
AbleBacon has quit [Read error: Connection reset by peer]
guludo has joined #freedesktop
guludo has quit [Ping timeout: 480 seconds]
swatish2 has quit [Ping timeout: 480 seconds]
imre has joined #freedesktop
swatish2 has joined #freedesktop
<slomo>
503 "SSL handshake error" on gitlab currently
<bentiss>
slomo: sorry, I'm trying to get the real ip of clients, and I broke the config
<slomo>
no worries, just wanted to be sure you're aware of it :)
<bentiss>
\o/ this seems to be working now (getting the actual IP, not the one from the fastly POP) -> this should help Anubis
<bentiss>
but the problem is now the runners are talking directly to the hetzner LB, bypassing fastly and we don't have their IPs
italove8 has joined #freedesktop
<bentiss>
reverted to the old configuration, because there were too many corner cases where we don't get the actual IP
<eric_engestrom>
I'm very rusty on cors things, but I think that header needs to be in the s3.freedesktop.org redirect response with the *.your-objectstorage.com domain
<eric_engestrom>
looking at your link, give me a minute
<eric_engestrom>
bentiss: can I see the current config?
<eric_engestrom>
my first guess would be `cors:allowOrigins:["fsn1.your-objectstorage.com"]`
jsa1 has quit [Ping timeout: 480 seconds]
<eric_engestrom>
no, wait, I think I got it backwards, origin is the domain of the webpage the user has opened
vsro has joined #freedesktop
<eric_engestrom>
so it would be all the *.pages.freedesktop.org that exist... not really something we can enumerate
<eric_engestrom>
if globs are allowed in there then that should do it
<eric_engestrom>
or maybe we can jsut disable cors? I don't think other websites embedding resources from s3.freedesktop.org can actually cause any issues?
ximion has joined #freedesktop
* eric_engestrom
is not a ~~lawyer~~ security expert
* bentiss
has no idea :/
<eric_engestrom>
after reading up a bit more, I feel fairly confident that the only thing that could be done if we disable cors is that someone could create a webpage that loads a url from s3 that is not public and requires auth by a specific few users (and then eg. uploads it elsewhere), and then trick one of those few users who have access to open that malicious web page, which will then use that user's creds to download that resource
<eric_engestrom>
given the kind of contents we have on there, I don't think this is much of an issue, so I think we can indeed disable cors on s3.freedesktop.org
<eric_engestrom>
I'd love to hear a second opinion from someone who knows more about cors and the kind of resources available through s3.fdo :)
Paddi has quit [Read error: Connection reset by peer]
<eric_engestrom>
> allowOrigins [...] This support stars in origins.
<eric_engestrom>
so I think we need to enable it with `*.freedesktop.org` as the value; I'll post an MR
<daniels>
I'm comfortable with disabling CORS, but otoh this doesn't apply to direct requests or HTTP redirects - it applies only to stuff which is fetched from JS
<daniels>
so yeah, enabling CORS + allowOrigins *.fd.o seems like the right move
<eric_engestrom>
the requested sets the Origin header, but the server sets the Access-Control-Allow-Origin header that tells the browser which resources are allowed to read the response
<eric_engestrom>
bentiss: added `Access-Control-Allow-Credentials: true` to the MR, although I'm not really clear whether it can help
<eric_engestrom>
I'm still not sure at what layer the problem is, I might still be looking in the wrong place
<eric_engestrom>
oh
<eric_engestrom>
hold on, I think it might be a problem with the html we generate
<eric_engestrom>
we set crossorigin="Anonymous"
<eric_engestrom>
I think removing that might fix everything
* eric_engestrom
needs to figure out what generates this html
<eric_engestrom>
hmm wait, I don't think that's the right file to look at, but it's the only thing that contains `crossorigin`, so I don't understand what else could be setting that
* eric_engestrom
is lost and it's late, giving up
<eric_engestrom>
fyi I'm off until thursday (except mesa release stuff on wednesday), so I won't be able to look into this issue more until then
<eric_engestrom>
I have verified that removing `crossorigin="Anonymous"` from that page fixes the issue though, so if someone can figure out which file sets this, please remove it :)
<bentiss>
cool
<bentiss>
and FWIW, I'll be off next week too
<eric_engestrom>
ack
<eric_engestrom>
tomeu, daniels: posted https://gitlab.freedesktop.org/mesa/piglit/-/merge_requests/1005 which removes the one `crossorigin` that could find; if there's another one, please remove it too, and once they're all removed, please uprev piglit in mesa ci ❤️
<bentiss>
OK... so I'm now bypassing the loadbalancer from fastly to directly talk to the nodes. This way, we will count the egress on each node AND we will have the proper IP addresses in the logs
swatish2 has joined #freedesktop
kasper93 has quit [Quit: kasper93]
kasper93 has joined #freedesktop
mrpops2ko_ has quit []
mrpops2ko has joined #freedesktop
jsa1 has joined #freedesktop
vsro has quit [Remote host closed the connection]
<bentiss>
and gitlab feels much more reactive now that we have 5gbs of bandwith instead :)
guludo has quit [Quit: WeeChat 4.6.0]
swatish2 has quit [Ping timeout: 480 seconds]
<bentiss>
(and now anubis.gitlab.freedesktop.org works much better)
guludo has joined #freedesktop
fomys has quit []
ximion has quit [Remote host closed the connection]