Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scenario for .well-known requests of Synapse Matrix servers #1206

Closed
wants to merge 2 commits into from

Conversation

adrianrudnik
Copy link

Hello crowdsecurity community,

not sure if this is worthwhile for Hub or if it is a good idea, but here goes.

The reason for this scenario is based on my decision to have a matrix test server about a year ago and throw it away after about a week.

Since then, I have been getting about 1.2-1.5k requests per day on /.well-known/matrix/server from servers running the matrix server, filling logfiles, tying up resources, even if it is just for a simple 404 with no way to turn it off.

As it turns out you have to leave all rooms before deleting your server. There also seems to be no way to tell remote servers to back-off on 404 status codes and my attempts to add a /.well-known/matrix/server pointing to the official server or an unknwon server did not reduce the requests in any way. So I, and anyone who works or owns the domain after me, will have to live with it.

One user posted some statistics detailing ~71k requests for the last two months and how to add it to the Cloudflare WAF to prevent further requests.

The scenario bans servers issuing those requests. I chose trigger because one requests is enough to identify those servers and there is no other reason to be on /.well-known/matrix/server, while /.well-known/matrix/client and /.well-known/matrix/support are still ok, as clients and contact information could still be request details for other reasons. I am not sure about the "http:dos" behaviour and the attack.T1498 classification as it is more like a DDOS, as it involves all servers that did learn to know you on server/channel join.

Maybe this scenario is too specific or not really suited for the Hub. Just trying to learn and get into crowdsec.

@adrianrudnik
Copy link
Author

As for the impact, here is a quick overview after running this scenario for 4 hours on the small machine responsible for my domain apex. I assume that would be ~24k dropped packets, minus the ones produced by the http-bad-user-agent, so approx. 1/10 of the total community blocklist in volume, within the same timeframe, am I reading that correctly?

metrics

decisions

@LaurenceJJones
Copy link
Contributor

LaurenceJJones commented Dec 30, 2024

Im a little worried to have this as an official scenario as it the way that matrix works, however, im shocked that they dont have a way to inform servers that there is no matrix server here anymore (even if it something silly like 418 response code)

However, it might also be good to note we can also handle this in the appsec component (our waf) via a rule.

#/etc/crowdsec/appsec-rules/vpatch-matrix-dos.yaml
name: my/vpatch-matrix-dos
description: "HTTP DOS by matrix servers inquiring about possible server"
rules:
  - zones:
      - URI
    transform:
      - lowercase
    match:
      type: contains
      value: /.well-known/matrix/server
labels:
  type: scan
  service: http

if you wanted to block the request before it handled by a scenario (however, to note it doesnt make a decision so you would still need a scenario to act as a trigger but I thought ill through it out there)

@LaurenceJJones
Copy link
Contributor

LaurenceJJones commented Dec 30, 2024

I assume that would be ~24k dropped packets, minus the ones produced by the http-bad-user-agent, so approx. 1/10 of the total community blocklist in volume, within the same timeframe, am I reading that correctly?

Yeah you are reading it correctly 👍🏻 what is crazy is 58 ips vs 53k in community so matrix is very chatty 🗣️

@adrianrudnik
Copy link
Author

adrianrudnik commented Dec 30, 2024

Thanks for the pointer to the AppSec component, I'll dig into the documentation and see if I can deploy it and get a feel for it!

I also have some conflicts about formulating it as a sort of detox rule, but there is just no other way to get them to stop, and the problems over there just seem to be building up.

I increased the ban time to 24 hours, just to see what was going on, and I'm now at almost the same host numbers as before, a year ago, with 8 hours left on the clock for the first unban:

metrics2

But again, I'm going to look into AppSec, as it's basically just a single HTTPS endpoint being affected, and banning servers that are simply running matrix servers might be a bit drastic.

@adrianrudnik
Copy link
Author

I'll close here. As a final note, the scenario is now at 226k dropped packets, from 223 active decisions per IP within the last 24h, while having seen 532 unique ips hitting within the last days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants