DHIS2 Login GeoIP Analysis

Extracts login events from a DHIS2 server, geolocates the source IPs, checks them against threat-intel blocklists, and produces a report that flags logins from outside the expected home country and accounts that look compromised.

It's aimed at national DHIS2 systems, where almost every login should come from inside one country. A flagged account is a lead, not a verdict — there are often innocent explanations, so treat the report as a starting point for investigation.

How it works

Extract — SSH into the server, pull AuthenticationSuccessEvent entries from the Tomcat journal, and parse out the timestamp, username, and IP address.
Geolocate — Look up each IP against the DB-IP free country database and the Starlink GeoIP database.
Reputation check — Match each IP against free, offline threat-intel blocklists (AbuseIPDB high-confidence aggregate, FireHOL, blocklist.de, Emerging Threats) to flag logins from known-malicious IPs.
Analyse — Flag accounts showing suspicious patterns and write a dated report.

Starlink IPs are identified separately and treated as home-country logins, since Starlink users physically in the home country can appear as foreign due to how satellite traffic is routed.

Deployment compatibility

extract_logins.sh is written for servers deployed with dhis2-server-tools or any similar LXD-based setup, where DHIS2 runs inside a named LXC container and Tomcat logs to the systemd journal.

Other deployment types will need a different extraction approach. For example:

Docker — Tomcat logs go to the container log driver rather than journald. You would use docker logs or read the log file directly and adapt the parsing accordingly.
Bare metal / standalone Tomcat — No LXC wrapper; journalctl -u tomcat9 can be run directly on the host without the lxc exec step.

Regardless of how DHIS2 is deployed, the goal is the same: produce a logins.txt file in the format that analyse.py expects (see logins.txt format below), then run analyse.py against it.

Prerequisites

ssh access to the target server (key-based auth recommended)
python3 virtual environment: python3 -m venv env && source env/bin/activate && pip3 install -r requirements.txt
perl (standard on most Linux systems)

Setup

1. Set environment variables

export SERVER=user@hostname   # SSH target
export INSTANCE=prod          # LXC container name
export UNIT=tomcat9           # tomcat9 or tomcat10 (default: tomcat9)
export COUNTRY=RW             # ISO country code for the home country

Add these to your ~/.bashrc or ~/.profile to persist them.

2. Download the GeoIP databases

./download_geoip.sh

This downloads the current month's DB-IP country database (~8 MB) and the Starlink CIDR list. It is safe to re-run — it skips the DB-IP download if the file is already current month. Run it again at the start of each month to refresh.

3. Download the threat-intel blocklists

./download_blocklists.sh

This fetches several free, no-API-key IP reputation feeds into blocklists/ and is run automatically by run_all.sh. It is safe to re-run — it skips the download if the lists are less than 12 hours old. The feeds are:

Feed	Source	Notes
`abuseipdb_100`	borestad mirror	AbuseIPDB IPs at ~100% abuse confidence (no API key needed)
`firehol_level1` / `firehol_level3`	FireHOL	Curated low-false-positive firewall blocklists (includes Spamhaus DROP, DShield, Feodo)
`blocklist_de`	blocklist.de	Hosts reported for SSH/brute-force/login attacks
`et_compromised`	Emerging Threats	Known compromised / hostile hosts

If the directory is missing or empty, analyse.py simply skips the malicious-IP check. Respect the source licenses (Spamhaus data requires attribution and prohibits commercial use).

A blocklist hit is enrichment, not proof. Shared NAT/CGNAT, Tor exit nodes, and cloud provider egress IPs can produce false positives — corroborate with the other flags before acting.

Usage

Typical workflow

Run the pipeline for a time window: ./run_all.sh "7 days ago".
Open the generated report_*.txt and read the high-suspicion accounts and known-malicious IP matches.
Dig into anything that stands out: ./investigate.sh <username>.
Need more history? Widen the window and re-run — but the server's journal only keeps a limited backlog (often around two weeks), so that's the most you can recover after the fact. For ongoing coverage, run this on a schedule and keep the logins_*.txt extracts.

Full pipeline (recommended)

./run_all.sh "1 day ago"

This runs the whole thing end to end: refresh the databases and blocklists, extract logins, analyse. The time window is passed straight to journalctl -S, so anything journalctl accepts works:

./run_all.sh "6 hours ago"
./run_all.sh "2026-05-01"

Steps individually

Extract only (writes to logins.txt):

./extract_logins.sh "1 day ago"

Analyse an existing logins file:

python3 analyse.py logins.txt

Investigate a single account:

./investigate.sh <username>

For one account, this lists every IP it logged in from (with counts, owner/ASN and reverse DNS via ipinfo.io, and any blocklist match), then prints a chronological timeline that marks country changes. It's the quickest way to tell a compromised account from a false positive: a real user has a consistent local footprint, while a takeover shows a sudden switch to cloud or foreign IPs.

Needs internet for the ownership lookups. Blocklist matching here is exact-IP only — use analyse.py for CIDR/range matches.

Output

Each run of analyse.py prints a report to stdout and saves a copy to a dated file (report_YYYYMMDD_HHMMSS.txt).

The report contains:

Overview — total login events, home/foreign split, unresolved IPs, unique accounts, malicious-IP logins
Starlink logins — IPs matched against the Starlink database (excluded from foreign counts)
Foreign login countries — breakdown of logins by non-home country
Known-malicious IP matches — login IPs found in the threat-intel feeds, with login count, affected user count, and the feed that flagged each
Suspicious account tiers:
- Any foreign login — at least one login from outside the home country
- Impossible travel — logins from two different countries within 60 minutes
- No home-country logins — account has never logged in from the home country
- Majority foreign — more logins from outside the home country than inside
- Known-malicious IP — at least one login from an IP on a threat-intel blocklist
- High suspicion — accounts triggering two or more of the above
High suspicion account detail — per-account breakdown of countries, top IPs, and impossible travel events

Files

File	Purpose
`run_all.sh`	Full pipeline entry point
`extract_logins.sh`	Extract logins from the remote server via SSH
`download_geoip.sh`	Download/refresh GeoIP databases
`download_blocklists.sh`	Download/refresh threat-intel IP blocklists
`analyse.py`	Geolocate IPs, reputation-check, and produce the report
`investigate.sh`	Deep-dive a single account's footprint + IP ownership
`dbip-country-lite.mmdb`	DB-IP country database (auto-downloaded)
`starlink-geoip.csv`	Starlink CIDR → country map (auto-downloaded)
`blocklists/`	Threat-intel feed files (auto-downloaded)
`logins*.txt`	Login extracts (gitignored — they contain usernames and IPs)
`report_*.txt`	Saved reports (one per run)

logins.txt format

analyse.py expects one login event per line with three space-separated fields:

2026-05-22T14:41:11,667 jsmith 102.93.8.147

Field	Format	Example
Timestamp	ISO 8601, comma or dot as subsecond separator	`2026-05-22T14:41:11,667`
Username	No spaces	`jsmith`
IP address	IPv4	`102.93.8.147`

If you are adapting the extraction for a different deployment type, this is the format to target. Trailing punctuation on the IP field (e.g. a stray ;) is stripped automatically by analyse.py.

Notes on accuracy

The DB-IP lite database is reliable at country level (~95–99% globally). For the home country, well-known ISP ranges (e.g. MTN, RwandaTel) are correctly attributed. False negatives — home-country users appearing as foreign — are more likely than false positives, particularly for users on Starlink (handled separately) or traffic routed through regional hubs in neighbouring countries.

Cloud provider IPs (Google Cloud, AWS, Azure, etc.) appearing repeatedly in foreign logins, especially switching rapidly with home-country IPs, are a stronger signal than a one-off foreign login. An account that only ever appears from a cloud IP, or a real user whose normal local logins are suddenly interleaved with cloud ones, is worth a close look.

Know your own service accounts. A BI or integration tool (e.g. a Superset/DHIS2 connector) authenticates constantly from a fixed server IP, which is usually foreign — so it will show up as high-suspicion every run. That's an expected false positive, but it's worth confirming the IP belongs to your integration and hasn't changed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DHIS2 Login GeoIP Analysis

How it works

Deployment compatibility

Prerequisites

Setup

1. Set environment variables

2. Download the GeoIP databases

3. Download the threat-intel blocklists

Usage

Typical workflow

Full pipeline (recommended)

Steps individually

Output

Files

logins.txt format

Notes on accuracy

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
__pycache__		__pycache__
.gitignore		.gitignore
README.md		README.md
analyse.py		analyse.py
download_blocklists.sh		download_blocklists.sh
download_geoip.sh		download_geoip.sh
extract_logins.sh		extract_logins.sh
investigate.sh		investigate.sh
notes.txt		notes.txt
requirements.txt		requirements.txt
run_all.sh		run_all.sh

Folders and files

Latest commit

History

Repository files navigation

DHIS2 Login GeoIP Analysis

How it works

Deployment compatibility

Prerequisites

Setup

1. Set environment variables

2. Download the GeoIP databases

3. Download the threat-intel blocklists

Usage

Typical workflow

Full pipeline (recommended)

Steps individually

Output

Files

logins.txt format

Notes on accuracy

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages