This repository contains the final work done during the BTech Project :
Mapping The Maze: The Study of Internet Shutdowns across the world
Aim: Finding a relationship between geopolitical events and internet shutdowns across the world.
Conclusion: We can use BGP data as a parameter to detect internet shutdowns on a marcroscopic scale.
The final BTP report can be found in the BTP Report
directory
Successful case studies include:
- Iran
- Uganda
- Myanmar
- US
Unsuccessful case study:
- India
The results can found in the Results
directory.
The project has old phases & pipeline which can be found here:
New & improved pipeline from routeviews_BGP_V3.0
Beta version
Check plan.txt for logic and working
- Dates are flexible, not hardcoded for 1 month, can use any 30 days
- Support removed for mongoDB, replaced by py dictioneries (much faster!)
- Execution time 7 hours, compared to 1.5 days previously
- Efficient storage : Using pickle to store dicts as binaries
- More intutive input for scripts
- Each script has little man page inside for debugging
- This time no need to scrap prefixes, use the ribs itself
- Input YYYY MM & DD until 30 days are covered (new UI)
- Input 4 timestamps (0200 0800 1400 2000 recommended for better coverage)
- Input ISP_ASN folder name + LIMIT for graphs
- Perform sanity check for all files
- Display all the inputs (vars array) + show warning
- Confimation check before proceeding
- Call
master.sh
in background & exit
Structure of the array VARS :
$vars{[0]} = 1st date [start] YYYYMMDD
$vars{[1]} = 2nd date
$vars{[2]} = 3rd date
.
.
$vars{[29]} = 30th date [end]
$vars{[30]} = timestamp_1 TTTT
$vars{[31]} = timestamp_2
$vars{[32]} = timestamp_3
$vars{[33]} = timestamp_4
.
${vars[34]} = LIMIT XX
This array is passed to master.sh
To get better insight of actual approach & hypothesis -
- Read plan.txt
- Each script has a little doc inside
Start frompipeline.sh
, it will lead to all other scripts
The results are out from new pipeline for ribs from 14th Jan
to 12th Feb
(for India) :-
(This is important to analyse as these are the results from the 1st run)
Some good stuff :
- New pipeline now runs in 7 hours, compared to 40 hours previously!
- The hypothesis was right, we got overall more prefixes from ribs than from CIDR,
CIDR prefixes -> 19884
Ribs prefixes -> 20720
(834 new prefixes) - Storage space or RAM is not an issue now , new pipeline is quite optimized
Storage < 500 MB, RAM < 6 GB - We got more number of prefixes with dips > 20% in ribs
Old pipeline : 356
New pipeline : 1205
And that's insane!
Some bad stuff :
- There seems to be very little correlation, but could be just coincidence,
Only a very small fraction of graphs falling in the right spot, on the days of shutdown
(this is only for India)
(we got perfect correlation for Iran & Myanmar)
A major concern :
We still don't get it...?
If these dips in graphs are not for shutdowns, then why are they for though,
We didn't see same pattern anywhere else!
This project is almost over now, and might not be maintained further.