Data is prime commodity for any corporate. While developing an application it is responsibility of the developers to allow restricted access to data and follow proper security protocols. For example, an offshoot of Anonymous, hacktivist group Lulzsec in June 2011 hacked into Sony Pictures via SQL Injection attack and stole data that included names, passwords, e-mail and home addresses of thousands of customers. The group claimed to have compromised over one million accounts.
“As the world is increasingly interconnected, everyone shares the responsibility of securing cyberspace.”
Newton Lee, Counterterrorism and Cybersecurity: Total Information Awareness
As the world gets online, hackers have more and more incentives to breach the security of applications on the web and obtain valuable information. Recent hacking of Equifax only goes to show that not even the biggest companies are completely secure. However, it is not the companies that are the worst hit by these hacks. The users of their products, whose details are leaked, are the ones hardest hit. As more and more personal services go online, it is imperative to secure them properly. Leak of Personal Identifiable Information is the biggest threat to consumers of these applications as it can have consequences that may not be immediately visible.
The main cause of services not being secure enough is the fact that most developers are not thinking about security when they are building these applications. This means that applications are often vulnerable to some of the oldest and easiest attacks. Things like sql-injection, cross-site scripting and CSRF attacks have been known about for quite some time now and many effective mitigation strategies have been developed to counter these attacks. However, we are seeing more and more of these attacks. Hence there is an urgent need to automatically detect if there are vulnerabilities in the code, and to correct them as soon as possible.
There are several static analysis tools and fuzzing tools to test the security of an application, however, they require the developer to be constantly running them in his workstation and keep checking for their result. We feel that an automated bot can ease this process, so that the developers can concentrate on writing code. The bot we imagine will create ‘issues’ on github repos, comment on specific lines of code which introduce a vulnerability and assign these to the developer who wrote the code.
We aim at developing a drone bot that responds to different git operations when applied to a git repository. For example, when a collaborator commits to a git repo, then it will scan the code using OWASP ZAP API’s and static source analyzers to find the components of User Interface or a module that can be compromised for a security attack.
ZAP API works by actually running the web application in a browser. It attempts penetration testing on different web pages and identifies the network attacks to which a page is prone to. WAP is another tool to detect and correct input validation vulnerabilities in web applications written in PHP and predicts false positives. The tool combines source code static analysis and data mining to detect vulnerabilities and predict false positives. It then corrects the source code to remove the vulnerabilities inserting fixes(small functions) in the right places of the source code.
Our bot application will be using these tools to scan the changes in the code for vulnerabilities and possible network attacks to raise issues if any such vulnerability is found and can raise comments on appropriate locations, for example, on commits applied to a repository. The application can be integrated to any github repository and it can also be invoked with git pull request and commit operation.
Our bot will aid users while performing the following tasks:
-
Check security vulnerabilities of an application when bot is integrated with the repo for the first time
- Preconditions
The repo must have the security bot installed. - Main Flow
Github signals the bot to check the code present in the repository against any potential vulnerabilities. Bot fetches the code from the repository and runs OWASP ZAP and static analysis tools [S1]. Bot will collect the list of vulnerabilities and raise an issue with the vulnerabilities found [S2]. - Subflows
[S1] - The tools will be picked according to the programming language or framework used. [S2] - The issue will be raised with title as <type_of_vulnerability>. - Alternative Flows
If no vulnerabilities are found, the bot comments that no vulnerability is detected in the code.
- Preconditions
-
A developer wants to see all the vulnerabilities introduced in the code because of his commit
- Preconditions
The repo must have the security bot installed. - Main Flow
One of the developers commits a piece of code. Github signals the bot with the commit id. Bot fetches the code from the commit and runs OWASP ZAP and static analysis tools [S1]. Bot will collect the list of vulnerabilities and post a comment on the commit with the new vulnerabilities found because of it. It also raises an issue with the vulnerabilities found [S2]. - Subflows
[S1] - The tools will be picked according to the programming language or framework used. [S2] - The issue will be raised with title as <type_of_vulnerability>. - Alternative Flows
If no vulnerabilities are found, the bot comments that no vulnerability is detected in the code.
- Preconditions
-
A repo collaborator has to check a pull request for vulnerabilities.
- Preconditions
The repo must have the security bot installed. - Main Flow
Some contributor raises a pull request [S1]. Github signals the bot with the pull request number. Bot fetches the code from the Pull Request and runs OWASP ZAP and static analysis tools. Bot makes comments on the Pull Request, detailing the new vulnerabilities added because of it. - Subflows
[S1] - The contributor can raise a pull request from another fork, or from one of the branches in the same repo. The bot will analyze both the base and the branch to be merged, and report any differences. - Alternative Flows
If no vulnerabilities are found, the bot comments that no vulnerability is detected in the code.
- Preconditions
-
Get vulnerability report of the day via email
- Preconditions
The repo must have the security bot installed. - Main Flow
Bot will query the database and fetch list of vulnerabilities from past 5 commits. It sends out an email stating the vulnerabilities found. - Alternative Flows
If no vulnerabilities are found, the bot sends an email that no vulnerability is detected in the code.
- Preconditions
Robocop will work as a service in github, and have the following screens :
The idea here is to show how a potential owner can add Robocop to his repo. He can go to settings and to services and integration tab, and search for Robocop. He just adds Robocop as a service, and Robocop starts doing its magic.
After Robocop is added to a repo, it will raise issues for the vulnerabilities it finds in the application.
When a user pushes a commit, robocop will make comments if there are any new security vulnerabilities created because of that commit.
Someone creates a pull request. This can be from the same repo of from a fork of this repo. Robocop will analyse the pull request and make comments about the security vulnerabilities introduced in this pull request.
The bot can also send timely vulnerability report to the owner or the security team so that they may act on these quickly.
Some storyboards describing primary tasks of our bot:
A developer tries to use RoboCop to test his application against any security vulnerability as he builds it. He adds the service to his GitHub repository and starts development and make a commit. The commit triggers RoboCop to check the code/application and report any vulnerability found in the form of comments and issues.
Some developer wants to add a feature to an application utilizing RoboCop. He creates a pull request which triggers RoboCop to check for any security vulnerability present in the code. RoboCop comments on the pull request with list of vulnerabilities found.
Before looking at the architecture, we wanted to figure out the different components of Robocop.
The first distinction is that the actual repository lives on github, while Robocop will run on some machine. So these two will be two separate components.
Also, we plan to use 3 types of security analysis tools.
- Attack based tools like OWASP ZAP.
- Static code analysis tools like OWASP WAP and Bandit.
- Dependency checkers like Snyk and OWASP Dependency check.
Since these tools go through either the actual code or a deployed instance of the application, Robocop will need a git clone and a deployed instance of that application. This will be another part of the application.
Also, Robocop will need some database to store the vulnerabilities it finds, to make sure they are not reported again.
With that, we made the following overall architecture.
So Robocop acts as an interface between github repo, and the various security tools, and makes the decisions on which tools to call and what to do with the output. It checks the vulnerabilities it finds in the database, whether they have been reported before or not. If they are new vulnerabilities, they are reported to github in the appropriate format.
We can imagine a layered structure of Robocop. It listens to commit and pull request events and then checks for security vulnerabilities by calling appropriate APIs. The APIs could attack the application in various ways. They can analyze the code, check dependencies or directly attack the application webpages and find out vulnerabilities present. There are several tools available which can serve this purpose. Code analysis tools differ with programming languages or frameworks used.
OWASP ZAP is an API to test for network vulnerabilities in the code. It tests the application to identify if any element is prone to network attacks. The Robocop continuously listens to the git operations and invokes the ZAP API to test the application for network vulnerabilities. ZAP performs penetration testing on the application to identify the vulnerabilities in the application. It also identifies the network attack that a component of the application is prone to. It gives back the discovered vulnerabilities and security issues to Robocop which stores them in a MySQL database. Robocop also raises an issue for each vulnerability found and also generates a report to the owner of the repository.
Snyk.io is a software which checks an application for any dependency related vulnerability. Robocop receives dependency list from GitHub which it passes on to Snyk which will check the full list and provide all vulnerable dependencies present. Robocop will save the list in the database so that it can report the vulnerabilities whenever asked.
Design of Robocop is a hybrid of 3 designs.
Publish - Subscribe :
The main idea is to check for new security vulnerabilities introduced with each commit or pull request. So we use publish subscribe architecture where github notifies Robocop of all commit or pull request events. Robocop then checks the commits or pull request for vulnerabilities and reports the findings by either making comments or raising issues.
Object-Oriented Pattern:
We have identified three different kinds of attack tools that inherit from the same attack interface. They take the same github event as input, and dispatch the attack command to the relavant attack tool. Polymorphic design helps providing a generic attack interface while delegating the attack request to relevant module. It also enables easy addition of more module and tools in future.
Layered Pattern: Robocop uses three different types of layers.
-
Github Communication layer - This layer subscribes to github events and forwards the event to underlying layer. It also accepts the response from the underlying and decides how to give it to github.
-
Attack Surface layer - This layer is the front end for all the attack request. Business logic identifies the attack type and delegates to approrpriate attack handler, like ZapHandler. It reports the vulnerabilities found to the layer above.
-
Attack Tool Handler layer - This layer consists of wrappers for multiple attack tools which can be added or removed as required. It communicates with the actual tool using REST Apis or CLI interface, depending on the tool. It also stores or retrieves vulnerability information to/from the database.
-
Tool layer - This layer consists of the actual tools used to conduct the attacks. Tools can be added or removed in accordance with the above layer.
The design had to be modified slightly, because of the way github integrations work. In the case of an integration, Github only forwards events to one machine/server. Hence the Github Communication Layer is split between 2 parts. The Robocop Server and the Robocop Clients. Robocop server receives the events pushed by Github, and stores them in a Database. Robocop Clients can then query the server to get one of the events. While the design introduces runtime inefficiency, it also allows many clients to run simultaneously for the same Repo, thus enabling parallel processing in a way.
Some other patterns that might become useful are -
-
Implicit Invocation - As our bot receives events from github, it implicitly invokes the attack surface which will in turn call the actual attack tool. Similar implicit invocation can be used for integrating multiple attack modules.
-
Pipes and Filters - We may come across a situation where we want to give the output of one attack tool to another attack tool. In this case, pipes and filters might come in handy.