Skip to content

Onion Hunter API

cribdragg3r edited this page Feb 4, 2020 · 5 revisions

Onion-Hunter API


Configuration Class

  • Description: Contains all the user customizable objects such as sub-reddits to search, keyword used to index and categorize onions, and the Reddit API details. All user customization will be contained within this class.

  • Methods:

    • _init_.py - This is the only method as it contains all the required variables.


Database Manager Class

  • Description: This class will maintain all communication to and from the SQLite3 database (onion.db). The onion.db file is located within the root directory of the repository and the name is hard coded within this class. If, for any reason, you would like to change the database name or secrutiy level (i.e., password protected), this class would need to be amended.

  • Methods:

    • deleteAll() - Deletes ALL data from ALL tables and resets the SQLite table index values for the primary keys (id).

      • Returns: N/A

    • onionsInsert() - Takes six (6) arguments containing information about a specific Onion domain. This function inserts the data into the ONIONS table.

      • Returns: N/A

    • freshInsert() - Takes two (2) arguments. This function inserts an Onion domain into the FRESH_ONIONS_SOURCES table.

      • Returns: N/A

    • knownOnionsInsert() - Takes two (2) arguments. This function inserts an Onion Domain into the KNOWN_ONIONS table.

      • Returns: N/A

    • getFreshOnionDomains() - Aggregates a list of all of Onion URI’s within the FRESH_ONION_SOURCES table.

      • Returns: List Object

    • checkOnionsDuplicate() - Checks if a Domain already exists within the ONIONS table.

      • Returns: Boolean

    • cleanupFreshOnions() - Deletes any tuple that does not meet the Fresh Onion requirements from the FRESH_ONION_SOURCES table and then optimizes the onion.db.

      • Returns: N/A

    • cleanupOnions() - Deletes all timeout domains and any domain that has Facebook or nytimes in the URI.

      • Returns: N/A

    • addTitlesFromSource() - I have amended the ONIONS table to include the HTML INDEX parameter as its own attribute. This is a very fast and easy way for webpage type attribution. This function only uses what’s already in the database. Every new finding will have this attribute included.

Building a Fresh Database

If you need to build a new database, below is the SQL statements required to create the three (3) tables that are required for Hunt.py to operate.

CREATE TABLE ONIONS
(ID INTEGER PRIMARY KEY AUTOINCREMENT,
DATE_FOUND TEXT NOT NULL,
DOMAIN_SOURCE TEXT NOT NULL,
URI TEXT NOT NULL,
URI_TITLE TEXT,
DOMAIN_HASH TEXT NOT NULL,
KEYWORD_MATCHES TEXT,
KEYWORD_MATCHES_SUM INT,
INDEX_SOURCE TEXT NOT NULL);

CREATE TABLE FRESH_ONION_SOURCES
(ID INTEGER PRIMARY KEY AUTOINCREMENT,
URI TEXT NOT NULL,
DOMAIN_HASH TEXT NOT NULL,
FOREIGN KEY (DOMAIN_HASH) REFERENCES ONIONS (DOMAIN_HASH));

CREATE TABLE KNOWN_ONIONS
(ID INTEGER PRIMARY KEY AUTOINCREMENT,
DOMAIN_HASH TEXT NOT NULL,
DATE_REPORTED TEXT,
REPORTED INT NOT NULL,
FOREIGN KEY (DOMAIN_HASH) REFERENCES ONIONS (DOMAIN_HASH));

Utilities Class

  • Description: Contains all non-specific functions that are required for use.

  • Methods:

    • getSHA256() - Takes one (1) argument (String Object) and returns a SHA256 hash of that object.

      • Returns: String SHA256 Hash

    • getOnions() - Takes one (1) argument (String Object) and searches for Onion Domains within the text via a RegEx function.

      • RegEx https?://(?:[-\w.]|(?:%[\da-fA-F]{2}))+.onion

      • Returns: List Object of all found domains

    • isFreshOnionRepo() - Checks an Onion Domains Source for the number of Unique Onion Domains listed. If the number of domains found is greater than (>50) 50, categorize the domain as a possible Fresh Onion Source.

      • Returns: Boolean

    • isTorEstablished() - Verifies that the system is connected to a Tor proxy and is capable of navigating to Onion Addresses.

      • Returns: Boolean

    • deepPasteEnum() - Deep Paste uses MD5’s in the URI. This function finds all the MD5’s to build valid URI’s for searching.

      • Returns: List

Clone this wiki locally