Skip to content

Latest commit

 

History

History
18 lines (12 loc) · 649 Bytes

README.md

File metadata and controls

18 lines (12 loc) · 649 Bytes

SiteTextRanker

Receives a URL and outputs the top 25 most common words in the page. Takes the url as the one and only parameter.

Assumptions made:

  1. URL passed to the class is fully formed and valid for JSoup to use.
  2. Outputting the list to the terminal is fine
  3. The SiteTextGetter implementation works correctly
    1. Jsoup does not correctly get all text in all instances. That is noted in its class.
  4. Counting words is not case sensitive
  5. Numbers (e.g., 123) are not words, but single letter "words" like 'v' or 'x' are

How to run

Run as you would a normal jar.

java -jar SiteTextRanker.jar https://www.your-cool-site.com