Skip to content

A file is given as an input and text is then pre-processed, tokenized, stemmed and lemmatized. After this, all the lemmatized words are tagged in context of Brown Corpus. A list of words along with there respective tags is obtained a a result.

Notifications You must be signed in to change notification settings

codesurf/NLP-PoS-Tagging

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

NLP-PoS-Tagging

A file is given as an input and text is then pre-processed, tokenized, stemmed and lemmatized. After this, all the lemmatized words are tagged in context of Brown Corpus. A list of words along with there respective tags is obtained a a result. Some side results are also obtained:

  1. Bar-Graphs:
    1. Relationship between word length and frequency (tokens with stopwords)
    2. Relationship between word length and frequency (tokens without stopwords)
    3. Top 10 most occured word in the file (with stopwords)
    4. Top 10 most occured word in the file (without stopwords)
    5. Relationship between word tags and frequency
  2. Word Cloud:
    1. Of tokens (with stopwords)
    2. Of tokens (without stopwords)

About

A file is given as an input and text is then pre-processed, tokenized, stemmed and lemmatized. After this, all the lemmatized words are tagged in context of Brown Corpus. A list of words along with there respective tags is obtained a a result.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages