Weighted Term Co-associations

Weighted Term Co-association approach for producing more coherent topics, a ranking of the topics and visualization of the topical structure.

Step 1: Pre-processing

Pre-process the corpus of text:

python prep-text.py -o dataset --df 20 --tfidf --norm path/to/datsest

Step 2: Topic Modeling

Apply NMF to the pre-processed corpus, for the specified value or range of number of topics:

python topic-nmf.py dataset.pkl --init random --kmin 5 --kmax 5 -r 20 --seed 1000 --maxiters 100 -o models/dataset

To check the results:

python display-topics.py -t 10 data/bbc/nmf_k05/*rank*

Step 3: Weighted Term Co-assocation

python ensemble-weighted-coassoc.py -k 5 -m wikipedia2016-w2v-cbow-d100.bin -t 10 data/bbc.pkl data/bbc/nmf_k05/*partition* data/bbc/nmf_k05/*rank* -o results/bbc

Step 4: Evaluation with Coherence

Embeddings are available to download here

python evaluate-embedding.py -b -t 10 -m wikipedia2016-w2v-cbow-d100.bin -o results/bbc-coherence.csv data/bbc/nmf_k05/*rank*

Step 5: Evaluation with NMI

python evaluate-accuracy.py -o results/bbc-accuracy.csv data/bbc.pkl data/bbc/nmf_k05/*partition*

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
data		data
stopwords		stopwords
unsupervised		unsupervised
validation		validation
README.md		README.md
co-association.ipynb		co-association.ipynb
display-topics.py		display-topics.py
ensemble-weighted-coassoc.py		ensemble-weighted-coassoc.py
evaluate-accuracy.py		evaluate-accuracy.py
evaluate-embedding.py		evaluate-embedding.py
prep-text.py		prep-text.py
textutil.py		textutil.py
topic-nmf.py		topic-nmf.py
topic-sample-nmf.py		topic-sample-nmf.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Weighted Term Co-associations

Step 1: Pre-processing

Step 2: Topic Modeling

Step 3: Weighted Term Co-assocation

Step 4: Evaluation with Coherence

Step 5: Evaluation with NMI

The full weighted term co-association generation and evaluation process can be easily run with the provided jupyter notebook.

About

Releases

Packages

Languages

MarkBelford/co-association

Folders and files

Latest commit

History

Repository files navigation

Weighted Term Co-associations

Step 1: Pre-processing

Step 2: Topic Modeling

Step 3: Weighted Term Co-assocation

Step 4: Evaluation with Coherence

Step 5: Evaluation with NMI

The full weighted term co-association generation and evaluation process can be easily run with the provided jupyter notebook.

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages