DeweyDecimalGenerator

Takes in the name of a book and generates the Dewey Decimal Classification(DDC) code needed by the Library of the book

File structure

-DDC.pickle and imageread.py - The DDC data is avaiable as a scanned book only. This image needs to be converted to text. For this task, Tesseract, an OCR library was used. imageread.py contains the code to do this. The images are too large to be hosted here, however, they can easily be obtained online. DDC.pickle contains the output of this, which is a list of book name and the book's code

-classifier_binary and classifier_multi - Contain prototyping code for making a simple text classifier using NLTK

-book_classifier.py - Can work in training or inference mode. -In training mode, it reads DDC.pickle and builds the multiple levels of NaiveBayesClassifiers needed for generating the DDC Code and saves to disk. See Explanation.MD for details on how it works. Training can take an hour. n inferenceing mode, it reads the saved classifiers and outputs the code

-Explanation.MD- Contains a summary of this work and how the system is designed

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
DDC.pickle		DDC.pickle
Explanation.MD		Explanation.MD
README.md		README.md
bookclassifier.py		bookclassifier.py
classifier_binary.py		classifier_binary.py
classifier_multi.py		classifier_multi.py
imageread.py		imageread.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DeweyDecimalGenerator

File structure

About

Uh oh!

Releases

Packages

Languages

Genius1237/DeweyDecimalGenerator

Folders and files

Latest commit

History

Repository files navigation

DeweyDecimalGenerator

File structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages