Where are all the parts of the project located?
.
├── db
│ ├── backup_tsv
│ ├── bold_definitions
│ ├── deconstructor
│ ├── epd
│ ├── families
│ ├── frequency
│ ├── inflections
│ ├── lookup
│ ├── sanskrit
│ └── sinhala
├── db_tests
├── docs
├── dps
├── exporter
│ ├── bold_definitions
│ ├── deconstructor
│ ├── goldendict
│ ├── grammar_dict
│ ├── kindle
│ ├── kobo
│ ├── other_dictionaries
│ ├── share
│ ├── sinhala
│ ├── tbw
│ ├── tpr
│ └── webapp
├── go_modules
│ ├── deconstructor
│ ├── dpdDb
│ ├── frequency
│ ├── tools
├── gui
├── resources
├── scripts
│ ├── add
│ ├── archive
│ ├── backup
│ ├── bash
│ ├── build
│ ├── export
│ ├── find
│ ├── fix
│ ├── info
│ ├── tutorial
├── shared_data
├── temp
└── tools
There are four main parts to the code:
- db: Create the database and build up the tables of derived data.
- db_tests: Run data integrity tests on the db.
- gui: Add new words, edit and update the db with a GUI.
- exporter: Compile all the parts and export into various dictionary formats.
-
db/ All code related to building and populating the various tables and columns of the database.
-
models.py SQLAlchemy model of the database.
-
db_helpers.py Helper functions to make create the database, get a session, get column names, etc.
-
backup_tsv/ TSV backups of the database source tables
-
bold_definitions/ Extract bold definitions from CST texts and compile for easy searching.
-
db/epd/ Compile data for the English to Pāḷi dictionary.
-
families/ Compile HTML and JSON lists to populate the
family_compound
,family_idiom
,family_root
,family_word
andfamily_set
tables in the database, for easy access. -
frequency/ Create the frequency maps for every word in CST texts.
-
inflections/ Populate the
inflection_templates
table and use that to create HTML inflection tables for every word. -
lookup/ Populate the
lookup
table for instant accces to every word in the dictionary. -
sanskrit/ Data used for updating the
sanskrit
column in thedpd_headwords
table. -
sinhala/ Data used to populate the
sinhala
table.
-
-
db_tests/ Tests to ensure the completeness and accuracy of the Pāḷi database.
-
docs/ Helpful project documentation.
-
project_folder_structure.md What you're reading right now.
-
dpd_headwords_table.md Description of the real and virtual columns in the
dpd_headwords
table.
-
-
dps/ All Devamitta's code related to creating and populating
russian
andsbs
tables -
exporter/ Export the DPD database into various formats.
-
deconstructor/ Export DPD Deconstructor to GoldenDict and MDict
-
goldendict/ Export DPD, EPD, Help and Abbreviations to GoldenDict and MDict
-
grammar_dict/ Export DPD Grammar to GoldenDict and MDict
-
kindle/ Export a light version of DPD for Kindle.
-
kobo/ Export a light version of DPD for Kobo eReader.
-
other_dictionaries/ Export other Pāḷi and Sanskrit dictionaries to GoldenDict and MDict
-
sinhala/ Export a light version of DPD in Sinhala to GoldenDict and MDict
-
tbw/ Export a light version of DPD for integration into The Buddha's Words website.
-
tpr/ Export DPD grammar deconstructor data for integration into Tipitaka Pali Reader app
-
webapp/ A web application using the DPD database and FastApi, hosted at www.dpdict.net
-
-
go_modules/ When Python is too slow, write it in GO
-
deconstructor/ Breaking up a more than a million compound words
-
dpdDb/ Access the database using GORM
-
frequency/ Generate frequency tables and data lists for every word
-
tools/ GO packages used across the project
-
-
gui/ A GUI for data capture and running database tests
-
resources/ All external resources used by the project, imported as submodules
-
scripts/ Useful scripts for project maintenance.
see scripts readme for more info.
-
shared_data/ Data used across the project.
-
tools/ Python modules used across the project.