-
Notifications
You must be signed in to change notification settings - Fork 0
Welcome to NLP server repo
The purpose of this system is to provide NLP services to platforms like DEEP. At the moment, DEEP is the major consumer of the services but in future we hope to have more consumers.
There are various NLP services deployed in AWS like text extraction, classification, summarization, etc which make use of different AWS services like lambda and ecs. The NLP server provides a consistent REST wrapper for all of those services.
Besides the NLP services, this system is also highly intended to monitor the performance of NLP services, primarily classification service. Thus, data from DEEP is periodically pulled and stored in the database in order to run the monitoring scripts.
More details on the infrastructure can be found here.
- ToFetchProject: Project in DEEP for which the data needs to be fetched and processed/monitored. Typically this is to fetch the entry and leads for that project.
- DeepDataFetchTracker: To track when were the last analysis framework and orgs fetched from DEEP.
- Organization: To store relevant information for Organization in DEEP
- AFMapping: To store relevant information for Analysis Framework in DEEP that included tags and mappings
- Project: Corresponding Project in DEEP but with subset of project info.
- Lead: Corresponding Lead in DEEP but with subset of project info.
- Entry: Corresponding Entry in DEEP but with subset of project info.
- NLPRequest: To keep track of each incoming request and its status.
- FailedCallback: To keep track of failed tasks in ECS which were triggered from here.
- ClassificationModel: To keep track of classification models(name, url, version) deployed in AWS.
- ClasificationPredictions
- ProjectWisePerfMetrics
- TagWisePerfMetrics
- AllProjectPerfMetrics
- CategoryWiseMatchRatios
- ProjectWiseMatchRatios
- ComputedFeatureDrift
- Clone the repo
- Set DEEP database details in
.env
file. Generally, you might need to set up a proxy to deep database using ssh if data is being accessed from alpha/prod server. - Run
docker-compose up
- Use
poetry
to add/remove pacakges.
Every request to nlp-server should contain a valid token in the header as
Authorization: Bearer <token>
.
- From the admin panel(https://HOST:PORT/admin), create a user if not already created and then create a token for the user.
- Distribute the token to clients.
-
fetch_new_projects
: This fetch newly added active projects and stores them asToFetchProject
. -
fetch_deep_data
: This fetches leads, entries, afs, orgs from DEEP based onToFetchProject
. -
calculate_model_metrics
: This calculates and stores various performance metrics for NLP models.