This project contains source code and supporting files for a serverless application that you can deploy with the AWS Serverless Application Model (AWS SAM) The project exposes an API to manage documents information and be able to classify them with comprehend. It includes the following files and folders:
src
- Code for the application's Lambda functions.template.yaml
- A template that defines the application's AWS resources and reference the stack deployed from the main project.
To get started, deploy the cloudformation template:
aws cloudformation deploy --template-file infra.yaml --stack-name your-stack-name --capabilities CAPABILITY_NAMED_IAM
To use the AWS SAM CLI, you need the following tools:
- AWS SAM CLI - Install the AWS SAM CLI.
- Node.js - Install Node.js 16, including the npm package management tool.
- Docker - Install Docker community edition.
To build and deploy the application for the first time reference to the main project:
If you just want to deploy the APIs and apply new changes to this project, run the following in your shell:
You can check the article in - THE TBBC WEBSITE
sam build --use-container
cd .aws-sam/build/
# for the first time
sam deploy --guided
# afterwards
sam deploy --config-file ../../samconfig.toml
The first command builds the source of your application. The second command will package and deploy your application to AWS The API Gateway endpoint API will be displayed in the outputs when the deployment is complete.
- file-manager: this lambda function exposes an Express API with the following routes:
- extract-data: S3 trigger to take the file and invoke the Textract API.
- textract-checker: due you can upload files with a couple of pages, we have created a SQS trigger to process asynchronously the files text extraction, the data is then persisted on DynamoDb
- classify-document :S3 trigger that takes the file and classify it with AWS comprehend (batch process). A prefix for the file must be defined (classify_<file_name>.extension)
- training-files :This is a lambda function triggered by EventBridge to generate the csv files by obtaining the records pending of sync from DynamoDB
- classify-extractor :S3 trigger that takes the classification output from AWS Comprehend, extracts the information and persists the classification results (file_name, score, label) in a new DynamoDB table