The AI Service is the intelligence layer of the SMARTDRIVE platform, providing advanced content analysis, metadata generation, and AI-powered insights for uploaded files. It integrates with Google Gemini AI, AWS Rekognition, and AWS Textract to deliver comprehensive file understanding.
βββββββββββββββββββ ββββββββββββββββ βββββββββββββββ
β File Storage βββββΆβ AI Service βββββΆβ Google β
β Service β β β β Gemini AI β
βββββββββββββββββββ ββββββββββββββββ βββββββββββββββ
β
βΌ
βββββββββββββββ
β AWS β
β Rekognition β
βββββββββββββββ
β
βΌ
βββββββββββββββ
β AWS β
β Textract β
βββββββββββββββ
- Object Detection using AWS Rekognition
- Face Recognition and analysis
- Text Extraction from images (OCR)
- Image Labeling and classification
- Content Moderation and safety checks
- Text Extraction using AWS Textract
- Document Structure analysis
- Form Data extraction
- Table Recognition and parsing
- Handwriting Recognition
- Content Summarization using Gemini AI
- Keyword Extraction and tagging
- Entity Recognition and classification
- Sentiment Analysis
- Topic Modeling
- Comprehensive Tags and categories
- Content Descriptions and summaries
- Search Optimization metadata
- Accessibility information
- Content Quality scoring
- Framework: Spring Boot 3.2
- Language: Java 17
- AI Services: Google Gemini AI, AWS Rekognition, AWS Textract
- Storage: AWS S3
- Message Queue: AWS SQS
- Monitoring: OpenTelemetry, Micrometer
- Documentation: OpenAPI 3.0 (Swagger)
- Java 17+
- Docker & Docker Compose
- AWS Account with Rekognition and Textract access
- Google Cloud Account with Gemini AI API access
# AWS Configuration
AWS_ACCESS_KEY_ID=your_access_key
AWS_SECRET_ACCESS_KEY=your_secret_key
AWS_REGION=us-east-1
# Google Gemini AI
GEMINI_API_KEY=your_gemini_api_key
GEMINI_BASE_URL=https://generativelanguage.googleapis.com/v1
# AWS Services
S3_BUCKET_NAME=smartdrive-uploads
# Service Configuration
SERVER_PORT=8082
# Clone the repository
git clone <repository-url>
cd ai-service
# Build the project
./gradlew build
# Run with Docker Compose
docker compose up -d
# Or run locally
./gradlew bootRun
# Build Docker image
docker build -t smartdrive-ai-service .
# Run container
docker run -p 8082:8082 \
-e AWS_ACCESS_KEY_ID=your_key \
-e AWS_SECRET_ACCESS_KEY=your_secret \
-e GEMINI_API_KEY=your_gemini_key \
smartdrive-ai-service
POST /api/v1/metadata/generate
Content-Type: application/json
{
"contentId": "uuid",
"s3Key": "uuid-document.pdf",
"fileName": "document.pdf",
"contentType": "application/pdf",
"size": 1024000
}
# Response
{
"contentId": "uuid",
"tags": ["document", "pdf", "business"],
"summary": "Business document containing quarterly reports...",
"imageLabels": ["text", "document", "paper"],
"imageObjects": ["person", "chart", "graph"],
"imageText": ["Q1 2024", "Revenue", "$1.2M"],
"extractedText": "Quarterly Report Q1 2024...",
"processingStatus": "COMPLETED"
}
POST /api/v1/images/analyze
Content-Type: multipart/form-data
# Response
{
"labels": ["person", "office", "business"],
"objects": ["person", "desk", "computer"],
"faces": ["adult", "male", "smiling"],
"text": ["Meeting", "Schedule", "9:00 AM"]
}
POST /api/v1/documents/extract
Content-Type: multipart/form-data
# Response
{
"text": "Extracted text content...",
"tables": [{"headers": ["Name", "Value"], "rows": [...]}],
"forms": [{"key": "Date", "value": "2024-01-01"}]
}
server:
port: 8082
spring:
application:
name: ai-service
app:
s3:
bucket-name: ${S3_BUCKET_NAME:smartdrive-uploads}
gemini:
api-key: ${GEMINI_API_KEY}
base-url: ${GEMINI_BASE_URL:https://generativelanguage.googleapis.com/v1}
ai:
max-image-size: 5MB
supported-image-types:
- image/jpeg
- image/png
- image/gif
supported-document-types:
- application/pdf
- image/tiff
- image/png
@Configuration
public class AiConfig {
@Bean
public RekognitionClient rekognitionClient() {
return RekognitionClient.builder().build();
}
@Bean
public TextractClient textractClient() {
return TextractClient.builder().build();
}
}
GET /actuator/health
GET /actuator/health/aws-rekognition
GET /actuator/health/aws-textract
GET /actuator/health/gemini-ai
- Processing Success Rate:
smartdrive.ai.processing.success.rate
- Processing Time:
smartdrive.ai.processing.time
- AI Service Latency:
smartdrive.ai.service.latency
- Error Rate:
smartdrive.ai.error.rate
logging:
level:
com.smartdrive.aiservice: INFO
software.amazon.awssdk: INFO
pattern:
console: "%d{HH:mm:ss.SSS} [%X{traceId:-}] %-5level %logger{20} - %msg%n"
- Rate Limiting for AI API calls
- Input Validation and sanitization
- Content Filtering and moderation
- Privacy Protection for sensitive content
- Secure API Keys management
- Content Encryption in transit
- Audit Logging for all AI operations
- GDPR Compliance for data processing
./gradlew test
./gradlew integrationTest
# Test image analysis
curl -X POST http://localhost:8082/api/v1/images/analyze \
-F "[email protected]"
# Test document extraction
curl -X POST http://localhost:8082/api/v1/documents/extract \
-F "[email protected]"
version: '3.8'
services:
ai-service:
build: .
ports:
- "8082:8082"
environment:
- AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID}
- AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY}
- GEMINI_API_KEY=${GEMINI_API_KEY}
depends_on:
- elasticsearch
- redis
apiVersion: apps/v1
kind: Deployment
metadata:
name: ai-service
spec:
replicas: 3
selector:
matchLabels:
app: ai-service
template:
metadata:
labels:
app: ai-service
spec:
containers:
- name: ai-service
image: smartdrive/ai-service:latest
ports:
- containerPort: 8082
env:
- name: GEMINI_API_KEY
valueFrom:
secretKeyRef:
name: ai-credentials
key: gemini-api-key
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Documentation: SMARTDRIVE Docs
- Issues: GitHub Issues
- Discussions: GitHub Discussions