Enterprise-grade ETL system for synchronizing patron records from various library management systems into FOLIO for the MOBIUS Consortium
The Patron Data to FOLIO Import system is a robust, enterprise-grade ETL (Extract, Transform, Load) solution that automates the synchronization of patron records from various library management systems into FOLIO. Designed specifically for the MOBIUS consortium, this system manages patron data imports for over 30 academic institutions across Missouri.
🔄 Two-Phase Processing: Separate staging and import phases for data validation and error recovery
🧩 Plugin Architecture: Modular parser system supporting multiple data formats and institutions
📊 Data Integrity: Advanced duplicate detection, fingerprinting, and validation
📈 Scalable Design: Handles high-volume imports with parallel processing capabilities
📧 Comprehensive Reporting: Detailed import reports with HTML and CSV exports
🔒 Security: Secure FOLIO API integration with proper authentication
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Data Sources │────│ Parser System │────│ PostgreSQL DB │
│ │ │ │ │ │
│ • Sierra ILS │ │ • SierraParser │ │ • stage_patron │
│ • CSV Files │ │ • CovenantParser│ │ • patron │
│ • Custom Format │ │ • TRCParser │ │ • address │
│ │ │ • MWParser │ │ │
│ │ │ • StateTechPars │ │ │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│
▼
┌─────────────────┐
│ FOLIO Service │
│ │
│ • REST API │
│ • Authentication│
│ • Error Handle │
└─────────────────┘
│
▼
┌─────────────────┐
│ FOLIO System │
│ │
│ • User Import │
│ • mod-users │
│ • Custom Fields │
└─────────────────┘
The system follows a five-component architecture:
- Data Sources: Sierra ILS, CSV Files, and Custom Formats from various institutions
- Parser System: Plugin-based parsers (SierraParser, CovenantParser, TRCParser, MWParser, StateTechParser) that transform raw data into standardized format
- PostgreSQL Database: Staging and production tables (stage_patron, patron, address) for data validation and storage
- FOLIO Service: REST API integration layer with authentication and error handling
- FOLIO System: Target system with User Import, mod-users, and Custom Fields functionality
- Perl 5.x with required modules
- PostgreSQL 12+ with patron_import schema
- FOLIO System with mod-user-import API access
- File System Access to dropbox directories
git clone https://github.com/mcoia/patron-data-to-folio-import.git
cd patron-data-to-folio-import# Install required Perl modules
cpan install DBI DBD::Pg JSON LWP::UserAgent Data::Dumper
cpan install Config::Tiny Email::MIME MIME::Base64# Create PostgreSQL database and user
createdb foliopatronimport
createuser foliopatronimport
# Import schema
psql -d foliopatronimport -f resources/sql/migrate/000-initial-schema.sql
# Grant permissions
psql -d foliopatronimport -c "GRANT ALL PRIVILEGES ON SCHEMA patron_import TO foliopatronimport;"Copy and customize the configuration file:
cp example.patron-import.conf patron-import.confEdit patron-import.conf with your settings:
# Database Configuration
dbhost = localhost
db = foliopatronimport
dbuser = foliopatronimport
dbpass = your_password
port = 5432
schema = patron_import
# File Paths
dropBoxPath = /mnt/dropbox
projectPath = /path/to/patron-data-to-folio-import
# FOLIO API Settings
folioUrl = https://your-folio-instance.org
folioTenant = your_tenant
folioUser = import_user
folioPassword = import_passwordThe system provides three main operational modes:
# Verify installation and dependencies
./patron-import.pl --test# Parse patron files and store in staging tables
./patron-import.pl --stage# Import staged patrons to FOLIO system
./patron-import.pl --import# Run both staging and import phases
./patron-import.pl# Use custom configuration file
./patron-import.pl --config /path/to/custom.conf
# Get help information
./patron-import.pl --help
# Initialize database schema
./patron-import.pl --initDB- Extend the Interface:
package Parsers::YourInstitutionParser;
use strict;
use warnings;
use parent 'Parsers::ParserInterface';
sub onInit {
my $self = shift;
# Initialize parser settings
}
sub beforeParse {
my $self = shift;
# Pre-processing logic
}
sub parse {
my ($self, $line) = @_;
# Main parsing logic
return $patronData;
}
sub afterParse {
my $self = shift;
# Post-processing logic
}
sub finish {
my $self = shift;
# Cleanup and finalization
}- Register the Parser:
# In lib/ParserManager.pm
$self->{parsers} = {
'sierra' => Parsers::SierraParser->new(),
'covenant' => Parsers::CovenantParser->new(),
'your_institution' => Parsers::YourInstitutionParser->new(),
};- Add Institution Configuration:
# In resources/mapping/MOBIUS Patron Loads - Patron Loads.csv
Institution,Parser,File Pattern,Active
Your Institution,your_institution,patron_*.txt,TRUE| Method | Purpose | Required |
|---|---|---|
onInit() |
Initialize parser state | ✓ |
beforeParse() |
Pre-processing setup | ✓ |
parse($line) |
Parse individual record | ✓ |
afterParse() |
Post-processing cleanup | ✓ |
finish() |
Finalization tasks | ✓ |
The system provides a REST API for querying patron data:
# Start API server
perl api.pl
# Query endpoints
GET /api/patrons # List all patrons
GET /api/patrons/:id # Get specific patron
GET /api/institutions # List institutions
GET /api/reports # Get import reportsKey tables in the patron_import schema:
stage_patron- Raw parsed patron datapatron- Processed patron recordsaddress- Normalized address informationinstitution- Institution configurationsjob- Import job trackingimport_response- FOLIO API responses
Logs are written to patron_import{timestamp}.log with detailed information about:
- Processing statistics
- Error conditions
- API responses
- Performance metrics
Automated HTML reports are sent after each import containing:
- Processing summary
- Success/failure counts
- Failed record details
- Performance statistics
resources/reports/report.html- Detailed HTML reportresources/reports/failed_patrons.csv- CSV of failed imports
Database Connection Errors:
# Check PostgreSQL service
systemctl status postgresql
# Verify database credentials
psql -h localhost -U foliopatronimport -d foliopatronimportParser Registration Issues:
# Verify parser is registered in ParserManager
grep -r "YourParser" lib/ParserManager.pmFOLIO API Errors:
# Test FOLIO connectivity
curl -X POST "https://your-folio/authn/login" \
-H "Content-Type: application/json" \
-d '{"username":"user","password":"pass"}'Enable verbose logging:
# Edit patron-import.conf
print2Console = true
logLevel = DEBUG- CPU: 4+ cores recommended
- RAM: 8GB+ for large imports
- Storage: 100GB+ for logs and temporary files
- Network: Stable connection to FOLIO API
# In patron-import.conf
maxProcesses = 8 # Parallel processing threads
db_pool_size = 20 # Database connection pool
maxPatronFileAge = 90 # File retention daysSet up monitoring for:
- Import success rates
- Processing times
- Database performance
- FOLIO API response times
- Fork the repository
- Create a feature branch
- Implement your changes
- Add tests for new functionality
- Submit a pull request
- Follow Perl Best Practices
- Document all public methods
- Include unit tests
- Update configuration examples
For technical support and questions:
- Email: [email protected]
- Issues: Report bugs through your internal issue tracking system
This project is proprietary software developed for the MOBIUS Consortium.
MOBIUS Consortium
Building the Future of Library Services