Skip to content

mcoia/patron-data-to-folio-import

Repository files navigation

Patron Data to FOLIO Import

Enterprise-grade ETL system for synchronizing patron records from various library management systems into FOLIO for the MOBIUS Consortium
Perl PostgreSQL FOLIO REST API

Overview

The Patron Data to FOLIO Import system is a robust, enterprise-grade ETL (Extract, Transform, Load) solution that automates the synchronization of patron records from various library management systems into FOLIO. Designed specifically for the MOBIUS consortium, this system manages patron data imports for over 30 academic institutions across Missouri.

Key Features

🔄 Two-Phase Processing: Separate staging and import phases for data validation and error recovery
🧩 Plugin Architecture: Modular parser system supporting multiple data formats and institutions
📊 Data Integrity: Advanced duplicate detection, fingerprinting, and validation
📈 Scalable Design: Handles high-volume imports with parallel processing capabilities
📧 Comprehensive Reporting: Detailed import reports with HTML and CSV exports
🔒 Security: Secure FOLIO API integration with proper authentication

Architecture

┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   Data Sources  │────│  Parser System  │────│  PostgreSQL DB  │
│                 │    │                 │    │                 │
│ • Sierra ILS    │    │ • SierraParser  │    │ • stage_patron  │
│ • CSV Files     │    │ • CovenantParser│    │ • patron        │
│ • Custom Format │    │ • TRCParser     │    │ • address       │
│                 │    │ • MWParser      │    │                 │
│                 │    │ • StateTechPars │    │                 │
└─────────────────┘    └─────────────────┘    └─────────────────┘
                                                        │
                                                        ▼
                                               ┌─────────────────┐
                                               │  FOLIO Service  │
                                               │                 │
                                               │ • REST API      │
                                               │ • Authentication│
                                               │ • Error Handle  │
                                               └─────────────────┘
                                                        │
                                                        ▼
                                               ┌─────────────────┐
                                               │  FOLIO System   │
                                               │                 │
                                               │ • User Import   │
                                               │ • mod-users     │
                                               │ • Custom Fields │
                                               └─────────────────┘

The system follows a five-component architecture:

  1. Data Sources: Sierra ILS, CSV Files, and Custom Formats from various institutions
  2. Parser System: Plugin-based parsers (SierraParser, CovenantParser, TRCParser, MWParser, StateTechParser) that transform raw data into standardized format
  3. PostgreSQL Database: Staging and production tables (stage_patron, patron, address) for data validation and storage
  4. FOLIO Service: REST API integration layer with authentication and error handling
  5. FOLIO System: Target system with User Import, mod-users, and Custom Fields functionality

Prerequisites

  • Perl 5.x with required modules
  • PostgreSQL 12+ with patron_import schema
  • FOLIO System with mod-user-import API access
  • File System Access to dropbox directories

Installation

1. Clone the Repository

git clone https://github.com/mcoia/patron-data-to-folio-import.git
cd patron-data-to-folio-import

2. Install Perl Dependencies

# Install required Perl modules
cpan install DBI DBD::Pg JSON LWP::UserAgent Data::Dumper
cpan install Config::Tiny Email::MIME MIME::Base64

3. Database Setup

# Create PostgreSQL database and user
createdb foliopatronimport
createuser foliopatronimport

# Import schema
psql -d foliopatronimport -f resources/sql/migrate/000-initial-schema.sql

# Grant permissions
psql -d foliopatronimport -c "GRANT ALL PRIVILEGES ON SCHEMA patron_import TO foliopatronimport;"

4. Configuration

Copy and customize the configuration file:

cp example.patron-import.conf patron-import.conf

Edit patron-import.conf with your settings:

# Database Configuration
dbhost = localhost
db = foliopatronimport
dbuser = foliopatronimport
dbpass = your_password
port = 5432
schema = patron_import

# File Paths
dropBoxPath = /mnt/dropbox
projectPath = /path/to/patron-data-to-folio-import

# FOLIO API Settings
folioUrl = https://your-folio-instance.org
folioTenant = your_tenant
folioUser = import_user
folioPassword = import_password

Usage

Command Line Interface

The system provides three main operational modes:

1. System Test

# Verify installation and dependencies
./patron-import.pl --test

2. Stage Data

# Parse patron files and store in staging tables
./patron-import.pl --stage

3. Import to FOLIO

# Import staged patrons to FOLIO system
./patron-import.pl --import

4. Complete Pipeline

# Run both staging and import phases
./patron-import.pl

Advanced Options

# Use custom configuration file
./patron-import.pl --config /path/to/custom.conf

# Get help information
./patron-import.pl --help

# Initialize database schema
./patron-import.pl --initDB

Parser Development

Creating a New Parser

  1. Extend the Interface:
package Parsers::YourInstitutionParser;
use strict;
use warnings;
use parent 'Parsers::ParserInterface';

sub onInit {
    my $self = shift;
    # Initialize parser settings
}

sub beforeParse {
    my $self = shift;
    # Pre-processing logic
}

sub parse {
    my ($self, $line) = @_;
    # Main parsing logic
    return $patronData;
}

sub afterParse {
    my $self = shift;
    # Post-processing logic
}

sub finish {
    my $self = shift;
    # Cleanup and finalization
}
  1. Register the Parser:
# In lib/ParserManager.pm
$self->{parsers} = {
    'sierra' => Parsers::SierraParser->new(),
    'covenant' => Parsers::CovenantParser->new(),
    'your_institution' => Parsers::YourInstitutionParser->new(),
};
  1. Add Institution Configuration:
# In resources/mapping/MOBIUS Patron Loads - Patron Loads.csv
Institution,Parser,File Pattern,Active
Your Institution,your_institution,patron_*.txt,TRUE

Available Parser Methods

Method Purpose Required
onInit() Initialize parser state
beforeParse() Pre-processing setup
parse($line) Parse individual record
afterParse() Post-processing cleanup
finish() Finalization tasks

API Reference

REST Endpoints

The system provides a REST API for querying patron data:

# Start API server
perl api.pl

# Query endpoints
GET /api/patrons          # List all patrons
GET /api/patrons/:id      # Get specific patron
GET /api/institutions     # List institutions
GET /api/reports          # Get import reports

Database Schema

Key tables in the patron_import schema:

  • stage_patron - Raw parsed patron data
  • patron - Processed patron records
  • address - Normalized address information
  • institution - Institution configurations
  • job - Import job tracking
  • import_response - FOLIO API responses

Monitoring and Reporting

Log Files

Logs are written to patron_import{timestamp}.log with detailed information about:

  • Processing statistics
  • Error conditions
  • API responses
  • Performance metrics

Email Reports

Automated HTML reports are sent after each import containing:

  • Processing summary
  • Success/failure counts
  • Failed record details
  • Performance statistics

Report Files

  • resources/reports/report.html - Detailed HTML report
  • resources/reports/failed_patrons.csv - CSV of failed imports

Troubleshooting

Common Issues

Database Connection Errors:

# Check PostgreSQL service
systemctl status postgresql

# Verify database credentials
psql -h localhost -U foliopatronimport -d foliopatronimport

Parser Registration Issues:

# Verify parser is registered in ParserManager
grep -r "YourParser" lib/ParserManager.pm

FOLIO API Errors:

# Test FOLIO connectivity
curl -X POST "https://your-folio/authn/login" \
  -H "Content-Type: application/json" \
  -d '{"username":"user","password":"pass"}'

Debug Mode

Enable verbose logging:

# Edit patron-import.conf
print2Console = true
logLevel = DEBUG

Production Deployment

System Requirements

  • CPU: 4+ cores recommended
  • RAM: 8GB+ for large imports
  • Storage: 100GB+ for logs and temporary files
  • Network: Stable connection to FOLIO API

Performance Tuning

# In patron-import.conf
maxProcesses = 8          # Parallel processing threads
db_pool_size = 20         # Database connection pool
maxPatronFileAge = 90     # File retention days

Monitoring

Set up monitoring for:

  • Import success rates
  • Processing times
  • Database performance
  • FOLIO API response times

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Implement your changes
  4. Add tests for new functionality
  5. Submit a pull request

Code Standards

  • Follow Perl Best Practices
  • Document all public methods
  • Include unit tests
  • Update configuration examples

Support

For technical support and questions:

  • Email: [email protected]
  • Issues: Report bugs through your internal issue tracking system

License

This project is proprietary software developed for the MOBIUS Consortium.


MOBIUS Consortium
Building the Future of Library Services

About

Enterprise-grade ETL system for synchronizing patron records from various library management systems into FOLIO for the MOBIUS Consortium

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors