pg_reloaded
is a program that's useful for restoring databases. You can use it to refresh databases for online demos, development databases and anywhere where you may want to reset the data after use. You schedule your databases to be restored from a backup.
Currently, you will have to build it from source, binary Releases will be made available soon.
Get the usage information by running pg_reloaded --help
$ pg_reloaded check --config="pg_reloaded.yml"
# Run with the default configuration
$ pg_reloaded start
# Or with a path to a configuration file
$ pg_reloaded start --config="development.yml" --log-file="./path/to/log"
If you would like to restore a database immediately, run the following:
$ pg_reloaded run "database"
In order to be effective, pg_reloaded
needs to run in the background as a daemon.
There's a sample Systemd Unit File here.
You can also use supervisor to run the pg_reloaded
daemon, below is an example
configuration for Supervisor
[program:pg_reloaded]
command=/usr/bin/pg_reloaded start --config=/etc/pg_reloaded/pg_reloaded.yml
Please note that these process management systems must be configured to start pg_reloaded on boot for best effective use of the scheduling capabilities.
PG Reloaded is configured via YAML configuration file which records the details about how and when to restore your databases.
By default pg_reloaded
reads configuration from a file named pg_reloaded.yml
in the home directory if present i.e. $HOME/pg_reloaded.yml
(on Windows in %UserProfile%\pg_reloaded.yml
)
You can specify a path for the configuration file via the --config
option
on the command-line.
The configuration basically looks like the following:
# Absolute path to the directory containing postgresql client programs
# The following client programs are searched for specifically:
# psql, pg_restore, pg_dump
psql_path: "/path/to/psql-dir"
# Absolute path to the logfile, will be created if it does not exist
log_file: "/path/to/logfile"
servers:
# name - A name to identify the server in the "databases" section of the configuration
- name: "my-development-server"
# host - The host for the database
host: "localhost"
# port - The port for the database
port: 5432
# Username for the user must have CREATE DATABASE & DROP DATABASE privileges
username: "appuser"
# Password for the user role on the database
password: "password"
databases:
# name - The database name
- name: "my_database_name"
# server - The name of a server in the "servers" list
server: "my-development-server"
# schedule - Specifies the schedule for running the database restores in
# daemon mode. Supports simple interval notation and CRON expressions
schedule: "@every 24h"
# Source specifies where to get the schema and data to restore
source:
# The type of file(s) to restore the database from.
# The following types are (will be) supported:
#
# * sql - load schema & data from SQL files using psql
# * tar - load schema & data from SQL files using pg_restore
# * csv - load data from CSV files using pgfutter
# * json - load data from JSON files using pgfutter
type: "sql"
# The absolute path to the file to restore the database from
file: "/path/to/file"
# The absolute path to the schema file to be used to create tables, functions etc..
# Schema MUST be specified if source type is one of: csv, json
# or if the SQL file only contains data
schema: "/path/to/schema/file.sql"
The real value of pg_reloaded is in its ability to restore databases according to a schedule.
The following syntax is supported for scheduling
- Intervals: Simple interval notation is supported. Only for seconds(
s
), minutes (m
) and hours (h
).
e.g. @every 10m
, @every 2h
, @weekly
, @monthly
- CRON Expression: Most CRON expressions valid here should be valid
- The postgresql client programs must be present on your path or configured in
the config file or command-line for
pg_reloaded
to work. In particular the program may need to executepsql
,pg_restore
orpg_dump
during it's operation.
In the YAML file:
psql_path: "/path/to/psql-dir/"
On the command-line
$ pg_reloaded --psql-path="/path/to/psql-dir"
Currently, the only supported sources are:
- SQL via dumped SQL file - default source, load dumps/data files from the filesystem
You can build Docker images/containers using the Dockerfile. At this time, you can pull images from @gkawamoto's Dockerhub:
$ docker pull gkawamoto/pg_reloaded
I encourage you to build pg_reloaded from source, if only to get you to try out Go ;). So the first step is to Install Go.
The next step is to clone the repo and then after that, building the binary should
be as simple as running go build
$ git clone https://github.com/nndi-oss/pg_reloaded.git
$ cd pg_reloaded
$ go build -o dist/pg_reloaded
This project could not be made possible without these great Open-Source tools and their authors/contributors whose shoulders are steady enough to stand on:
- Running as a Service on Windows
Since you have to run it in the background for the scheduling functionality to be of any value a Service wrapper would be ideal on Windows - but until then you will have to run it in the foreground.
- BE CAREFULL, IT MAY EAT YOUR LUNCH!
This is not meant to be run on production databases which house critical data that you can't afford to lose. It's meant for demo and development databases that can be dropped and restored without losing a dime. Use good judgment.
Some rough ideas on how to take this thing further:
- Add preload and post-load conditions for databases
preload:
dump: true
checkActivity: true
postload:
notify: true
-
Backup the current database before restoring using pg_dump or pgclimb
-
A Windows Service wrapper
-
Add support for the following sources:
- csv : loads data from CSV files, just like you would with pgfutter
- json: loads data from JSON files, just like you would with pgfutter
- postgres - load database from a remote postgresql database
- http - load a database dump from an HTTP server
- s3 - load database dump from AWS S3
Issues and Pull Requests welcome.
MIT License
Copyright (c) 2019 - 2020, NNDI