This is an ETL process for extracting and publishing data from the City of Philadelphia 311 system.
git clonethis repo- Create a virtualenv, activate, and
pip install -r requirements.txt - Rename
sample_config.pytoconfig.pyand enter actual values (or download from Lastpass). - Create a batch file to activate the virtualenv
andpython sync.py`. Schedule this to run regularly.
seed.py is used to truncate the cases table and reload from a CSV dump. The basic usage is:
python seed.py <file>
sync.py will check the database table for the most recent updated_datetime and get all records from Salesforce that have been updated since then. For a description of command-line arguments, see python sync.py --help.
The basic usage is:
python sync.py
If the Salesforce query times out you may have to chunk the updates into individual days. To sync just a single day, use the -d option:
python sync.py -d 2016-05-18