Canadian Legislative Scrapers

Usage

Follow the instructions in the Python Quick Start Guide to install Homebrew, Git, PostGIS, Python 3.3+ and virtualenv.

mkvirtualenv scrapers-ca --python=`which python3`
git clone git://github.com/opencivicdata/scrapers-ca.git
cd scrapers-ca
pip install -r requirements.txt

Initialize the database:

createdb pupa
psql pupa -c "CREATE EXTENSION postgis;"
pupa dbinit ca

Run a scraper

pupa update ca_ab_edmonton

To run only the scraping step and skip the import step add the --scrape switch:

pupa update --scrape ca_ab_edmonton

For documentation on the pupa command:

pupa -h

For documentation on the update subcommand:

pupa update -h

Create a scraper

Find division identifiers using the Open Civic Data Division Identifier (OCD-ID) Viewer or by browsing the list of identifiers. In most cases, a municipality will have a division identifier with a type ID of csd. Then, create a scraper with:

pupa init ca_on_toronto

Develop a scraper

Read the Pupa documentation or an existing scraper's code.

Avoid using the XPath string() function unless the expression is known to not have matches on some pages. Otherwise, scrapers may continue to run without error despite failing to find a match. A comment like # can be empty or # allow string() should accompany the use of string().

Use the get_email and get_phone helpers as much as possible.

Maintenance

Check module names, class names, classification, division_name, name and url in __init.py__ files:

invoke tidy

Check sources are credited:

invoke sources

Check jurisdiction URLs:

invoke urls

Check PEP 8 conformance:

flake8 .

Update the OCD-IDs:

curl -O https://raw.githubusercontent.com/opencivicdata/ocd-division-ids/master/identifiers/country-ca.csv

Scraper code rarely undergoes code review. The focus is on the quality of the data.

Bugs? Questions?

This repository is on GitHub: https://github.com/opencivicdata/scrapers-ca, where your contributions, forks, bug reports, feature requests, and feedback are greatly welcomed.

Name		Name	Last commit message	Last commit date
Latest commit History 1,314 Commits
ca		ca
ca_ab		ca_ab
ca_ab_calgary		ca_ab_calgary
ca_ab_edmonton		ca_ab_edmonton
ca_ab_grande_prairie		ca_ab_grande_prairie
ca_ab_strathcona_county		ca_ab_strathcona_county
ca_ab_wood_buffalo		ca_ab_wood_buffalo
ca_bc		ca_bc
ca_bc_abbotsford		ca_bc_abbotsford
ca_bc_burnaby		ca_bc_burnaby
ca_bc_coquitlam		ca_bc_coquitlam
ca_bc_kelowna		ca_bc_kelowna
ca_bc_langley		ca_bc_langley
ca_bc_richmond		ca_bc_richmond
ca_bc_saanich		ca_bc_saanich
ca_bc_surrey		ca_bc_surrey
ca_bc_vancouver		ca_bc_vancouver
ca_bc_victoria		ca_bc_victoria
ca_mb		ca_mb
ca_mb_winnipeg		ca_mb_winnipeg
ca_nb		ca_nb
ca_nb_fredericton		ca_nb_fredericton
ca_nb_moncton		ca_nb_moncton
ca_nl		ca_nl
ca_nl_st_john_s		ca_nl_st_john_s
ca_ns		ca_ns
ca_ns_cape_breton		ca_ns_cape_breton
ca_ns_halifax		ca_ns_halifax
ca_nt		ca_nt
ca_on		ca_on
ca_on_ajax		ca_on_ajax
ca_on_belleville		ca_on_belleville
ca_on_brampton		ca_on_brampton
ca_on_brantford		ca_on_brantford
ca_on_burlington		ca_on_burlington
ca_on_cambridge		ca_on_cambridge
ca_on_chatham_kent		ca_on_chatham_kent
ca_on_greater_sudbury		ca_on_greater_sudbury
ca_on_guelph		ca_on_guelph
ca_on_haldimand_county		ca_on_haldimand_county
ca_on_hamilton		ca_on_hamilton
ca_on_kingston		ca_on_kingston
ca_on_kitchener		ca_on_kitchener
ca_on_lambton		ca_on_lambton
ca_on_london		ca_on_london
ca_on_markham		ca_on_markham
ca_on_milton		ca_on_milton
ca_on_mississauga		ca_on_mississauga
ca_on_niagara		ca_on_niagara
ca_on_north_dumfries		ca_on_north_dumfries
ca_on_oakville		ca_on_oakville
ca_on_oshawa		ca_on_oshawa
ca_on_ottawa		ca_on_ottawa
ca_on_peel		ca_on_peel
ca_on_richmond_hill		ca_on_richmond_hill
ca_on_st_catharines		ca_on_st_catharines
ca_on_thunder_bay		ca_on_thunder_bay
ca_on_toronto		ca_on_toronto
ca_on_vaughan		ca_on_vaughan
ca_on_waterloo		ca_on_waterloo
ca_on_waterloo_region		ca_on_waterloo_region
ca_on_welland		ca_on_welland
ca_on_wellesley		ca_on_wellesley
ca_on_whitby		ca_on_whitby
ca_on_windsor		ca_on_windsor
ca_pe		ca_pe
ca_pe_charlottetown		ca_pe_charlottetown
ca_qc		ca_qc
ca_qc_beaconsfield		ca_qc_beaconsfield
ca_qc_brossard		ca_qc_brossard
ca_qc_cote_saint_luc		ca_qc_cote_saint_luc
ca_qc_dollard_des_ormeaux		ca_qc_dollard_des_ormeaux
ca_qc_dorval		ca_qc_dorval
ca_qc_gatineau		ca_qc_gatineau
ca_qc_kirkland		ca_qc_kirkland
ca_qc_laval		ca_qc_laval
ca_qc_levis		ca_qc_levis
ca_qc_longueuil		ca_qc_longueuil
ca_qc_montreal_est		ca_qc_montreal_est
ca_qc_pointe_claire		ca_qc_pointe_claire
ca_qc_quebec		ca_qc_quebec
ca_qc_saguenay		ca_qc_saguenay
ca_qc_saint_jean_sur_richelieu		ca_qc_saint_jean_sur_richelieu
ca_qc_saint_jerome		ca_qc_saint_jerome
ca_qc_senneville		ca_qc_senneville
ca_qc_sherbrooke		ca_qc_sherbrooke
ca_qc_terrebonne		ca_qc_terrebonne
ca_qc_trois_rivieres		ca_qc_trois_rivieres
ca_sk		ca_sk
ca_sk_regina		ca_sk_regina
ca_sk_saskatoon		ca_sk_saskatoon
disabled		disabled
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
country-ca.csv		country-ca.csv
patch.py		patch.py
pupa_settings.py		pupa_settings.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Canadian Legislative Scrapers

Usage

Run a scraper

Create a scraper

Develop a scraper

Maintenance

Bugs? Questions?

About

Releases

Packages

Languages

License

tor-councilmatic/scrapers-ca

Folders and files

Latest commit

History

Repository files navigation

Canadian Legislative Scrapers

Usage

Run a scraper

Create a scraper

Develop a scraper

Maintenance

Bugs? Questions?

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages