We recommend the use of pip and virtualenv for environment and dependency management
in this and other Python projects. If you don't have them installed we
recommend sudo easy_install pip
and then sudo pip install virtualenv
- Copy
and configure any local settings: DATABASES, SECRET_KEY, SOLR_, FEDORA_, customize LOGGING, etc. - Create a new virtualenv and activate it.
- Install fabric:
pip install fabric
- Use fabric to run a local build, which will install python dependencies in
your virtualenv, run unit tests, and build sphinx documentation:
fab build
Deploy to QA and Production should be done using fab deploy
After configuring your database, run syncdb:
python manage.py syncdb
Use eulindexer to index repository content into your configured Solr instance.
When first installing this project, you'll need to create a virtual environment
for it. The environment is just a directory. You can store it anywhere you
like; in this documentation it'll live right next to the source. For instance,
if the source is in /home/httpd/openemory/src
, consider creating an
environment in /home/httpd/openemory/env
. To create such an environment, su
into apache's user and:
$ virtualenv --no-site-packages /home/httpd/openemory/env
This creates a new virtual environment in that directory. Source the activation file to invoke the virtual environment (requires that you use the bash shell):
$ . /home/httpd/openemory/env/bin/activate
Once the environment has been activated inside a shell, Python programs spawned from that shell will read their environment only from this directory, not from the system-wide site packages. Installations will correspondingly be installed into this environment.
Installation instructions and upgrade notes below assume that you are already in an activated shell.
Beginning with Release 0.5 - Faculty Profiles, OpenEmory uses the Python Imaging Library Pillow to support faculty profile photo uploads. -Pillow can be installed via pip, but support for JPEG and PNG formats depends on the certain system libraries. For JPEG, libjpeg is required; for PNG, libz is required. On recent versions of Ubuntu, libjpeg8-dev and zlib1g-dev packages should be installed (libjpeg62-dev probably works with the path adjustment noted below).
By default on Ubuntu, libz.so is not installed directly in
, but in an architecture-specific like
or /usr/lib/x86_64-linux-gnu
. As a
work-around, add a symlink either to /usr/lib
or to the
virtualenv lib
directory, e.g.:
$ sudo ln -s /usr/lib/i386-linux-gnu/libz.so /usr/lib
To test that the required libraries are installed correctly, pip
install PIL
(or pip install --upgrade PIL
if already installed).
At the end of the installation, PIL setup provides a summary of the
configuration. Check to see that JPEG and PNG are listed as
--- JPEG support available --- ZLIB (PNG/ZIP) support available
python-ldap requires the following packages: python-dev libldap2-dev libsasl2-dev libssl-dev
OpenEmory depends on several python libraries. The installation is mostly automated, and will print status messages as packages are installed. If there are any errors, pip should announce them very loudly.
To install python dependencies, cd into the repository checkout and:
$ pip install -r pip-install-req.txt
Note: installation of some dependencies (i.e. django-tracking) requires that django settings are available. To install these manually, use this:
$ env DJANGO_SETTINGS_MODULE=openemory.settings pip install -r pip-install-after-config.txt
If you are a developer or are installing to a continuous integration server where you plan to run unit tests, code coverage reports, or build sphinx documentation, you probably will also want to:
$ pip install -r pip-dev-req.txt
After this step, your virtual environment should contain all of the needed dependencies.
OpenEmory uses Solr and
:mod:`eulindexer` for searching and indexing Fedora content. The Solr schema
included with the source code at solr/schema.xml
should be used as the
Solr schema configuration. For convenience, this directory also contains a
sample solrconfig.xml
and minimal versions of all other solr
configuration files used by the index.
The url for accessing the configured Solr instance should be set in
Repository content accessible via OpenEmory should be indexed using
EULindexer. To add OpenEmory to an installed and configured
instance of EULindexer, add the deployed indexdata url to the
eulindexer localsettings.py
, e.g.:
INDEXER_SITE_URLS = { 'openemory': 'http://openemory.library.emory.edu/indexdata/', }
To populate the index initially, or to reindex all content, run the
script that is available in EULindexer:
$ python manage.py reindex -s openemory
After installing dependencies, copy and edit the wsgi and apache configuration files in src/apache inside the source code checkout. Both may require some tweaking for paths and other system details.
Configure application settings by copying localsettings.py.dist
and editing for local settings (database, Fedora
repository, Pid Manager, etc.).
After configuring all settings, initialize the db with all needed tables and initial data using:
$ python manage.py syncdb $ python manage.py migrate
Load Fedora fixtures and control objects to the configured repository using:
$ python manage.py syncrepo
This application makes use of the :mod:`django.contrib.sites` module
to generate ARKs. After running syncdb
and starting the
web app, use the Django DB Admin site to configure the default site by
replacing the example.com
domain with the domain for the deployed
web application.
The application uses database-backed sessions. Django recommends periodically clearing the session table in this configuration. To do this, set up a cron job to run the following command periodically from within the application's virtual environment:
$ manage.py cleanup
This script removes any expired sessions from the database. We recommend doing this about every week, though exact timing depends on usage patterns and administrative discretion.
The application relies on current directory information about faculty. This information is provided by Emory Shared Data, but we also index it in solr for improved searching capabilities. Set up a nightly cron job to re-scan the ESD data and update the index:
$ manage.py index_faculty
The application collects usage statistics and sends quarterly reports to article authors. Set up a cron job to create and send these reports by running the following command from within the application's virtual environment. The script should run at the beginning of January, April, July, and October:
$ manage.py quarterly_stats_by_author
The application harvests article metadata from PubMed Central nigtly and stores it in the OpenEmory SQL database to be later ingested. The followng command should be run to keep the harvest queue up to date. In this mode article metadata is harvested from the last harvest date to the present:
$ manage.py fetch_pmc_metadata --auto-date
Additionally, there is a second job which runs once a month that does a full harvest to catch any records that may have been missed for any reason:
$ manage.py fetch_pmc_metadata
Set up iWatch to trigger notifications on folder where reports are created.
- Please use the Django Admin to edit the flatpage contents in the database so that the site navigation can be updated. The "/about/authors-rights/" needs to be updated to "/about/author-rights/", title "Authors' Rights" needs to be updated to "Author Rights".
- Please use the Django Admin to edit the flatpage contents in the database so that the site navigation can be updated. The "/data-archiving/" needs to be updated to "/publishing-your-data/", title "Data Archiving" needs to be updated to "Publishing Your Data".
- Please use the Django Admin to edit the flatpage contents in the database so that the site navigation can be updated. There needs to be "/about/depositadvice/" added, "/how-to/submit/" updated, and "/about/staff/" title updated.
- Please check the "django_flatpage_sites" table in the database and make sure that the "site_id" is all marked as "1" or the "site_id" that we are using for this app.
- Currently the Admin page may not be viewable due to a problem in eullocal; until it is fixed permanently, we just need to delete: SiteProfileNotAvailable from /home/httpd/openemory/env/lib/python2.7/site-packages/eullocal/django/ldap/backends.py.
- fixing embargo duration
- pdf file download bug
- pubsid report
- download pmc subset
- fixing styles for publication page
- adjusting mods to save non emory faculty authors
- mime type debugging
- fixing styles
- adding new content
- adding new content
- debugging conflicting policies in XACML
run this script to cleanup journal articles (updated)
$ python manage.py journal_title
run this script to match all content models for articles and books
$ python manage.py cmodel_cleanup
run this script to match all current journal titles with Sherpa Romeo
$ python manage.py journal_title
run migrations for downtime
$ python ./manage.py migrate downtime $ python ./manage.py migrate mx
run migrations for publication
$ python ./manage.py migrate publication
create LastRun object:
$ from openemory.publication.models import LastRun $ LastRun(name='Convert Symp to OE', start_time='2014-01-01 00:00:00').save()
Set up iWatch to trigger notifications on folder where reports are created
Setup cron job to run import command
in localsettings.py
run migrations for accounts to add add_articlerecord to Site Admin group permissions:
$ python manage.py migrate accounts
Add the following variables to localsettings.py:
Run migrations:
$ python ./manage.py migrate accounts
Run add_dc_ident to modify dc data:
$ python ./manage.py add_dc_ident --username=<USERNAME>
Run add_to_oai to update OAI info:
$ python ./manage.py add_to_oai --username=<USERNAME>
The system pip and virtualenv packages need to be updated before the fab file is run:
$ sudo pip install --upgrade pip $ sudo pip install --upgrade virtualenv
Run add_dc_ident to restore dc identifiers:
$ python ./manage.py add_dc_ident
Add the following to local setting BEFORE fab is run. Values will be provided at deploy time:
# reCAPTCHA keys for your server or domain from https://www.google.com/recaptcha/ RECAPTCHA_PUBLIC_KEY = '' RECAPTCHA_PRIVATE_KEY = '' RECAPTCHA_OPTIONS = {}
Run syncrepo to load collection object:
$ python ./manage.py syncrepo
A manage commnd needs to be run to prepare the articles to be harvested by OAI:
$ python manage.py add_to_oai --username=<USERNAME> > oai.log
Run migrations to add License model:
$ python ./manage.py migrate
Run the following command to load the initial license info:
$ python ./manage.py loaddata init_license
A manage commnd needs to be run to remove empty contentMetadata datastreams, copy license info into the MODS and ADD OAI info. The script should be run with the
user:$ python manage.py cleanup_articles --username=<USERNAME> > cleanup.log
New configurations have been added
:- GOOGLE_ANALYTICS_ENABLED - set True/False to enable/disable Google Analytics on the site (analytics should generally only be enabled in production)
- GOOGLE_SITE_VERIFICATION - set to the value provided by Google Webmaster Tools to allow site verification
for examples.
Now using :mod:`django.contrib.flatpages` for pages with static site content (about, how-tos, etc). Run
to update the database:$ python manage.py syncdb $ python manage.py migrate
For an existing installation with a database you want to preserve, you will have to fake the 0012_add_model_announcement migration if you receive the error message Table accounts_announcement already exists:
$ python manage.py migrate accounts 0012 --fake --delete-ghost-migrations
You can then run the migrate
command above to finish the migrations.
A nightly cron job is needed to run the following command to check for embargoes that have expired and reindex them so that the full text can be searched:
$ python manage.py expire_embargo
The output of this script should be redirected to a log. The log Should be rolled on a regular basis.
A nightly cron job is needed to sync indexed faculty data with ESD:
$ python manage.py index_faculty
A cron cron job is needed to run at the beginning of each quarter to send out stats for the previous quarter:
$ python manage.py quarterly_stats_by_author
The output of this script should be redirected to a log. The log Should be rolled on a regular basis.
ESD faculty information is now indexed in Solr for search functionality. In order to accommodate indexing disparate types of data, the unique key for Solr has been changed. Solr should be configured with the new schema, and then all data must be cleared and reindexed.
Restart eulindexer after this and any other solr schema changes.
After updating Solr with the new schema, index Faculty data from Emory Shared Data into Solr:
$ python manage.py index_faculty
This release adds models and migrations. Sync and migrate the database:
$ python manage.py syncdb $ python manage.py migrate
- Now makes use the PID manager and the :mod:`django.contrib.sites`
module to generate ARKs for repository content. To configure:
- After running
and starting the web app, use the Django DB Admin site to configure the default site by replacing theexample.com
domain with the domain for the deployed web application. - Create a domain and user for OpenEmory ARKs on the PID manager
(the user should have permissions to create pids and targets), and
configure all of the PIDMAN_ settings in
based on the examples inlocalsettings.py.dist
- After running
Now includes :mod:`south` for database migrations. For a new installation, you should run
to add the required database tables for south and any of the other tables not managed by South:$ python manage.py syncdb
By default, Django will prompt you to create a superuser when you run
on a new database; since the user profile model is managed by :mod:`south`, you should not attempt to create any accounts until after you have completed the migrations. To skip this prompt, you may runsyncdb
with the--noinput
option. After migrations are complete, use thecreatesuperuser
manage.py command to create a new super ures.Then run the south
command to update the database tables that are now managed by :mod:`south`:$ python manage.py migrate
For an existing installation with a database you want to preserve, run the
step above to add the required database tables for south, and then fake the initial migrations:$ python manage.py migrate accounts 0001 --fake $ python manage.py migrate harvest 0001 --fake $ python manage.py migrate publication 0001 --fake
After this step, you should be able to use South migrations normally.
Python dependencies now include Python Imaging Library (PIL). See Install System Dependencies for instructions on the libraries required for JPEG and PNG support.
Profile editing provides an option for users to upload images; this user uploaded content will be stored in the configured MEDIA_ROOT directory. System administrators may wish to revisit the configuration for this Django setting (previously set in
but now included inlocalsettings.py
; seelocalsettings.py.dist
for example configuration).
to add new article review permissions and update the Site Admin group permissions:$ python manage.py syncdb
Added new logic for generating Article MODS from NLM records harvested from PubMed Central. Any existing test records should either be removed and reharvested, or updated as follows. Activate the virtualenv and start the Django console:
$ python manage.py shell
Then run the following to update Articles in the configured repository with NLM xml:
from eulfedora.server import Repository from openemory.publication.models import Article from django.conf import settings repo = Repository(username=settings.FEDORA_MANAGEMENT_USER, password=settings.FEDORA_MANAGEMENT_PASSWORD) for a in repo.get_objects_with_cmodel(Article.ARTICLE_CONTENT_MODEL, type=Article): if a.contentMetadata.exists: try: if str(a.contentMetadata.content): a.descMetadata.content = a.contentMetadata.content.as_article_mods() a.save('populating MODs from NLM xml') except: pass
This release includes new solr fields. Configure a new core and reindex project content into it.
This release includes support for editing inactive Fedora items. This support requires updated Fedora policies. Update Fedora policies while upgrading this package.
Updated Fedora policies provide read access to all OpenEmory content (not published content only) to logged-in users with the "indexer" role. It is recommended to create a Fedora user with an indexer role and configure :mod:`eulindexer` to use this account. For example:
<user name="eulindexer" password="..."> <attribute name="fedoraRole"> <value>indexer</value> </attribute> </user>
This release includes new relational Python modules and database tables. To upgrade, install new python dependencies in your virtualenv:
$ pip install -r pip-install-req.txt
And then update the database with new tables via
:$ python manage.py syncdb
As part of this release, the user profile model has been customized, which entails a database change. If you wish to create profiles for existing Emory LDAP users, run the inituser script with the usernames. You may also want to drop the former ldap profile table,
, as it is no longer in use. Any users created or updated after this upgrade will get the new profiles automatically.
This release includes new relational database tables and fixtures. Upgrade requires a
:$ python manage.py syncdb
This release changes the project solr schema. Before installing the software, set up a new solr core for the project. The solr configuration files will be produced as part of the release. If the URL of this solr core is different from the old one then update it in
. After the updated OpenEmory website is live, reindex the site. Aseulindexer
:$ python manage.py reindex -s openemory