Skip to content
Open
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions doajtest/testbook/article_xml_upload/article_doaj_xml_upload.yml
Original file line number Diff line number Diff line change
Expand Up @@ -357,3 +357,24 @@ tests:
- Related background job is found
- status is "complete"
- Outcome Status is "success"

- title: Upload a file before OA start date of the Journal
context:
role: publisher
steps:
- step: Make sure the OA start date of the journal in in future
- step: Go to the "Upload Article XML" tab in the "Publisher Area"
- step: Select "Choose file" and select the test resource file "no_issn.xml"
resource: /xml_upload_test_package/DOAJ/successful.xml
- step: Click "Upload"
results:
- 'A flash message appears at the top of the screen indicating a successful upload:
File uploaded and waiting to be processed. Check back here for updates.(Dismiss)'
- Your file is shown in the "History of uploads" with status "pending"
- step: wait a short amount of time for the job to process, then reload the page
(do not re-submit the form data). If the job remains in "pending", reload the
page until the status changes.
results:
- Your file is shown in the "History of uploads" with status "processing failed"
and a entry in the "Notes" and reads as 'Article(s) cannot be uploaded before OA start date of the Journal'. Check that the explanation link goes to
a suitable reason and resolution for the problem.
75 changes: 74 additions & 1 deletion doajtest/unit/test_tasks_ingestDOAJarticles.py
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran the whole test suite and got a number of failures, not related to this file, I think there's a general problem with the fixtures for some of the tests:

FAILED test_bll_article_batch_create_article.py::TestBLLArticleBatchCreateArticle::test_01_batch_create_article_033_34 - portality.bll.exceptions.ArticleNotAcceptable: Article(s) 'Article Title' cannot be uploaded before OA start date of the Journal
FAILED test_bll_article_batch_create_article.py::TestBLLArticleBatchCreateArticle::test_01_batch_create_article_089_90 - portality.bll.exceptions.ArticleNotAcceptable: Article(s) 'Article Title' cannot be uploaded before OA start date of the Journal
FAILED api_tests/test_api_crud_article.py::TestCrudArticle::test_02_create_article_success - portality.api.common.Api400Error: Article(s) '{title}' cannot be uploaded before OA start date of the Journal
FAILED api_tests/test_api_crud_article.py::TestCrudArticle::test_08_update_article_success - portality.api.common.Api400Error: Article(s) '{title}' cannot be uploaded before OA start date of the Journal
FAILED api_tests/test_api_crud_article.py::TestCrudArticle::test_09_update_article_fail - portality.api.common.Api400Error: Article(s) '{title}' cannot be uploaded before OA start date of the Journal
FAILED api_tests/test_api_crud_article.py::TestCrudArticle::test_10_delete_article_success - portality.api.common.Api400Error: Article(s) '{title}' cannot be uploaded before OA start date of the Journal
FAILED api_tests/test_api_crud_article.py::TestCrudArticle::test_11_delete_article_fail - portality.api.common.Api400Error: Article(s) '{title}' cannot be uploaded before OA start date of the Journal
FAILED api_tests/test_api_crud_article.py::TestCrudArticle::test_12_too_many_keywords - portality.api.common.Api400Error: Article(s) '{title}' cannot be uploaded before OA start date of the Journal
FAILED api_tests/test_api_bulk_article_class.py::TestBulkArticle::test_01_create_articles_success - portality.api.common.Api400Error: Article(s) 'Article Title' cannot be uploaded before OA start date of the Journal
FAILED api_tests/test_api_crud_article.py::TestCrudArticle::test_14_test_via_endpoint - assert 400 == 201
FAILED api_tests/test_api_crud_article.py::TestCrudArticle::test_15_no_redirects - assert 400 == 201
FAILED api_tests/test_api_bulk_article_class.py::TestBulkArticle::test_04_delete_article_success - portality.api.common.Api400Error: Article(s) 'Article Title' cannot be uploaded before OA start date of the Journal
FAILED api_tests/test_api_bulk_article_class.py::TestBulkArticle::test_05_delete_articles_fail - portality.api.common.Api400Error: Article(s) 'Article Title' cannot be uploaded before OA start date of the Journal
FAILED api_tests/test_api_crud_article.py::TestCrudArticle::test_16_no_redirects - assert 400 == 201
FAILED test_bll_article_create_article.py::TestBLLArticleCreateArticle::test_01_create_article_0_1 - portality.bll.exceptions.ArticleBeforeOAStartDate: Article(s) '{title}' cannot be uploaded before OA start date of the Journal
FAILED test_bll_article_create_article.py::TestBLLArticleCreateArticle::test_01_create_article_2_3 - portality.bll.exceptions.ArticleBeforeOAStartDate: Article(s) '{title}' cannot be uploaded before OA start date of the Journal
FAILED api_tests/test_api_bulk_article_class.py::TestBulkArticle::test_06_test_via_endpoint - TimeoutError: fail to meet the condition within the timeout period.
FAILED test_bll_article_create_article.py::TestBLLArticleCreateArticle::test_01_create_article_8_9 - portality.bll.exceptions.ArticleBeforeOAStartDate: Article(s) '{title}' cannot be uploaded before OA start date of the Journal
FAILED application_processors/test_application_processor_emails.py::TestUpdateRequestReviewEmails::test_01_maned_review_emails - assert 1 == 3
FAILED api_tests/test_api_bulk_article_class.py::TestBulkArticle::test_07_v1_no_redirects - TimeoutError: fail to meet the condition within the timeout period.
FAILED test_paths.py::TestPaths::test_get_project_root - AssertionError: assert 'doaj3' == 'doaj'
FAILED test_create_article.py::TestCreateOrUpdateArticle::test_00_no_doi_and_url_changed - portality.bll.exceptions.ArticleBeforeOAStartDate: Article(s) '{title}' cannot be uploaded before OA start date of the Journal
FAILED api_tests/test_api_bulk_article_class.py::TestBulkArticle::test_08_v2_no_redirects - TimeoutError: fail to meet the condition within the timeout period.
FAILED test_update_article.py::TestCreateOrUpdateArticle::test_00_no_doi_and_url_changed - portality.bll.exceptions.ArticleBeforeOAStartDate: Article(s) '{title}' cannot be uploaded before OA start date of the Journal
FAILED test_create_article.py::TestCreateOrUpdateArticle::test_01_new_doi_new_url - portality.bll.exceptions.ArticleBeforeOAStartDate: Article(s) '{title}' cannot be uploaded before OA start date of the Journal
FAILED test_update_article.py::TestCreateOrUpdateArticle::test_01_new_doi_new_url - portality.bll.exceptions.ArticleBeforeOAStartDate: Article(s) '{title}' cannot be uploaded before OA start date of the Journal
FAILED test_create_article.py::TestCreateOrUpdateArticle::test_04_old_doi_new_url - portality.bll.exceptions.ArticleBeforeOAStartDate: Article(s) '{title}' cannot be uploaded before OA start date of the Journal
FAILED test_update_article.py::TestCreateOrUpdateArticle::test_04_old_doi_new_url - portality.bll.exceptions.ArticleBeforeOAStartDate: Article(s) '{title}' cannot be uploaded before OA start date of the Journal
FAILED test_create_article.py::TestCreateOrUpdateArticle::test_05_new_doi_old_url - portality.bll.exceptions.ArticleBeforeOAStartDate: Article(s) '{title}' cannot be uploaded before OA start date of the Journal
FAILED test_update_article.py::TestCreateOrUpdateArticle::test_05_new_doi_old_url - portality.bll.exceptions.ArticleBeforeOAStartDate: Article(s) '{title}' cannot be uploaded before OA start date of the Journal
FAILED test_admin_edit_metadata.py::TestAdminEditMetadata::test_02_update_article_metadata_no_url_fulltext - portality.bll.exceptions.ArticleBeforeOAStartDate: Article(s) '{title}' cannot be uploaded before OA start date of the Journal
FAILED test_admin_edit_metadata.py::TestAdminEditMetadata::test_03_update_fulltext_valid - portality.bll.exceptions.ArticleBeforeOAStartDate: Article(s) '{title}' cannot be uploaded before OA start date of the Journal
FAILED test_admin_edit_metadata.py::TestAdminEditMetadata::test_05_update_doi_valid - portality.bll.exceptions.ArticleBeforeOAStartDate: Article(s) '{title}' cannot be uploaded before OA start date of the Journal
FAILED api_tests/test_api_crud_returnvalues.py::TestCrudReturnValues::test_03_articles_crud - AssertionError: 400
FAILED event_consumers/test_update_request_publisher_accepted_notify.py::TestUpdateRequestPublisherAcceptedNotify::test_should_consume - assert False
FAILED event_consumers/test_update_request_publisher_assigned_notify.py::TestApplicationPublisherAssignedNotify::test_should_consume - assert False
FAILED event_consumers/test_update_request_publisher_rejected_notify.py::TestUpdateRequestPublisherRejectedNotify::test_should_consume - assert False
FAILED event_consumers/test_update_request_publisher_submitted_notify.py::TestUpdateRequestPublisherSubmittedNotify::test_should_consume - AssertionError: assert False
FAILED test_tasks_harvest.py::TestHarvester::test_harvest - TimeoutError: fail to meet the condition within the timeout period.

Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@
from portality.crosswalks import article_doaj_xml
from portality.tasks import ingestarticles
from portality.ui.messages import Messages

from portality.lib import dates

class TestIngestArticlesDoajXML(DoajTestCase):

Expand Down Expand Up @@ -1048,3 +1048,76 @@ def test_61_journal_not_indoaj(self):

assert file_upload.status == "failed"
assert file_upload.error == Messages.EXCEPTION_ADDING_ARTICLE_TO_WITHDRAWN_JOURNAL

def test_62_article_before_oa_start(self):
journal = article_upload_tester.create_simple_journal("testowner", pissn="1234-5678", eissn="9876-5432")
journal.bibjson().oa_start = dates.now().year
helpers.save_all_block_last([ journal,
article_upload_tester.create_simple_publisher("testowner")
])

# make both handles, as we want as little gap as possible between requests in a moment
handle1 = DoajXmlArticleFixtureFactory.upload_2_issns_correct()

f1 = FileMockFactory(stream=handle1)

job1 = ingestarticles.IngestArticlesBackgroundTask.prepare("testowner", schema="doaj", upload_file=f1)
id1 = job1.params.get("ingest_articles__file_upload_id")
self.cleanup_ids.append(id1)

# because file upload gets created and saved by prepare
time.sleep(1)

task1 = ingestarticles.IngestArticlesBackgroundTask(job1)

task1.run()

# because file upload needs to be re-saved
time.sleep(1)

fu1 = models.FileUpload.pull(id1)

assert fu1.status == "failed", "received status: {}".format(fu1.status)
assert job1.outcome_status == "fail"

assert any('Articles before OA start date: Imaginaires autochtones contemporains. Introduction' in entry['message'] for entry in
job1.audit), "No message found with 'Articles before OA start date'"

# check that article not created
assert models.Article.count_by_issns(["1234-5678", "9876-5432"]) == 0

def test_63_article_after_oa_start(self):
journal = article_upload_tester.create_simple_journal("testowner", pissn="1234-5678", eissn="9876-5432")
journal.bibjson().oa_start = 2013
helpers.save_all_block_last([ journal,
article_upload_tester.create_simple_publisher("testowner")
])

# make both handles, as we want as little gap as possible between requests in a moment
handle1 = DoajXmlArticleFixtureFactory.upload_2_issns_correct()

f1 = FileMockFactory(stream=handle1)

job1 = ingestarticles.IngestArticlesBackgroundTask.prepare("testowner", schema="doaj", upload_file=f1)
id1 = job1.params.get("ingest_articles__file_upload_id")
self.cleanup_ids.append(id1)

# because file upload gets created and saved by prepare
time.sleep(1)

task1 = ingestarticles.IngestArticlesBackgroundTask(job1)

task1.run()

# because file upload needs to be re-saved
time.sleep(1)

fu1 = models.FileUpload.pull(id1)

assert fu1.status == "processed", "received status: {}".format(fu1.status)

assert not any('Articles before OA start date: Imaginaires autochtones contemporains. Introduction' in entry['message'] for entry in
job1.audit), "No message found with 'Articles before OA start date'"

# check that article not created
assert models.Article.count_by_issns(["1234-5678", "9876-5432"]) == 1
6 changes: 6 additions & 0 deletions portality/bll/exceptions.py
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,12 @@ def __str__(self):
super(ArticleNotAcceptable, self).__str__()
return self.message

class ArticleBeforeOAStartDate(ArticleNotAcceptable):
"""
Exception to raise when the article is uploaded before OA start date of the Journal
"""
pass

class ArticleMergeConflict(Exception):
"""
Exception to raise when it's not clear which article to merge an update with
Expand Down
17 changes: 15 additions & 2 deletions portality/bll/services/article.py
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@ def batch_create_articles(self, articles, account, duplicate_check=True, merge_d
all_shared = set()
all_unowned = set()
all_unmatched = set()
all_before_oa_start_date = set()

# Hold on to the exception so we can raise it later
e_not_acceptable = None
Expand All @@ -70,6 +71,11 @@ def batch_create_articles(self, articles, account, duplicate_check=True, merge_d
dry_run=True)
except (exceptions.ArticleMergeConflict, exceptions.ConfigurationException):
raise exceptions.IngestException(message=Messages.EXCEPTION_ARTICLE_BATCH_CONFLICT)
except exceptions.ArticleBeforeOAStartDate as e:
all_before_oa_start_date.add(article.bibjson().title)
result = {'fail': 1}
e.message = e.message.format(title=",".join(list(all_before_oa_start_date)))
e_not_acceptable = e
except exceptions.ArticleNotAcceptable as e:
# The ArticleNotAcceptable exception is a superset of reasons we can't match a journal to this article
e_not_acceptable = e
Expand All @@ -84,7 +90,7 @@ def batch_create_articles(self, articles, account, duplicate_check=True, merge_d
all_unmatched.update(result.get("unmatched", set()))

report = {"success": success, "fail": fail, "update": update, "new": new, "shared": all_shared,
"unowned": all_unowned, "unmatched": all_unmatched}
"unowned": all_unowned, "unmatched": all_unmatched, "before_oa_start_date":all_before_oa_start_date}

# if there were no failures in the batch, then we can do the save
if fail == 0:
Expand Down Expand Up @@ -235,8 +241,15 @@ def create_article(self, article, account, duplicate_check=True, merge_duplicate
except (exceptions.DuplicateArticleException, exceptions.ArticleMergeConflict, exceptions.ConfigurationException) as e:
raise e

# Check if article is uploaded before OA start date of Journal and reject the article
journal = article.get_journal()
published_year = int(article.bibjson().year)
oa_start_date = journal.has_oa_start_date()
if oa_start_date and published_year < oa_start_date:
raise exceptions.ArticleBeforeOAStartDate(message=Messages.EXCEPTION_ARTICLE_BEFORE_OA_START_DATE)

if add_journal_info:
article.add_journal_metadata()
article.add_journal_metadata(j=journal)

# finally, save the new article
if not dry_run:
Expand Down
3 changes: 3 additions & 0 deletions portality/tasks/helpers/articles_upload_helper.py
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,7 @@ def upload_process(articles_upload: BaseArticlesUpload,
shared = result.get("shared", [])
unowned = result.get("unowned", [])
unmatched = result.get("unmatched", [])
before_oa_start_date = result.get("before_oa_start_date", [])

if success == 0 and fail > 0 and not ingest_exception:
articles_upload.failed("All articles in file failed to import")
Expand All @@ -99,6 +100,8 @@ def upload_process(articles_upload: BaseArticlesUpload,
job.add_audit_message("Shared ISSNs: " + ", ".join(list(shared)))
job.add_audit_message("Unowned ISSNs: " + ", ".join(list(unowned)))
job.add_audit_message("Unmatched ISSNs: " + ", ".join(list(unmatched)))
if len(before_oa_start_date) > 0:
job.add_audit_message("Articles before OA start date: " + ", ".join(list(before_oa_start_date)))

if new:
ids = [a.id for a in articles]
Expand Down
11 changes: 11 additions & 0 deletions portality/templates-v2/public/publisher/xml_help.html
Original file line number Diff line number Diff line change
Expand Up @@ -226,6 +226,17 @@ <h2 id="explanations">Explanation of XML errors</h2>
Ensure that all your articles have the correct DOIs and full-text links. If it still doesn’t work please <a href="https://github.com/DOAJ/doaj/issues/new/choose" target="_blank" rel="noopener">submit a bug report</a> or <a href="{{ url_for('doaj.contact') }}">contact us</a> with the details; we may need to clean up your existing articles manually.
</td>
</tr>
<tr>
<th>
<code>Article(s) cannot be uploaded before OA start date of the Journal</code>
</th>
<td>
One or more articles in the xml has the <strong>publicationDate</strong> which is before OpenAccess start date of the Journal.
</td>
<td>
Ensure that all your articles are uploaded on or after OpenAccess start date of the Journal.
</td>
</tr>
</tbody>
</table>
{% endblock %}
1 change: 1 addition & 0 deletions portality/ui/messages.py
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,7 @@ class Messages(object):
EXCEPTION_IDENTICAL_PISSN_AND_EISSN = "The Print and Online ISSNs supplied are identical. If you supply two ISSNs, they must be different."
EXCEPTION_NO_ISSNS = "Neither the Print ISSN nor Online ISSN have been supplied. DOAJ requires at least one ISSN."
EXCEPTION_INVALID_BIBJSON = "Invalid article bibjson: " # + Dataobj exception message
EXCEPTION_ARTICLE_BEFORE_OA_START_DATE = "Article(s) '{title}' cannot be uploaded before OA start date of the Journal"

EXCEPTION_IDENTIFIER_CHANGE_CLASH = "DOI or Fulltext URL has been changed to match another article that already exists in DOAJ"
EXCEPTION_IDENTIFIER_CHANGE = "Either the DOI or Fulltext URL has been changed. This operation is not permitted; please contact an administrator for help."
Expand Down