Skip to content

Commit 9acb6cf

Browse files
authored
Merge pull request #327 from BU-ISCIII/develop
Release 1.2.0
2 parents 9ec59c7 + 6ef3a10 commit 9acb6cf

26 files changed

+1190
-426
lines changed

.github/workflows/pypi_publish.yml

-39
Original file line numberDiff line numberDiff line change
@@ -47,42 +47,3 @@ jobs:
4747
path: dist/
4848
- name: Publish to PyPI
4949
uses: pypa/gh-action-pypi-publish@release/v1
50-
51-
github-release:
52-
name: Sign dist with Sigstore and upload to GitHub Release
53-
needs:
54-
- publish-to-pypi
55-
runs-on: ubuntu-latest
56-
permissions:
57-
contents: write
58-
id-token: write
59-
steps:
60-
- name: Download all the dists
61-
uses: actions/download-artifact@v4
62-
with:
63-
name: python-package-distributions
64-
path: dist/
65-
- name: Sign the dists with Sigstore
66-
uses: sigstore/[email protected]
67-
with:
68-
inputs: >-
69-
./dist/*.tar.gz
70-
./dist/*.whl
71-
- name: Create GitHub Release
72-
env:
73-
GITHUB_TOKEN: ${{ github.token }}
74-
run: >-
75-
gh release create
76-
'${{ github.ref_name }}'
77-
--repo '${{ github.repository }}'
78-
--notes ""
79-
- name: Upload artifact signatures to GitHub Release
80-
env:
81-
GITHUB_TOKEN: ${{ github.token }}
82-
# Upload to GitHub Release using the `gh` CLI.
83-
# `dist/` contains the built packages, and the
84-
# sigstore-produced signatures and certificates.
85-
run: >-
86-
gh release upload
87-
'${{ github.ref_name }}' dist/**
88-
--repo '${{ github.repository }}'

.github/workflows/python_lint.yml

+21
Original file line numberDiff line numberDiff line change
@@ -21,8 +21,19 @@ jobs:
2121
uses: actions/checkout@master
2222
- name: Install flake8
2323
run: pip install flake8
24+
- name: Check for Python file changes
25+
id: file_check
26+
uses: tj-actions/changed-files@v44
27+
with:
28+
sha: ${{ github.event.pull_request.head.sha }}
29+
files: |
30+
**.py
2431
- name: Run flake8
32+
if: steps.file_check.outputs.any_changed == 'true'
2533
run: flake8 --ignore E501,W503,E203,W605
34+
- name: No Python files changed
35+
if: steps.file_check.outputs.any_changed != 'true'
36+
run: echo "No Python files have been changed."
2637

2738
black_lint:
2839
runs-on: ubuntu-latest
@@ -31,5 +42,15 @@ jobs:
3142
uses: actions/checkout@v2
3243
- name: Install black in jupyter
3344
run: pip install black[jupyter]
45+
- name: Check for Python file changes
46+
id: file_check
47+
uses: tj-actions/changed-files@v44
48+
with:
49+
sha: ${{ github.event.pull_request.head.sha }}
50+
files: '**.py'
3451
- name: Check code lints with Black
52+
if: steps.file_check.outputs.any_changed == 'true'
3553
uses: psf/black@stable
54+
- name: No Python files changed
55+
if: steps.file_check.outputs.any_changed != 'true'
56+
run: echo "No Python files have been changed."

.github/workflows/test_sftp_handle.yml

+7-16
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,14 @@
11
name: test_sftp_handle
22

33
on:
4-
push:
5-
branches: "**"
64
pull_request_target:
7-
types: [opened, reopened, synchronize, closed]
5+
types: [opened, reopened, synchronize]
86
branches: "**"
9-
7+
8+
concurrency:
9+
group: ${{ github.repository }}-test_sftp_handle
10+
cancel-in-progress: false
11+
1012
jobs:
1113
security_check:
1214
runs-on: ubuntu-latest
@@ -24,21 +26,10 @@ jobs:
2426
echo "Current permission level is ${{ steps.checkAccess.outputs.user-permission }}"
2527
echo "Job originally triggered by ${{ github.actor }}"
2628
exit 1
27-
28-
sleep_to_ensure_concurrency:
29-
needs: security_check
30-
runs-on: ubuntu-latest
31-
steps:
32-
- name:
33-
run: sleep 10s
34-
shell: bash
3529
3630
test_sftp_handle:
37-
needs: [security_check, sleep_to_ensure_concurrency]
31+
needs: security_check
3832
if: github.repository_owner == 'BU-ISCIII'
39-
concurrency:
40-
group: ${{ github.repository }}-test_sftp_handle
41-
cancel-in-progress: false
4233
runs-on: ubuntu-latest
4334
strategy:
4435
max-parallel: 1

CHANGELOG.md

+45-3
Original file line numberDiff line numberDiff line change
@@ -4,29 +4,71 @@ All notable changes to this project will be documented in this file.
44

55
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
66

7-
## [1.X.Xdev] - 2024-XX-XX : https://github.com/BU-ISCIII/relecov-tools/releases/tag/1.X.X
7+
## [1.2.0] - 2024-10-11 : https://github.com/BU-ISCIII/relecov-tools/releases/tag/1.2.0
88

99
### Credits
1010

11-
Code contributions to the hotfix:
11+
Code contributions to the release:
12+
13+
- [Juan Ledesma](https://github.com/juanledesma78)
14+
- [Pablo Mata](https://github.com/Shettland)
15+
- [Sergio Olmos](https://github.com/OPSergio)
1216

1317
### Modules
1418

19+
- Included wrapper module to launch download, read-lab-metadata and validate processes sequentially [#322](https://github.com/BU-ISCIII/relecov-tools/pull/322)
20+
- Changed launch-pipeline name for pipeline-manager when tools are used via CLI [#324](https://github.com/BU-ISCIII/relecov-tools/pull/324)
21+
1522
#### Added enhancements
1623

24+
- Now also check for gzip file integrity after download. Moved cleaning process to end of workflow [#313](https://github.com/BU-ISCIII/relecov-tools/pull/313)
25+
- Introduced a decorator in sftp_client.py to reconnect when conection is lost [#313](https://github.com/BU-ISCIII/relecov-tools/pull/313)
26+
- Add Hospital Universitari Doctor Josep Trueta to laboratory_address.json [#316] (https://github.com/BU-ISCIII/relecov-tools/pull/316)
27+
- samples_data json file is no longer mandatory as input in read-lab-metadata [#314](https://github.com/BU-ISCIII/relecov-tools/pull/314)
28+
- Included handling of alternative column names to support two distinct headers using the same schema in read-lab-metadata [#314](https://github.com/BU-ISCIII/relecov-tools/pull/314)
29+
- Included a new hospital (Hospital Universitario Araba) to laboratory_address.json [#315](https://github.com/BU-ISCIII/relecov-tools/pull/315)
30+
- More accurate cleaning process, skipping only sequencing files instead of whole folder [#321](https://github.com/BU-ISCIII/relecov-tools/pull/321)
31+
- Now single logs summaries are also created for each folder during download [#321](https://github.com/BU-ISCIII/relecov-tools/pull/321)
32+
- Introduced handling for missing/dup files and more accurate information in prompt for pipeline_manager [#321](https://github.com/BU-ISCIII/relecov-tools/pull/321)
33+
- Included excel resize, brackets removal in messages and handled exceptions in log_summary.py [#322](https://github.com/BU-ISCIII/relecov-tools/pull/322)
34+
- Included processed batchs and samples in read-bioinfo-metadata log summary [#324](https://github.com/BU-ISCIII/relecov-tools/pull/324)
35+
- When no samples_data.json is given, read-lab-metadata now creates a new one [#324](https://github.com/BU-ISCIII/relecov-tools/pull/324)
36+
- Handling for missing sample ids in read-lab-metadata [#324](https://github.com/BU-ISCIII/relecov-tools/pull/324)
37+
- Better logging for download, read-lab-metadata and wrapper [#324](https://github.com/BU-ISCIII/relecov-tools/pull/324)
38+
1739
#### Fixes
1840

41+
- Fixed wrong city name in relecov_tools/conf/laboratory_address.json [#320](https://github.com/BU-ISCIII/relecov-tools/pull/320)
42+
- Fixed wrong single-paired layout detection in metadata due to Capital letters [#321](https://github.com/BU-ISCIII/relecov-tools/pull/321)
43+
- Error handling in merge_logs() and create_logs_excel() methods for log_summary.py [#322](https://github.com/BU-ISCIII/relecov-tools/pull/322)
44+
- Included handling of multiple empty rows in metadata xlsx file [#322](https://github.com/BU-ISCIII/relecov-tools/pull/322)
45+
1946
#### Changed
2047

48+
- Renamed and refactored "bioinfo_lab_heading" for "alt_header_equivalences" in configuration.json [#314](https://github.com/BU-ISCIII/relecov-tools/pull/314)
49+
- Included a few schema fields that were missing or outdated, related to bioinformatics results [#314](https://github.com/BU-ISCIII/relecov-tools/pull/314)
50+
- Updated metadata excel template, moved to relecov_tools/assets [#320](https://github.com/BU-ISCIII/relecov-tools/pull/320)
51+
- Now python lint only triggers when PR includes python files [#320](https://github.com/BU-ISCIII/relecov-tools/pull/320)
52+
- Moved concurrency to whole workflow instead of each step in test_sftp-handle.yml [#320](https://github.com/BU-ISCIII/relecov-tools/pull/320)
53+
- Updated test_sftp-handle.yml testing datasets [#320](https://github.com/BU-ISCIII/relecov-tools/pull/320)
54+
- Now download skips folders containing "invalid_samples" in its name [#321](https://github.com/BU-ISCIII/relecov-tools/pull/321)
55+
- read-lab-metadata: Some warnings now include label. Also removed trailing spaces [#322](https://github.com/BU-ISCIII/relecov-tools/pull/322)
56+
- Renamed launch-pipeline for pipeline-manager and updated keys in configuration.json [#324](https://github.com/BU-ISCIII/relecov-tools/pull/324)
57+
- Pipeline manager now splits data based on enrichment_panel and version. One folder for each group [#324](https://github.com/BU-ISCIII/relecov-tools/pull/324)
58+
2159
#### Removed
2260

61+
- Removed duplicated tests with pushes after PR was merged in test_sftp-handle [#312](https://github.com/BU-ISCIII/relecov-tools/pull/312)
62+
- Deleted deprecated auto-release in pypi_publish as it does not work with tag pushes anymore [#312](https://github.com/BU-ISCIII/relecov-tools/pull/312)
63+
- Removed first sleep time for reconnection decorator in sftp_client.py, sleep time now increases in the second attempt [#321](https://github.com/BU-ISCIII/relecov-tools/pull/321)
64+
2365
### Requirements
2466

2567
## [1.1.0] - 2024-09-13 : https://github.com/BU-ISCIII/relecov-tools/releases/tag/1.1.0
2668

2769
### Credits
2870

29-
Code contributions to the hotfix:
71+
Code contributions to the release:
3072

3173
- [Pablo Mata](https://github.com/Shettland)
3274
- [Sara Monzón](https://github.com/saramonzon)

README.md

+31-14
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,8 @@ relecov-tools is a set of helper tools for the assembly of the different element
2424
- [upload-to-ena](#upload-to-ena)
2525
- [upload-to-gisaid](#upload-to-gisaid)
2626
- [update-db](#update-db)
27-
- [launch-pipeline](#launch-pipeline)
27+
- [pipeline-manager](#pipeline-manager)
28+
- [wrapper](#wrapper)
2829
- [logs-to-excel](#logs-to-excel)
2930
- [build-schema](#build-schema)
3031
- [Mandatory Fields](#mandatory-fields)
@@ -63,7 +64,7 @@ $ relecov-tools --help
6364
\ \ / |__ / |__ | |___ | | | \ /
6465
/ / \ | \ | | | | | | \ /
6566
/ |--| | \ |___ |___ |___ |___ |___| \/
66-
RELECOV-tools version 1.1.0
67+
RELECOV-tools version 1.2.0
6768
Usage: relecov-tools [OPTIONS] COMMAND [ARGS]...
6869
6970
Options:
@@ -73,16 +74,19 @@ Options:
7374
--help Show this message and exit.
7475
7576
Commands:
76-
download Download files located in sftp server.
77-
read-lab-metadata Create the json compliant to the relecov schema from...
78-
read-bioinfo-metadata Create the json compliant to the relecov schema with Bioinfo Metadata.
79-
validate Validate json file against schema.
80-
map Convert data between phage plus schema to ENA,...
81-
upload-to-ena parsed data to create xml files to upload to ena
82-
upload-to-gisaid parsed data to create files to upload to gisaid
83-
update-db feed database with metadata jsons
84-
build-schema Generates and updates JSON Schema files from...
85-
launch-pipeline Create the symbolic links for the samples which...
77+
download Download files located in sftp server.
78+
read-lab-metadata Create the json compliant to the relecov schema...
79+
validate Validate json file against schema.
80+
map Convert data between phage plus schema to ENA,...
81+
upload-to-ena parse data to create xml files to upload to ena
82+
upload-to-gisaid parsed data to create files to upload to gisaid
83+
update-db upload the information included in json file to...
84+
read-bioinfo-metadata Create the json compliant from the Bioinfo...
85+
metadata-homogeneizer Parse institution metadata lab to the one used...
86+
pipeline-manager Create the symbolic links for the samples which...
87+
wrapper Execute download, read-lab-metadata and validate...
88+
build-schema Generates and updates JSON Schema files from...
89+
logs-to-excel Creates a merged xlsx report from all the log...
8690
```
8791
#### download
8892
The command `download` connects to a transfer protocol (currently sftp) and downloads all files in the different available folders in the passed credentials. In addition, it checks if the files in the current folder match the files in the metadata file and also checks if there are md5sum for each file. Else, it creates one before storing in the final repository.
@@ -247,10 +251,10 @@ Usage: relecov-tools upload-to-gisaid [OPTIONS]
247251
-t, --type Select the type of information to upload to database [sample,bioinfodata,variantdata]
248252
-d, --databaseServer Name of the database server receiving the data [iskylims,relecov]
249253

250-
#### launch-pipeline
254+
#### pipeline-manager
251255
Create the folder structure to execute the given pipeline for the latest sample batches after executing download, read-lab-metadata and validate modules. This module will create symbolic links for each sample and generate the necessary files for pipeline execution using the information from validated_BATCH-NAME_DATE.json.
252256
```
253-
Usage: relecov-tools launch-pipeline [OPTIONS]
257+
Usage: relecov-tools pipeline-manager [OPTIONS]
254258
255259
Create the symbolic links for the samples which are validated to prepare for
256260
bioinformatics pipeline execution.
@@ -263,6 +267,19 @@ Options:
263267
--help Show this message and exit.
264268
```
265269

270+
#### wrapper
271+
Execute download, read-lab-metadata and validate sequentially using a config file to fill the arguments for each one. It also creates a global report with all the logs for the three processes in a user-friendly .xlsx format. The config file should include the name of each module that is executed, along with the necessary parameters in YAML format.
272+
```
273+
Usage: relecov-tools wrapper [OPTIONS]
274+
275+
Executes the modules in config file sequentially
276+
277+
Options:
278+
-c, --config_file PATH Path to config file in yaml format [required]
279+
-o, --output_folder PATH Path to folder where global results are saved [required]
280+
--help Show this message and exit.
281+
```
282+
266283
#### logs-to-excel
267284
Creates an xlsx file with all the entries found for a specified laboratory in a given set of log_summary.json files (from log-summary module). The laboratory name must match the name of one of the keys in the provided logs to work.
268285
```

relecov_tools/__main__.py

+8-6
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@
2424
import relecov_tools.upload_ena_protocol
2525
import relecov_tools.pipeline_manager
2626
import relecov_tools.build_schema
27+
import relecov_tools.dataprocess_wrapper
2728

2829
log = logging.getLogger()
2930

@@ -61,7 +62,7 @@ def run_relecov_tools():
6162
)
6263

6364
# stderr.print("[green] `._,._,'\n", highlight=False)
64-
__version__ = "1.1.0"
65+
__version__ = "1.2.0"
6566
stderr.print(
6667
"\n" "[grey39] RELECOV-tools version {}".format(__version__), highlight=False
6768
)
@@ -476,12 +477,12 @@ def metadata_homogeneizer(institution, directory, output):
476477
help="select the template config file",
477478
)
478479
@click.option("-o", "--output", type=click.Path(), help="select output folder")
479-
def launch_pipeline(input, template, output, config):
480+
def pipeline_manager(input, template, output, config):
480481
"""
481482
Create the symbolic links for the samples which are validated to prepare for
482483
bioinformatics pipeline execution.
483484
"""
484-
new_launch = relecov_tools.pipeline_manager.LaunchPipeline(
485+
new_launch = relecov_tools.pipeline_manager.PipelineManager(
485486
input, template, output, config
486487
)
487488
new_launch.pipeline_exc()
@@ -565,22 +566,23 @@ def logs_to_excel(lab_code, output_folder, files):
565566
logsum = relecov_tools.log_summary.LogSum(output_location=output_folder)
566567
merged_logs = logsum.merge_logs(key_name=lab_code, logs_list=all_logs)
567568
final_logs = logsum.prepare_final_logs(logs=merged_logs)
568-
logsum.create_logs_excel(logs=final_logs)
569+
excel_outpath = os.path.join(output_folder, lab_code + "_logs_report.xlsx")
570+
logsum.create_logs_excel(logs=final_logs, excel_outpath=excel_outpath)
569571

570572

571573
@relecov_tools_cli.command(help_priority=16)
572574
@click.option(
573575
"-c",
574576
"--config_file",
575577
type=click.Path(),
576-
help="Path to config file in yaml format",
578+
help="Path to config file in yaml format [required]",
577579
required=True,
578580
)
579581
@click.option(
580582
"-o",
581583
"--output_folder",
582584
type=click.Path(),
583-
help="Path to the base schema file. This file is used as a reference to compare it with the schema generated using this module. (Default: installed schema in 'relecov-tools/relecov_tools/schema/relecov_schema.json')",
585+
help="Path to folder where global results are saved [required]",
584586
required=False,
585587
)
586588
def wrapper(config_file, output_folder):
Binary file not shown.

0 commit comments

Comments
 (0)