Skip to content

Commit

Permalink
Merge pull request #1 from analysiscenter/release
Browse files Browse the repository at this point in the history
Initial public release
  • Loading branch information
roman-kh authored Nov 22, 2017
2 parents d654f72 + fd116eb commit 75d6689
Show file tree
Hide file tree
Showing 69 changed files with 16,459 additions and 3 deletions.
6 changes: 6 additions & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# Set the default behavior, in case people don't have core.autocrlf set.
* text=auto

# Explicitly declare text files you want to always be normalized on checkout.
*.py text
*.sh text
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
*.pyc
.cache/*
__pycache__
__pycache__/*
*/__pycache__/*
.ipynb_checkpoints
3 changes: 3 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[submodule "cardio/dataset"]
path = cardio/dataset
url = https://github.com/analysiscenter/dataset.git
38 changes: 38 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
- Перед любыми операциями с репозиториями у каждого пользователя должно быть настроено имя и адрес почты:
```bash
git config --global user.name "Firstname Lastnameov"
git config --global user.email [email protected]
```
Причем email **должен совпадать** с email'ом, который указан в вашем github-аккаунте (в нем может быть несколько email'ов).

- В корневом каталоге каждого репозитория должен быть размещен файл README.md с кратким описанием проекта, структуры исходного кода, инструкцией по установке и ссылками на документацию.

- Все содержательные файлы рекомендуется размещать в подкаталогах, а в корневом хранить только описательные (README.md, INSTALL.md и т.п.),
инсталляционные (setup.py, requirements.txt и т.п.), а также конфигурационные и make-файлы.

- Имена файлов должны содержать только латинские буквы. Пробелы в наименованиях файлов не допускаются.

- Коммиты в ветку `master` не допускаются. Она должна быть защищена от удаления и изменения истории
(Settings - Branches - Protected branches).

- Изменения в исходном коде и файлах репозитория рекомендуется производить только в рамках задач (issues).
Для каждого изменения исполнитель открывает отдельную ветку с наименованием вида <iTASK-ID>-<short branch name> (например, `i15-dataset` или `i22-HMM`).

- В рамках одной задачи можно создавать несколько веток в одном репозитории.

- Если у вас нет задачи, имеет смысл ее открыть и явным образом завести в issues.

- Коммиты в рабочие ветки рекомендуется делать регулярно, чтобы каждый коммит содержал не слишком объемные,
но вместе с тем завершенные и независимые от всего остального изменения в репозитории
(лучше закоммитить 3 измененных строки, чем сразу 300).

- Коммит должен содержать однострочный англоязычный комментарий (длиной 20-60 символов),
отражающий содержание включенных в него изменений исходного кода и файлов.

- Более подробное описание изменений следует сохранять в файле HISTORY.md, размещенном в корневом каталоге репозитория.

- Выполнив задачу и завершив все изменения, исполнитель открывает pull request на слияние рабочей и продуктивной ветки (например, master).

- Перед слиянием рабочая ветка не должна отставать от продуктивной (что можно проверить с помощью `git status`). Для этого следует предварительно синхронизировать рабочую ветку (`git pull`).

- После слияния рабочая ветка удаляется.
4 changes: 2 additions & 2 deletions LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -178,15 +178,15 @@
APPENDIX: How to apply the Apache License to your work.

To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
boilerplate notice, with the fields enclosed by brackets "{}"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.

Copyright [yyyy] [name of copyright owner]
Copyright {yyyy} {name of copyright owner}

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
Expand Down
16 changes: 16 additions & 0 deletions MANIFEST.in
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
include MANIFEST.in
include LICENSE
include README.md
include setup.py

recursive-include cardio *
recursive-include docs *
recursive-include tutorials *
recursive-exclude docs/_build *

global-exclude *.pyc *.pyo *.pyd
global-exclude *.git
global-exclude *.so
global-exclude *~
global-exclude \#*
global-exclude .DS_Store
98 changes: 97 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1 +1,97 @@
# cardio
# CardIO

CardIO is a library that works with electrocardiograms (ECG). With CardIO you can

* load and save signal in various formats
* resample, crop, filter and flip signal
* allocate PQ, QT, QRS segments
* calculate heart rate and other standard ECG characteristics
* apply complex transformations like fft and wavelets, or any other custom functions.
* recognize heart diseases from ECG
* efficiently work with large datasets that do not even fit into memory
* easily arrange new custom actions into pipelines
* do end-to-end ECG processing
* build, train and test custom models for deep research

… and do everything under a single API.

For more details see [the documentation and tutorials](https://analysiscenter.github.io/cardio/).

## About CardIO

The library is based on [Dataset](https://github.com/analysiscenter/dataset/). We suggest to read Dataset's [documentation](https://analysiscenter.github.io/dataset/) to learn more.

CardIO has three modules: [```batch```](https://analysiscenter.github.io/cardio/intro/batch.html) [```models```](https://analysiscenter.github.io/cardio/intro/models.html) and [```pipelines```](https://analysiscenter.github.io/cardio/intro/pipeline.html).

Module ```batch``` contains low-level actions for ECG processing.
Actions are included in ```EcgBatch``` class that also defines how
to store ECGs. From these actions you can biuld new pipelines. You can also
write custom action and include it in ```EcgBatch```.

In ```models``` we provide several models that were elaborated to learn the most important problems in ECG:
* how to recognize specific features of ECG like R-peaks, P-wave, T-wave
* how to recognize heart diseases from ECG, for example - atrial fibrillation.

Module ```pipelines``` contains high-level methods that
* train model to allocate PQ, QT, QRS segments
* calculate heart rate
* train model to find probabilities of heart diseases.

Under the hood these methods contain many actions that load signals, filter it and do complex caclulations. Using pipelines you do not think about this part of work and simply pass ECG datasets and get results.

## Basic usage

Here is an example of pipeline that loads ECG signals, makes some preprocessing and learns model over 50 epochs.
```python
train_ppl = (
dtst.train
.pipeline
.init_model("dynamic", DirichletModel, name="dirichlet",
config=model_config)
.init_variable("loss_history", init=list)
.load(components=["signal", "meta"], fmt="wfdb")
.load(components="target", fmt="csv", src=LABELS_PATH)
.drop_labels(["~"])
.replace_labels({"N": "NO", "O": "NO"})
.flip_signals()
.random_resample_signals("normal", loc=300, scale=10)
.random_split_signals(2048, {"A": 9, "NO": 3})
.binarize_labels()
.train_model("dirichlet", make_data=make_data, fetches="loss", save_to=V("loss_history"), mode="a")
.run(batch_size=100, shuffle=True, drop_last=True, n_epochs=50)
)
```

As a result of this pipeline one obtains a trained model.

## Installation

> `CardIO` module is in the beta stage. Your suggestions and improvements are very welcome.
> `CardIO` supports python 3.5 or higher.
### Installation as python package

With [pipenv](https://docs.pipenv.org/):

pipenv install git+https://github.com/analysiscenter/cardio.git#egg=cardio

With [pip](https://pip.pypa.io/en/stable/):

pip3 install git+https://github.com/analysiscenter/cardio.git

After that just import `cardio`:
```python
import cardio
```

### Installation as a project repository:

git clone --recursive https://github.com/analysiscenter/ecg.git

Flag `--recursive` is used to clone submodules.

## Citing CardIO
Please cite CardIO in your publications if it helps your research.

Khudorozhkov R., Illarionov E., Kuvaev A., Podvyaznikov D. CardIO library for data science research of heart signals. 2017.
8 changes: 8 additions & 0 deletions cardio/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
""" ECG package """
import sys

from .batch import * # pylint: disable=wildcard-import
from . import dataset # pylint: disable=wildcard-import


__version__ = '0.1.0'
3 changes: 3 additions & 0 deletions cardio/batch/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
""" ECG Batch """
from .ecg_batch import EcgBatch
from .ecg_dataset import EcgDataset
Loading

0 comments on commit 75d6689

Please sign in to comment.