Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync main from soda-core #9

Open
wants to merge 112 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
112 commits
Select commit Hold shift + click to select a range
5164d34
Catch exceptions while building results file (#1936)
m1n0 Sep 13, 2023
71dfe19
[pre-commit.ci] pre-commit autoupdate (#1935)
pre-commit-ci[bot] Sep 14, 2023
8fa452e
Reference check: support must NOT exist (#1937)
m1n0 Sep 18, 2023
995b4ac
Bump to 3.0.49
m1n0 Sep 19, 2023
67597f2
Add thresholds and diagnostics to scan result (#1939)
m1n0 Sep 21, 2023
8e74d93
Fix databricks numeric types profiling (#1941)
m1n0 Sep 27, 2023
67111aa
Bump to 3.0.50
m1n0 Sep 27, 2023
f743fc7
Allow to specify virtual file name for add sodacl string (#1943)
m1n0 Oct 2, 2023
3fdac3c
Feature/add more file formats for duckdb (#1942)
PaoloLeonard Oct 6, 2023
b34e271
added BigQuery Job Labels (#1947)
m1n0 Oct 10, 2023
d25316f
Bump to 3.0.51
m1n0 Oct 11, 2023
2f67adb
Distribution: compute value counts in DB rather than in python
baturayo Oct 13, 2023
fe27fc3
Fix 3.8 compatibility
m1n0 Oct 17, 2023
431a0ee
feat: Add Dask/Pandas configurable data source naming support (#1951)
dirkgroenen Oct 25, 2023
5312c43
Bump to 3.0.52
dirkgroenen Oct 25, 2023
f6505f0
Freshness: support mixed thresholds (#1957)
m1n0 Oct 31, 2023
7affe19
Add License to every package (#1958)
m1n0 Nov 1, 2023
b3c112e
Bump to 3.0.53
m1n0 Nov 1, 2023
2c9cde9
Failed rows check: support thresholds (#1960)
m1n0 Nov 3, 2023
59191bf
Updated install doc to include MotherDuck support via DuckDB (#1963)
janet-can Nov 7, 2023
c7182b1
remove % from pattern (#1956)
chuwangBA Nov 9, 2023
7505aa3
Sqlserver: support quoting tables with brackets, "quote_tables" mode …
m1n0 Nov 14, 2023
644546d
Bump to 3.0.54
m1n0 Nov 14, 2023
5f268b8
Contracts
tombaeyens Nov 15, 2023
6ffddd9
Fix check source payload (#1966)
m1n0 Nov 15, 2023
2a142e7
Bump to 3.1.0
m1n0 Nov 16, 2023
3f8fcc7
Update python api docs (#1967)
m1n0 Nov 16, 2023
88640a9
Make custom identity fixed as v4 (#1968)
m1n0 Nov 20, 2023
09c00a2
Freshness: support in-check filters (#1970)
m1n0 Dec 1, 2023
ae8d325
Bump to 3.1.1
m1n0 Dec 2, 2023
8249949
Adding support for authentication via a chained list of delegate acco…
nathadfield Dec 15, 2023
17c67cf
fix anomaly detection frequency aggregation bug (#1975)
baturayo Dec 15, 2023
46206eb
upgrade pydantic from v1 to v2 (#1974)
baturayo Dec 15, 2023
cb950c9
[pre-commit.ci] pre-commit autoupdate (#1938)
pre-commit-ci[bot] Dec 15, 2023
b7103e1
Bump to 3.1.2
m1n0 Dec 15, 2023
e80f118
feat: implement warn_only for anomaly score (#156) (#1980)
baturayo Dec 27, 2023
3c05346
Bump to 3.1.3
m1n0 Jan 3, 2024
1a44ce0
Dbt: improve parsing logs (#1981)
m1n0 Jan 4, 2024
2bde90c
Sampler: fix link href (#1983)
m1n0 Jan 5, 2024
c3c9521
Document group by example for Soda Core with failed rows check (#1984)
janet-can Jan 5, 2024
45a5a74
Schema check: support custom identity (#1988)
m1n0 Jan 16, 2024
34d65af
Add semver release with major, minor, latest (#1993)
dirkgroenen Jan 23, 2024
036204b
bug: handle null values for continuous dist (#165) (#1994)
baturayo Jan 23, 2024
55b85f5
[pre-commit.ci] pre-commit autoupdate (#1977)
pre-commit-ci[bot] Jan 23, 2024
ceab226
feat: implement new anomaly detection in soda core (#1995)
baturayo Jan 24, 2024
9445d1e
feat: support built-in prophet public holidays (#1997)
baturayo Jan 24, 2024
64bc338
Bump to 3.1.4
m1n0 Jan 24, 2024
b6f4329
Hive data source improvements (#1982)
robertomorandeira Jan 24, 2024
79b513a
feat: implement migrate from anomaly score check config (#168) (#1998)
baturayo Jan 25, 2024
311f1f2
Bump Prophet (#2000)
m1n0 Jan 25, 2024
89da879
Tests: use approx comparison for floats (#1999)
m1n0 Jan 25, 2024
8e0ae62
hive: add configuration parameters (#36)
vijaykiran Jul 3, 2023
2d00558
Bump to 3.1.5
m1n0 Jan 26, 2024
594d026
feat: implement severity level paramaters (#2001)
baturayo Jan 29, 2024
339309f
Always use datasource specifis COUNT expression (#2003)
m1n0 Jan 29, 2024
51a30fb
fix: anomaly detection feedbacks (#2005)
baturayo Jan 31, 2024
70b8753
[pre-commit.ci] pre-commit autoupdate (#2002)
pre-commit-ci[bot] Feb 2, 2024
1d2e8ac
feat: anomaly detection simulator (#163) (#2010)
baturayo Feb 6, 2024
e172b7d
feat: added dremio token support (#2009)
JorisTruong Feb 7, 2024
fc8e191
Bump to 3.2.0
m1n0 Feb 8, 2024
68d44b3
feat: correctly identified anomalies are excluded from training data …
baturayo Feb 9, 2024
1a211f5
fix: show more clearly the detected frequency using warning message f…
baturayo Feb 9, 2024
16ea0b9
Fix simulator import and streamlit path (#2017)
m1n0 Feb 12, 2024
a02f463
[pre-commit.ci] pre-commit autoupdate (#2016)
pre-commit-ci[bot] Feb 13, 2024
2c3ce9d
Update oracle_data_source.py (#2012)
vinod901 Feb 13, 2024
eb2abf9
Oracle: cast config to str/int to prevent oracledb errors (#2018)
m1n0 Feb 13, 2024
dd63d9e
Bump to 3.2.1
m1n0 Feb 13, 2024
ea5831e
Fix assets folder (#2020)
m1n0 Feb 14, 2024
f47801c
fix timezone issue and log messages (#188) (#2023)
baturayo Feb 21, 2024
fe70d82
feat: in anomaly detection simulator use soda core historic check res…
baturayo Feb 28, 2024
7d2ed7b
Update dask-sql (#2026)
m1n0 Feb 29, 2024
f07eba9
Add dask-sql version comment
m1n0 Feb 29, 2024
97c3545
Bump to 3.2.2
m1n0 Feb 29, 2024
6245a4c
feat: implement daily and monthly seasonality to external regressor ……
baturayo Feb 29, 2024
b62550e
Dremio: fix token support (#2028)
m1n0 Mar 6, 2024
8179c50
Bump to 3.2.3
m1n0 Mar 6, 2024
8e41a2c
[pre-commit.ci] pre-commit autoupdate (#2022)
pre-commit-ci[bot] Mar 11, 2024
91dd60f
bugfix: support attributes on multiple checks (#2032)
milanaleksic Mar 12, 2024
e3787d1
Use dbt's new access_url pattern to access cloud API (#2035)
bastienboutonnet Mar 14, 2024
c25a872
Bump to 3.2.4
m1n0 Mar 16, 2024
98c52ce
Contracts 2nd iteration (#2006)
tombaeyens Mar 16, 2024
bd04e84
Bump to 3.3.0
m1n0 Mar 16, 2024
a1a2008
feat: improved wording and tooltip formatting in simulator (#2038)
bastienboutonnet Mar 19, 2024
c20eb59
Failed rows: fix warn/fail thresholds (#2042)
m1n0 Mar 22, 2024
de1d4b4
Bump opentelemetry to 1.22 (#2043)
m1n0 Mar 22, 2024
d4b8183
Bump dev requirements (#2045)
m1n0 Mar 23, 2024
ae33e9f
Bump to 3.3.1
m1n0 Mar 24, 2024
aee8045
Rename argument in set_scan_results_file method (#2047)
ozgenbaris1 Apr 9, 2024
2e40e45
Dremio: support disableCertificateVerification option (#2049)
m1n0 Apr 9, 2024
9e95906
[pre-commit.ci] pre-commit autoupdate (#2037)
pre-commit-ci[bot] Apr 16, 2024
1d21a34
Denodo: fix connection timeout attribute (#2065)
m1n0 Apr 23, 2024
34ace6a
Update db2_data_source.py (#2063)
4rahulae Apr 23, 2024
c046af0
Bump to 3.3.2
m1n0 Apr 24, 2024
76159ca
Update autoflake precommit (#2070)
m1n0 Apr 30, 2024
062b1e2
Contracts v3 (#2067)
tombaeyens Apr 30, 2024
5e51e69
Bump to 3.3.3
tombaeyens Apr 30, 2024
31b1ab3
Fix automated monitoring, prevent duplicate queries (#2075)
m1n0 May 3, 2024
cc02c01
Hive: support scheme (#2077)
m1n0 May 7, 2024
63c73f8
Bump dev requirements (#2078)
m1n0 May 7, 2024
7866d27
Bump deps (#2079)
m1n0 May 7, 2024
8a1ce04
Bump to 3.3.4
m1n0 May 7, 2024
1819347
Failed rows: fix warn/fail thresholds for fail condition (#2084)
m1n0 May 16, 2024
09262b0
upgrade to latest version of ibm-db python client (#2076)
Antoninj May 17, 2024
5d1163c
User defined metric fail query (#2089)
m1n0 May 23, 2024
b014718
Bump to 3.3.5
m1n0 May 23, 2024
4e09b27
CLOUD-7708 - Add Snowflake CI account to pipeline for soda-core (#2088)
dakue-soda May 27, 2024
5776b5e
[CLOUD-7400] Improve memory usage (#2081)
dirkgroenen May 29, 2024
c3dc141
lower pre-commit version to support py38
dirkgroenen May 30, 2024
7e631d5
Duplicate check: fail gracefully in case of error in query (#2093)
m1n0 Jun 5, 2024
552a716
Bump requests and tox/docker (#2094)
m1n0 Jun 5, 2024
af649b9
Duplicate check: support sample exclude columns fully (#2096)
m1n0 Jun 7, 2024
a94bd47
Merge remote-tracking branch 'upstream/main'
bichitra95 Jun 16, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 12 additions & 19 deletions .github/workflows/build-docker.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,35 +8,30 @@ jobs:
docker:
runs-on: ubuntu-latest
steps:
-
name: check if a version tag
- name: check if a version tag
id: check-version-tag
run: |
if [[ ${{ github.event.client_payload.tag }} =~ ^v[0-9]+\.[0-9]+\.[0-9]+$ ]]; then
echo ::set-output name=match::true
fi
-
name: Sleep for 900s
- name: Sleep for 900s
if: steps.check-version-tag.outputs.match == 'true'
uses: juliangruber/sleep-action@v1
with:
time: 900s
-
name: check if a version tag in ref
- name: check if a version tag in ref
if: steps.check-version-tag.outputs.match == 'true'
id: get-version-tag-in-ref
run: |
if [[ ${{ github.event.client_payload.ref }} =~ ^refs/tags/v[0-9]+\.[0-9]+\.[0-9]+$ ]]; then
echo ::set-output name=versiontag::$(echo "${{github.event.client_payload.ref}}" | cut -d / -f 3)
fi
-
name: Checkout
- name: Checkout
if: github.event.client_payload.tag == steps.get-version-tag-in-ref.outputs.versiontag
uses: actions/checkout@v3
with:
ref: ${{ github.event.client_payload.ref }}
-
name: Docker meta
- name: Docker meta
if: github.event.client_payload.tag == steps.get-version-tag-in-ref.outputs.versiontag
id: meta
uses: docker/metadata-action@v4
Expand All @@ -45,27 +40,25 @@ jobs:
sodadata/soda-core
tags: |
type=raw,value=${{ github.event.client_payload.tag }}
-
name: Set up QEMU
type=semver,pattern=v{{major}}.{{minor}},value=${{ github.event.client_payload.tag }}
type=semver,pattern=v{{major}},value=${{ github.event.client_payload.tag }}
- name: Set up QEMU
if: github.event.client_payload.tag == steps.get-version-tag-in-ref.outputs.versiontag
uses: docker/setup-qemu-action@v2
-
name: Set up Docker Buildx
- name: Set up Docker Buildx
if: github.event.client_payload.tag == steps.get-version-tag-in-ref.outputs.versiontag
uses: docker/setup-buildx-action@v2
-
name: Login to DockerHub
- name: Login to DockerHub
if: github.event.client_payload.tag == steps.get-version-tag-in-ref.outputs.versiontag
uses: docker/login-action@v2
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
-
name: Build and push
- name: Build and push
if: github.event.client_payload.tag == steps.get-version-tag-in-ref.outputs.versiontag
uses: docker/build-push-action@v3
with:
context: .
push: ${{ github.event_name != 'pull_request' }}
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
labels: ${{ steps.meta.outputs.labels }}
40 changes: 35 additions & 5 deletions .github/workflows/main.workflow.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,8 @@ jobs:
steps:
- uses: actions/checkout@v3
- uses: actions/setup-python@v3
with:
python-version: '3.11.x'
- uses: pre-commit/[email protected]
with:
extra_args: --all-files
Expand Down Expand Up @@ -49,11 +51,10 @@ jobs:
env:
DATA_SOURCE: ${{ matrix.data-source }}
PYTHON_VERSION: ${{ matrix.python-version }}
SNOWFLAKE_HOST: ${{ secrets.SNOWFLAKE_HOST }}
SNOWFLAKE_ACCOUNT: ${{ secrets.SNOWFLAKE_ACCOUNT }}
SNOWFLAKE_USERNAME: ${{ secrets.SNOWFLAKE_USERNAME }}
SNOWFLAKE_PASSWORD: ${{ secrets.SNOWFLAKE_PASSWORD }}
SNOWFLAKE_DATABASE: ${{ secrets.SNOWFLAKE_DATABASE }}
SNOWFLAKE_ACCOUNT: ${{ secrets.SNOWFLAKE_CI_ACCOUNT }}
SNOWFLAKE_USERNAME: ${{ secrets.SNOWFLAKE_CI_USERNAME }}
SNOWFLAKE_PASSWORD: ${{ secrets.SNOWFLAKE_CI_PASSWORD }}
SNOWFLAKE_DATABASE: ${{ secrets.SNOWFLAKE_CI_DATABASE }}
SNOWFLAKE_SCHEMA: "public"
BIGQUERY_ACCOUNT_INFO_JSON: ${{ secrets.BIGQUERY_ACCOUNT_INFO_JSON }}
BIGQUERY_DATASET: "test"
Expand Down Expand Up @@ -169,6 +170,35 @@ jobs:
- name: Test with tox
run: |
tox -- soda -k soda/scientific
test-contracts:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
python-version:
- "3.9"

env:
PYTHON_VERSION: ${{ matrix.python-version }}
steps:
- uses: actions/checkout@v3

- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}

- name: Install dependencies
run: |
sudo apt-get update
sudo apt-get install -y libsasl2-dev
python -m pip install --upgrade pip
cat dev-requirements.in | grep tox | xargs pip install

- name: Test with tox
run: |
tox -- soda -k soda/contracts

publish-pypi:
name: Build & Publish Package
if: contains(github.ref, 'refs/tags/')
Expand Down
51 changes: 40 additions & 11 deletions .github/workflows/pr.workflow.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,8 @@ jobs:
steps:
- uses: actions/checkout@v3
- uses: actions/setup-python@v3
with:
python-version: '3.11.x'
- uses: pre-commit/[email protected]
with:
extra_args: --all-files
Expand All @@ -35,15 +37,13 @@ jobs:
- "duckdb"
- "dask"


env:
DATA_SOURCE: ${{ matrix.data-source }}
PYTHON_VERSION: ${{ matrix.python-version }}
SNOWFLAKE_HOST: ${{ secrets.SNOWFLAKE_HOST }}
SNOWFLAKE_ACCOUNT: ${{ secrets.SNOWFLAKE_ACCOUNT }}
SNOWFLAKE_USERNAME: ${{ secrets.SNOWFLAKE_USERNAME }}
SNOWFLAKE_PASSWORD: ${{ secrets.SNOWFLAKE_PASSWORD }}
SNOWFLAKE_DATABASE: ${{ secrets.SNOWFLAKE_DATABASE }}
SNOWFLAKE_ACCOUNT: ${{ secrets.SNOWFLAKE_CI_ACCOUNT }}
SNOWFLAKE_USERNAME: ${{ secrets.SNOWFLAKE_CI_USERNAME }}
SNOWFLAKE_PASSWORD: ${{ secrets.SNOWFLAKE_CI_PASSWORD }}
SNOWFLAKE_DATABASE: ${{ secrets.SNOWFLAKE_CI_DATABASE }}
SNOWFLAKE_SCHEMA: "public"
BIGQUERY_ACCOUNT_INFO_JSON: ${{ secrets.BIGQUERY_ACCOUNT_INFO_JSON }}
BIGQUERY_DATASET: "test"
Expand All @@ -61,7 +61,7 @@ jobs:
MYSQL_PASSWORD: sodacore
MYSQL_ROOT_PASSWORD: sodacore
SPARK_DF_HOST: ${{ secrets.SPARK_DF_HOST }}

steps:
- uses: actions/checkout@v3

Expand All @@ -81,8 +81,8 @@ jobs:

- name: Test with tox
run: |
tox --exit-and-dump-after 3600 -- soda -k soda/core
tox --exit-and-dump-after 3600 -- soda -k soda/${{ matrix.data-source }}
tox -- soda -k soda/core
tox -- soda -k soda/${{ matrix.data-source }}
env:
test_data_source: ${{ matrix.data-source }}

Expand Down Expand Up @@ -113,7 +113,7 @@ jobs:

- name: Test with tox
run: |
tox --exit-and-dump-after 3600 -- soda -k soda/core
tox -- soda -k soda/core
env:
test_data_source: postgres
WESTMALLE: BETTER_THAN_LA_TRAPPE
Expand Down Expand Up @@ -145,4 +145,33 @@ jobs:

- name: Test with tox
run: |
tox --exit-and-dump-after 3600 -- soda -k soda/scientific
tox -- soda -k soda/scientific

test-contracts:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
python-version:
- "3.9"

env:
PYTHON_VERSION: ${{ matrix.python-version }}
steps:
- uses: actions/checkout@v3

- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}

- name: Install dependencies
run: |
sudo apt-get update
sudo apt-get install -y libsasl2-dev
python -m pip install --upgrade pip
cat dev-requirements.in | grep tox | xargs pip install

- name: Test with tox
run: |
tox -- soda -k soda/contracts
15 changes: 8 additions & 7 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ files: ^soda/
exclude: antlr/
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.4.0
rev: v4.6.0
hooks:
- id: trailing-whitespace
- id: check-added-large-files
Expand All @@ -18,24 +18,25 @@ repos:
- id: debug-statements
- id: detect-private-key
- id: end-of-file-fixer
- repo: https://github.com/humitos/mirrors-autoflake.git
rev: v1.1
- repo: https://github.com/PyCQA/autoflake
rev: v2.2.1
hooks:
- id: autoflake
args: ["--in-place", "--remove-all-unused-imports"]
- repo: https://github.com/asottile/pyupgrade
rev: v3.10.1
rev: v3.15.2
hooks:
- id: pyupgrade
args: [--py37-plus]
exclude: _models?\.py$
args: [--py38-plus, --keep-runtime-typing]
- repo: https://github.com/PyCQA/isort
rev: 5.12.0
rev: 5.13.2
hooks:
- id: isort
additional_dependencies: [toml]
name: Sort imports using isort
- repo: https://github.com/psf/black
rev: 23.7.0
rev: 24.4.0
hooks:
- id: black
name: Run black formatter
Expand Down
6 changes: 6 additions & 0 deletions .streamlit/config.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
[theme]
primaryColor = "#00D891" # Primary color
backgroundColor = "#F5F7F7" # Background color
# secondaryBackgroundColor = "#00D891" # Color for the sidebar and other secondary backgrounds
textColor = "#262730" # Primary text color
font = "sans serif" # Font style (e.g., "sans serif", "serif", "monospace")
12 changes: 8 additions & 4 deletions dev-requirements.in
Original file line number Diff line number Diff line change
@@ -1,16 +1,20 @@
pip-tools~=6.5
pip-tools~=7.3
pytest~=7.0
python-dotenv~=1.0
tox~=4.6
tox-docker~=4.1
tox~=4.12
tox-docker~=5.0
pytest-html~=3.1
pytest-cov~=3.0
faker~=13.3
tbump~=6.7
tbump~=6.11
black==22.6.0
typing_extensions>=4.3.0,<5
urllib3~=1.26
pygments~=2.11
readme-renderer~=32.0
certifi>=2022.12.07
wheel>=0.38.1
docutils<0.21 # 0.21 dropped py38 support, remove this after py38 support is gone
pre-commit<3.6 # 3.6 dropped py38, remove this after py38 support is gone
requests>=2.32.3

Loading