Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[testing] add sdf to hooli #121

Draft
wants to merge 8 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,15 @@ FROM python:3.12-slim

WORKDIR /opt/dagster/app

RUN apt-get update && apt-get install -y git gcc
RUN apt-get update && apt-get install -y git gcc curl

RUN apt install -y default-jre

RUN python -m pip install -U pip

RUN curl -LSfs https://cdn.sdf.com/releases/download/install.sh | sh -s


# libcrypto fix oct 2023; should be able to remove sometime after that
RUN python -m pip uninstall oscrypto -y
RUN python -m pip install git+https://github.com/wbond/oscrypto.git@d5f3437ed24257895ae1edd9e503cfb352e635a8
Expand Down
2 changes: 1 addition & 1 deletion hooli-data-ingest/hooli_data_ingest/assets/sling.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ def __init__(self, target_prefix="RAW_DATA"):

def get_group_name(self, stream_definition):
return "RAW_DATA"

def get_tags(self, stream_definition):
# derive storage_kind from the target set in the replication_config
storage_kind = self.replication_config.get("target", "DUCKDB")
Expand Down
2 changes: 1 addition & 1 deletion hooli-data-ingest/hooli_data_ingest/jobs/__init__.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
from dagster import AssetSelection, define_asset_job


raw_location_by_day = AssetSelection.keys(["RAW_DATA", "locations"])
#raw_location_by_day = AssetSelection.keys(["locations"])


daily_sling_job = define_asset_job(
Expand Down
Loading
Loading