This project demonstrates a clean, production-style dbt setup modeling a simple e-commerce system.
It follows best practices for modular, environment-agnostic dbt projects, including raw source management, staging, marts, tests, and documentation.
This tutorial walks through building a simple, production-quality modern data platform:
- Ingest raw data
- Transform it using dbt
- Analyze with business marts
- Visualize key metrics
You will need:
- Google Cloud account with BigQuery enabled (free tier is sufficient)
- dbt CLI installed locally (or optionally a dbt Cloud account)
- Tableau Public account (optional for visualization)
Recommended tools:
- Git
- VS Code (or another code editor)
The project is organized into three main layers:
- Raw sources: simulated
customers
andorders
data (indev_raw
dataset) - Staging models: light cleaning and standardization
- Mart models: business-ready metrics including Customer Lifetime Value and Monthly Revenue Trends
- Logical project structure:
models/staging/
,models/marts/
- Raw data sources declared using
source()
- Staging models (
stg_*
) clean and standardized - Business models (
dim_*
,fact_*
) built withref()
- Tests (
unique
,not_null
) declared inschema.yml
- All models and columns documented
- Environment-agnostic: no hardcoded project/schema names
- Separate targets for seeds (
seed_raw
) and models (dev
) - Clean local runs:
dbt run
,dbt test
,dbt docs generate
- Secure: No credentials or secrets in the repo
models/
├── staging/
│ ├── schema.yml
│ ├── stg_customers.sql
│ └── stg_orders.sql
├── marts/
│ ├── schema.yml
│ ├── dim_customers.sql
│ └── fact_orders.sql
seeds/
├── customers.csv
├── orders.csv
├── schema.yml
dbt_project.yml
README.md
deploy.sh
git clone https://github.com/tstrall/analytics.git
cd analytics
- Create a new dataset in BigQuery named
dev_raw
. - Create another dataset named
dbt_yourname
(e.g.,dbt_tstrall
) for dbt models. - Enable BigQuery Storage API if prompted.
Create or edit ~/.dbt/profiles.yml
with the following structure:
analytics:
outputs:
dev:
type: bigquery
method: service-account
project: your-gcp-project-id
dataset: dbt_yourname
keyfile: path/to/your/service_account.json
threads: 4
seed_raw:
type: bigquery
method: service-account
project: your-gcp-project-id
dataset: dev_raw
keyfile: path/to/your/service_account.json
threads: 4
target: dev
Replace:
your-gcp-project-id
with your GCP project IDdbt_yourname
with your dataset for modelspath/to/your/service_account.json
with your service account key path
Test your connection:
dbt debug
This project provides seed files containing dummy raw data. To load them into BigQuery:
dbt --target seed_raw seed
This loads tables into your dev_raw
dataset.
dbt --target seed_raw test --select path:seeds/
dbt run
dbt test --select path:models/
dbt docs generate
dbt docs serve
Access documentation locally at http://localhost:8080
.
You can deploy everything automatically:
./deploy.sh
This will:
- Seed raw data
- Test seeds
- Build models
- Test models
- Generate documentation
For a full step-by-step deployment guide, see DEPLOYMENT.md.
Note:
This Tableau dashboard reflects a snapshot of the dbt-transformed data as of the publish date.
Tableau Public dashboards do not maintain live BigQuery connections.
Problem | Solution |
---|---|
Dataset not found | Confirm your BigQuery dataset names match your profiles.yml |
dbt CLI can't find profile | Ensure your profiles.yml is located at ~/.dbt/profiles.yml |
Missing raw tables | Ensure you ran dbt --target seed_raw seed successfully |
Partial Parsing Warnings | Expected when switching targets; safe to ignore |
This project is built first for BigQuery. Future enhancements include:
- Snowflake version
- Automated ingestion with Fivetran
- Additional visualization dashboards
- dbt — Data transformation and modeling
- Google BigQuery — Cloud data warehouse
- Tableau — Data visualization
- GitHub — Source control and project management
Built by Ted Strall — strall.com
# Seed data into dev_raw
dbt --target seed_raw seed
# Test seeds
dbt --target seed_raw test --select path:seeds/
# Build models into dbt_tstrall
dbt run
# Test models
dbt test --select path:models/
# Generate docs
dbt docs generate
# Serve docs
dbt docs serve
Version v1.0.0 — Initial dbt + BigQuery + Tableau stack
- dbt transformations and tests
- Automated deploy script
- Tableau dashboard for Top Customers and Revenue Trends