Skip to content

Commit

Permalink
update the quick example
Browse files Browse the repository at this point in the history
  • Loading branch information
sh-rp committed May 26, 2024
1 parent 42067f0 commit 1aeddfa
Showing 1 changed file with 31 additions and 33 deletions.
64 changes: 31 additions & 33 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,23 +1,12 @@
# dlt-init-openapi
`dlt-init-openapi` generates [`dlt`](https://dlthub.com/docs) pipelines from OpenAPI 3.x documents/specs using the [`dlt` `rest_api` `verified source`](https://dlthub.com/docs/dlt-ecosystem/verified-sources/rest_api). If you do not know `dlt` or our `verified sources`, please read:
# dlt openapi source generator (dlt-init-openapi)
`dlt-init-openapi` generates [`dlt`](https://dlthub.com/docs) data pipelines from OpenAPI 3.x specs using the [`dlt` `rest_api` `verified source`](https://dlthub.com/docs/dlt-ecosystem/verified-sources/rest_api) to extract data from any rest API. If you do not know `dlt` or our `verified sources`, please read:

* [Getting started](https://dlthub.com/docs/getting-started) to learn the `dlt` basics
* [dlt rest_api](https://dlthub.com/docs/dlt-ecosystem/verified-sources/rest_api) to learn how our `rest_api` source works

> This generator does not support OpenAPI 2.x FKA Swagger. If you need to use an older document, try upgrading it to
version 3 first with one of many available converters.


## Prior work
This project started as a fork of [openapi-python-client](https://github.com/openapi-generators/openapi-python-client). Pretty much all parts are heavily changed or completely replaced, but some lines of code still exist and we like to acknowledge the many good ideas we got from the original project :)


## Support
If you need support for this tool, [join our slack community](https://dlthub.com/community) and ask for help on the technical help channel. We're usually around to help you out or discuss features :)


## Features
The dlt-init-openapi generates code from an OpenAPI spec that you can use to extract data from a `rest_api` into any [`destination`](https://dlthub.com/docs/dlt-ecosystem/destinations/) (e.g. Postgres, BigQuery, Redshift...) `dlt` supports.
The dlt-init-openapi generates code from an OpenAPI spec that you can use to extract data from a `rest_api` into any [`destination`](https://dlthub.com/docs/dlt-ecosystem/destinations/) (e.g. Postgres, BigQuery, Redshift...) `dlt` supports. dlt-init-openapi additional executes a set of heuristics to discover information that is not explicetely defined in OpenAPI specs.

Features include

Expand All @@ -27,32 +16,26 @@ Features include
* **Payload JSON path [data selector](https://dlthub.com/docs/dlt-ecosystem/verified-sources/rest_api#data-selection) discovery** for results nested in the returned json
* **[Authentication](https://dlthub.com/docs/dlt-ecosystem/verified-sources/rest_api#authentication)** discovery for an API

## Setup

You will need Python 3.9 or higher installed, as well as pip.

```console
# 1. install this tool locally
$ pip install dlt-init-openapi
## Support
If you need support for this tool or `dlt`, please [join our slack community](https://dlthub.com/community) and ask for help on the technical help channel. We're usually around to help you out or discuss features :)

# 2. Show the version of the installed package to verify it worked
$ dlt-init-openapi --version
```
## A quick example

## Basic Usage
You will need Python 3.9 or higher installed, as well as pip. You can run `pip install dlt-init-openapi` to install the current version.

Let's create an example pipeline from the [PokeAPI spec](https://raw.githubusercontent.com/cliffano/pokeapi-clients/ec9a2707ef2a85f41b747d8df013e272ef650ec5/specification/pokeapi.yml). You can point to any other OpenAPI Spec instead if you like.
We will create a simple example pipeline from a [PokeAPI spec](https://pokeapi.co/) in our repo. You can point to any other OpenAPI Spec instead if you like.

```console
# 1.a. Run the generator with an url:
$ dlt-init-openapi pokemon --url https://raw.githubusercontent.com/cliffano/pokeapi-clients/ec9a2707ef2a85f41b747d8df013e272ef650ec5/specification/pokeapi.yml
$ dlt-init-openapi pokemon --path tests/cases/e2e_specs/pokeapi.yml --global-limit 2

# 1.b. If you have a local file, you can use the --path flag:
$ dlt-init-openapi pokemon --path ./my_specs/pokeapi.yml

# 2. You can now pick the endpoints you need from the popup
# 2. You can now pick both of the endpoints from the popup

# 3. After selecting your pokemon endpoints and hitting Enter, your pipeline will be rendered
# 3. After selecting your pokemon endpoints and hitting Enter, your pipeline will be rendered.

# 4. If you have any kind of authentication on your pipeline (this example has not), open the `.dlt/secrets.toml` and provide the credentials. You can find further settings in the `.dlt/config.toml`.

Expand All @@ -68,19 +51,31 @@ $ pip install pandas streamlit
$ dlt pipeline pokemon_pipeline show

# 8. You can go to our docs at https://dlthub.com/docs to learn how modify the generated pipeline to load to many destinations, place schema contracts on your pipeline and many other things.

# NOTE: we used the `--global-limit 2` cli flag to limit the requests to the pokekom API for this example. This way the Pokemon collection endpoint only get's queried twice, resulting in 2 x 20 Pokemon details being rendered
```

## What will be created?

When you run the `init` command above, the following files will be generated:

* `./pokemon-pipeline` - a folder containing the full project.
* `./pokemon-pipeline/pipeline.py` - a file which you can execute to run your pipeline.
* `./pokemon-pipeline/pokemon/__init__.py` - a file that contains the generated code to connect to the PokeApi, you can inspect this file and manually change it to your liking or to fix incorrectly generated results.
* `./pokemon-pipeline/.dlt` - a folder with the `config.toml`. You can add your `secrets.toml` with credentials here.
* `./pokemon-pipeline/rest_api` - a folder that contains the rest_api source from our verified sources.
pokemon_pipeline/
├── .dlt/
│ ├── config.toml # dlt config, learn more at dlthub.com/docs
│ └── secrets.toml # your secrets, only needed for APIs with auth
├── pokemon/
│ └── __init__.py # your rest_api dictionary, learn more below
├── rest_api/
│ └── ... # rest_api copied from our verified sources repo
├── .gitingore
├── pokemon_pipeline.py # your pipeline file that you can execute
├── README.md # a list of your endpoints with some additional info
└── requirements.txt # the pip requirements for your pipeline

> If you re-generate your pipeline, you will be prompted to continue if this folder exists. If you select yes, all generated files will be overwritten. All other files you may have created will remain in this folder.
## A closer look

## CLI commands

```console
Expand Down Expand Up @@ -141,6 +136,9 @@ $ dlt-init-openapi pokemon --url ... --config config.yml
## Telemetry
We track your usage of this tool similar to how we track other commands in the dlt core library. Read more about this and how to disable it here: https://dlthub.com/docs/reference/telemetry.

## Prior work
This project started as a fork of [openapi-python-client](https://github.com/openapi-generators/openapi-python-client). Pretty much all parts are heavily changed or completely replaced, but some lines of code still exist and we like to acknowledge the many good ideas we got from the original project :)

## Implementation notes
* OAuth Authentication currently is not natively supported, you can supply your own
* Per endpoint authentication currently is not supported by the generator, only the first globally set securityScheme will be applied. You can add your own per endpoint if you need to.

0 comments on commit 1aeddfa

Please sign in to comment.