Skip to content

Commit 8694ef8

Browse files
Merge pull request #80 from data-apis/document-adding-new-model
Add instructions on adding new builds
2 parents 11a3fcf + 7c6c439 commit 8694ef8

File tree

3 files changed

+70
-9
lines changed

3 files changed

+70
-9
lines changed

README.md

Lines changed: 51 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -28,11 +28,6 @@ def argmax(
2828

2929
Contributions are very welcome! Please feel free to open an issue, or reach out directly, if there is anything you would like to discuss or explore! You can also check out the [issue tracker](https://github.com/data-apis/python-record-api/issues) for some possible next steps that we could use help on.
3030

31-
## Hosted Usage
32-
33-
We have this repository set up with Kubernetes and Github Actions to automatically analyze a number of libraries. The ones we have added are in [`k8/images`](./k8/images). Do you have a library that you would also like to see analyzed? Please open a PR adding that image to the folder there. Make sure to test it locally first, to see that it runs.
34-
35-
Once it's added to the repo, it will be run and the data will be added to `data/api/<library_name>.json` and from there, it will be used to generate the NumPy and Pandas APIs. Those are present in [`data/api.json`](./data/api.json), in machine readable form, as well as in [`data/typing`](data/typing) in human readable form.
3631

3732
## Usage
3833

@@ -84,6 +79,57 @@ env PYTHON_RECORD_API_OUTPUT=typing/ \
8479
```
8580

8681

82+
## Hosted Usage
83+
84+
We have this repository set up with Kubernetes and Github Actions to automatically analyze a number of libraries. The ones we have added are in [`k8/images`](./k8/images).
85+
86+
Once it's added to the repo, it will be run and the data will be added to `data/api/<library_name>.json` and from there, it will be used to generate the NumPy and Pandas APIs. Those are present in [`data/api.json`](./data/api.json), in machine readable form, as well as in [`data/typing`](data/typing) in human readable form.
87+
88+
### Adding more libraries
89+
90+
Do you have a library that you would also like to see analyzed? We welcome PRs!
91+
92+
You can add a new backend just from the Github editor UX, but you probably want to test it locally first.
93+
94+
95+
First install Docker >= v18.06 and Python.
96+
97+
Then [fork the library](https://github.com/data-apis/python-record-api/fork), clone your fork, and check out a new branch:
98+
99+
```bash
100+
git clone [email protected]:<FORK_NAME>/python-record-api.git
101+
cd python-record-api
102+
103+
git checout -b <CHANGES_NAME>
104+
```
105+
106+
And install the python package locally:
107+
108+
```bash
109+
pip install flit
110+
flit install
111+
```
112+
113+
Now either:
114+
115+
1. Edit an existing library, by editing a `Dockerfile` in any of the directories in `./k8/images`. If you make any changes, also increment number in the corresponding `version` file.
116+
2. Add a new library, by creating a new directory with the name of the module in the `./k8/images` directory. The name cannot contain any underscores. It should have at least a `Dockerfile` and a `versions` file, with the contents of `0`. You should copy an existing `Dockerfile` and modify it to download some downstream library and run some command as the entrypoint which produces a trace.
117+
3. Add a new module to trace calls against, by modifying the `TO_MODULES` constant in `./k8/Makefile` to add another Python module and also incrementing the `./k8/argo/version` file.
118+
119+
Then you can test an image locally, to make sure the Dockerfile builds correctly and the tracing works:
120+
121+
```bash
122+
cd k8
123+
make test-local-<LIBRARY NAME>
124+
```
125+
126+
This should output a number of lines of traces.
127+
128+
Now, commit your changes, push your branch, and open a PR. This will then build the image in CI and attempt to kick off build of it.
129+
130+
Once the PR is merged, every time we update the core library, this tracing will be re-run and the updated data will be added to the repo.
131+
132+
87133
## Development
88134

89135
First install the local package:

k8/Makefile

Lines changed: 16 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11

2+
TO_MODULES := pandas,numpy
23
PYTHON_PACKAGE_VERSION := $(shell python -c 'import record_api; print(record_api.__version__)')
34

45
$(info Python package version = ${PYTHON_PACKAGE_VERSION})
@@ -86,22 +87,34 @@ argo-submit-%:
8687
-p label="$(*F)" \
8788
-p base-image="$(BASE_IMAGE)" \
8889
-p image="$(call sub_image,$(*F))" \
90+
-p to_modules="$(TO_MODULES)" \
8991
--name "$(call workflow_name,$(*F))"
9092

9193

9294
push-images: docker-bake.json
9395
docker buildx bake base --push
9496
docker buildx bake --push
9597

98+
test-local-%:
99+
env DOCKER_BUILDKIT=1 docker build \
100+
--tag $(call sub_image,$(*F)) \
101+
--build-arg "FROM=${BASE_IMAGE}" \
102+
images/$(*F)
103+
docker run \
104+
--rm \
105+
-it \
106+
-e PYTHON_RECORD_API_TO_MODULES=$(TO_MODULES) \
107+
-e PYTHON_RECORD_API_OUTPUT_FILE=/dev/stdout \
108+
$(call sub_image,$(*F))
96109

97-
test-%: docker-bake.json
98-
# docker buildx bake --push $(*F)
110+
test-remote-%: docker-bake.json
111+
docker buildx bake --push $(*F)
99112
kubectl run \
100113
$(*F) \
101114
--rm \
102115
-it \
103116
--restart='Never' \
104-
--env=PYTHON_RECORD_API_TO_MODULES=numpy,pandas \
117+
--env=PYTHON_RECORD_API_TO_MODULES=$(TO_MODULES) \
105118
--env=PYTHON_RECORD_API_OUTPUT_FILE=/dev/stdout \
106119
--image=$(call sub_image,$(*F))
107120

k8/argo/workflow.yml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,8 @@ spec:
2626
value: "..."
2727
- name: image
2828
value: "..."
29+
- name: to_modules
30+
value: "..."
2931
templates:
3032
- name: all
3133
steps:
@@ -48,7 +50,7 @@ spec:
4850
- name: PYTHON_RECORD_API_OUTPUT_FILE
4951
value: /tmp/vol/raw.jsonl
5052
- name: PYTHON_RECORD_API_TO_MODULES
51-
value: numpy,pandas
53+
value: "{{workflow.parameters.to_modules}}"
5254
# https://github.com/argoproj/argo/blob/master/docs/resource-duration.md#request-defaults
5355
# Must override or get defaults which are too low
5456
resources:

0 commit comments

Comments
 (0)