Skip to content

Commit e30c113

Browse files
kurmanfacebook-github-bot
authored andcommitted
Initial public docs for tracker API. (#622)
Summary: Pull Request resolved: #622 Public docs for torchx.tracker api Reviewed By: d4l3k Differential Revision: D40451386 fbshipit-source-id: a0ef137bc30295d43695af7db458d80706eeb913
1 parent 25bdc52 commit e30c113

File tree

4 files changed

+134
-0
lines changed

4 files changed

+134
-0
lines changed

docs/source/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,7 @@ Documentation
4545
runner.config
4646
advanced
4747
custom_components.md
48+
tracker
4849

4950
.. fbcode::
5051

docs/source/tracker.rst

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
torchx.tracker
2+
==============
3+
4+
Overview & Usage
5+
~~~~~~~~~~~~~~~~~~
6+
7+
.. automodule:: torchx.tracker
8+
.. currentmodule:: torchx.tracker
9+
10+
.. autoclass:: AppRun
11+
.. autoclass:: torchx.tracker.api.TrackerBase
12+
.. autoclass:: torchx.tracker.backend.fsspec.FsspecTracker
13+
14+
.. autoclass:: torchx.cli.cmd_tracker.CmdTracker

setup.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -73,6 +73,9 @@ def get_nightly_version():
7373
"console_scripts": [
7474
"torchx=torchx.cli.main:main",
7575
],
76+
"torchx.tracker": [
77+
"fsspec=torchx.tracker.backend.fsspec:create",
78+
],
7679
},
7780
extras_require={
7881
"kfp": ["kfp==1.6.2"],

torchx/tracker/__init__.py

Lines changed: 116 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,122 @@
44
# This source code is licensed under the BSD-style license found in the
55
# LICENSE file in the root directory of this source tree.
66

7+
"""
8+
.. note:: PROTOTYPE, USE AT YOUR OWN RISK, APIs SUBJECT TO CHANGE
9+
10+
11+
Practitioners running ML jobs often need to track information such as:
12+
13+
* Job inputs:
14+
* configuration
15+
* model configuration
16+
* HPO parameters
17+
* data
18+
* version
19+
* sources
20+
* Job results:
21+
* metrics
22+
* model location
23+
* Conceptual job groupings
24+
25+
26+
:py:class:`~torchx.tracker.api.AppRun` provides a uniform **interface** as an experiment and artifact tracking solution that
27+
supports wrapping pluggable tracking implementations by providing :py:class:`~torchx.tracker.api.TrackerBase` adapter
28+
implementation.
29+
30+
31+
Example usage
32+
-------------
33+
Sample `code <https://github.com/pytorch/torchx/blob/main/torchx/examples/apps/tracker/main.py>`__ using tracker API.
34+
35+
36+
Tracker Setup
37+
-------------
38+
To enable tracking it requires:
39+
40+
1. Defining tracker backends (entrypoints and configuration) on launcher side using :doc:`runner.config`
41+
2. Adding entrypoints within a user job using entry_points (`specification`_)
42+
43+
.. _specification: https://packaging.python.org/en/latest/specifications/entry-points/
44+
45+
46+
1. Launcher side configuration
47+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
48+
49+
User can define any number of tracker backends under **torchx:tracker** section in :doc:`runner.config`, where:
50+
* Key: is an arbitrary name for the tracker, where the name will be used to configure its properties
51+
under [tracker:<TRACKER_NAME>]
52+
* Value: is *entrypoint/factory method* that must be available within user job. The value will be injected into a
53+
user job and used to construct tracker implementation.
54+
55+
.. code-block:: ini
56+
57+
[torchx:tracker]
58+
tracker_name=<entry_point>
59+
60+
61+
Each tracker can be additionally configured (currently limited to `config` parameter) under `[tracker:<TRACKER NAME>]` section:
62+
63+
.. code-block:: ini
64+
65+
[tracker:<TRACKER NAME>]
66+
config=configvalue
67+
68+
For example, ~/.torchxconfig may be setup as:
69+
70+
.. code-block:: ini
71+
72+
[torchx:tracker]
73+
tracker1=tracker1
74+
tracker12=backend_2_entry_point
75+
76+
[tracker:tracker1]
77+
config=s3://my_bucket/config.json
78+
79+
80+
2. User job configuration (Advanced)
81+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
82+
83+
Entrypoint value defined in the previous step must be discoverable under `[torchx.tracker]` group and callable within user job
84+
(depending on packaging/distribution mechanism) to create an instance of the :py:class:`~torchx.tracker.api.TrackerBase`.
85+
86+
To accomplish that define entrypoint in the distribution in `entry_points.txt` as:
87+
88+
.. code-block:: ini
89+
90+
[torchx.tracker]
91+
entry_point_name=my_module:create_tracker_fn
92+
93+
94+
Aquiring :py:class:`~torchx.tracker.api.AppRun` instance
95+
--------------------------------------------------------
96+
97+
Use :py:meth:`~torchx.tracker.app_run_from_env`:
98+
99+
100+
>>> import os; os.environ["TORCHX_JOB_ID"] = "scheduler://session/job_id" # Simulate running job first
101+
>>> from torchx.tracker import app_run_from_env
102+
>>> app_run = app_run_from_env()
103+
104+
105+
Reference :py:class:`~torchx.tracker.api.TrackerBase` implementation
106+
--------------------------------------------------------------------
107+
:py:class:`~torchx.tracker.backend.fsspec.FsspecTracker` provides reference implementation of a tracker backend.
108+
GitHub example `directory <https://github.com/pytorch/torchx/blob/main/torchx/examples/apps/tracker/>`__ provides example on how to
109+
configure and use it in user application.
110+
111+
112+
Querying data
113+
-------------
114+
* :py:class:`~torchx.cli.cmd_tracker.CmdTracker` exposes operations available to users at the CLI level:
115+
* ``torchx tracker list jobs [–parent-run-id RUN_ID]``
116+
* ``torchx tracker list metadata RUN_ID``
117+
* ``torchx tracker list artifacts [–artifact ARTIFACT_NAME] RUN_ID``
118+
* Alternatively, backend implementations may expose UI for user consumption.
119+
120+
121+
"""
122+
7123
from .api import AppRun
8124

9125

0 commit comments

Comments
 (0)