Teacherenv by J-SUPHA · Pull Request #416 · NousResearch/atropos

J-SUPHA · 2026-03-13T15:12:41Z

No description provided.

for more information, see https://pre-commit.ci

J-SUPHA · 2026-03-13T15:23:55Z

atroposlib/envs/server_handling/openai_server.py


        if isinstance(default_server_configs, APIServerConfig):
-            server_configs = final_openai_config
+            server_configs = [final_openai_config]


If you pass a list of configs here it uses the configs directly. But if you pass a single non list config object, it goes into "template mode" and auto-generates server URLs/ports

https://github.com/NousResearch/atropos/blob/main/atroposlib/envs/server_handling/server_manager.py#L57

I mean, you're not supposed to pass this in like that

Agreed — the issue was the wrong config shape here. I fixed it so this path now returns [final_openai_config] instead of a bare APIServerConfig.

dmahan93

i do not think different tokenizer distillation is possible so at minimum remove it from this pr, if it is possible move to dms and we can get a blog post out

atroposlib/envs/teacher_distillation_env.py

dmahan93 · 2026-03-13T16:43:04Z

atroposlib/envs/teacher_distillation_env.py

+    teacher_base_url: Optional[str] = Field(
+        default=None,
+        description="Teacher server base URL (OpenAI-compatible).",
+    )
+    teacher_model_name: Optional[str] = Field(
+        default=None,
+        description="Teacher model name used in teacher server requests.",
+    )
+    teacher_api_key: str = Field(
+        default="",
+        description="Teacher API key, if required by the teacher endpoint.",
+    )
+    teacher_server_type: str = Field(
+        default="vllm",
+        description="Teacher server type (e.g. vllm, sglang, trl, openai).",
+    )
+    teacher_tokenizer_name: str = Field(
+        default="none",
+        description=(
+            "Tokenizer name for teacher server. If 'none', teacher_model_name is used. "
+            "When this resolves to a different vocabulary than the student tokenizer, "
+            "cross-tokenizer alignment is applied automatically."
+        ),
+    )


can't we just make these ServerConfigs

Yes — I moved those endpoint fields into teacher_server: APIServerConfig on TeacherDistillationConfig

for more information, see https://pre-commit.ci

dmahan93 · 2026-03-13T17:15:21Z

environments/gsm8k_server.py

        )

        async with self.server.managed_server(tokenizer=self.tokenizer) as managed:
+            logger.warning(


logger.debug please

dmahan93 · 2026-03-13T17:15:29Z

environments/gsm8k_server.py

                max_tokens=self.config.max_token_length,
                temperature=1.0,
            )
+            logger.warning(


logger.debug

dmahan93 · 2026-03-13T17:15:37Z

environments/gsm8k_server.py


            state = managed.get_state()
            nodes = state["nodes"]
+            logger.warning(


logger.debug

dmahan93 · 2026-03-13T17:16:51Z

example_trainer/api.py

+            "X-Atropos-Client": "trainer",
+            "X-Atropos-Pid": str(os.getpid()),


er, what are these for?

This was a sanity check - it has been removed

example_trainer/README.md

example_trainer/vllm_api_server.py

dmahan93 · 2026-03-13T17:22:12Z

atroposlib/envs/server_handling/openai_server.py


        if isinstance(default_server_configs, APIServerConfig):
-            server_configs = final_openai_config
+            server_configs = [final_openai_config]


https://github.com/NousResearch/atropos/blob/main/atroposlib/envs/server_handling/server_manager.py#L57

I mean, you're not supposed to pass this in like that

dmahan93 · 2026-03-13T17:25:17Z

atroposlib/envs/teacher_distillation_env.py

+    teacher_server: Optional[APIServerConfig] = Field(
+        default=None,
+        description="Teacher inference server configuration.",
+    )


ah, i probably commented poorly, it should be the same as how we setup the server_manager, so we may need to pass in a new thing to init

Updated this to follow this pattern. I removed teacher_server from TeacherDistillationConfig, so the env config now only carries env-level knobs like teacher_enabled and teacher_top_k. Teacher server wiring is now passed separately via teacher_server_configs at init

dmahan93 · 2026-03-13T17:27:58Z

atroposlib/envs/teacher_distillation_env.py

+                "teacher_top_k", self.config.teacher_top_k
+            )
+        )
+        top_k = max(1, top_k)


max should be 0, because prompt logprobs are (selected token + topk), disabled would be setting it to -1 or lower. I would also be amenable to a group override that's skip_teacher_top_k

dmahan93 · 2026-03-13T17:33:35Z

atroposlib/envs/server_handling/openai_server.py

i think this may need to be reverted?

dmahan93 · 2026-03-13T17:33:54Z

atroposlib/envs/server_handling/vllm_server.py

for more information, see https://pre-commit.ci

J-SUPHA added 30 commits March 13, 2026 11:04

teacher env init

f44eb81

testing set up

530fed2

command change

d5ca760

increase timeout cause vllm is super slow all of a sudden

ad364ac

trial

985311e

quicker training

e563352

forgot something easy

81f90a6

apparently not so easy

4f33ab8

next

bb2736d

sneaky bug

64794e7

sneaky bug logging

09ad401

non blocking test

d1fd89f

shorten worker timeout

057c9fe

remove enforce eager

e84686b

testing config

e79af5f

testing config

abba562

testing config

82be871

testing config

98a5d3b

tokenizer bug

78c0a6d

tokenizer bug

f1cfc13

tokenizer bug

c275687

tokenizer bug

3a440f8

tokenizer bug

b457a67

tokenizer bug

2f371e0

tokenizer bug

8a348be

tokenizer bug

34a3936

tokenizer bug

fd5b426

tokenizer bug

c37516b

tokenizer bug

a54dfe7

training kernel

62ef2fc

J-SUPHA and others added 5 commits March 13, 2026 11:06

training kernel

a43b0b7

investigating weird training issue

690e670

investigating weird training issue

3df0e45

investigating weird training issue

d8857eb

[pre-commit.ci] auto fixes from pre-commit.com hooks

d1b0dee

for more information, see https://pre-commit.ci

J-SUPHA commented Mar 13, 2026

View reviewed changes

J-SUPHA added 2 commits March 13, 2026 12:12

clean log

600c54f

clean logging

862cd36

dmahan93 requested changes Mar 13, 2026

View reviewed changes

J-SUPHA and others added 3 commits March 13, 2026 12:52

remove training code

148a4fd

remove cross tokenization and fix location of configs

a1b545c

[pre-commit.ci] auto fixes from pre-commit.com hooks

994e9c2

for more information, see https://pre-commit.ci

dmahan93 requested changes Mar 13, 2026

View reviewed changes

remove comments

322e7e6

dmahan93 requested changes Mar 13, 2026

View reviewed changes

dmahan93 reviewed Mar 13, 2026

View reviewed changes

atroposlib/envs/server_handling/openai_server.py

Copy link

Collaborator

dmahan93 Mar 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this may need to be reverted?

dmahan93 reviewed Mar 13, 2026

View reviewed changes

atroposlib/envs/server_handling/vllm_server.py

Copy link

Collaborator

dmahan93 Mar 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert

J-SUPHA and others added 12 commits March 13, 2026 16:12

address problems

a8cdb53

[pre-commit.ci] auto fixes from pre-commit.com hooks

82964b6

for more information, see https://pre-commit.ci

changes

697c594

[pre-commit.ci] auto fixes from pre-commit.com hooks

6c56479

for more information, see https://pre-commit.ci

adding tests

1b8ff07

[pre-commit.ci] auto fixes from pre-commit.com hooks

12ba3cc

for more information, see https://pre-commit.ci

structural changes

a171358

[pre-commit.ci] auto fixes from pre-commit.com hooks

3a85ede

for more information, see https://pre-commit.ci

better logging for devex

9bd299b

[pre-commit.ci] auto fixes from pre-commit.com hooks

f053c77

for more information, see https://pre-commit.ci

revert to similar structure

805a0c0

fresh eyes check

7aba0d3

J-SUPHA requested a review from dmahan93 March 14, 2026 15:20

		"X-Atropos-Client": "trainer",
		"X-Atropos-Pid": str(os.getpid()),

Conversation

J-SUPHA commented Mar 13, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dmahan93 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

J-SUPHA Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

J-SUPHA Mar 13, 2026 •

edited

Loading