Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Update ruff, mypy and poetry.lock. #87

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

undo76
Copy link
Contributor

@undo76 undo76 commented Jan 13, 2025

Update development dependencies. Also pin litellm to < 1.56.2 until BerriAI/litellm#7583 is solved.

Also included a workaround for this: BerriAI/litellm#7668

@lsorber
Copy link
Member

lsorber commented Jan 13, 2025

Thanks @undo76, those are welcome improvements. Haven’t dug into but could you check why the pipeline is failing?

@undo76
Copy link
Contributor Author

undo76 commented Jan 13, 2025

Thanks @undo76, those are welcome improvements. Haven’t dug into but could you check why the pipeline is failing?

For some reason the reranker test is failing, maybe we should add more results to prevent "random accidents"

>               assert τ_search >= τ_random >= τ_inverse
E               assert 0.3157894736842105 >= 0.37894736842105264

Also, for python 3.12, there is a problem with llvm_config. Not sure why it happens.

@undo76 undo76 requested a review from lsorber January 14, 2025 20:05
Copy link
Member

@lsorber lsorber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First review round

Comment on lines +476 to +500
cast(
llama_types.ChatCompletionResponseChoice,
{
"finish_reason": "tool_calls",
"index": 0,
"logprobs": completion["choices"][0]["logprobs"],
"message": {
"role": "assistant",
"content": None,
"tool_calls": [
{
"id": "call_" + f"_{i}_" + tool_name + "_" + completion["id"],
"type": "function",
"function": {
"name": tool_name,
"arguments": completion["choices"][0]["text"],
},
}
for i, (tool_name, completion) in enumerate(
zip(completions_tool_name, completions, strict=True)
)
],
},
},
}
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of a cast, it would be better to actually construct the object here. That way we benefit from proper type checking.

Comment on lines +21 to +37
class UserProfileResponse(BaseModel):
"""The response to a user profile extraction request."""

model_config = ConfigDict(extra="forbid" if strict else "allow")
username: str = Field(..., description="The username.")
password: str = Field(..., description="The password.")
system_prompt: ClassVar[str] = "Extract the username and password from the input."
email: str = Field(..., description="The email address.")
system_prompt: ClassVar[str] = "Extract the username and email from the input."

# Extract structured data.
username, password = "cypher", "steak"
login_response = extract_with_llm(
LoginResponse, f"username: {username}\npassword: {password}", strict=strict, config=config
# Example input data.
username, email = "cypher", "[email protected]"
profile_response = extract_with_llm(
UserProfileResponse, f"username: {username}\nemail: {email}", strict=strict, config=config
)
# Validate the response.
assert isinstance(login_response, LoginResponse)
assert login_response.username == username
assert login_response.password == password
assert isinstance(profile_response, UserProfileResponse)
assert profile_response.username == username
assert profile_response.email == email
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my information, why modify this test? Was it failing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Llama was reluctant to write passwords as it considered it insecure.

assert τ_search >= τ_random >= τ_inverse
assert τ_search >= τ_random >= τ_inverse, (
f"τ_search: {τ_search}, τ_random: {τ_random}, τ_inverse: {τ_inverse}"
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these changes necessary? Does the test fail without them? If so, do you know why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was added for diagnosing the problem. Sometimes, tau_random was larger than tau_search. This has been mitigated taking a larger sample.

Comment on lines +37 to +42
# Remove repeated \n to make it more resilient to variations between pdftext versions
sentences = [re.sub(r"\n+", "\n", sentence) for sentence in sentences]
for sentence, expected_sentence in zip(
sentences[: len(expected_sentences)], expected_sentences, strict=True
):
assert sentence == expected_sentence
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be sure: is it a new version of pdftext that's causing the differences here, and not SaT?

In addition: I'd prefer not to make changes to the expected_sentence, nor to the output of split_sentences. Instead, I would make the assertion itself (on line 42) invariant to the patterns you would consider 'equivalent'.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, PDF text is now returning slightly different strings. I understand what you mean, it makes sense to create a custom assert_similar_sentence.

@@ -1,4 +1,4 @@
# This file is automatically @generated by Poetry 1.8.5 and should not be changed by hand.
# This file is automatically @generated by Poetry 1.8.0 and should not be changed by hand.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous lock file was generated with Poetry v1.8.5, but this update was generated with Poetry v1.8.0.

Could you update to the latest patch version of Poetry (v1.8.5) before running poetry lock --no-update to make sure we don't inadvertently introduce any dependency issues?

Comment on lines +37 to +41
litellm = ">=1.48.4,<1.56.2"
llama-cpp-python = ">=0.3.2"
pydantic = ">=2.7.0"
# Approximate Nearest Neighbors:
pynndescent = ">=0.5.12"
pynndescent = ">=0.5.13"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I think we want to do here is:

  1. Explicitly upgrade ruff, mypy, and other tools in pyproject.toml.
  2. Then run poetry lock --no-update to avoid updating any dependencies that we don't explicitly upgrade.

Running a full poetry lock would significantly change the dependency set we run the tests against.

In the future, we should test against both the 'oldest' solution to the specification, and the 'newest' solution to the specification. That's something we can do when we migrate from Poetry to uv.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants