-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: Update ruff, mypy and poetry.lock. #87
base: main
Are you sure you want to change the base?
Conversation
Thanks @undo76, those are welcome improvements. Haven’t dug into but could you check why the pipeline is failing? |
For some reason the reranker test is failing, maybe we should add more results to prevent "random accidents"
Also, for python 3.12, there is a problem with llvm_config. Not sure why it happens. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First review round
cast( | ||
llama_types.ChatCompletionResponseChoice, | ||
{ | ||
"finish_reason": "tool_calls", | ||
"index": 0, | ||
"logprobs": completion["choices"][0]["logprobs"], | ||
"message": { | ||
"role": "assistant", | ||
"content": None, | ||
"tool_calls": [ | ||
{ | ||
"id": "call_" + f"_{i}_" + tool_name + "_" + completion["id"], | ||
"type": "function", | ||
"function": { | ||
"name": tool_name, | ||
"arguments": completion["choices"][0]["text"], | ||
}, | ||
} | ||
for i, (tool_name, completion) in enumerate( | ||
zip(completions_tool_name, completions, strict=True) | ||
) | ||
], | ||
}, | ||
}, | ||
} | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of a cast, it would be better to actually construct the object here. That way we benefit from proper type checking.
class UserProfileResponse(BaseModel): | ||
"""The response to a user profile extraction request.""" | ||
|
||
model_config = ConfigDict(extra="forbid" if strict else "allow") | ||
username: str = Field(..., description="The username.") | ||
password: str = Field(..., description="The password.") | ||
system_prompt: ClassVar[str] = "Extract the username and password from the input." | ||
email: str = Field(..., description="The email address.") | ||
system_prompt: ClassVar[str] = "Extract the username and email from the input." | ||
|
||
# Extract structured data. | ||
username, password = "cypher", "steak" | ||
login_response = extract_with_llm( | ||
LoginResponse, f"username: {username}\npassword: {password}", strict=strict, config=config | ||
# Example input data. | ||
username, email = "cypher", "[email protected]" | ||
profile_response = extract_with_llm( | ||
UserProfileResponse, f"username: {username}\nemail: {email}", strict=strict, config=config | ||
) | ||
# Validate the response. | ||
assert isinstance(login_response, LoginResponse) | ||
assert login_response.username == username | ||
assert login_response.password == password | ||
assert isinstance(profile_response, UserProfileResponse) | ||
assert profile_response.username == username | ||
assert profile_response.email == email |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For my information, why modify this test? Was it failing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Llama was reluctant to write passwords as it considered it insecure.
assert τ_search >= τ_random >= τ_inverse | ||
assert τ_search >= τ_random >= τ_inverse, ( | ||
f"τ_search: {τ_search}, τ_random: {τ_random}, τ_inverse: {τ_inverse}" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these changes necessary? Does the test fail without them? If so, do you know why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was added for diagnosing the problem. Sometimes, tau_random was larger than tau_search. This has been mitigated taking a larger sample.
# Remove repeated \n to make it more resilient to variations between pdftext versions | ||
sentences = [re.sub(r"\n+", "\n", sentence) for sentence in sentences] | ||
for sentence, expected_sentence in zip( | ||
sentences[: len(expected_sentences)], expected_sentences, strict=True | ||
): | ||
assert sentence == expected_sentence |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be sure: is it a new version of pdftext that's causing the differences here, and not SaT?
In addition: I'd prefer not to make changes to the expected_sentence
, nor to the output of split_sentences
. Instead, I would make the assertion itself (on line 42) invariant to the patterns you would consider 'equivalent'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, PDF text is now returning slightly different strings. I understand what you mean, it makes sense to create a custom assert_similar_sentence
.
@@ -1,4 +1,4 @@ | |||
# This file is automatically @generated by Poetry 1.8.5 and should not be changed by hand. | |||
# This file is automatically @generated by Poetry 1.8.0 and should not be changed by hand. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous lock file was generated with Poetry v1.8.5, but this update was generated with Poetry v1.8.0.
Could you update to the latest patch version of Poetry (v1.8.5) before running poetry lock --no-update
to make sure we don't inadvertently introduce any dependency issues?
litellm = ">=1.48.4,<1.56.2" | ||
llama-cpp-python = ">=0.3.2" | ||
pydantic = ">=2.7.0" | ||
# Approximate Nearest Neighbors: | ||
pynndescent = ">=0.5.12" | ||
pynndescent = ">=0.5.13" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I think we want to do here is:
- Explicitly upgrade ruff, mypy, and other tools in pyproject.toml.
- Then run
poetry lock --no-update
to avoid updating any dependencies that we don't explicitly upgrade.
Running a full poetry lock
would significantly change the dependency set we run the tests against.
In the future, we should test against both the 'oldest' solution to the specification, and the 'newest' solution to the specification. That's something we can do when we migrate from Poetry to uv.
Update development dependencies. Also pin litellm to < 1.56.2 until BerriAI/litellm#7583 is solved.
Also included a workaround for this: BerriAI/litellm#7668