chore: Update ruff, mypy and poetry.lock. #87

undo76 · 2025-01-13T12:46:45Z

Update development dependencies. Also pin litellm to < 1.56.2 until BerriAI/litellm#7583 is solved.

Also included a workaround for this: BerriAI/litellm#7668

lsorber · 2025-01-13T15:23:51Z

Thanks @undo76, those are welcome improvements. Haven’t dug into but could you check why the pipeline is failing?

undo76 · 2025-01-13T15:28:32Z

Thanks @undo76, those are welcome improvements. Haven’t dug into but could you check why the pipeline is failing?

For some reason the reranker test is failing, maybe we should add more results to prevent "random accidents"

>               assert τ_search >= τ_random >= τ_inverse
E               assert 0.3157894736842105 >= 0.37894736842105264

Also, for python 3.12, there is a problem with llvm_config. Not sure why it happens.

lsorber

First review round

lsorber · 2025-01-19T17:19:40Z

src/raglite/_chatml_function_calling.py

+            cast(
+                llama_types.ChatCompletionResponseChoice,
+                {
+                    "finish_reason": "tool_calls",
+                    "index": 0,
+                    "logprobs": completion["choices"][0]["logprobs"],
+                    "message": {
+                        "role": "assistant",
+                        "content": None,
+                        "tool_calls": [
+                            {
+                                "id": "call_" + f"_{i}_" + tool_name + "_" + completion["id"],
+                                "type": "function",
+                                "function": {
+                                    "name": tool_name,
+                                    "arguments": completion["choices"][0]["text"],
+                                },
+                            }
+                            for i, (tool_name, completion) in enumerate(
+                                zip(completions_tool_name, completions, strict=True)
+                            )
+                        ],
+                    },
                },
-            }
+            )


Instead of a cast, it would be better to actually construct the object here. That way we benefit from proper type checking.

lsorber · 2025-01-19T17:21:04Z

tests/test_extract.py

+    class UserProfileResponse(BaseModel):
+        """The response to a user profile extraction request."""

        model_config = ConfigDict(extra="forbid" if strict else "allow")
        username: str = Field(..., description="The username.")
-        password: str = Field(..., description="The password.")
-        system_prompt: ClassVar[str] = "Extract the username and password from the input."
+        email: str = Field(..., description="The email address.")
+        system_prompt: ClassVar[str] = "Extract the username and email from the input."

-    # Extract structured data.
-    username, password = "cypher", "steak"
-    login_response = extract_with_llm(
-        LoginResponse, f"username: {username}\npassword: {password}", strict=strict, config=config
+    # Example input data.
+    username, email = "cypher", "[email protected]"
+    profile_response = extract_with_llm(
+        UserProfileResponse, f"username: {username}\nemail: {email}", strict=strict, config=config
    )
    # Validate the response.
-    assert isinstance(login_response, LoginResponse)
-    assert login_response.username == username
-    assert login_response.password == password
+    assert isinstance(profile_response, UserProfileResponse)
+    assert profile_response.username == username
+    assert profile_response.email == email


For my information, why modify this test? Was it failing?

Llama was reluctant to write passwords as it considered it insecure.

lsorber · 2025-01-19T17:22:07Z

tests/test_rerank.py

-            assert τ_search >= τ_random >= τ_inverse
+            assert τ_search >= τ_random >= τ_inverse, (
+                f"τ_search: {τ_search},  τ_random: {τ_random}, τ_inverse: {τ_inverse}"
+            )


Are these changes necessary? Does the test fail without them? If so, do you know why?

This was added for diagnosing the problem. Sometimes, tau_random was larger than tau_search. This has been mitigated taking a larger sample.

lsorber · 2025-01-19T17:27:34Z

tests/test_split_sentences.py

+    # Remove repeated \n to make it more resilient to variations between pdftext versions
+    sentences = [re.sub(r"\n+", "\n", sentence) for sentence in sentences]
+    for sentence, expected_sentence in zip(
+        sentences[: len(expected_sentences)], expected_sentences, strict=True
+    ):
+        assert sentence == expected_sentence


Just to be sure: is it a new version of pdftext that's causing the differences here, and not SaT?

In addition: I'd prefer not to make changes to the expected_sentence, nor to the output of split_sentences. Instead, I would make the assertion itself (on line 42) invariant to the patterns you would consider 'equivalent'.

Yes, PDF text is now returning slightly different strings. I understand what you mean, it makes sense to create a custom assert_similar_sentence.

lsorber · 2025-01-19T17:28:51Z

poetry.lock

@@ -1,4 +1,4 @@
-# This file is automatically @generated by Poetry 1.8.5 and should not be changed by hand.
+# This file is automatically @generated by Poetry 1.8.0 and should not be changed by hand.


The previous lock file was generated with Poetry v1.8.5, but this update was generated with Poetry v1.8.0.

Could you update to the latest patch version of Poetry (v1.8.5) before running poetry lock --no-update to make sure we don't inadvertently introduce any dependency issues?

lsorber · 2025-01-19T17:33:00Z

pyproject.toml

+litellm = ">=1.48.4,<1.56.2"
 llama-cpp-python = ">=0.3.2"
 pydantic = ">=2.7.0"
 # Approximate Nearest Neighbors:
-pynndescent = ">=0.5.12"
+pynndescent = ">=0.5.13"


What I think we want to do here is:

Explicitly upgrade ruff, mypy, and other tools in pyproject.toml.

Then run poetry lock --no-update to avoid updating any dependencies that we don't explicitly upgrade.

Running a full poetry lock would significantly change the dependency set we run the tests against.

In the future, we should test against both the 'oldest' solution to the specification, and the 'newest' solution to the specification. That's something we can do when we migrate from Poetry to uv.

undo76 added 2 commits January 13, 2025 13:32

chore: Update ruff, mypy and poetry.lock.

1df8207

chore: Update ruff, mypy and poetry.lock.

84d6860

undo76 added 3 commits January 14, 2025 15:29

chore: fix dependencies and more informative error in tests.

3c9ab5e

test: Fix tests

aa705bf

test: Fix tests

4d01dd4

undo76 requested a review from lsorber January 14, 2025 20:05

lsorber requested changes Jan 19, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: Update ruff, mypy and poetry.lock. #87

chore: Update ruff, mypy and poetry.lock. #87

undo76 commented Jan 13, 2025

lsorber commented Jan 13, 2025

undo76 commented Jan 13, 2025

lsorber left a comment

lsorber Jan 19, 2025

lsorber Jan 19, 2025

undo76 Jan 20, 2025

lsorber Jan 19, 2025

undo76 Jan 20, 2025

lsorber Jan 19, 2025

undo76 Jan 20, 2025

lsorber Jan 19, 2025

lsorber Jan 19, 2025

		@@ -1,4 +1,4 @@
		# This file is automatically @generated by Poetry 1.8.5 and should not be changed by hand.
		# This file is automatically @generated by Poetry 1.8.0 and should not be changed by hand.

chore: Update ruff, mypy and poetry.lock. #87

Are you sure you want to change the base?

chore: Update ruff, mypy and poetry.lock. #87

Conversation

undo76 commented Jan 13, 2025

lsorber commented Jan 13, 2025

undo76 commented Jan 13, 2025

lsorber left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment