Skip to content

Add backward compatibility with v0 #1518

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Apr 10, 2025

Conversation

RobinPicard
Copy link
Contributor

@RobinPicard RobinPicard commented Mar 28, 2025

The objective of this PR is to make the v1 of Outlines backward compatible with the current v0.

We want users to still be able to run their code as long as they were using the regular high-level API (some objects have been deleted and are not supported anymore). For the general approach, I tried to do the following:

  • modify the v1 code as little as possible (notable exception is the OpenAI model)
  • keep the legacy code as separate as possible from the v1 code (put in a dedicated v0_legacy directory)
  • start using the v1 code as soon as possible in the legacy objects

Things this PR does:

  • Restore the model loading functions from v0 (they return a regular v1 model instance)
  • Update the OpenAI model to support the __ini__ signature of both the v0 and v1 version. In case of a v0 initialization, the methods of the instance call a legacy_instance attribute that implements the legacy interface
  • Restore the generate functions (text, regex, json...): they now return a GeneratorV0Adapter that stores a v1 generator while providing the expected interface of v0
  • Add warnings for everything deprecated
  • Add tests for all restored objects
  • Fix little issues in v1 code that were encountered along the way

I have not testes the exllamav2 model yet as it requires having a GPU. If someone could try running the tests for it that would be nice

@RobinPicard RobinPicard force-pushed the add_backward_compatibility branch 2 times, most recently from c318b9b to a96ab23 Compare March 30, 2025 12:23
@RobinPicard RobinPicard force-pushed the add_backward_compatibility branch from a96ab23 to 7c8bbd9 Compare March 31, 2025 13:49
@RobinPicard RobinPicard marked this pull request as ready for review March 31, 2025 14:03
@RobinPicard RobinPicard requested a review from rlouf March 31, 2025 14:03
@rlouf
Copy link
Member

rlouf commented Apr 2, 2025

I like the thorough explanation in the warning messages. I'm not sure what is going on with the coverage, is everything tested? Also, before merging we should do a smoke test to make sure the deeplearning.ai notebooks can run with this branch.

@cpfiffer
Copy link
Contributor

cpfiffer commented Apr 3, 2025

First -- I have no additional comments. Remi mentioned all the ones I would have. I tried a few examples and the deprecation warnings are comprehensive and very clear about what should be changed. Extremely good work here!

I'll also running smoke tests on the DLAI notebook. The warnings are fine actually because all the notebooks disable warning printouts, so everyone should be able to run things as normal if they copy the code over to a v1 outlines install. Otherwise the course pins the course to outlines==0.2.1.

The code

generator = outlines.generate.json(
    model, 
    Person,
    sampler = outlines.samplers.greedy()
)

yields the error

---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
Cell In[4], line 1
----> 1 from utils import track_logits
      3 generator = outlines.generate.json(
      4     model, 
      5     Person,
      6     sampler = outlines.samplers.greedy()
      7 )
      9 # Add tools to track token probabilities as they are generated

File [~/dottxt/outlines/SC-DotTxt-Outlines-C1/L4/utils.py:16](http://server.languid.ai:8080/lab/tree/SC-DotTxt-Outlines-C1/L4/SC-DotTxt-Outlines-C1/L4/utils.py#line=15)
     13 from numpy.typing import NDArray
     14 import matplotlib.pyplot as plt
---> 16 from outlines.processors.base_logits_processor import OutlinesLogitsProcessor, Array
     18 if TYPE_CHECKING:
     19     from outlines.generate import Generator

ImportError: cannot import name 'Array' from 'outlines.processors.base_logits_processor' ([/home/cameron/dottxt/outlines/outlines/processors/base_logits_processor.py](http://server.languid.ai:8080/lab/tree/SC-DotTxt-Outlines-C1/L4/outlines/processors/base_logits_processor.py))

@RobinPicard RobinPicard force-pushed the add_backward_compatibility branch from d58a35f to 2aeca4a Compare April 8, 2025 14:45
@RobinPicard RobinPicard force-pushed the add_backward_compatibility branch from 2aeca4a to 8b88746 Compare April 8, 2025 15:02
@RobinPicard
Copy link
Contributor Author

I've added a lot of tests to increase our coverage

@cpfiffer
Copy link
Contributor

cpfiffer commented Apr 8, 2025

The exllamav2 tests require nvcc>11, which is a pain to upgrade. I may have to defer it to another time.

@cpfiffer
Copy link
Contributor

cpfiffer commented Apr 8, 2025

Update on this error:

---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
Cell In[4], line 1
----> 1 from utils import track_logits
      3 generator = outlines.generate.json(
      4     model, 
      5     Person,
      6     sampler = outlines.samplers.greedy()
      7 )
      9 # Add tools to track token probabilities as they are generated

File [~/dottxt/outlines/SC-DotTxt-Outlines-C1/L4/utils.py:16](http://server.languid.ai:8080/lab/tree/SC-DotTxt-Outlines-C1/L4/SC-DotTxt-Outlines-C1/L4/utils.py#line=15)
     13 from numpy.typing import NDArray
     14 import matplotlib.pyplot as plt
---> 16 from outlines.processors.base_logits_processor import OutlinesLogitsProcessor, Array
     18 if TYPE_CHECKING:
     19     from outlines.generate import Generator

ImportError: cannot import name 'Array' from 'outlines.processors.base_logits_processor' ([/home/cameron/dottxt/outlines/outlines/processors/base_logits_processor.py](http://server.languid.ai:8080/lab/tree/SC-DotTxt-Outlines-C1/L4/outlines/processors/base_logits_processor.py))

This is due to some custom code I have in the DeepLearning.ai notebooks that does a bunch of weird stuff to the logit processor. This code is very much a bandaid and wasn't really intended to be production code.

That code was sort of formalized here, but it needs more work on the interface and shifting it to use the new v1 interface.

We might be able to just let this one go, especially because it's more of a perk feature than anything else. I flagged it as experimental in the videos so I'm less worried.

@cpfiffer
Copy link
Contributor

cpfiffer commented Apr 8, 2025

The next error I found is this one here, due to changes in the regex DSL:

from outlines.types import sentence, digit
from outlines.types.dsl import to_regex

# Write between 1-3 Sentences
reasoning = "Reasoning: " + sentence.repeat(1,2)
# Answer in 1-4 digits
answer = "So the answer is: " + digit.repeat(1,4)

to_regex(reasoning + "\n" + answer)

Error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[18], line 5
      2 from outlines.types.dsl import to_regex
      4 # Write between 1-3 Sentences
----> 5 reasoning = "Reasoning: " + sentence.repeat(1,2)
      6 # Answer in 1-4 digits
      7 answer = "So the answer is: " + digit.repeat(1,4)

AttributeError: 'Regex' object has no attribute 'repeat'

@RobinPicard
Copy link
Contributor Author

This is a change that's already in the main branch. I'll open a different PR to address it. Is this code using the repeat method also from the DeepLearning.ai notebooks?

@cpfiffer
Copy link
Contributor

cpfiffer commented Apr 8, 2025

Yeah, that stuff is used in DLAI.

@RobinPicard RobinPicard merged commit 902402a into dottxt-ai:v1.0 Apr 10, 2025
5 checks passed
@rlouf rlouf linked an issue Apr 15, 2025 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make v1.0 backward compatible
3 participants