Skip to content

Releases: ohmeow/blurr

v.1.0.0

12 Apr 19:16
Compare
Choose a tag to compare

The official v.1 release of ohmeow-blurr

This is a massive refactoring over the previous iterations of blurr, including namespace modifications that will make it easier for us to add in support for vision, audio, etc... transformers in the future. If you've used any of the previous versions of blurr or the development build we covered in part 2 of the W&B study group, please make sure you read the docs and note the namespace changes.

To get up to speed with how to use this library, check out the W&B x fastai x Hugging Face study group. The docs are your friend and full of examples as well. I'll be working on updating the other examples floating around the internet as I have time.

If you have any questions, please use the hf-fastai channel in the fastai discord or github issues. As always, any and all PRs are welcome.

0.0.26 release

04 May 02:44
Compare
Choose a tag to compare
0.0.26 release Pre-release
Pre-release

Checkout the readme for more info.

This release fixes a couple of issues and also includes a few breaking changes. Make sure you update your version of fastai to >= 2.3.1 and your huggingface transformers to >= 4.5.x

Goodby 2020 release!

01 Jan 03:59
Compare
Choose a tag to compare
Goodby 2020 release! Pre-release
Pre-release
  • Updated the Seq2Seq models to use some of the latest huggingface bits like tokenizer.prepare_seq2seq_batch.
  • Separated out the Seq2Seq and Token Classification metrics into metrics-specific callbacks for a better separation of concerns. As a best practice, you should now only use them as fit_one_cycle, etc.. callbacks rather than attach them to your Learner.
  • NEW: Translation are now available in blurr, joining causal language modeling and summarization in our core Seq2Seq stack
  • NEW: Integration of huggingface's Seq2Seq metrics (rouge, bertscore, meteor, bleu, and sacrebleu). Plenty of info on how to set this up in the docs.
  • NEW: Added default_text_gen_kwargs, a method that given a huggingface config, model, and task (optional), will return the default/recommended kwargs for any text generation models.
  • A lot of code cleanup (e.g., refactored naming and removal of redundant code into classes/methods)
  • More model support and more tests across the board! Check out the docs for more info
  • Misc. validation improvements and bug fixes.

See the docs for each task for more info!

PyTorch 1.7 and fast.ai 2.1.x compliant release

04 Nov 23:36
Compare
Choose a tag to compare

Makes blurr PyTorch 1.7 and fast.ai 2.1.x compliant.

Added new examples section

Misc. improvements/fixes.

On-the-fly Batch-Time Tokenization Release

25 Sep 22:47
Compare
Choose a tag to compare

This release simplifies the API and introduces a new on-the-fly tokenization feature whereby all tokenization happens during mini-batch creation. There are several upsides to this approach. First, it gets you training faster. Second, it reduces RAM utilization during the reading of your raw data (esp. nice with very large datasets that would give folks problems on platforms like colab). And lastly, I believe the approach provides some flexibility to include data augmentation and/or build adverserial models amongst other things.

0.0.12

16 Sep 18:42
Compare
Choose a tag to compare
0.0.12 Pre-release
Pre-release

This pre-release does tokenization/numericalization the tradational way, as a fastai type transform.