Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to pickle functions that are decorated with functools.lru_cache() #178

Open
csadorf opened this issue Jun 20, 2018 · 14 comments
Open

Comments

@csadorf
Copy link

csadorf commented Jun 20, 2018

Observed behavior

The cloudpickle module is unable to pickle functions that are cached via the functools.lru_cache decorator function.

The following minimal example triggers the issue on Python 3.6.5 and cloudpickle version 0.5.3:

# minimal.py
from functools import lru_cache
import cloudpickle

@lru_cache()
def cached_func():
    pass

cloudpickle.dumps(cached_func)

Resulting in the following traceback:

$ python minimal.py
Traceback (most recent call last):
  File "minimal.py", line 8, in <module>
    cloudpickle.dumps(cached_func)
  File "/Users/csadorf/miniconda3/lib/python3.6/site-packages/cloudpickle/cloudpickle.py", line 895, in dumps
    cp.dump(obj)
  File "/Users/csadorf/miniconda3/lib/python3.6/site-packages/cloudpickle/cloudpickle.py", line 268, in dump
    return Pickler.dump(self, obj)
  File "/Users/csadorf/miniconda3/lib/python3.6/pickle.py", line 409, in dump
    self.save(obj)
  File "/Users/csadorf/miniconda3/lib/python3.6/pickle.py", line 507, in save
    self.save_global(obj, rv)
  File "/Users/csadorf/miniconda3/lib/python3.6/site-packages/cloudpickle/cloudpickle.py", line 642, in save_global
    return self.save_dynamic_class(obj)
  File "/Users/csadorf/miniconda3/lib/python3.6/site-packages/cloudpickle/cloudpickle.py", line 497, in save_dynamic_class
    self.save_reduce(tp, (obj.__name__, obj.__bases__, type_kwargs), obj=obj)
AttributeError: 'functools._lru_cache_wrapper' object has no attribute '__bases__'

Versions:

$ python -V
Python 3.6.5 :: Anaconda, Inc.
$ python -c "import cloudpickle; print(cloudpickle.__version__)"
0.5.3

Expected behavior

The cloudpickle module should handle cached functions by either pickling or ignoring the cache or at least should fail with a descriptive error message.

@ghost
Copy link

ghost commented Aug 12, 2018

Using dill the cache is pickled.

@aldanor
Copy link

aldanor commented Nov 23, 2018

Hit this as well. Wonder if it could be special-cased in cloudpickle, since it's a part of the standard library?

@guyskk
Copy link

guyskk commented Dec 15, 2018

@aldanor the standard pickle works,I think it's a special-cased in cloudpickle.

@csadorf
Copy link
Author

csadorf commented Dec 17, 2018

@guyskk For which Python version did you test that?

@guyskk
Copy link

guyskk commented Dec 18, 2018

@csadorf python 3.6

@ogrisel
Copy link
Contributor

ogrisel commented Feb 21, 2019

The standard pickle does not support pickling functions defined in the __main__ module so a direct comparison is not necessarily possible. But I agree it's a bug.

@pierreglaser
Copy link
Member

Yes, functools._lru_cache_wrapper actually has a smoke reduce method so that it ends up in save_global to pickle the cache by attribute. This comes from these 3 lines in pickle.py that our patched save_global did not consider.

Pickling an actual cache raises a few problems:

  • first, the current version of lru_cache relies on the C version of _lru_cache_wrapper, that overrides the python _lru_cache_wrapper which does not expose the cache internals. If we wanted to pickle a cache, we would have to make CPython expose a "pure python" _lru_cache.
  • even the pure python version relies on objects such as RLock that are not picklable for now - even by cloudpickle.

@ogrisel
Copy link
Contributor

ogrisel commented Feb 22, 2019

I believe we should not try to pickle the cache contents anyway, just the fact that this function is wrapped by an LRUCache instance.

@shenker
Copy link

shenker commented Mar 21, 2019

This would be very useful for me. Any idea where to start?

@pcmoritz
Copy link
Contributor

I agree that just pickling the information that the function was LRU wrapped without the cache so far seems to be the way to go forward here. Is there an easy way to do this?

@ssanderson
Copy link

I took a stab at implementing this here: #309.

Unfortunately, I don't think there's a way for us to properly preserve the typed parameter for lru_caches in CPython, so the implementation in that PR silently sets typed to the default value of False on deserialization. I think to support this properly we'd have to propose an upstream change to expose the typed value as a read-only attribute of lru_cached functions.

@orf
Copy link

orf commented Nov 22, 2021

FYI you can fix this with a snippet like so:

import functools
import copyreg

lru_cache_type = type(functools.lru_cache()(lambda: None))

def new_lru_cache(func, cache_kwargs):
    return functools.lru_cache(**cache_kwargs)(func)

def _pickle_lru_cache(obj: lru_cache_type):
    params = {}
    if hasattr(obj, "cache_parameters"):
        params = obj.cache_parameters()
    return new_lru_cache, (obj.__wrapped__, params)


copyreg.pickle(lru_cache_type, _pickle_lru_cache)

@garrettwrong
Copy link

Good morning, I think I just encountered this issue when attempting to use ray multiprocessing on some code recently decorated with lru_cache. I will try to make a minimal bug report for ray in their repo... but if it circles back to this issue, is it stalled/blocked?

Thanks!

@csadorf
Copy link
Author

csadorf commented Dec 14, 2022

@garrettwrong There is a WIP-PR for this referenced earlier (#178 (comment)), but this is indeed stalled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants