Skip to content
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 10 additions & 1 deletion src/boutdata/data.py
Original file line number Diff line number Diff line change
Expand Up @@ -1242,9 +1242,10 @@ def _init_parallel(self):
proc_list = tuple(p for p in range(filenum, filenum + files_per_proc[i]))
filenum = filenum + files_per_proc[i]
worker = Process(
target=self._worker_function,
target=self._clean_for_mp()._worker_function,
args=(child_connection, proc_list, self._shared_buffer_raw),
)

worker.start()
self._workers.append((worker, parent_connection))

Expand Down Expand Up @@ -1628,6 +1629,14 @@ def _collect_parallel(self, varname):
self._shared_buffer[global_slices].copy(), attributes=var_attributes
)

def _clean_for_mp(self):
new = object.__new__(BoutOutputs)
new.__dict__ = self.__dict__.copy()
new._file0 = None
new._root_file = None
new._workers = None
return new
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does __getstate__ allow us to do this without having to call _clean_for_mp()? Then it might just be:

Suggested change
def _clean_for_mp(self):
new = object.__new__(BoutOutputs)
new.__dict__ = self.__dict__.copy()
new._file0 = None
new._root_file = None
new._workers = None
return new
def __getstate__(self):
new = self.__dict__.copy()
new.pop("_file0")
new.pop("_root_file")
new.pip("_workers")
return new

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The question is, do users expect to be able to pickle an object and still have _file0, _root_file and _workers?
How does BoutOptions behave if that is not set?

It is probably fine to allow BoutOptions to be pickled, and if someone in the future wants to unpickle it, and it is broken for their use case, we can add the fixes to __setstate__ ...

I will update the PR, thanks for the hints 👍

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, but I agree we can cross that bridge when we come to it :)


def _worker_function(self, connection, proc_list, shared_buffer_raw):
data_files = [DataFile(self._file_list[i]) for i in proc_list]
dim_sizes = tuple(self.grid_info["sizes"][d] for d in ("t", "x", "y", "z"))
Expand Down
Loading