Skip to content

Commit 03cccaa

Browse files
Vincent Moenspytorchmergebot
authored andcommitted
Doc: Rewrite the storage.rst file to emphasize untyped storages (pytorch#140145)
Pull Request resolved: pytorch#140145 Approved by: https://github.com/janeyx99
1 parent 1a8752b commit 03cccaa

File tree

1 file changed

+128
-10
lines changed

1 file changed

+128
-10
lines changed

docs/source/storage.rst

Lines changed: 128 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,134 @@
11
torch.Storage
2-
===================================
2+
=============
3+
4+
In PyTorch, a regular tensor is a multi-dimensional array that is defined by the following components:
5+
6+
- Storage: The actual data of the tensor, stored as a contiguous, one-dimensional array of bytes.
7+
- ``dtype``: The data type of the elements in the tensor, such as torch.float32 or torch.int64.
8+
- ``shape``: A tuple indicating the size of the tensor in each dimension.
9+
- Stride: The step size needed to move from one element to the next in each dimension.
10+
- Offset: The starting point in the storage from which the tensor data begins. This will usually be 0 for newly
11+
created tensors.
12+
13+
These components together define the structure and data of a tensor, with the storage holding the
14+
actual data and the rest serving as metadata.
15+
16+
Untyped Storage API
17+
-------------------
18+
19+
A :class:`torch.UntypedStorage` is a contiguous, one-dimensional array of elements. Its length is equal to the number of
20+
bytes of the tensor. The storage serves as the underlying data container for tensors.
21+
In general, a tensor created in PyTorch using regular constructors such as :func:`~torch.zeros`, :func:`~torch.zeros_like`
22+
or :func:`~torch.Tensor.new_zeros` will produce tensors where there is a one-to-one correspondence between the tensor
23+
storage and the tensor itself.
24+
25+
However, a storage is allowed to be shared by multiple tensors.
26+
For instance, any view of a tensor (obtained through :meth:`~torch.Tensor.view` or some, but not all, kinds of indexing
27+
like integers and slices) will point to the same underlying storage as the original tensor.
28+
When serializing and deserializing tensors that share a common storage, the relationship is preserved, and the tensors
29+
continue to point to the same storage. Interestingly, deserializing multiple tensors that point to a single storage
30+
can be faster than deserializing multiple independent tensors.
31+
32+
A tensor storage can be accessed through the :meth:`~torch.Tensor.untyped_storage` method. This will return an object of
33+
type :class:`torch.UntypedStorage`.
34+
Fortunately, storages have a unique identifier called accessed through the :meth:`torch.UntypedStorage.data_ptr` method.
35+
In regular settings, two tensors with the same data storage will have the same storage ``data_ptr``.
36+
However, tensors themselves can point to two separate storages, one for its data attribute and another for its grad
37+
attribute. Each will require a ``data_ptr()`` of its own. In general, there is no guarantee that a
38+
:meth:`torch.Tensor.data_ptr` and :meth:`torch.UntypedStorage.data_ptr` match and this should not be assumed to be true.
39+
40+
Untyped storages are somewhat independent of the tensors that are built on them. Practically, this means that tensors
41+
with different dtypes or shape can point to the same storage.
42+
It also implies that a tensor storage can be changed, as the following example shows:
43+
44+
>>> t = torch.ones(3)
45+
>>> s0 = t.untyped_storage()
46+
>>> s0
47+
0
48+
0
49+
128
50+
63
51+
0
52+
0
53+
128
54+
63
55+
0
56+
0
57+
128
58+
63
59+
[torch.storage.UntypedStorage(device=cpu) of size 12]
60+
>>> s1 = s0.clone()
61+
>>> s1.fill_(0)
62+
0
63+
0
64+
0
65+
0
66+
0
67+
0
68+
0
69+
0
70+
0
71+
0
72+
0
73+
0
74+
[torch.storage.UntypedStorage(device=cpu) of size 12]
75+
>>> # Fill the tensor with a zeroed storage
76+
>>> t.set_(s1, storage_offset=t.storage_offset(), stride=t.stride(), size=t.size())
77+
tensor([0., 0., 0.])
78+
79+
.. warning::
80+
Please note that directly modifying a tensor's storage as shown in this example is not a recommended practice.
81+
This low-level manipulation is illustrated solely for educational purposes, to demonstrate the relationship between
82+
tensors and their underlying storages. In general, it's more efficient and safer to use standard ``torch.Tensor``
83+
methods, such as :meth:`~torch.Tensor.clone` and :meth:`~torch.Tensor.fill_`, to achieve the same results.
84+
85+
Other than ``data_ptr``, untyped storage also have other attributes such as :attr:`~torch.UntypedStorage.filename`
86+
(in case the storage points to a file on disk), :attr:`~torch.UntypedStorage.device` or
87+
:attr:`~torch.UntypedStorage.is_cuda` for device checks. A storage can also be manipulated in-place or
88+
out-of-place with methods like :attr:`~torch.UntypedStorage.copy_`, :attr:`~torch.UntypedStorage.fill_` or
89+
:attr:`~torch.UntypedStorage.pin_memory`. FOr more information, check the API
90+
reference below. Keep in mind that modifying storages is a low-level API and comes with risks!
91+
Most of these APIs also exist on the tensor level: if present, they should be prioritized over their storage
92+
counterparts.
93+
94+
Special cases
95+
-------------
96+
97+
We mentioned that a tensor that has a non-None ``grad`` attribute has actually two pieces of data within it.
98+
In this case, :meth:`~torch.Tensor.untyped_storage` will return the storage of the :attr:`~torch.Tensor.data` attribute,
99+
whereas the storage of the gradient can be obtained through ``tensor.grad.untyped_storage()``.
100+
101+
>>> t = torch.zeros(3, requires_grad=True)
102+
>>> t.sum().backward()
103+
>>> assert list(t.untyped_storage()) == [0] * 12 # the storage of the tensor is just 0s
104+
>>> assert list(t.grad.untyped_storage()) != [0] * 12 # the storage of the gradient isn't
105+
106+
There are also special cases where tensors do not have a typical storage, or no storage at all:
107+
- Tensors on ``"meta"`` device: Tensors on the ``"meta"`` device are used for shape inference
108+
and do not hold actual data.
109+
- Fake Tensors: Another internal tool used by PyTorch's compiler is
110+
`FakeTensor <https://pytorch.org/docs/stable/torch.compiler_fake_tensor.html>`_ which is based on a similar idea.
111+
112+
Tensor subclasses or tensor-like objects can also display unusual behaviours. In general, we do not
113+
expect many use cases to require operating at the Storage level!
114+
115+
.. autoclass:: torch.UntypedStorage
116+
:members:
117+
:undoc-members:
118+
:inherited-members:
119+
120+
Legacy Typed Storage
121+
--------------------
122+
123+
.. warning::
124+
For historical context, PyTorch previously used typed storage classes, which are
125+
now deprecated and should be avoided. The following details this API in case you
126+
should encounter it, although its usage is highly discouraged.
127+
All storage classes except for :class:`torch.UntypedStorage` will be removed
128+
in the future, and :class:`torch.UntypedStorage` will be used in all cases.
3129

4130
:class:`torch.Storage` is an alias for the storage class that corresponds with
5-
the default data type (:func:`torch.get_default_dtype()`). For instance, if the
131+
the default data type (:func:`torch.get_default_dtype()`). For example, if the
6132
default data type is :attr:`torch.float`, :class:`torch.Storage` resolves to
7133
:class:`torch.FloatStorage`.
8134

@@ -22,20 +148,12 @@ holds the data as an untyped array of bytes.
22148
Every strided :class:`torch.Tensor` contains a :class:`torch.TypedStorage`,
23149
which stores all of the data that the :class:`torch.Tensor` views.
24150

25-
.. warning::
26-
All storage classes except for :class:`torch.UntypedStorage` will be removed
27-
in the future, and :class:`torch.UntypedStorage` will be used in all cases.
28151

29152
.. autoclass:: torch.TypedStorage
30153
:members:
31154
:undoc-members:
32155
:inherited-members:
33156

34-
.. autoclass:: torch.UntypedStorage
35-
:members:
36-
:undoc-members:
37-
:inherited-members:
38-
39157
.. autoclass:: torch.DoubleStorage
40158
:members:
41159
:undoc-members:

0 commit comments

Comments
 (0)