Why tuples use less space in memory than lists in Python?

Small tutorial on basic classes in Python with code examples 
17 October 2017   1729

Let's create a Python tuple:

>>> a = (1,2,3)
>>> a.__sizeof__()
48

and list

>>> b = [1,2,3]
>>> b.__sizeof__()
64

Tuple will use less memory than list. Let's figure out why.

 lists are variable-sized while tuples are fixed-size.

So tuples can store the elements directly inside the struct, lists on the other hand need a layer of indirection (it stores a pointer to the elements). This layer of indirection is a pointer, on 64bit systems that's 64bit, hence 8bytes.

But there's another thing that lists do: They over-allocate. Otherwise list.append would be an O(n) operation always - to make it amortized O(1) (much faster) it over-allocates. But now it has to keep track of the allocated size and the filled size (tuples only need to store one size, because allocated and filled size are always identical). That means each list has to store another "size" which on 64bit systems is a 64bit integer, again 8 bytes.

So lists need at least 16 bytes more memory than tuples. Because of the over-allocation. Over-allocation means it allocates more space than needed. However, the amount of over-allocation depends on "how" you create the list and the append/deletion history:

>>> l = [1,2,3]
>>> l.__sizeof__()
64
>>> l.append(4)  # triggers re-allocation (with over-allocation), because the original list is full
>>> l.__sizeof__()
96

>>> l = []
>>> l.__sizeof__()
40
>>> l.append(1)  # re-allocation with over-allocation
>>> l.__sizeof__()
72
>>> l.append(2)  # no re-alloc
>>> l.append(3)  # no re-alloc
>>> l.__sizeof__()
72
>>> l.append(4)  # still has room, so no over-allocation needed (yet)
>>> l.__sizeof__()
72

Some handy links on this topic:

  • tuple struct
  • list struct

Facebook to Release PyTorch 1.0

This release added support for large cloud platforms, a C ++ interface, a set of JIT compilers
10 December 2018   131

Facebook has released a stable version of the library for machine learning PyTorch 1.0. This iteration added support for large cloud platforms, a C ++ interface, a set of JIT compilers, and various improvements.

The stable version received a set of JIT compilers that eliminate the dependence of the code on the Python interpreter. The model code is transformed into Torch Script - a superstructure over Python. Keeping the opportunity to work with the model in the Python environment, the user can download it to other projects not related to this language. So, the PyTorch developers state that the code processed in this way can be used in the C ++ API.

The torch.distributed package and the torch.nn.parallel.DistributedDataParallel module are completely redesigned. torch.distributed now has better performance and works asynchronously with the Gloo, NCCL and MPI libraries.

The developers added a C ++ wrapper to PyTorch 1.0. It contains analogs of Python interface components, such astorch.nn,torch.optim, torch.data. According to the creators, the new interface should provide high performance for C ++ applications. True, the C ++ API is still experimental, but it can be used in projects now.

To improve the efficiency of working with PyTorch 1.0, a Torch Hub repository has been created, which stores pre-trained models of neural networks. You can publish your own development using the hubconf.py file, after which the model will be available for download by any user via the torch.hub.load API.

Support for C extensions and the module torch.utils.trainer were removed from the library.

Facebook released the preliminary version of PyTorch 1.0 at the beginning of October 2018, and in two months the developers brought the framework to a stable state.