Zarr#

Cubed was designed to work seamlessly with Zarr data. The examples below demonstrate using cubed.from_zarr(), cubed.to_zarr() and cubed.store() to read and write Zarr data.

Write to Zarr#

We’ll start by creating a small chunked array containing random data in Cubed and writing it to Zarr using cubed.to_zarr(). Note that the call to to_zarr executes eagerly.

import cubed
import cubed.random

# 2MB chunks
a = cubed.random.random((5000, 5000), chunks=(500, 500))

# write to Zarr
cubed.to_zarr(a, "a.zarr")

Read from Zarr#

We can check that the Zarr file was created by loading it from disk using cubed.from_zarr():

cubed.from_zarr("a.zarr")
Array Chunk
Bytes 200.0 MB 2.0 MB
Shape (5000, 5000) (500, 500)
Count 1 arrays in Plan 100 Chunks
Type float64 np.ndarray
5000 5000

Multiple arrays#

To write multiple arrays in a single computation use cubed.store():

import cubed
import cubed.random

# 2MB chunks
a = cubed.random.random((5000, 5000), chunks=(500, 500))
b = cubed.random.random((5000, 5000), chunks=(500, 500))

# write to Zarr
arrays = [a, b]
paths = ["a.zarr", "b.zarr"]
cubed.store(arrays, paths)

Then to read the Zarr files back, we use cubed.from_zarr() for each array and perform whatever array operations we like on them. Only when we call to_zarr is the whole computation executed.

import cubed.array_api as xp

# read from Zarr
a = cubed.from_zarr("a.zarr")
b = cubed.from_zarr("b.zarr")

# perform operation
c = xp.add(a, b)

# write to Zarr
cubed.to_zarr(c, store="c.zarr")