Zarr

Contents

Zarr#

Cubed was designed to work seamlessly with Zarr data. The examples below demonstrate using cubed.from_zarr(), cubed.to_zarr() and cubed.store() to read and write Zarr data.

Write to Zarr#

We’ll start by creating a small chunked array containing random data in Cubed and writing it to Zarr using cubed.to_zarr(). Note that the call to to_zarr executes eagerly.

import cubed
import cubed.random

# 2MB chunks
a = cubed.random.random((5000, 5000), chunks=(500, 500))

# write to Zarr
cubed.to_zarr(a, "a.zarr")

Read from Zarr#

We can check that the Zarr file was created by loading it from disk using cubed.from_zarr():

cubed.from_zarr("a.zarr")

	Array	Chunk
Bytes	200.0 MB	2.0 MB
Shape	(5000, 5000)	(500, 500)
Count	1 arrays in Plan	100 Chunks
Type	float64	np.ndarray

Multiple arrays#

To write multiple arrays in a single computation use cubed.store():

import cubed
import cubed.random

# 2MB chunks
a = cubed.random.random((5000, 5000), chunks=(500, 500))
b = cubed.random.random((5000, 5000), chunks=(500, 500))

# write to Zarr
arrays = [a, b]
paths = ["a.zarr", "b.zarr"]
cubed.store(arrays, paths)

Then to read the Zarr files back, we use cubed.from_zarr() for each array and perform whatever array operations we like on them. Only when we call to_zarr is the whole computation executed.

import cubed.array_api as xp

# read from Zarr
a = cubed.from_zarr("a.zarr")
b = cubed.from_zarr("b.zarr")

# perform operation
c = xp.add(a, b)

# write to Zarr
cubed.to_zarr(c, store="c.zarr")