Proteomics Tutorial with SpatialData blobs#
This tutorial shows two equivalent ways to build a Vitessce proteomics config:
proteomics_from_split_sources: image, labels, and table are passed as separate paths.proteomics_from_spatialdata: layers are resolved from a SpatialData object and can use coordinate systems.
Both workflows use the same underlying data so you can compare behavior directly.
%load_ext autoreload
%autoreload 2
import harpy_vitessce as hpv
import tempfile
from pathlib import Path
tmp_dir = Path(tempfile.mkdtemp(prefix="spatialdata_blobs"))
import scanpy as sc
from spatialdata.datasets import blobs
from spatialdata.models import TableModel
sdata = blobs()
adata = sdata["table"]
# add leiden clusters using a dummy scanpy pipeline
sc.pp.scale(adata, max_value=10)
sc.pp.pca(
adata,
n_comps=2,
svd_solver="arpack",
)
sc.pp.neighbors(
adata,
use_rep="X_pca",
n_neighbors=10,
)
sc.tl.leiden(adata, resolution=0.6, key_added="leiden")
sc.tl.umap(adata, min_dist=0.3)
# uncomment these to convince yourself that Vitessce (when using SpatialDataWrapper) falls back to the index of the table if there is not instance/region key in the table.
# del adata.obs["instance_id"]
# del adata.uns[TableModel.ATTRS_KEY]
# adata.obs.index = [f"segmentation_{uuid.uuid4()}" for _ in range(len(adata.obs))]
spatialdata_path = tmp_dir / "sdata.zarr"
sdata.write(
spatialdata_path,
overwrite=True,
)
Why index alignment matters for split sources#
proteomics_from_split_sources uses separate wrappers for image/labels/table.
For cell-level linking, segmentation IDs in labels_source should match
the AnnData observation IDs (adata.obs_names, i.e. the table index).
If they do not match, selections in spatial and feature views cannot be synchronized correctly.
import dask.array as da
display(sdata["table"].obs.index)
# should match ID's in
display(da.unique(sdata["blobs_labels"].data).compute())
Index(['1', '2', '3', '4', '5', '6', '8', '9', '10', '11', '12', '13', '15',
'16', '17', '18', '19', '20', '22', '23', '24', '25', '26', '27', '29',
'30'],
dtype='object')
array([ 0, 1, 2, 3, 4, 5, 6, 8, 9, 10, 11, 12, 13, 15, 16, 17, 18,
19, 20, 22, 23, 24, 25, 26, 27, 29, 30], dtype=int16)
Build a Vitessce config from split sources#
This call reads image, labels, and table from separate paths and builds a linked
multi-view layout (spatial view + marker/cluster views). This is useful when you want to store the AnnData table as a chunked Zarr array (for example, via adata.write(..., chunks=...)), or when your image, labels, and table are stored in different locations.
from IPython.display import HTML, display
vc = hpv.proteomics_from_split_sources(
image_source=spatialdata_path
/ "images"
/ "blobs_multiscale_image", # we require image of dimension "c", "y", "x"
labels_source=spatialdata_path
/ "labels"
/ "blobs_labels", # note we require segmentation mask to be of dimension "y", "x"
microns_per_pixel_image=0.5, # set as you please
microns_per_pixel_mask=0.5,
channels=[0, 1, 2],
adata_source=spatialdata_path / "tables" / "table",
visualize_feature_matrix=True,
visualize_heatmap=False,
embedding_key="X_umap",
embedding_key_display_name="UMAP",
spatial_key=None,
cluster_key="leiden",
cluster_key_display_name="Leiden clusters",
)
url = vc.web_app()
display(HTML(f'<a href="{url}" target="_blank">Open in Vitessce</a>'))
SpatialData-native workflow#
With proteomics_from_spatialdata, linkage is derived from SpatialData table annotations.
The table should annotate the labels element using SpatialData table attributes
(region/instance semantics), rather than relying on index matching alone.
sdata["table"].uns[TableModel.ATTRS_KEY] # -> annotated by blobs_labels
{'region': 'blobs_labels',
'region_key': 'region',
'instance_key': 'instance_id'}
Coordinate systems and transformations#
For the SpatialData-based API, view-space scaling and reorientation are controlled through named coordinate systems.
Below we add:
micron: isotropic scaling from pixels to microns.rotation: an affine rotation in the(x, y)plane.global: required for OME-NGFF compatibility.
import numpy as np
from spatialdata.transformations import Affine, Identity, Scale, set_transformation
microns_per_pixel = 10
rotation_degrees = 20
rotation_radians = np.deg2rad(rotation_degrees)
rotation_matrix = [
[np.cos(rotation_radians), -np.sin(rotation_radians), 0.0],
[np.sin(rotation_radians), np.cos(rotation_radians), 0.0],
[0.0, 0.0, 1.0],
]
transformations = {
"micron": Scale(axes=("x", "y"), scale=[microns_per_pixel, microns_per_pixel]),
"rotation": Affine(
matrix=rotation_matrix, input_axes=("x", "y"), output_axes=("x", "y")
),
"global": Identity(), # Note that we need global coordinate sytem for ome ngff.
}
set_transformation(
sdata["blobs_multiscale_image"],
transformation=transformations,
set_all=True,
write_to_sdata=sdata,
)
set_transformation(
sdata["blobs_labels"],
transformation=transformations,
set_all=True,
write_to_sdata=sdata,
)
Build a Vitessce config from SpatialData#
We render the same data twice, changing only to_coordinate_system (micron vs rotation),
so you can see how coordinate-system selection affects the spatial view while preserving
feature-level linking.
from IPython.display import HTML, display
vc = hpv.proteomics_from_spatialdata(
sdata_path=spatialdata_path,
labels_name="blobs_labels",
image_name="blobs_multiscale_image",
table_name="table",
channels=[0, 1, 2],
visualize_feature_matrix=True,
to_coordinate_system="micron", # specify the micron coordinate system.
visualize_heatmap=True,
embedding_key="X_umap",
cluster_key="leiden",
cluster_key_display_name="Leiden",
)
url = vc.web_app()
display(HTML(f'<a href="{url}" target="_blank">Open in Vitessce</a>'))
vc = hpv.proteomics_from_spatialdata(
sdata_path=spatialdata_path,
labels_name="blobs_labels",
image_name="blobs_multiscale_image",
table_name="table",
channels=[0, 1, 2],
visualize_feature_matrix=True,
to_coordinate_system="rotation", # or a rotation
visualize_heatmap=True,
embedding_key="X_umap",
cluster_key="leiden",
cluster_key_display_name="Leiden",
)
url = vc.web_app()
display(HTML(f'<a href="{url}" target="_blank">Open in Vitessce</a>'))