Spatial Clustering Demonstration of Breast cancer (10xVisium)

In this Tutorial, we demonstrate how to use 3d-OT to obtain the clustering results of Breast cancer

Loading package

[1]:
from lib_3d_OT.utils import *
import scanpy as sc
import numpy as np
import pandas as pd
import torch
from lib_3d_OT.single_modialty import *
import torch.optim as optim
import warnings
warnings.filterwarnings("ignore")
R[write to console]:                    __           __
   ____ ___  _____/ /_  _______/ /_
  / __ `__ \/ ___/ / / / / ___/ __/
 / / / / / / /__/ / /_/ (__  ) /_
/_/ /_/ /_/\___/_/\__,_/____/\__/   version 6.1.1
Type 'citation("mclust")' for citing this R package in publications.

[ ]:
device = torch.device("cuda:1" if torch.cuda.is_available() else "cpu")

Loading data

We use SCANPY package to select Top3000 HVGs and perform standard data processing,The standard processed expression matrix adata.Xis used as input

[3]:
adata=sc.read_visium('/home/dbj/mouse/vision3/')
adata.var_names_make_unique()
sc.pp.highly_variable_genes(adata, n_top_genes=3000, flavor='seurat_v3')
adata = adata[:,adata.var.highly_variable]
sc.pp.normalize_total(adata, inplace=True)
sc.pp.log1p(adata)
sc.pp.scale(adata, zero_center=False, max_value=10)
cluster=pd.read_csv('/home/dbj/mouse/metadata.tsv',sep='\t',index_col=0)
adata.obs['truth']=cluster['ground_truth']
adata.obs['truth'] = adata.obs['truth'].astype('category')
adata.obsm['feat']=adata.X

The ground truth of breast cancer

[5]:
sc.pl.spatial(adata, img_key="hires", color="truth", alpha=0.7, size=1.5)
../_images/Single_Omics_spatial_domain_identification_Breast_cancer_7_0.png

Construct a neighbor graph and train PointNet++Encoder

[6]:
set_seed(8)
graph = prepare_data(adata, location="spatial", nb_neighbors=16).to(device)
input_dim1 = graph.express.shape[-1]
model = extractMODEL(args=None,input_dim=input_dim1)
optimizer = optim.Adam(model.parameters(), lr=0.001)
best_model, min_loss = train_graph_extractor(graph, model, optimizer, device,epochs=1150)
Epoch 1150/1150, Loss: 0.678885, Min Loss: 0.679147

Obtain the reconstruction matrix decoded_features

[8]:
with torch.no_grad():
    model.eval()
    z= model.get_features(graph)
    decoded_features = model.decode(z)
    gene_expression_matrix = decoded_features.cpu().squeeze(0).detach().numpy()
adata.obsm['3d-OT']=gene_expression_matrix

We use mclust for clustering

[9]:
clustering(adata, n_clusters=20, radius=50, key='3d-OT', method='mclust', refinement=True,random=38,n_comp=10)
Using 3d-OT representation for clustering...
fitting ...
  |======================================================================| 100%

The clustering result of breast cancer

[10]:
sc.pl.spatial(adata, color='3d-OT', img_key='hires', alpha=0.7, size=1.5)
../_images/Single_Omics_spatial_domain_identification_Breast_cancer_15_0.png

Calculate supervision metrics ARI and NMI

[11]:
from sklearn.metrics import adjusted_rand_score,normalized_mutual_info_score
ARI = adjusted_rand_score(adata.obs['truth'], adata.obs['3d-OT'])
NMI = normalized_mutual_info_score(adata.obs['truth'], adata.obs['3d-OT'])
ARI,NMI
[11]:
(0.6834927219439044, 0.7146305989724012)