Spatial Clustering Demonstration of Breast cancer (10xVisium)
In this Tutorial, we demonstrate how to use 3d-OT to obtain the clustering results of Breast cancer
Loading package
[1]:
from lib_3d_OT.utils import *
import scanpy as sc
import numpy as np
import pandas as pd
import torch
from lib_3d_OT.single_modialty import *
import torch.optim as optim
import warnings
warnings.filterwarnings("ignore")
R[write to console]: __ __
____ ___ _____/ /_ _______/ /_
/ __ `__ \/ ___/ / / / / ___/ __/
/ / / / / / /__/ / /_/ (__ ) /_
/_/ /_/ /_/\___/_/\__,_/____/\__/ version 6.1.1
Type 'citation("mclust")' for citing this R package in publications.
[ ]:
device = torch.device("cuda:1" if torch.cuda.is_available() else "cpu")
Loading data
We use SCANPY package to select Top3000 HVGs and perform standard data processing,The standard processed expression matrix adata.Xis used as input
[3]:
adata=sc.read_visium('/home/dbj/mouse/vision3/')
adata.var_names_make_unique()
sc.pp.highly_variable_genes(adata, n_top_genes=3000, flavor='seurat_v3')
adata = adata[:,adata.var.highly_variable]
sc.pp.normalize_total(adata, inplace=True)
sc.pp.log1p(adata)
sc.pp.scale(adata, zero_center=False, max_value=10)
cluster=pd.read_csv('/home/dbj/mouse/metadata.tsv',sep='\t',index_col=0)
adata.obs['truth']=cluster['ground_truth']
adata.obs['truth'] = adata.obs['truth'].astype('category')
adata.obsm['feat']=adata.X
The ground truth of breast cancer
[5]:
sc.pl.spatial(adata, img_key="hires", color="truth", alpha=0.7, size=1.5)
Construct a neighbor graph and train PointNet++Encoder
[6]:
set_seed(8)
graph = prepare_data(adata, location="spatial", nb_neighbors=16).to(device)
input_dim1 = graph.express.shape[-1]
model = extractMODEL(args=None,input_dim=input_dim1)
optimizer = optim.Adam(model.parameters(), lr=0.001)
best_model, min_loss = train_graph_extractor(graph, model, optimizer, device,epochs=1150)
Epoch 1150/1150, Loss: 0.678885, Min Loss: 0.679147
Obtain the reconstruction matrix decoded_features
[8]:
with torch.no_grad():
model.eval()
z= model.get_features(graph)
decoded_features = model.decode(z)
gene_expression_matrix = decoded_features.cpu().squeeze(0).detach().numpy()
adata.obsm['3d-OT']=gene_expression_matrix
We use mclust for clustering
[9]:
clustering(adata, n_clusters=20, radius=50, key='3d-OT', method='mclust', refinement=True,random=38,n_comp=10)
Using 3d-OT representation for clustering...
fitting ...
|======================================================================| 100%
The clustering result of breast cancer
[10]:
sc.pl.spatial(adata, color='3d-OT', img_key='hires', alpha=0.7, size=1.5)
Calculate supervision metrics ARI and NMI
[11]:
from sklearn.metrics import adjusted_rand_score,normalized_mutual_info_score
ARI = adjusted_rand_score(adata.obs['truth'], adata.obs['3d-OT'])
NMI = normalized_mutual_info_score(adata.obs['truth'], adata.obs['3d-OT'])
ARI,NMI
[11]:
(0.6834927219439044, 0.7146305989724012)