.. _Vignette_LabelTransferHeart_label:
Label transfer using customized heart cell reference
====================================================
In this vignette, we will explore how to use hECA build customized
reference dataset of the same organ and perform label transfer to
annotate other cells.
**Step 1: load packages**
.. code:: r
library(Seurat)
library(SingleR)
library(scran)
| **Step 2: build reference dataset from hECA**
| In this experiment we will build reference using two study of heart
cells. Following 5 steps:
| 2.1 - Open `hECA website `__, click “Cell
Sorting” in the menu.
| 2.2 - “Add Filters” - “Organ” - type in “Heart”, click to select and
include subtypes as default
| 2.3 - Click “Apply”. After seconds, click “Download Data” to download
the keys of sorted cells, a csv file “keys.csv” will be downloaded.
|
2.4 - Download cells following the tutorials of
`ECAUGT `__
and save the results to csv file (in Python):
.. code:: python
rows_to_get = pd.read_csv('keys.csv')
rows_to_get = [[('cid',i)] for i in rows_to_get['cid']]
result = ECAUGT.get_columnsbycell_para(rows_to_get = rows_to_get, cols_to_get=None,
col_filter=gene_condition, do_transfer = True,
thread_num = multiprocessing.cpu_count()-1)
genes = result.columns[:43878]
metaCols = result.columns[43878:43878+18]
expr = result.loc[:,genes]
meta = result.loc[:,metaCols]
expr.to_csv("hECA_exprs.csv", index=True)
meta.to_csv("hECA_metadata.csv", index=True)
2.5 - Load downloaded expression matrix and metadata as customized
reference dataset (continue in R).
.. code:: r
expr <- read.csv("hECA_exprs.csv", header=T, row.names=1)
meta <- read.csv("hECA_metadata.csv", header=T, row.names=1)
| **Step 3: load query data**
| Please replace the “query_data.csv” to the path of your query dataset.
.. code:: r
query_path <- "query_data.csv"
query_data <- read.csv(query_path, header=T, row.names=1)
| **Step 4: perform label transfer with SingleR**
| Now we will use SingleR to transfer labels from reference data to
query data.
Step 4.1: Train SingleR model
.. code:: r
# get labels
ct.ref <- as.character(meta$cell_type)
# train model
trainedR <- trainSingleR(expr, ct.ref, de.method = "wilcox")
## (optional) save trained model
# save(trainedR,file = "trainedModel.Rdata")
Step 4.2: Predict labels of query data
.. code:: r
predict <- classifySingleR(query_data,trainedR)
Step 4.3: Check prediction results
.. code:: r
# get true labels
truth <- obj.query$cell_type
# construct result dataframe
df.result <- predict[,c("pruned.labels","labels")]
df.result$truth <- truth
df.result <- data.frame(df.result)
## (optional) save results
# write.csv(df.result,"result1.csv")
# draw confusion matrix and accuarcy scores
caret::confusionMatrix(factor(df.result$pruned.labels,levels=unique(df.result$truth)),factor(df.result$truth,levels=unique(df.result$truth)))