.. _Vignette_LabelTransferHeart_label: Label transfer using customized heart cell reference ==================================================== In this vignette, we will explore how to use hECA build customized reference dataset of the same organ and perform label transfer to annotate other cells. **Step 1: load packages** .. code:: r library(Seurat) library(SingleR) library(scran) | **Step 2: build reference dataset from hECA** | In this experiment we will build reference using two study of heart cells. Following 5 steps: | 2.1 - Open `hECA website `__, click “Cell Sorting” in the menu. | 2.2 - “Add Filters” - “Organ” - type in “Heart”, click to select and include subtypes as default | 2.3 - Click “Apply”. After seconds, click “Download Data” to download the keys of sorted cells, a csv file “keys.csv” will be downloaded. | 2.4 - Download cells following the tutorials of `ECAUGT `__ and save the results to csv file (in Python): .. code:: python rows_to_get = pd.read_csv('keys.csv') rows_to_get = [[('cid',i)] for i in rows_to_get['cid']] result = ECAUGT.get_columnsbycell_para(rows_to_get = rows_to_get, cols_to_get=None, col_filter=gene_condition, do_transfer = True, thread_num = multiprocessing.cpu_count()-1) genes = result.columns[:43878] metaCols = result.columns[43878:43878+18] expr = result.loc[:,genes] meta = result.loc[:,metaCols] expr.to_csv("hECA_exprs.csv", index=True) meta.to_csv("hECA_metadata.csv", index=True) 2.5 - Load downloaded expression matrix and metadata as customized reference dataset (continue in R). .. code:: r expr <- read.csv("hECA_exprs.csv", header=T, row.names=1) meta <- read.csv("hECA_metadata.csv", header=T, row.names=1) | **Step 3: load query data** | Please replace the “query_data.csv” to the path of your query dataset. .. code:: r query_path <- "query_data.csv" query_data <- read.csv(query_path, header=T, row.names=1) | **Step 4: perform label transfer with SingleR** | Now we will use SingleR to transfer labels from reference data to query data. Step 4.1: Train SingleR model .. code:: r # get labels ct.ref <- as.character(meta$cell_type) # train model trainedR <- trainSingleR(expr, ct.ref, de.method = "wilcox") ## (optional) save trained model # save(trainedR,file = "trainedModel.Rdata") Step 4.2: Predict labels of query data .. code:: r predict <- classifySingleR(query_data,trainedR) Step 4.3: Check prediction results .. code:: r # get true labels truth <- obj.query$cell_type # construct result dataframe df.result <- predict[,c("pruned.labels","labels")] df.result$truth <- truth df.result <- data.frame(df.result) ## (optional) save results # write.csv(df.result,"result1.csv") # draw confusion matrix and accuarcy scores caret::confusionMatrix(factor(df.result$pruned.labels,levels=unique(df.result$truth)),factor(df.result$truth,levels=unique(df.result$truth)))