This tutorial is desgined for users do not have pre-build gene regulatory networks.
Step0. Select motif database based on input species
IReNA contains DNA motif datasets for four species (Homo sapiens, Mus musculus, Zebrafish and Chicken) derived from TRANSFAC version 201803. Following codes are used to get the motif dataset from TRANSFAC or user-defined motif dataset which should have the same format as these from TRANSFAC database. Selected motif database is used to infer gene regulatory networks.
library(IReNA)
###call Mus musculus motif database
<- Tranfac201803_Mm_MotifTFsF
motif1 ###call Homo sapiens motif database
<- Tranfac201803_Hs_MotifTFsF
motif1 ###call Zebrafish motif database
<- Tranfac201803_Zf_MotifTFsF
motif1 ###call Chicken motif database
<- Tranfac201803_Ch_MotifTFsF motif1
The species of test data used here is Homo sapiens, so we select Tranfac201803_Hs_MotifTFsF as motif database
<- Tranfac201803_Hs_MotifTFsF
motif1 = unlist(strsplit(motif1$TFs,';')) candidate_tfs
Step1. Build moduliazed pseudo-bulk expression profile
IReNA utilizes modularized gene regulatory networks to identify enriched transcription factors and simplified networks to represent biological processes. Therefore, before inferring gene regulatory networks, we need to modularize all input features (genes) to obtain a modularized gene regulatory network in subsequent steps. In addition, because scRNA-seq data are highly sparse, this may inhibit the accurate inference of regulatory relationships; therefore, we need to construct pseudo-bulk expression profiles (metacells).
To run this part, you need to prepare a seurat object with ‘Pseudotime’ column in metadata.
= readRDS("E:/IReNA/SSC_seurat_with_time.rds")
obj = irena_step1(obj,50,4) exp
## [1] "Perform k-means"
##
## 1 2 3 4
## 557 649 1021 457
## [1] "Sort genes"
## [1] "Revise outlier"
## [1] "Number of outlier: 0" "Number of outlier: 0"
Then, we can visualize the moduliazed pseudo-bulk expression profile
plot_kmeans_pheatmap(exp,ModuleColor1 = c('#67C7C1','#5BA6DA','#FFBF0F','#C067A9'))
Step2. infer gene regulatory networks
infer co-expression networks
= irena_step2(as.matrix(exp),candidate_tfs=candidate_tfs) regulatory_relationships
## [1] "Using pearson correlation to infer basic grn"
### Then we can filter weak regualtory relationships based on regulatory weight
= regulatory_relationships[abs(regulatory_relationships$Correlation)>0.8,] regulatory_relationships
Refine regulatory relationships based on binding motifs of target genes (optional)
For ATAC-seq data is not available. IReNA uses fimo to check whether motifs of transcription factors that regulate the target gene occur in the upstream of target genes. If motifs of transcription factors that regulate the target gene exist, this gene pair will be retained.
Firstly select candidate gene tss database.
### select hg38 gene tss for human data
= hg38_gene_tss
gene_tss ### you can also using the following code to select mm10 gene tss for mouse data
# gene_tss = mm10_gene_tss
Then calculate potential binding TFs of target genes throguh R package motifmatchr.
library(BSgenome.Hsapiens.UCSC.hg38)
= gene_tss[gene_tss$symbol %in% rownames(exp),]
gene_tss <- GRanges(paste0(gene_tss[,1],':',gene_tss[,2],'-',gene_tss[,3]))
gene_tss_GR <- identify_region_tfs(gene_tss_GR,gene_tss[,6],motif1,BSdb=BSgenome.Hsapiens.UCSC.hg38)
fimo_regulation <- filter_regulation_fimo(fimo_regulation, regulatory_relationships) regulatory_relationships
Step3. Decode gene regulatory networks
The primary novelty of IReNA lies in its ability to decode regulatory relationships among modules using a hypergeometric test. Here, you can use the following code to perform IReNA analysis. The output of IReNA analysis contains three most important part:
Enriched TFs that regulate other module
Enriched TFs networks: regulatory network of enriched TFs.
Simplified networks: regualtory network among modules.
<- network_analysis(regulatory_relationships,exp[,1:2],TFFDR1=10,TFFDR2 = 10,ModuleFDR = 0.01) IReNA_result
## [1] "Total TFs: 145"
## [1] "Enriched TFs: 69"
## [1] "Significant regulations: 4"
Browse enriched TFs that regulate other module
$TF_module_regulation[1:3,] IReNA_result
## TF TFGroup LogFDR TargetGroup RegulationType
## out2 CCNT2 1 -0.744319429562015 Group1 Positive
## out2.1 REST 1 -0.815215941738212 Group1 Positive
## out2.2 APEX1 1 -0.617368646095004 Group1 Positive
Browse enriched TFs network
$TF_network[1:3,] IReNA_result
## TF TFGroup TFMinNlogfdr TFMinGroup SigActModules SigRepModules Target
## 34906 CCNT2 1 Inf P1 1 NA CCNT2
## 37591 REST 1 Inf P1 1 NA REST
## 64441 APEX1 1 Inf P1 1 NA APEX1
## Correlation TFGroup TargetGroup Regulation
## 34906 1 1 1 Positive
## 37591 1 1 1 Positive
## 64441 1 1 1 Positive
Browse simplified networks
$intramodular_network[1:3,] IReNA_result
## TFGroup TargetGroup Regulation Correlation
## Regulation12Pnum 1 1 Positive 1
## Regulation12Pnum.4 2 2 Positive 0.952059736819138
## Regulation12Pnum.7 3 3 Positive 0.851203572257657
## NumberRegulation Pvalue NlogFdr
## Regulation12Pnum 12;12;42;12 0 Inf
## Regulation12Pnum.4 7;7;47;7 0 Inf
## Regulation12Pnum.7 29;29;25;29 0 Inf
In addition, you can also use our function to visualize enriched tf network
plot_tf_network(IReNA_result)