MicroRNAs (miRNAs) are small non-coding RNAs (~22 nt) that control a wide range of biological processes including cancers via regulating target genes [1-5]. Therefore, it is important to uncover miRNA functions and regulatory mechanisms in cancers.
The emergence of competing endogenous RNA (ceRNA) hypothesis [6] has challenged the traditional knowledge that coding RNAs only act as targets of miRNAs. Actually, a pool of coding and non-coding RNAs that shares common miRNA biding sites competes with each other, thus act as ceRNAs to release coding RNAs from miRNAs control. These ceRNAs are also known as miRNA sponges or miRNA decoys, and include long non-coding RNAs (lncRNAs), pseudogenes, circular RNAs (circRNAs) and messenger RNAs (mRNAs), etc [7-10]. Recent studies [11, 12] have shown that miRNA sponge regulation can help to reveal the biological mechanism in cancer.
To accelerate the research of miRNA sponge, an R package named ‘miRspongeR’ is developed to implement popular methods for the identification and analysis of miRNA sponge regulation from putative miRNA-target interactions or/and transcriptomics data (including bulk, single-cell and spatial gene expression data).
In ‘spongeMethod’ function, ‘miRspongeR’ implements eight popular methods (including miRHomology [13, 14], pc [15, 16], sppc [17], ppc [11], hermes [18], muTaME [19], cernia [20], and SPONGE [21]) to identify miRNA sponge interactions. The eight methods should meet a basic condition: the significance of sharing of miRNAs by each RNA-RNA pair (e.g. adjusted p-value < 0.01). Each method has its own merit due to different evaluating indicators. Thus, ‘miRspongeR’ presents an integrate method to combine predicted miRNA sponge interactions from different methods.
The miRHomology method is based on the homology of sharing miRNAs.
miR2Target <- system.file("extdata", "miR2Target.csv", package="miRspongeR")
miRTarget <- read.csv(miR2Target, header=TRUE, sep=",")
miRHomologyceRInt_original <- spongeMethod(miRTarget, method = "miRHomology")
miRHomologyceRInt_parallel <- spongeMethod(miRTarget, method = "miRHomology_parallel")
head(miRHomologyceRInt_original)
## sponge_1 sponge_2 #shared miRNAs p.adjusted_value of shared miRNAs
## [1,] "QKI" "NFIA" "31" "1.61969028532605e-05"
## [2,] "QKI" "TCF4" "20" "0.00164001938044806"
## [3,] "QKI" "TNKS2" "28" "0.00098462411218998"
## [4,] "QKI" "PTEN" "31" "0.00172299770543381"
## [5,] "QKI" "KLF6" "25" "1.61969028532605e-05"
## [6,] "SOX5" "RB1" "11" "0.00313418502601189"
head(miRHomologyceRInt_parallel)
## sponge_1 sponge_2 #shared miRNAs p.adjusted_value of shared miRNAs
## [1,] "QKI" "NFIA" "31" "1.61969028532605e-05"
## [2,] "QKI" "TCF4" "20" "0.00164001938044806"
## [3,] "QKI" "TNKS2" "28" "0.00098462411218998"
## [4,] "QKI" "PTEN" "31" "0.0017229977054338"
## [5,] "QKI" "KLF6" "25" "1.61969028532605e-05"
## [6,] "SOX5" "RB1" "11" "0.00313418502601189"
The pc method considers expression data based on miRHomology method. Significantly positive miRNA sponge pairs are regarded as output.
miR2Target <- system.file("extdata", "miR2Target.csv", package="miRspongeR")
miRTarget <- read.csv(miR2Target, header=TRUE, sep=",")
ExpDatacsv <- system.file("extdata", "ExpData.csv", package="miRspongeR")
ExpData <- read.csv(ExpDatacsv, header=TRUE, sep=",")
pcceRInt_original <- spongeMethod(miRTarget, ExpData, method = "pc")
pcceRInt_parallel <- spongeMethod(miRTarget, ExpData, method = "pc_parallel")
head(pcceRInt_original)
## sponge_1 sponge_2 #shared miRNAs p.adjusted_value of shared miRNAs
## [1,] "QKI" "NFIA" "31" "1.61969028532605e-05"
## [2,] "QKI" "TCF4" "20" "0.00164001938044806"
## [3,] "QKI" "TNKS2" "28" "0.00098462411218998"
## [4,] "QKI" "PTEN" "31" "0.00172299770543381"
## [5,] "QKI" "KLF6" "25" "1.61969028532605e-05"
## [6,] "SOX5" "RB1" "11" "0.00313418502601189"
## correlation p.adjusted_value of positive correlation
## [1,] "0.491729673394525" "1.31981734203614e-09"
## [2,] "0.732930517949784" "2.7017217087929e-24"
## [3,] "0.397085235409979" "1.57216583466138e-06"
## [4,] "0.554784433631471" "2.2899877573925e-12"
## [5,] "0.605337991455224" "5.23010520014821e-15"
## [6,] "0.239965818711153" "0.00427058208653246"
head(pcceRInt_parallel)
## sponge_1 sponge_2 #shared miRNAs p.adjusted_value of shared miRNAs
## [1,] "QKI" "NFIA" "31" "1.61969028532605e-05"
## [2,] "QKI" "TCF4" "20" "0.00164001938044806"
## [3,] "QKI" "TNKS2" "28" "0.00098462411218998"
## [4,] "QKI" "PTEN" "31" "0.0017229977054338"
## [5,] "QKI" "KLF6" "25" "1.61969028532605e-05"
## [6,] "SOX5" "RB1" "11" "0.00313418502601189"
## correlation p.adjusted_value of positive correlation
## [1,] "0.491729673394525" "1.31981734203614e-09"
## [2,] "0.732930517949784" "2.70172170879289e-24"
## [3,] "0.397085235409979" "1.57216583466138e-06"
## [4,] "0.554784433631471" "2.2899877573925e-12"
## [5,] "0.605337991455224" "5.2301052001482e-15"
## [6,] "0.239965818711153" "0.00427058208653247"
The sppc method is based on sensitivity correlation (difference between the Pearson and partial correlation coefficients).
miR2Target <- system.file("extdata", "miR2Target.csv", package="miRspongeR")
miRTarget <- read.csv(miR2Target, header=TRUE, sep=",")
ExpDatacsv <- system.file("extdata", "ExpData.csv", package="miRspongeR")
ExpData <- read.csv(ExpDatacsv, header=TRUE, sep=",")
sppcceRInt_original <- spongeMethod(miRTarget, ExpData, senscorcutoff = 0.1, method = "sppc")
sppcceRInt_parallel <- spongeMethod(miRTarget, ExpData, senscorcutoff = 0.1, method = "sppc_parallel")
head(sppcceRInt_original)
## sponge_1 sponge_2 #shared miRNAs p.adjusted_value of shared miRNAs
## [1,] "QKI" "NFIA" "31" "1.61969028532605e-05"
## [2,] "QKI" "TCF4" "20" "0.00164001938044806"
## [3,] "QKI" "TNKS2" "28" "0.00098462411218998"
## [4,] "QKI" "PTEN" "31" "0.00172299770543381"
## [5,] "QKI" "KLF6" "25" "1.61969028532605e-05"
## [6,] "SOX5" "RB1" "11" "0.00313418502601189"
## correlation p.adjusted_value of positive correlation
## [1,] "0.491729673394525" "1.31981734203614e-09"
## [2,] "0.732930517949784" "2.7017217087929e-24"
## [3,] "0.397085235409979" "1.57216583466138e-06"
## [4,] "0.554784433631471" "2.2899877573925e-12"
## [5,] "0.605337991455224" "5.23010520014821e-15"
## [6,] "0.239965818711153" "0.00427058208653246"
## sensitivity correlation
## [1,] "0.307037199068782"
## [2,] "0.316438618435417"
## [3,] "0.138436130309623"
## [4,] "0.346274495026087"
## [5,] "0.354300353701363"
## [6,] "0.188537090552444"
head(sppcceRInt_parallel)
## sponge_1 sponge_2 #shared miRNAs p.adjusted_value of shared miRNAs
## [1,] "QKI" "NFIA" "31" "1.61969028532605e-05"
## [2,] "QKI" "TCF4" "20" "0.00164001938044806"
## [3,] "QKI" "TNKS2" "28" "0.00098462411218998"
## [4,] "QKI" "PTEN" "31" "0.0017229977054338"
## [5,] "QKI" "KLF6" "25" "1.61969028532605e-05"
## [6,] "SOX5" "RB1" "11" "0.00313418502601189"
## correlation p.adjusted_value of positive correlation
## [1,] "0.491729673394525" "1.31981734203614e-09"
## [2,] "0.732930517949784" "2.70172170879289e-24"
## [3,] "0.397085235409979" "1.57216583466138e-06"
## [4,] "0.554784433631471" "2.2899877573925e-12"
## [5,] "0.605337991455224" "5.2301052001482e-15"
## [6,] "0.239965818711153" "0.00427058208653247"
## sensitivity correlation
## [1,] "0.307037199068782"
## [2,] "0.316438618435417"
## [3,] "0.138436130309623"
## [4,] "0.346274495026087"
## [5,] "0.354300353701363"
## [6,] "0.188537090552444"
The hermes method predicts competing endogenous RNAs via evidence for competition for miRNA regulation based on conditional mutual information.
miR2Target <- system.file("extdata", "miR2Target.csv", package="miRspongeR")
miRTarget <- read.csv(miR2Target, header=TRUE, sep=",")
ExpDatacsv <- system.file("extdata", "ExpData.csv", package="miRspongeR")
ExpData <- read.csv(ExpDatacsv, header=TRUE, sep=",")
hermesceRInt_original <- spongeMethod(miRTarget, ExpData, num_perm = 10, method = "hermes")
hermesceRInt_parallel <- spongeMethod(miRTarget, ExpData, num_perm = 10, method = "hermes_parallel")
head(hermesceRInt_original)
## sponge_1 sponge_2 #shared miRNAs p.adjusted_value of shared miRNAs
## [1,] "QKI" "NFIA" "31" "1.61969028532605e-05"
## [2,] "RB1" "PTEN" "21" "0.00334325253854548"
## [3,] "NFIA" "TCF4" "21" "1.81490295278364e-05"
## [4,] "NFIA" "TNKS2" "23" "0.00344389982365612"
## [5,] "NFIA" "PTEN" "29" "0.000182921588755158"
## [6,] "NFIA" "KLF6" "23" "1.61969028532605e-05"
## p.adjusted_value of RNA competition
## [1,] "0.00650112946906951"
## [2,] "0.00210431851630092"
## [3,] "0.00522980494348687"
## [4,] "0.00650112946906951"
## [5,] "0.000553964501191847"
## [6,] "0.00650112946906951"
head(hermesceRInt_parallel)
## sponge_1 sponge_2 #shared miRNAs p.adjusted_value of shared miRNAs
## [1,] "QKI" "NFIA" "31" "1.61969028532605e-05"
## [2,] "QKI" "TNKS2" "28" "0.00098462411218998"
## [3,] "RB1" "PTEN" "21" "0.00334325253854548"
## [4,] "NFIA" "TCF4" "21" "1.81490295278364e-05"
## [5,] "NFIA" "PTEN" "29" "0.000182921588755158"
## [6,] "TCF4" "PTEN" "23" "2.77533469138317e-05"
## p.adjusted_value of RNA competition
## [1,] "0.0053612964521979"
## [2,] "0.00236875606365036"
## [3,] "0.0053612964521979"
## [4,] "0.00347707514082423"
## [5,] "6.64551043885334e-05"
## [6,] "0.00236875606365036"
Parameter ‘num_perm’ is used to set the number of permutations of input expression data. The larger the number is, the slower the calculation is.
The ppc method is a variant of the hermes method. However, it predicts competing endogenous RNAs via evidence for competition for miRNA regulation based on Partial Pearson Correlation.
miR2Target <- system.file("extdata", "miR2Target.csv", package="miRspongeR")
miRTarget <- read.csv(miR2Target, header=TRUE, sep=",")
ExpDatacsv <- system.file("extdata", "ExpData.csv", package="miRspongeR")
ExpData <- read.csv(ExpDatacsv, header=TRUE, sep=",")
ppcceRInt_original <- spongeMethod(miRTarget, ExpData, num_perm = 10, method = "ppc")
ppcceRInt_parallel <- spongeMethod(miRTarget, ExpData, num_perm = 10, method = "ppc_parallel")
head(ppcceRInt_original)
## sponge_1 sponge_2 #shared miRNAs p.adjusted_value of shared miRNAs
## [1,] "QKI" "NFIA" "31" "1.61969028532605e-05"
## [2,] "QKI" "TCF4" "20" "0.00164001938044806"
## [3,] "QKI" "TNKS2" "28" "0.00098462411218998"
## [4,] "QKI" "PTEN" "31" "0.00172299770543381"
## [5,] "QKI" "KLF6" "25" "1.61969028532605e-05"
## [6,] "SOX5" "RB1" "11" "0.00313418502601189"
## p.adjusted_value of RNA competition
## [1,] "3.24863700780281e-10"
## [2,] "2.83150048689073e-07"
## [3,] "1.09386075001538e-08"
## [4,] "4.23392739007838e-10"
## [5,] "2.15999455679945e-09"
## [6,] "0.000284164148158203"
head(ppcceRInt_parallel)
## sponge_1 sponge_2 #shared miRNAs p.adjusted_value of shared miRNAs
## [1,] "QKI" "NFIA" "31" "1.61969028532605e-05"
## [2,] "QKI" "TCF4" "20" "0.00164001938044806"
## [3,] "QKI" "TNKS2" "28" "0.00098462411218998"
## [4,] "QKI" "PTEN" "31" "0.0017229977054338"
## [5,] "QKI" "KLF6" "25" "1.61969028532605e-05"
## [6,] "SOX5" "RB1" "11" "0.00313418502601189"
## p.adjusted_value of RNA competition
## [1,] "3.24863700780281e-10"
## [2,] "2.77892977932344e-07"
## [3,] "1.77112563653867e-08"
## [4,] "4.23392739007838e-10"
## [5,] "2.15999455679945e-09"
## [6,] "0.000982265076819532"
Parameter ‘num_perm’ is used to set the number of permutations of input expression data. The larger the number is, the slower the calculation is.
‘miRspongeR’ implements the muTaME method based on the logarithm of four scores: (1) the fraction of common miRNAs, (2) the density of the MREs for all shared miRNAs, (3) the distribution of MREs of the putative RNA-RNA pairs and (4) the relation between the overall number of MREs for a putative miRNA sponge compared with the number of miRNAs that yield these MREs. There is no reason to decide which score has more contribution than the rest. Thus, ‘miRspongeR’ calculates a combined score by adding these four scores. To evaluate the strength of each RNA-RNA pair, ‘miRspongeR’ further normalizes the combined scores to obtain normalized scores with interval [0 1].
miR2Target <- system.file("extdata", "miR2Target.csv", package="miRspongeR")
miRTarget <- read.csv(miR2Target, header=TRUE, sep=",")
MREs <- system.file("extdata", "MREs.csv", package="miRspongeR")
mres <- read.csv(MREs, header=TRUE, sep=",")
muTaMEceRInt_original <- spongeMethod(miRTarget, mres = mres, method = "muTaME")
muTaMEceRInt_parallel <- spongeMethod(miRTarget, mres = mres, method = "muTaME_parallel")
head(muTaMEceRInt_original)
## sponge_1 sponge_2 #shared miRNAs p.adjusted_value of shared miRNAs
## [1,] "QKI" "NFIA" "31" "1.61969028532605e-05"
## [2,] "QKI" "TCF4" "20" "0.00164001938044806"
## [3,] "QKI" "TNKS2" "28" "0.00098462411218998"
## [4,] "QKI" "PTEN" "31" "0.00172299770543381"
## [5,] "QKI" "KLF6" "25" "1.61969028532605e-05"
## [6,] "NFIA" "PTEN" "29" "0.000182921588755158"
## Score 1 Score 2 Score 3
## [1,] "-0.437213806422745" "-165.610621988223" "257.7778713546"
## [2,] "-0.470003629245736" "-100.765642874738" "155.515558123873"
## [3,] "-0.559615787935423" "-152.637669361688" "228.018475194044"
## [4,] "-0.626455806061273" "-172.33389579933" "256.339724121501"
## [5,] "-0.336472236621213" "-134.332311526516" "203.667092432857"
## [6,] "-0.503905180921417" "-126.210306682534" "179.441855213251"
## Score 4 Combined score Normalized score
## [1,] "-0.0201046632956751" "91.7099308966585" "1"
## [2,] "-0.0188882845202057" "54.2610233353697" "0.638676224274825"
## [3,] "-0.0226424767497598" "74.7985475676708" "0.836831425131288"
## [4,] "-0.0192015809668538" "83.3601709351436" "0.919437788919319"
## [5,] "-0.0209431738452431" "68.9773654958742" "0.780666062178499"
## [6,] "-0.0470453636900539" "52.6805979861055" "0.623427575071551"
head(muTaMEceRInt_parallel)
## sponge_1 sponge_2 #shared miRNAs p.adjusted_value of shared miRNAs
## [1,] "QKI" "NFIA" "31" "1.61969028532605e-05"
## [2,] "QKI" "TCF4" "20" "0.00164001938044806"
## [3,] "QKI" "TNKS2" "28" "0.00098462411218998"
## [4,] "QKI" "PTEN" "31" "0.0017229977054338"
## [5,] "QKI" "KLF6" "25" "1.61969028532605e-05"
## [6,] "NFIA" "PTEN" "29" "0.000182921588755158"
## Score 1 Score 2 Score 3
## [1,] "-0.437213806422745" "-165.610621988223" "257.7778713546"
## [2,] "-0.470003629245736" "-100.765642874738" "155.515558123873"
## [3,] "-0.559615787935423" "-152.637669361688" "228.018475194044"
## [4,] "-0.626455806061273" "-172.33389579933" "256.339724121501"
## [5,] "-0.336472236621213" "-134.332311526516" "203.667092432857"
## [6,] "-0.503905180921417" "-126.210306682534" "179.441855213251"
## Score 4 Combined score Normalized score
## [1,] "-0.0201046632956751" "91.7099308966585" "1"
## [2,] "-0.0188882845202057" "54.2610233353697" "0.638676224274825"
## [3,] "-0.0226424767497598" "74.7985475676708" "0.836831425131288"
## [4,] "-0.0192015809668538" "83.3601709351436" "0.919437788919319"
## [5,] "-0.0209431738452431" "68.9773654958742" "0.780666062178499"
## [6,] "-0.0470453636900539" "52.6805979861055" "0.623427575071551"
The cernia method is based on the logarithm of seven scores: (1) the fraction of common miRNAs, (2) the density of the MREs for all shared miRNAs, (3) the distribution of MREs of the putative RNA-RNA pairs, (4) the relation between the overall number of MREs for a putative miRNA sponge compared with the number of miRNAs that yield these MREs, (5) the density of the hybridization energies related to MREs for all shared miRNAs, (6) the DT-Hybrid recommendation scores and (7) the pairwise Pearson correlation between putative RNA-RNA pair expression data. There is no reason to decide which score has more contribution than the rest. Thus, ‘miRspongeR’ calculates a combined score by adding these seven scores. To evaluate the strength of each RNA-RNA pair, ‘miRspongeR’ further normalizes the combined scores to obtain normalized scores with interval [0 1].
miR2Target <- system.file("extdata", "miR2Target.csv", package="miRspongeR")
miRTarget <- read.csv(miR2Target, header=TRUE, sep=",")
MREs <- system.file("extdata", "MREs.csv", package="miRspongeR")
mres <- read.csv(MREs, header=TRUE, sep=",")
ExpDatacsv <- system.file("extdata", "ExpData.csv", package="miRspongeR")
ExpData <- read.csv(ExpDatacsv, header=TRUE, sep=",")
cerniaceRInt_original <- spongeMethod(miRTarget, ExpData, mres, method = "cernia")
cerniaceRInt_parallel <- spongeMethod(miRTarget, ExpData, mres, method = "cernia_parallel")
head(cerniaceRInt_original)
## sponge_1 sponge_2 #shared miRNAs p.adjusted_value of shared miRNAs
## [1,] "QKI" "NFIA" "31" "1.61969028532605e-05"
## [2,] "QKI" "TCF4" "20" "0.00164001938044806"
## [3,] "QKI" "TNKS2" "28" "0.00098462411218998"
## [4,] "QKI" "PTEN" "31" "0.00172299770543381"
## [5,] "QKI" "KLF6" "25" "1.61969028532605e-05"
## [6,] "SOX5" "RB1" "11" "0.00313418502601189"
## Score 1 Score 2 Score 3
## [1,] "-0.437213806422745" "-165.610621988223" "257.7778713546"
## [2,] "-0.470003629245736" "-100.765642874738" "155.515558123873"
## [3,] "-0.559615787935423" "-152.637669361688" "228.018475194044"
## [4,] "-0.626455806061273" "-172.33389579933" "256.339724121501"
## [5,] "-0.336472236621213" "-134.332311526516" "203.667092432857"
## [6,] "-0.693147180559945" "-48.1031359789178" "59.3153148480852"
## Score 4 Score 5 Score 6
## [1,] "-0.0201046632956751" "-92.2247034012848" "0.106465490518013"
## [2,] "-0.0188882845202057" "-56.2215260182082" "0.093724938445459"
## [3,] "-0.0226424767497598" "-84.6432209105312" "0.105901597149822"
## [4,] "-0.0192015809668538" "-92.9082256214383" "0.111354127972192"
## [5,] "-0.0209431738452431" "-73.2294093715465" "0.0969028171707324"
## [6,] "-0.0224308469881821" "-20.3501801808809" "0.075154823978529"
## Score 7 Combined score Normalized score
## [1,] "-0.709826157809743" "-1.11813317191805" "0.756556965720048"
## [2,] "-0.310704372925734" "-2.17748211731878" "0.737120534653221"
## [3,] "-0.923604322573739" "-10.6623760682843" "0.581443732308459"
## [4,] "-0.589175648556801" "-10.0258762068793" "0.593121929312341"
## [5,] "-0.501968313383334" "-4.65710937188486" "0.691625512060294"
## [6,] "-1.42725878781999" "-11.205683303103" "0.57147538933374"
head(cerniaceRInt_parallel)
## sponge_1 sponge_2 #shared miRNAs p.adjusted_value of shared miRNAs
## [1,] "QKI" "NFIA" "31" "1.61969028532605e-05"
## [2,] "QKI" "TCF4" "20" "0.00164001938044806"
## [3,] "QKI" "TNKS2" "28" "0.00098462411218998"
## [4,] "QKI" "PTEN" "31" "0.0017229977054338"
## [5,] "QKI" "KLF6" "25" "1.61969028532605e-05"
## [6,] "SOX5" "RB1" "11" "0.00313418502601189"
## Score 1 Score 2 Score 3
## [1,] "-0.437213806422745" "-165.610621988223" "257.7778713546"
## [2,] "-0.470003629245736" "-100.765642874738" "155.515558123873"
## [3,] "-0.559615787935423" "-152.637669361688" "228.018475194044"
## [4,] "-0.626455806061273" "-172.33389579933" "256.339724121501"
## [5,] "-0.336472236621213" "-134.332311526516" "203.667092432857"
## [6,] "-0.693147180559945" "-48.1031359789178" "59.3153148480852"
## Score 4 Score 5 Score 6
## [1,] "-0.0201046632956751" "-92.2247034012848" "0.106465490518013"
## [2,] "-0.0188882845202057" "-56.2215260182082" "0.093724938445459"
## [3,] "-0.0226424767497598" "-84.6432209105312" "0.105901597149822"
## [4,] "-0.0192015809668538" "-92.9082256214383" "0.111354127972192"
## [5,] "-0.0209431738452431" "-73.2294093715465" "0.0969028171707324"
## [6,] "-0.0224308469881821" "-20.3501801808809" "0.075154823978529"
## Score 7 Combined score Normalized score
## [1,] "-0.709826157809743" "-1.11813317191805" "0.756556965720047"
## [2,] "-0.310704372925734" "-2.17748211731878" "0.73712053465322"
## [3,] "-0.923604322573739" "-10.6623760682843" "0.581443732308459"
## [4,] "-0.589175648556801" "-10.0258762068793" "0.59312192931234"
## [5,] "-0.501968313383334" "-4.65710937188486" "0.691625512060293"
## [6,] "-1.42725878781999" "-11.205683303103" "0.57147538933374"
The SPONGE method predicts competing endogenous RNAs with sparse partial correlation.
miR2Target <- system.file("extdata", "miR2Target.csv", package="miRspongeR")
miRTarget <- read.csv(miR2Target, header=TRUE, sep=",")
ExpDatacsv <- system.file("extdata", "ExpData.csv", package="miRspongeR")
ExpData <- read.csv(ExpDatacsv, header=TRUE, sep=",")
null_model_file <- system.file("extdata", "precomputed_null_model.rda", package="miRspongeR")
load(null_model_file)
spongeceRInt_parallel <- spongeMethod(miRTarget, ExpData, padjustvaluecutoff = 0.05, senscorcutoff = 0.1, null_model = precomputed_null_model, method = "sponge_parallel")
head(spongeceRInt_parallel)
## sponge_1 sponge_2 #shared miRNAs p.adjusted_value of shared miRNAs
## [1,] "QKI" "NFIA" "31" "2.09606978101018e-05"
## [2,] "QKI" "TNKS2" "28" "0.00127421943930468"
## [3,] "QKI" "PTEN" "31" "0.00222976173644375"
## [4,] "QKI" "KLF6" "25" "2.09606978101018e-05"
## [5,] "SOX5" "NFIA" "11" "0.0396676479882843"
## [6,] "SOX5" "TNKS2" "12" "0.0163679216117328"
## correlation p.adjusted_value of positive correlation
## [1,] "0.491729673394525" "1.70799891322324e-09"
## [2,] "0.397085235409979" "1.52592566305369e-06"
## [3,] "0.554784433631471" "2.9635135683903e-12"
## [4,] "0.605337991455224" "6.76837143548591e-15"
## [5,] "0.40652609360582" "9.4248730164215e-07"
## [6,] "0.311193449867247" "0.000189927771179422"
## partial correlation sensitivity correlation
## [1,] "0.184692474325743" "0.307037199068782"
## [2,] "0.258649105100356" "0.138436130309623"
## [3,] "0.208509938605384" "0.346274495026087"
## [4,] "0.251037637753861" "0.354300353701363"
## [5,] "0.0968069891680962" "0.309719104437724"
## [6,] "0.0826156937347901" "0.228577756132457"
## p.adjusted_value of sensitivity correlation
## [1,] "0.02"
## [2,] "0.0166666666666667"
## [3,] "0.0375"
## [4,] "0.0166666666666667"
## [5,] "0.0166666666666667"
## [6,] "0.025"
To obtain a list of high-confidence miRNA sponge interactions, ‘miRspongeR’ implements ‘integrateMethod’ function to integrate results of different methods.
miR2Target <- system.file("extdata", "miR2Target.csv", package="miRspongeR")
miRTarget <- read.csv(miR2Target, header=TRUE, sep=",")
ExpDatacsv <- system.file("extdata", "ExpData.csv", package="miRspongeR")
ExpData <- read.csv(ExpDatacsv, header=TRUE, sep=",")
miRHomologyceRInt <- spongeMethod(miRTarget, method = "miRHomology")
pcceRInt <- spongeMethod(miRTarget, ExpData, method = "pc")
sppcceRInt <- spongeMethod(miRTarget, ExpData, method = "sppc")
Interlist <- list(miRHomologyceRInt[, 1:2], pcceRInt[, 1:2], sppcceRInt[, 1:2])
IntegrateceRInt <- integrateMethod(Interlist, Intersect_num = 2)
head(IntegrateceRInt)
## sponge_1 sponge_2
## [1,] "QKI" "NFIA"
## [2,] "QKI" "TCF4"
## [3,] "QKI" "TNKS2"
## [4,] "QKI" "PTEN"
## [5,] "QKI" "KLF6"
## [6,] "SOX5" "RB1"
Parameter ‘Intersect_num’ is used to set the least number of methods intersected for integration. That is to say, ‘miRspongeR’ only reserves those miRNA sponge interactions predicted by at least ‘Intersect_num’ methods.
‘miRspongeR’ uses a sample control variable strategy to infer sample-specific miRNA sponge interaction network. In the strategy, eight popular methods (miRHomology, pc, sppc, ppc, hermes, muTaME, cernia, and SPONGE) to identify miRNA sponge interactions for each sample.
ExpDatacsv <- system.file("extdata","ExpData.csv",package="miRspongeR")
ExpData <- read.csv(ExpDatacsv, header=TRUE, sep=",")
miR2Target <- system.file("extdata", "miR2Target.csv", package="miRspongeR")
miRTarget <- read.csv(miR2Target, header=TRUE, sep=",")
sponge_sample_specific_net <- sponge_sample_specific(miRTarget, ExpData, senscorcutoff = 0.1, method = "sppc")
head(sponge_sample_specific_net[[1]])
## from to
## 1 TNKS2 PTEN
## 2 TCF4 TNKS2
In terms of sample-specific miRNA sponge interaction networks, ‘miRspongeR’ further identifies sample-sample correlation network. The three similarity methods (Simpson [22], Jaccard [23] and Lin [24]) are used. The significance p-values of similarity are calculated using a hypergeometric distribution test.
ExpDatacsv <- system.file("extdata","ExpData.csv",package="miRspongeR")
ExpData <- read.csv(ExpDatacsv, header=TRUE, sep=",")
miR2Target <- system.file("extdata", "miR2Target.csv", package="miRspongeR")
miRTarget <- read.csv(miR2Target, header=TRUE, sep=",")
sponge_sample_specific_net <- sponge_sample_specific(miRTarget, ExpData, senscorcutoff = 0.1, method = "sppc")
sample_cor_network_res <- sample_cor_network(sponge_sample_specific_net, genes_num = 31, padjustvaluecutoff = 0.05)
head(sample_cor_network_res)
## sample_1 sample_2 similarity p.adjusted_value of similarity
## [1,] "1" "19" "1" "0.0159065628476203"
## [2,] "1" "53" "1" "0.0159065628476203"
## [3,] "1" "56" "1" "0.0178948832035728"
## [4,] "1" "67" "1" "0.0159065628476203"
## [5,] "1" "144" "1" "0.0178948832035728"
## [6,] "9" "52" "1" "0.0178948832035728"
To validate the predicted miRNA sponge interactions, ‘miRspongeR’ implements ‘spongeValidate’ function. The built-in groundtruth of miRNA sponge interactions are from miRSponge (http://bio-bigdata.hrbmu.edu.cn/miRSponge/), lncACTdb (http://bio-bigdata.hrbmu.edu.cn/LncACTdb/), LncCeRBase (http://www.insect-genome.com/LncCeRBase/front/).
miR2Target <- system.file("extdata", "miR2Target.csv", package="miRspongeR")
miRTarget <- read.csv(miR2Target, header=TRUE, sep=",")
miRHomologyceRInt <- spongeMethod(miRTarget, method = "miRHomology")
Groundtruthcsv <- system.file("extdata", "Groundtruth.csv", package="miRspongeR")
Groundtruth <- read.csv(Groundtruthcsv, header=TRUE, sep=",")
spongenetwork_validated <- spongeValidate(miRHomologyceRInt[, 1:2], directed = FALSE, Groundtruth)
spongenetwork_validated
## sponge_1 sponge_2
## 1 PTEN KLF6
## 2 TNKS2 PTEN
## 3 RB1 PTEN
To further understand the module-level properties of miRNA sponges, ‘miRspongeR’ implements ‘netModule’ function to identify miRNA sponge modules. Users can choose FN [25], MCL [26], LINKCOMM [27], MCODE [28], betweenness [29], infomap [30], prop [31], eigen [32], louvain [33] and walktrap [34] for module identification.
miR2Target <- system.file("extdata", "miR2Target.csv", package="miRspongeR")
miRTarget <- read.csv(miR2Target, header=TRUE, sep=",")
miRHomologyceRInt <- spongeMethod(miRTarget, method = "miRHomology")
spongenetwork_Cluster <- netModule(miRHomologyceRInt[, 1:2], modulesize = 2)
spongenetwork_Cluster
## [[1]]
## [1] "SOX5" "RB1"
##
## [[2]]
## [1] "QKI" "NFIA" "TCF4" "TNKS2" "PTEN" "KLF6"
‘miRspongeR’ implements ‘moduleDEA’ function to conduct disease enrichment analysis of modules. The disease databases used include DO: Disease Ontology database (http://disease-ontology.org/), DGN: DisGeNET database (http://www.disgenet.org/) and NCG: Network of Cancer Genes database (http://ncg.kcl.ac.uk/). Moreover, ‘moduleFEA’ function is implemented to conduct functional GO, KEGG and Reactome enrichment analysis of modules. The ontology databases used contain GO: Gene Ontology database (http://www.geneontology.org/), KEGG: Kyoto Encyclopedia of Genes and Genomes Pathway Database (http://www.genome.jp/kegg/), and Reactome: Reactome Pathway Database (http://reactome.org/).
miR2Target <- system.file("extdata", "miR2Target.csv", package="miRspongeR")
miRTarget <- read.csv(miR2Target, header=TRUE, sep=",")
miRHomologyceRInt <- spongeMethod(miRTarget, method = "miRHomology")
spongenetwork_Cluster <- netModule(miRHomologyceRInt[, 1:2], modulesize = 2)
sponge_Module_DEA <- moduleDEA(spongenetwork_Cluster)
sponge_Module_FEA <- moduleFEA(spongenetwork_Cluster)
To make survival analysis of miRNA sponge modules, ‘miRspongeR’ implements ‘moduleSurvival’ function.
miR2Target <- system.file("extdata", "miR2Target.csv", package="miRspongeR")
miRTarget <- read.csv(miR2Target, header=TRUE, sep=",")
ExpDatacsv <- system.file("extdata", "ExpData.csv", package="miRspongeR")
ExpData <- read.csv(ExpDatacsv, header=TRUE, sep=",")
SurvDatacsv <- system.file("extdata", "SurvData.csv", package="miRspongeR")
SurvData <- read.csv(SurvDatacsv, header=TRUE, sep=",")
pcceRInt <- spongeMethod(miRTarget, ExpData, method = "pc")
spongenetwork_Cluster <- netModule(pcceRInt[, 1:2], modulesize = 2)
sponge_Module_Survival <- moduleSurvival(spongenetwork_Cluster,
ExpData, SurvData, devidePercentage=.5)
sponge_Module_Survival
## Chi-square p-value HR HRlow95 HRup95
## Module 1 2.0610023 0.1511107 1.269461 0.9062752 1.778192
## Module 2 0.5682278 0.4509640 1.134920 0.8109446 1.588325
Parameter ‘devidePercentage’ is used to set the percentage of high risk group.
‘miRspongeR’ provides several functions to explore miRNA sponge (also called ceRNA or miRNA decoy) regulation from putative miRNA-target interactions or/and transcriptomics data (including bulk, single-cell and spatial gene expression data). It provides eight popular methods for identifying miRNA sponge interactions, and an integrative method to integrate miRNA sponge interactions from different methods, as well as the functions to validate miRNA sponge interactions, and infer miRNA sponge modules, conduct enrichment analysis of miRNA sponge modules, and conduct survival analysis of miRNA sponge modules. By using a sample control variable strategy, it provides a function to infer sample-specific miRNA sponge interactions. In terms of sample-specific miRNA sponge interactions, it implements three similarity methods to construct sample-sample correlation network. It could provide a useful tool for the research of miRNA sponges.
[1] Ambros V. microRNAs: tiny regulators with great potential. Cell, 2001, 107:823–6.
[2] Bartel DP. MicroRNAs: genomics, biogenesis, mechanism, and function. Cell, 2004, 116:281–97.
[3] Du T, Zamore PD. Beginning to understand microRNA function. Cell Research, 2007, 17:661–3.
[4] Esquela-Kerscher A, Slack FJ. Oncomirs—microRNAs with a role in cancer. Nature Reviews Cancer, 2006, 6:259–69.
[5] Lin S, Gregory RI. MicroRNA biogenesis pathways in cancer. Nature Reviews Cancer, 2015, 15:321–33.
[6] Salmena L, Poliseno L, Tay Y, et al. A ceRNA hypothesis: the Rosetta Stone of a hidden RNA language? Cell, 2011, 146(3):353-8.
[7] Cesana M, Cacchiarelli D, Legnini I, et al. A long noncoding RNA controls muscle differentiation by functioning as a competing endogenous RNA. Cell, 2011, 147:358–69.
[8] Poliseno L, Salmena L, Zhang J, et al. A coding-independent function of gene and pseudogene mRNAs regulates tumour biology. Nature, 2010, 465:1033–8.
[9] Hansen TB, Jensen TI, Clausen BH, et al. Natural RNA circles function as efficient microRNA sponges. Nature, 2013, 495:384–8.
[10] Memczak S, Jens M, Elefsinioti A, et al. Circular RNAs are a large class of animal RNAs with regulatory potency. Nature, 2013, 495:333–8.
[11] Le TD, Zhang J, Liu L, et al. Computational methods for identifying miRNA sponge interactions. Brief Bioinform., 2017, 18(4):577-590.
[12] Tay Y, Rinn J, Pandolfi PP. The multilayered complexity of ceRNA crosstalk and competition. Nature, 2014, 505:344–52.
[13] Li JH, Liu S, Zhou H, et al. starBase v2.0: decoding miRNA-ceRNA, miRNA-ncRNA and protein-RNA interaction networks from large-scale CLIP-Seq data. Nucleic Acids Res., 2014, 42(Database issue):D92-7.
[14] Sarver AL, Subramanian S. Competing endogenous RNA database. Bioinformation, 2012, 8(15):731-3.
[15] Zhou X, Liu J, Wang W, Construction and investigation of breast-cancer-specific ceRNA network based on the mRNA and miRNA expression data. IET Syst Biol., 2014, 8(3):96-103.
[16] Xu J, Li Y, Lu J, et al. The mRNA related ceRNA-ceRNA landscape and significance across 20 major cancer types. Nucleic Acids Res., 2015, 43(17):8169-82.
[17] Paci P, Colombo T, Farina L, Computational analysis identifies a sponge interaction network between long non-coding RNAs and messenger RNAs in human breast cancer. BMC Syst Biol., 2014, 8:83.
[18] Sumazin P, Yang X, Chiu HS, et al. An extensive microRNA-mediated network of RNA-RNA interactions regulates established oncogenic pathways in glioblastoma. Cell, 2011, 147(2):370-81.
[19] Tay Y, Kats L, Salmena L, et al. Coding-independent regulation of the tumor suppressor PTEN by competing endogenous mRNAs. Cell, 2011, 147(2):344-57.
[20] Sardina DS, Alaimo S, Ferro A, et al. A novel computational method for inferring competing endogenous interactions. Brief Bioinform., 2017, 18(6):1071-1081.
[21] List M, Dehghani Amirabad A, Kostka D, Schulz MH. Large-scale inference of competing endogenous RNA networks with sparse partial correlation. Bioinformatics. 2019;35(14):i596-i604.
[22] Tucker CM, Cadotte MW, Carvalho SB, et al. A guide to phylogenetic metrics for conservation, community ecology and macroecology. Biol Rev Camb Philos Soc. 2017;92(2):698-715.
[23] Jaccard P. The Distribution of the Flora in the Alpine Zone. The New Phytologist 11, no. 2 (1912): 37–50.
[24] Lin D, et al. An information-theoretic definition of similarity. In: Icml, vol. 98. 1998. p. 296–304.
[25] Clauset A, Newman ME, Moore C. Finding community structure in very large networks. Phys Rev E Stat Nonlin Soft Matter Phys., 2004, 70(6 Pt 2):066111.
[26] Enright AJ, Van Dongen S, Ouzounis CA. An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res., 2002, 30(7):1575-84.
[27] Kalinka AT, Tomancak P. linkcomm: an R package for the generation, visualization, and analysis of link communities in networks of arbitrary size and type. Bioinformatics, 2011, 27(14):2011-2.
[28] Bader GD, Hogue CW. An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformatics, 2003, 4:2.
[29] Newman ME, Girvan M. Finding and evaluating community structure in networks. Phys Rev E Stat Nonlin Soft Matter Phys. 2004;69(2 Pt 2):026113.
[30] Rosvall M, Bergstrom CT. Maps of random walks on complex networks reveal community structure. Proc Natl Acad Sci U S A. 2008;105(4):1118-1123.
[31] Raghavan UN, Albert R, Kumara S. Near linear time algorithm to detect community structures in large-scale networks. Phys Rev E Stat Nonlin Soft Matter Phys. 2007;76(3 Pt 2):036106.
[32] Newman ME. Finding community structure in networks using the eigenvectors of matrices. Phys Rev E Stat Nonlin Soft Matter Phys. 2006;74(3 Pt 2):036104.
[33] Blondel VD, Guillaume JL, Lambiotte R, Lefebvre E. Fast unfolding of communities in large networks. Journal of statistical mechanics: theory and experiment, 2008, 2008(10): P10008.
[34] Pons P, Latapy M. Computing communities in large networks using random walks. Graph Algorithms Appl. 2006.
sessionInfo()
## R version 4.2.1 (2022-06-23)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.5 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.16-bioc/R/lib/libRblas.so
## LAPACK: /home/biocbuild/bbs-3.16-bioc/R/lib/libRlapack.so
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_GB LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] miRspongeR_2.2.0 BiocStyle_2.26.0
##
## loaded via a namespace (and not attached):
## [1] rappdirs_0.3.3 ModelMetrics_1.2.2.2 tidyr_1.2.1
## [4] ggplot2_3.3.6 bit64_4.0.5 knitr_1.40
## [7] data.table_1.14.4 rpart_4.1.19 KEGGREST_1.38.0
## [10] hardhat_1.2.0 RCurl_1.98-1.9 doParallel_1.0.17
## [13] generics_0.1.3 BiocGenerics_0.44.0 cowplot_1.1.1
## [16] RSQLite_2.2.18 shadowtext_0.1.2 future_1.28.0
## [19] bit_4.0.4 tzdb_0.3.0 enrichplot_1.18.0
## [22] xml2_1.3.3 lubridate_1.8.0 assertthat_0.2.1
## [25] tidyverse_1.3.2 viridis_0.6.2 gargle_1.2.1
## [28] gower_1.0.0 xfun_0.34 hms_1.1.2
## [31] jquerylib_0.1.4 evaluate_0.17 fansi_1.0.3
## [34] progress_1.2.2 dbplyr_2.2.1 readxl_1.4.1
## [37] Rgraphviz_2.42.0 igraph_1.3.5 DBI_1.1.3
## [40] googledrive_2.0.0 stats4_4.2.1 purrr_0.3.5
## [43] ellipsis_0.3.2 dplyr_1.0.10 ggpubr_0.4.0
## [46] backports_1.4.1 bookdown_0.29 linkcomm_1.0-14
## [49] biomaRt_2.54.0 vctrs_0.5.0 Biobase_2.58.0
## [52] abind_1.4-5 caret_6.0-93 cachem_1.0.6
## [55] withr_2.5.0 ggforce_0.4.1 HDO.db_0.99.1
## [58] treeio_1.22.0 prettyunits_1.1.1 cluster_2.1.4
## [61] DOSE_3.24.0 ape_5.6-2 lazyeval_0.2.2
## [64] crayon_1.5.2 glmnet_4.1-4 recipes_1.0.2
## [67] pkgconfig_2.0.3 tweenr_2.0.2 GenomeInfoDb_1.34.0
## [70] nlme_3.1-160 nnet_7.3-18 rlang_1.0.6
## [73] globals_0.16.1 lifecycle_1.0.3 SPONGE_1.20.0
## [76] downloader_0.4 filelock_1.0.2 BiocFileCache_2.6.0
## [79] modelr_0.1.9 cellranger_1.1.0 randomForest_4.7-1.1
## [82] polyclip_1.10-4 matrixStats_0.62.0 graph_1.76.0
## [85] rngtools_1.5.2 Matrix_1.5-1 aplot_0.1.8
## [88] carData_3.0-5 reprex_2.0.2 ggridges_0.5.4
## [91] GlobalOptions_0.1.2 googlesheets4_1.0.1 png_0.1-7
## [94] viridisLite_0.4.1 rjson_0.2.21 bitops_1.0-7
## [97] gson_0.0.9 pROC_1.18.0 Biostrings_2.66.0
## [100] blob_1.2.3 doRNG_1.8.2 shape_1.4.6
## [103] stringr_1.4.1 qvalue_2.30.0 parallelly_1.32.1
## [106] readr_2.1.3 rstatix_0.7.0 gridGraphics_0.5-1
## [109] S4Vectors_0.36.0 ggsignif_0.6.4 reactome.db_1.82.0
## [112] scales_1.2.1 graphite_1.44.0 memoise_2.0.1
## [115] magrittr_2.0.3 plyr_1.8.7 zlibbioc_1.44.0
## [118] compiler_4.2.1 scatterpie_0.1.8 RColorBrewer_1.1-3
## [121] clue_0.3-62 miRBaseConverter_1.22.0 cli_3.4.1
## [124] XVector_0.38.0 listenv_0.8.0 patchwork_1.1.2
## [127] MASS_7.3-58.1 tidyselect_1.2.0 stringi_1.7.8
## [130] forcats_0.5.2 yaml_2.3.6 GOSemSim_2.24.0
## [133] ggrepel_0.9.1 grid_4.2.1 sass_0.4.2
## [136] fastmatch_1.1-3 tools_4.2.1 future.apply_1.9.1
## [139] parallel_4.2.1 circlize_0.4.15 foreach_1.5.2
## [142] logging_0.10-108 gridExtra_2.3 prodlim_2019.11.13
## [145] farver_2.1.1 ggraph_2.1.0 digest_0.6.30
## [148] BiocManager_1.30.19 lava_1.7.0 ppcor_1.1
## [151] Rcpp_1.0.9 car_3.1-1 broom_1.0.1
## [154] gRbase_1.8.8 org.Hs.eg.db_3.16.0 httr_1.4.4
## [157] AnnotationDbi_1.60.0 ComplexHeatmap_2.14.0 colorspace_2.0-3
## [160] rvest_1.0.3 XML_3.99-0.12 fs_1.5.2
## [163] IRanges_2.32.0 splines_4.2.1 yulab.utils_0.0.5
## [166] tidytree_0.4.1 expm_0.999-6 graphlayouts_0.8.3
## [169] ggplotify_0.1.0 MCL_1.0 tnet_3.0.16
## [172] jsonlite_1.8.3 ggtree_3.6.0 dynamicTreeCut_1.63-1
## [175] tidygraph_1.2.2 corpcor_1.6.10 cvms_1.3.6
## [178] timeDate_4021.106 ggfun_0.0.7 ipred_0.9-13
## [181] MetBrewer_0.2.0 R6_2.5.1 pillar_1.8.1
## [184] htmltools_0.5.3 glue_1.6.2 fastmap_1.1.0
## [187] clusterProfiler_4.6.0 BiocParallel_1.32.0 class_7.3-20
## [190] codetools_0.2-18 fgsea_1.24.0 utf8_1.2.2
## [193] lattice_0.20-45 bslib_0.4.0 tibble_3.1.8
## [196] curl_4.3.3 ReactomePA_1.42.0 GO.db_3.16.0
## [199] survival_3.4-0 rmarkdown_2.17 munsell_0.5.0
## [202] GetoptLong_1.0.5 GenomeInfoDbData_1.2.9 iterators_1.0.14
## [205] haven_2.5.1 reshape2_1.4.4 gtable_0.3.1