Title: | Generating Robust Biclusters from a Bicluster Set (Ensemble Biclustering) |
---|---|
Description: | Biclusters are submatrices in the data matrix which satisfy certain conditions of homogeneity. Package contains functions for generating robust biclusters with respect to the initialization parameters for a given bicluster solution contained in a bicluster set in data, the procedure is also known as ensemble biclustering. The set of biclusters is evaluated based on the similarity of its elements (the overlap), and afterwards the hierarchical tree is constructed to obtain cut-off points for the classes of robust biclusters. The result is a number of robust (or super) biclusters with none or low overlap. |
Authors: | Tatsiana Khamiakova |
Maintainer: | Tatsiana Khamiakova <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.2 |
Built: | 2025-03-12 05:32:19 UTC |
Source: | https://github.com/cran/superbiclust |
The package contains a number of functions for computing similarity matrix of the biclusters obtained by a variety of methods, initialization seeds or various parameter settings. It uses biclustering output as generated by biclust or fabia. isa2 package can be used to generate the biclusters as well, however, a prior conversion is needed to a biclust object by using isa2.biclust() function. The matrix is used for the construction of hierarchical tree based on overall similarity, row similarity or column similarity to obtain cut-off points for the similarity metric of choice. Various statistics are output per bicluster set: a number of a given gene(compound) or gene (compound) set has been present in any bicluster of output or per run. After the tree is cut, the robiust or super biclusters are obtained in a form of biclust object, which can further be used for plotting of biclusters. Biclusters are submatrices in the data which satisfy certain conditions of homogeneity. For more details on biclusters and biclustering see Madeira and Oliveira (2004).
Package: | superbiclust |
Type: | Package |
Version: | 0.99 |
Date: | 2012-08-23 |
License: | GPL |
LazyLoad: | yes |
Tatsiana Khamiakova <[email protected]>
Madeira and Oliveira (2004) Biclustering algorithms for biological data analysis: a survey. IEEE/ACM Trans Comput Biol Bioinform. 2004 Jan-Mar;1(1):24-45. Shi et al. (2010) A bi-ordering approach to linking gene expression with clinical annotations in gastric cancer. BMC Bioinformatics. 11. pages 477.
BiclustSet
BiclustSet Class contains the biclustering result in a form: bicluster rows and bicluster columns
Objects can be created by calls of the form new("BiclustSet", ...)
.
The variety of inputs variety of inputs (isa2, fabia, biclust,...) can be used.
GenesMembership
:logical, object of class "matrix"
, with row membership within a bicluster
ColumnMembership
:logical, object of class "matrix"
, with column membership within a bicluster
Number
:code"numeric", number of biclusters in the set
Tatsiana Khamiakova
The method extract relevant information from a variety of biclustering input and constructs a BiclustSet object
signature(x = "ANY")
signature(x = "Biclust")
Converts Biclust objects into BiclustSet object
signature(x = "Factorization")
Converts FABIA Factorization object into BiClustSet
signature(x = "list")
Converts a list with biclustering results into BiClustSet
test <- matrix(rnorm(5000), 100, 50) test[11:20,11:20] <- rnorm(100, 3, 0.1) test[17:26,21:30] <- rnorm(100, 3, 0.1) #Run FABIA set.seed(1) FabiaRes1 <- fabia(test) #construct BiclustSet object from FABIA output FabiabiclustSet <- BiclustSet(FabiaRes1) FabiabiclustSet
test <- matrix(rnorm(5000), 100, 50) test[11:20,11:20] <- rnorm(100, 3, 0.1) test[17:26,21:30] <- rnorm(100, 3, 0.1) #Run FABIA set.seed(1) FabiaRes1 <- fabia(test) #construct BiclustSet object from FABIA output FabiabiclustSet <- BiclustSet(FabiaRes1) FabiabiclustSet
Combine two Biclust objects into one
combine(x,y)
combine(x,y)
x |
1st Biclust object containing bicluster results |
y |
2nd Biclust object containing bicluster results |
If a biclust function returns empty set, joined result contains only results of non-empty object. Info and Parameters slots of a "Biclust" object contain information about both biclustering runs.
object of a class Biclust
Tatsiana Khamiakova
#combine output of two biclust objects test <- matrix(rnorm(5000), 100, 50) test[11:20,11:20] <- rnorm(100, 3, 0.1) test[17:26,21:30] <- rnorm(100, 3, 0.1) set.seed(1) PlaidRes1 <- biclust(x=test, method=BCPlaid()) set.seed(2) PlaidRes2 <- biclust(x=test, method=BCPlaid()) combinedRes <- combine(PlaidRes1,PlaidRes2) summary(combinedRes)
#combine output of two biclust objects test <- matrix(rnorm(5000), 100, 50) test[11:20,11:20] <- rnorm(100, 3, 0.1) test[17:26,21:30] <- rnorm(100, 3, 0.1) set.seed(1) PlaidRes1 <- biclust(x=test, method=BCPlaid()) set.seed(2) PlaidRes2 <- biclust(x=test, method=BCPlaid()) combinedRes <- combine(PlaidRes1,PlaidRes2) summary(combinedRes)
For a given Bicluster set, for each row and column in data, compute frequency of apperance within a bicluster
getStats(x)
getStats(x)
x |
Biclust object containing bicluster results |
a list of column and row frequencies
Tatsiana Khamiakova
Constructs and plots hierarchical tree of biclusters output based on the similarity matrix
HCLtree(x)
HCLtree(x)
x |
Similarity object containing pairwise similarity indices for all biclusters in the output |
This function operates on a similarity matrix, which is converted to the distance between biclusters according to
, where the smaller the distance, the higher is overlap in terms of rows and columns.
The tree is constructed using complete method and plotted.
Further, the structure must be explored and robust or super-biclusters obtained after cutting the tree.
identify
function can be applied to the hierarchical tree to see the partition and get the plots of biclusters.
tree
Tatsiana Khamiakova
#compute sensitivity for BiMAX biclusters test <- matrix(rnorm(5000), 100, 50) test[11:20,11:20] <- rnorm(100, 3, 0.1) test[17:26,21:30] <- rnorm(100, 3, 0.1) testBin <- binarize(test,2) res <- biclust(x=testBin, method=BCBimax(), minr=4, minc=4, number=10) BiMaxBiclustSet <- BiclustSet(res) SensitivityMatr<- similarity(BiMaxBiclustSet,index="sensitivity") #construct hierarchical clustering based on the sensitivities HCLsensitivity <- HCLtree(SensitivityMatr) plot(HCLsensitivity, main="structure of bicluster solution")
#compute sensitivity for BiMAX biclusters test <- matrix(rnorm(5000), 100, 50) test[11:20,11:20] <- rnorm(100, 3, 0.1) test[17:26,21:30] <- rnorm(100, 3, 0.1) testBin <- binarize(test,2) res <- biclust(x=testBin, method=BCBimax(), minr=4, minc=4, number=10) BiMaxBiclustSet <- BiclustSet(res) SensitivityMatr<- similarity(BiMaxBiclustSet,index="sensitivity") #construct hierarchical clustering based on the sensitivities HCLsensitivity <- HCLtree(SensitivityMatr) plot(HCLsensitivity, main="structure of bicluster solution")
computes jaccard similarity matrix for biclusters in two bicluster sets
jaccardMat(x, y, type=c("rows", "cols", "both"))
jaccardMat(x, y, type=c("rows", "cols", "both"))
x |
BiclustSet object containing row and column indicators of bicluster membership, number of biclusters |
y |
BiclustSet object containing row and column indicators of bicluster membership, number of biclusters |
type |
whether to compute Jaccard index in two dimensions, row dimension or column dimension |
This function operates on BiclustSet objects and computes pairwise similarity based on the common elements between biclusters.
The Jaccard similarity score for two biclusters A and B is computed as
matrix of pairwise Jaccard indices
Tatsiana Khamiakova
similarity
,kulczynskiMat
, ochiaiMat
, sensitivityMat
,
specificityMat
,sorensenMat
computes Kulczynski similarity matrix for biclusters in two bicluster sets
kulczynskiMat(x, y, type=c("rows", "cols", "both"))
kulczynskiMat(x, y, type=c("rows", "cols", "both"))
x |
BiclustSet object containing row and column indicators of bicluster membership, number of biclusters |
y |
BiclustSet object containing row and column indicators of bicluster membership, number of biclusters |
type |
whether to compute Kulczynski index in two dimensions, row dimension or column dimension |
This function operates on BiclustSet objects and computes pairwise similarity based on the common elements between biclusters.
Kulczynski similarity score for two biclusters A and B is computed as
matrix of pairwise Kulczynski indices
Tatsiana Khamiakova
similarity
,jaccardMat
, ochiaiMat
, sensitivityMat
,
specificityMat
,sorensenMat
Computes Ochiai similarity matrix for biclusters in two bicluster sets
ochiaiMat(x, y, type=c("rows", "cols", "both"))
ochiaiMat(x, y, type=c("rows", "cols", "both"))
x |
BiclustSet object containing row and column indicators of bicluster membership, number of biclusters |
y |
BiclustSet object containing row and column indicators of bicluster membership, number of biclusters |
type |
whether to compute Ochiai index in two dimensions, row dimension or column dimension |
This function operates on BiclustSet objects and computes pairwise similarity based on the common elements between biclusters.
The Ochiai similarity score for two biclusters A and B is computed as
matrix of pairwise Ochiai indices
Tatsiana Khamiakova
similarity
,jaccardMat
, ochiaiMat
, sensitivityMat
,
specificityMat
,sorensenMat
Plot Gene Expression Profiles Across All Samples of the Original Data
plotProfilesAcrossAllSamples(x, coreBiclusterGenes, coreBiclusterSamples)
plotProfilesAcrossAllSamples(x, coreBiclusterGenes, coreBiclusterSamples)
x |
data |
coreBiclusterGenes |
vector of genes belonging to bicluster |
coreBiclusterSamples |
vector of samples belonging to bicluster |
The plot re-sorts the samples by bicluster membership and highlights them in red. Only the genes of a bicluster are plotted.
no return value; a plot is drawn to the current device
Tatsiana Khamiakova
Plot Gene Expression Profiles within a (Core) Bicluster
plotProfilesWithinBicluster(x, main = "", sampleNames, geneNames = NULL)
plotProfilesWithinBicluster(x, main = "", sampleNames, geneNames = NULL)
x |
expression matrix (of class 'matrix') for the subset of genes and samples corresponding to the bicluster under study |
main |
main title for the graph |
sampleNames |
names of the samples to be used for annotating the x axis (character vector of length equal to the number of columns of the expression matrix 'x' (representing the bicluster) |
geneNames |
names of the genes to be plotted in a legend (character vector of length equal to the number of rows of the expression matrix 'x'); only suitable for biclusters containing a small number of genes |
no return value; a plot is drawn to the current device
Tatsiana Khamiakova
Function for plotting gene profiles for compounds within constructed super-bicluster
plotSuper(x, data, BiclustSet)
plotSuper(x, data, BiclustSet)
x |
a vector, containing indices of biclusters, to be joined for obtaining the robust bicluster |
data |
matrix, dataset, from which the bicluster results are obtained |
BiclustSet |
a BiclustSet object containing bicluster output |
This function constructs a robust bicluster from a set of biclusters identified in a hierarchical tree and
plots gene profiles for columns in a robust bicluster. Each line represents a gene from a bicluster.
The bicluster is saved as Biclust
object which can be further plotetd by available functions from biclust package
.
The information about the number of biclusters used to generate the resulting robust bicluster is saved in Call
slot of the object.
This information is important to see how often the bicluster has been discovered under different parameter settings (e.g. initialization seeds)
Indices used as an input can be obtained by identify function or by cutting the tree.
biclust object containing bicluster and the information about bicluster subset used to generate it
Tatsiana Khamiakova
HCLtree
, plotSuperAll
, plotProfilesWithinBicluster
Function for plotting bicluster gene profiles for all samples in the data
plotSuperAll(x, data, BiclustSet)
plotSuperAll(x, data, BiclustSet)
x |
a vector, containing indices of biclusters, to be joined for obtaining the robust bicluster |
data |
matrix, dataset, from which the bicluster results are obtained |
BiclustSet |
a BiclustSet object containing bicluster output |
This function constructs a robust bicluster from a subset of biclusters specified in x argument and plots the expression profiles
biclust object
Tatsiana Khamiakova
HCLtree
, plotProfilesAcrossAllSamples
Computes sensitivity matrix for biclusters in two bicluster sets
sensitivityMat(x, y, type=c("rows", "cols", "both"))
sensitivityMat(x, y, type=c("rows", "cols", "both"))
x |
BiclustSet object containing row and column indicators of bicluster membership, number of biclusters |
y |
BiclustSet object containing row and column indicators of bicluster membership, number of biclusters |
type |
whether to compute sensitivity in two dimensions, row dimension or column dimension |
This function operates on BiclustSet objects and computes pairwise similarity based on the common elements between biclusters.
Sensitivity inclusion score of biclusters A and B is computed as
matrix of pairwise sensitivities
Tatsiana Khamiakova
similarity
,jaccardMat
, ochiaiMat
, kulczynskiMat
,
specificityMat
,sorensenMat
computes similarity matrix for the biclustering output based on one of the pairwise similarity indices of biclusters in a given bicluster set
similarity(x, index = "jaccard", type="rows")
similarity(x, index = "jaccard", type="rows")
x |
BiclustSet object containing row and column indicators of bicluster membership, number of biclusters |
index |
similarity index for the biclusters in output |
type |
whether to perform similarity in two dimensions, "both" (recommended for biclustering), row dimension, "rows" (default, requires less computations) or column dimension "cols" |
This function operates on BiclustSet object and computes pairwise similarity based on the common elements between biclusters.
type
variable controls whether similarity index is constructed for all elements, or in one dimension (rows or columns) only.
In general, similarity indices for one dimension (row or column) are higher than for two-dimensions.
Several options for similarity indices are offered: jaccard (default), kulczynski, sensitivity, specificity, sorensen and ochiai indices.
a "similarity" object containing similarity matrix
Tatsiana Khamiakova
HCLtree
, plotSuper
, jaccardMat
,kulczynskiMat
, ochiaiMat
, sensitivityMat
,
specificityMat
,sorensenMat
#compute sensitivity for BiMAX biclusters test <- matrix(rnorm(5000), 100, 50) test[11:20,11:20] <- rnorm(100, 3, 0.1) test[17:26,21:30] <- rnorm(100, 3, 0.1) testBin <- binarize(test,2) res <- biclust(x=testBin, method=BCBimax(), minr=4, minc=4, number=10) BiMaxBiclustSet <- BiclustSet(res) SensitivityMatr<- similarity(BiMaxBiclustSet,index="sensitivity", type="rows") SensitivityMatr
#compute sensitivity for BiMAX biclusters test <- matrix(rnorm(5000), 100, 50) test[11:20,11:20] <- rnorm(100, 3, 0.1) test[17:26,21:30] <- rnorm(100, 3, 0.1) testBin <- binarize(test,2) res <- biclust(x=testBin, method=BCBimax(), minr=4, minc=4, number=10) BiMaxBiclustSet <- BiclustSet(res) SensitivityMatr<- similarity(BiMaxBiclustSet,index="sensitivity", type="rows") SensitivityMatr
Computes Sorensen similarity matrix for biclusters in two bicluster sets
sorensenMat(x, y, type=c("rows", "cols", "both"))
sorensenMat(x, y, type=c("rows", "cols", "both"))
x |
BiclustSet object containing row and column indicators of bicluster membership, number of biclusters |
y |
BiclustSet object containing row and column indicators of bicluster membership, number of biclusters |
type |
whether to compute Sorensen index in two dimensions, row dimension or column dimension |
This function operates on BiclustSet objects and computes pairwise similarity based on the common elements between biclusters.
Sorensen similarity score for two biclusters A and B is computed as
matrix of pairwise Sorensen indices
Tatsiana Khamiakova
similarity
,jaccardMat
, ochiaiMat
, sensitivityMat
,
specificityMat
,kulczynskiMat
Computes specificity matrix for biclusters in two bicluster sets
specificityMat(x, y, type=c("rows", "cols", "both"))
specificityMat(x, y, type=c("rows", "cols", "both"))
x |
BiclustSet object containing row and column indicators of bicluster membership, number of biclusters |
y |
BiclustSet object containing row and column indicators of bicluster membership, number of biclusters |
type |
whether to compute specificity in two dimensions, row dimension or column dimension |
This function operates on BiclustSet objects and computes pairwise similarity based on the common elements between biclusters.
Sensitivity inclusion score of biclusters A and B is computed as
matrix of pairwise specificities
Tatsiana Khamiakova
similarity
,jaccardMat
, ochiaiMat
, kulczynskiMat
,
sensitivityMat
,sorensenMat