Last data update: 2014.03.03

R: Show significant clusters of mutations
lfmR Documentation

Show significant clusters of mutations

Description

The method lfm (low frequency mutations) retrieve the original mutations that created the significant clusters calculated with entropy on the consensus

Usage

lfm(object , metric='qvalue', threshold=.05, conservation=NULL)

Arguments

object

a LowMACA class object

metric

a character that defines whether to use 'pvalue' or 'qvalue' to select significant positions. Default: 'qvalue'

threshold

a numeric defining the threshold of significance for the defined metric. Default: 0.05

conservation

a numeric value in the range of 0-1 that defines the threshold of trident conservation score to include the specified position. The default value is inherited from the slot entropy, whose default is 0.1

Details

After the alignment, we lose every information about the original sequences used as input. The consensus sequence is in fact an alignment that could not represent the reality of human proteins. lfm allows to go back on the original dataset and retrieve the proteins and the real positions of the mutations that we consider 'conserved'.

Value

A data.frame with 13 columns corresponding to the mutations retrieved:

  1. Gene_Symbol gene symbols of the mutations

  2. Amino_Acid_Position amino acidic positions relative to original protein

  3. Amino_Acid_Change amino acid changes in hgvs format

  4. Sample Sample barcode where the mutation was found

  5. Tumor_Type Tumor type of the Sample

  6. Envelope_Start start of the pfam domain in the protein

  7. Envelope_End end of the pfam domain in the protein

  8. Multiple_Aln_pos positions in the consensus

  9. Entrez entrez ids of the mutations

  10. Entry Uniprot entry of the protein

  11. UNIPROT other protein names for Uniprot

  12. Chromosome cytobands of the genes

  13. Protein.name extended protein names

Author(s)

Stefano de Pretis , Giorgio Melloni

See Also

entropy

Examples

#Load homeobox example and launch entropy method
data(lmObj)
lmObj <- entropy(lmObj)
significant_muts <- lfm(lmObj)
#Display original mutations that formed significant clusters (column Multiple_Aln_pos)
head(significant_muts)
#Position 4 has a qvalue<0.05
#What are the genes mutated in position 4 in the consensus?
cluster_4_genes <- significant_muts[ significant_muts[['Multiple_Aln_pos']]==4 , 'Gene_Symbol']
#Display the genes and their number of mutation in consensus position 4
sort(table(cluster_4_genes))

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(LowMACA)
Checking if clustalo is in the PATH...
Checking perl installation...
Checking perl modules XML::Simple and LWP...
Can't locate XML/Simple.pm in @INC (you may need to install the XML::Simple module) (@INC contains: /etc/perl /usr/local/lib/x86_64-linux-gnu/perl/5.22.1 /usr/local/share/perl/5.22.1 /usr/lib/x86_64-linux-gnu/perl5/5.22 /usr/share/perl5 /usr/lib/x86_64-linux-gnu/perl/5.22 /usr/share/perl/5.22 /usr/local/lib/site_perl /usr/lib/x86_64-linux-gnu/perl-base .).
BEGIN failed--compilation aborted.
Warning messages:
1: In .ClustalChecks(ClustalCommand = "clustalo") :
  Clustal Omega is not in the PATH:
You can either change clustalo command using lmParams function or use the web service. See ?setup
2: running command '/usr/bin/perl -MXML::Simple -e 1' had status 2 
3: In .PerlModuleChecks(stop = FALSE, perl = "perl") :
  XML::Simple module for perl is not installed. 
            If you don't want to install a local clustal omega and use the web service, XML::Simple is required

> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/LowMACA/lfm.Rd_%03d_medium.png", width=480, height=480)
> ### Name: lfm
> ### Title: Show significant clusters of mutations
> ### Aliases: lfm
> 
> ### ** Examples
> 
> #Load homeobox example and launch entropy method
> data(lmObj)
> lmObj <- entropy(lmObj)
Making uniform model...
Assigned bandwidth: 0
> significant_muts <- lfm(lmObj)
> #Display original mutations that formed significant clusters (column Multiple_Aln_pos)
> head(significant_muts)
  Gene_Symbol Amino_Acid_Position Amino_Acid_Change          Sample Tumor_Type
1        ALX4                 218             R218Q TCGA-AA-3949-01   coadread
2        CDX4                 177             R177C TCGA-D3-A2JO-06       skcm
3        CDX4                 177             R177C TCGA-AP-A0LM-01       ucec
4        CDX4                 177             R177C   MEL-Ma-Mel-85       skcm
5        CUX1                1248            R1248W TCGA-ER-A193-06       skcm
6        CUX1                1248            R1248W TCGA-BG-A18B-01       ucec
  Envelope_Start Envelope_End Multiple_Aln_pos       metric Entrez  Entry
1            215          271                4 1.721185e-11  60529 Q9H161
2            174          230                4 1.721185e-11   1046 O14627
3            174          230                4 1.721185e-11   1046 O14627
4            174          230                4 1.721185e-11   1046 O14627
5           1245         1301                4 1.721185e-11   1523 P39880
6           1245         1301                4 1.721185e-11   1523 P39880
     UNIPROT Chromosome                       Protein.name
1 ALX4_HUMAN    11p11.2 Homeobox protein aristaless-like 4
2 CDX4_HUMAN     Xq13.2             Homeobox protein CDX-4
3 CDX4_HUMAN     Xq13.2             Homeobox protein CDX-4
4 CDX4_HUMAN     Xq13.2             Homeobox protein CDX-4
5 CUX1_HUMAN     7q22.1        Homeobox protein cut-like 1
6 CUX1_HUMAN     7q22.1        Homeobox protein cut-like 1
> #Position 4 has a qvalue<0.05
> #What are the genes mutated in position 4 in the consensus?
> cluster_4_genes <- significant_muts[ significant_muts[['Multiple_Aln_pos']]==4 , 'Gene_Symbol']
> #Display the genes and their number of mutation in consensus position 4
> sort(table(cluster_4_genes))
cluster_4_genes
 ALX4  DBX2  EVX2  ISL1  LHX8  CUX1   HDX HOXA5 HOXD3  CDX4  DUXA HOXA1   ISX 
    1     1     1     1     1     2     2     2     2     3     4     4    15 
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>