The method lfm (low frequency mutations) retrieve
the original mutations that created the significant clusters
calculated with entropy on the consensus
a character that defines whether to use 'pvalue' or
'qvalue' to select significant positions. Default: 'qvalue'
threshold
a numeric defining the threshold of significance for the defined metric. Default: 0.05
conservation
a numeric value in the range of 0-1 that defines
the threshold of trident conservation score to include the specified position.
The default value is inherited from the slot entropy, whose default is 0.1
Details
After the alignment, we lose every information about the original sequences used as input.
The consensus sequence is in fact an alignment that could not represent the reality of human proteins.
lfm allows to go back on the original dataset and retrieve the proteins and the real positions
of the mutations that we consider 'conserved'.
Value
A data.frame with 13 columns corresponding to the mutations retrieved:
Gene_Symbol gene symbols of the mutations
Amino_Acid_Position amino acidic positions relative to original protein
Amino_Acid_Change amino acid changes in hgvs format
Sample Sample barcode where the mutation was found
Tumor_Type Tumor type of the Sample
Envelope_Start start of the pfam domain in the protein
Envelope_End end of the pfam domain in the protein
Multiple_Aln_pos positions in the consensus
Entrez entrez ids of the mutations
Entry Uniprot entry of the protein
UNIPROT other protein names for Uniprot
Chromosome cytobands of the genes
Protein.name extended protein names
Author(s)
Stefano de Pretis , Giorgio Melloni
See Also
entropy
Examples
#Load homeobox example and launch entropy method
data(lmObj)
lmObj <- entropy(lmObj)
significant_muts <- lfm(lmObj)
#Display original mutations that formed significant clusters (column Multiple_Aln_pos)
head(significant_muts)
#Position 4 has a qvalue<0.05
#What are the genes mutated in position 4 in the consensus?
cluster_4_genes <- significant_muts[ significant_muts[['Multiple_Aln_pos']]==4 , 'Gene_Symbol']
#Display the genes and their number of mutation in consensus position 4
sort(table(cluster_4_genes))
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(LowMACA)
Checking if clustalo is in the PATH...
Checking perl installation...
Checking perl modules XML::Simple and LWP...
Can't locate XML/Simple.pm in @INC (you may need to install the XML::Simple module) (@INC contains: /etc/perl /usr/local/lib/x86_64-linux-gnu/perl/5.22.1 /usr/local/share/perl/5.22.1 /usr/lib/x86_64-linux-gnu/perl5/5.22 /usr/share/perl5 /usr/lib/x86_64-linux-gnu/perl/5.22 /usr/share/perl/5.22 /usr/local/lib/site_perl /usr/lib/x86_64-linux-gnu/perl-base .).
BEGIN failed--compilation aborted.
Warning messages:
1: In .ClustalChecks(ClustalCommand = "clustalo") :
Clustal Omega is not in the PATH:
You can either change clustalo command using lmParams function or use the web service. See ?setup
2: running command '/usr/bin/perl -MXML::Simple -e 1' had status 2
3: In .PerlModuleChecks(stop = FALSE, perl = "perl") :
XML::Simple module for perl is not installed.
If you don't want to install a local clustal omega and use the web service, XML::Simple is required
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/LowMACA/lfm.Rd_%03d_medium.png", width=480, height=480)
> ### Name: lfm
> ### Title: Show significant clusters of mutations
> ### Aliases: lfm
>
> ### ** Examples
>
> #Load homeobox example and launch entropy method
> data(lmObj)
> lmObj <- entropy(lmObj)
Making uniform model...
Assigned bandwidth: 0
> significant_muts <- lfm(lmObj)
> #Display original mutations that formed significant clusters (column Multiple_Aln_pos)
> head(significant_muts)
Gene_Symbol Amino_Acid_Position Amino_Acid_Change Sample Tumor_Type
1 ALX4 218 R218Q TCGA-AA-3949-01 coadread
2 CDX4 177 R177C TCGA-D3-A2JO-06 skcm
3 CDX4 177 R177C TCGA-AP-A0LM-01 ucec
4 CDX4 177 R177C MEL-Ma-Mel-85 skcm
5 CUX1 1248 R1248W TCGA-ER-A193-06 skcm
6 CUX1 1248 R1248W TCGA-BG-A18B-01 ucec
Envelope_Start Envelope_End Multiple_Aln_pos metric Entrez Entry
1 215 271 4 1.721185e-11 60529 Q9H161
2 174 230 4 1.721185e-11 1046 O14627
3 174 230 4 1.721185e-11 1046 O14627
4 174 230 4 1.721185e-11 1046 O14627
5 1245 1301 4 1.721185e-11 1523 P39880
6 1245 1301 4 1.721185e-11 1523 P39880
UNIPROT Chromosome Protein.name
1 ALX4_HUMAN 11p11.2 Homeobox protein aristaless-like 4
2 CDX4_HUMAN Xq13.2 Homeobox protein CDX-4
3 CDX4_HUMAN Xq13.2 Homeobox protein CDX-4
4 CDX4_HUMAN Xq13.2 Homeobox protein CDX-4
5 CUX1_HUMAN 7q22.1 Homeobox protein cut-like 1
6 CUX1_HUMAN 7q22.1 Homeobox protein cut-like 1
> #Position 4 has a qvalue<0.05
> #What are the genes mutated in position 4 in the consensus?
> cluster_4_genes <- significant_muts[ significant_muts[['Multiple_Aln_pos']]==4 , 'Gene_Symbol']
> #Display the genes and their number of mutation in consensus position 4
> sort(table(cluster_4_genes))
cluster_4_genes
ALX4 DBX2 EVX2 ISL1 LHX8 CUX1 HDX HOXA5 HOXD3 CDX4 DUXA HOXA1 ISX
1 1 1 1 1 2 2 2 2 3 4 4 15
>
>
>
>
>
> dev.off()
null device
1
>