R: Show significant clusters of mutations
lfmR Documentation

Show significant clusters of mutations


The method lfm (low frequency mutations) retrieve the original mutations that created the significant clusters calculated with entropy on the consensus


lfm(object , metric='qvalue', threshold=.05, conservation=NULL)



a LowMACA class object


a character that defines whether to use 'pvalue' or 'qvalue' to select significant positions. Default: 'qvalue'


a numeric defining the threshold of significance for the defined metric. Default: 0.05


a numeric value in the range of 0-1 that defines the threshold of trident conservation score to include the specified position. The default value is inherited from the slot entropy, whose default is 0.1


After the alignment, we lose every information about the original sequences used as input. The consensus sequence is in fact an alignment that could not represent the reality of human proteins. lfm allows to go back on the original dataset and retrieve the proteins and the real positions of the mutations that we consider 'conserved'.


A data.frame with 13 columns corresponding to the mutations retrieved:

  1. Gene_Symbol gene symbols of the mutations

  2. Amino_Acid_Position amino acidic positions relative to original protein

  3. Amino_Acid_Change amino acid changes in hgvs format

  4. Sample Sample barcode where the mutation was found

  5. Tumor_Type Tumor type of the Sample

  6. Envelope_Start start of the pfam domain in the protein

  7. Envelope_End end of the pfam domain in the protein

  8. Multiple_Aln_pos positions in the consensus

  9. Entrez entrez ids of the mutations

  10. Entry Uniprot entry of the protein

  11. UNIPROT other protein names for Uniprot

  12. Chromosome cytobands of the genes

  13. extended protein names


Stefano de Pretis , Giorgio Melloni

See Also



#Load homeobox example and launch entropy method
lmObj <- entropy(lmObj)
significant_muts <- lfm(lmObj)
#Display original mutations that formed significant clusters (column Multiple_Aln_pos)
#Position 4 has a qvalue<0.05
#What are the genes mutated in position 4 in the consensus?
cluster_4_genes <- significant_muts[ significant_muts[['Multiple_Aln_pos']]==4 , 'Gene_Symbol']
#Display the genes and their number of mutation in consensus position 4


> #Load homeobox example and launch entropy method
> data(lmObj)
> lmObj <- entropy(lmObj)
Making uniform model...
Assigned bandwidth: 0
> significant_muts <- lfm(lmObj)
> #Display original mutations that formed significant clusters (column Multiple_Aln_pos)
> head(significant_muts)
  Gene_Symbol Amino_Acid_Position Amino_Acid_Change          Sample Tumor_Type
1        ALX4                 218             R218Q TCGA-AA-3949-01   coadread
2        CDX4                 177             R177C TCGA-D3-A2JO-06       skcm
3        CDX4                 177             R177C TCGA-AP-A0LM-01       ucec
4        CDX4                 177             R177C   MEL-Ma-Mel-85       skcm
5        CUX1                1248            R1248W TCGA-ER-A193-06       skcm
6        CUX1                1248            R1248W TCGA-BG-A18B-01       ucec
  Envelope_Start Envelope_End Multiple_Aln_pos       metric Entrez  Entry
1            215          271                4 1.721185e-11  60529 Q9H161
2            174          230                4 1.721185e-11   1046 O14627
3            174          230                4 1.721185e-11   1046 O14627
4            174          230                4 1.721185e-11   1046 O14627
5           1245         1301                4 1.721185e-11   1523 P39880
6           1245         1301                4 1.721185e-11   1523 P39880
     UNIPROT Chromosome             
1 ALX4_HUMAN    11p11.2 Homeobox protein aristaless-like 4
2 CDX4_HUMAN     Xq13.2             Homeobox protein CDX-4
3 CDX4_HUMAN     Xq13.2             Homeobox protein CDX-4
4 CDX4_HUMAN     Xq13.2             Homeobox protein CDX-4
5 CUX1_HUMAN     7q22.1        Homeobox protein cut-like 1
6 CUX1_HUMAN     7q22.1        Homeobox protein cut-like 1
> #Position 4 has a qvalue<0.05
> #What are the genes mutated in position 4 in the consensus?
> cluster_4_genes <- significant_muts[ significant_muts[['Multiple_Aln_pos']]==4 , 'Gene_Symbol']
> #Display the genes and their number of mutation in consensus position 4
> sort(table(cluster_4_genes))
    1     1     1     1     1     2     2     2     2     3     4     4    15 
