R: Show significant clusters of mutations of every gene in a...
lfmSingleSequence
R Documentation
Show significant clusters of mutations of every
gene in a LowMACA object without alignment
Description
The method lfmSingleSequence
(low frequency mutations in Single Sequence) launch lfm method
on every gene or domain inside a LowMACA object without aligning the sequences
a character that defines whether to use 'pvalue' or
'qvalue' to select significant positions. Default: 'qvalue'
threshold
a numeric element between 0 and 1 defining the threshold
of significance for the defined metric. Default: 0.05
conservation
a numeric value in the range of 0-1 that defines
the threshold of trident conservation score
to include the specified position. Default: 0.1
BPPARAM
An object of class BiocParallelParam specifiying parameters related to
the parallel execution of some of the tasks and calculations within this function.
See function bpparam() from the BiocParallel package.
mail
if not NULL, it must be a valid email address to use EBI clustalo web service.
Default is to use a local clustalo installation
perlCommand
a character string containing the path to Perl executable.
if missing, "perl" will be used as default. Only used in web mode
verbose
logical. verbose output or not
Details
This function completes a LowMACA analysis by analyzing
every gene or domain in the LowMACA object as a 'single sequence' analysis was started
in the first place. The result is a dataframe showing all the significant positions
of every gene. If you have a LowMACA object composed by 100 genes,
it will launch 100 LowMACA single gene analyses and aggregates
the results of every lfm launched on these 100 objects.
The output looks very similar to lfm, but in this case the
column Multiple_Aln_pos has a different meaning. While in lfm it shows
where the mutation falls in the consensus sequence, in this case it must be intended
the consensus within the gene. If the original LowMACA object had mode equal to 'gene', the column
Multiple_Aln_pos will be always equal to Amino_Acid_Position. If mode is 'pfam', it is the same unless
a gene harbors more than one domain of the same type within its sequence. In that case, an internal alignment
of every domain inside the protein is performed.
Value
A data.frame with 10 columns corresponding to the mutations retrieved:
Gene_Symbol gene symbols of the analyzed genes
Amino_Acid_Position amino acidic positions relative to original protein
Amino_Acid_Change amino acid changes in hgvs format
Sample Sample barcode where the mutation was found
Tumor_Type Tumor type of the Sample
Envelope_Start start of the pfam domain in the protein
Envelope_End end of the pfam domain in the protein
Multiple_Aln_pos positions in the consensus
relatively to the sequence analyzed. See warnings section
Entrez entrez ids of the mutations
Entry Uniprot entry of the protein
UNIPROT other protein names for Uniprot
Chromosome cytobands of the genes
Protein.name extended protein names
Author(s)
Stefano de Pretis , Giorgio Melloni
See Also
lfm
Examples
#Load homeobox example
data(lmObj)
#Run lfmSingleSequence
significant_muts <- lfmSingleSequence(lmObj)
#Show the result
head(significant_muts)
#Show all the genes that harbor significant mutations without the alignment
unique(significant_muts$Gene_Symbol)
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(LowMACA)
Checking if clustalo is in the PATH...
Checking perl installation...
Checking perl modules XML::Simple and LWP...
Can't locate XML/Simple.pm in @INC (you may need to install the XML::Simple module) (@INC contains: /etc/perl /usr/local/lib/x86_64-linux-gnu/perl/5.22.1 /usr/local/share/perl/5.22.1 /usr/lib/x86_64-linux-gnu/perl5/5.22 /usr/share/perl5 /usr/lib/x86_64-linux-gnu/perl/5.22 /usr/share/perl/5.22 /usr/local/lib/site_perl /usr/lib/x86_64-linux-gnu/perl-base .).
BEGIN failed--compilation aborted.
Warning messages:
1: In .ClustalChecks(ClustalCommand = "clustalo") :
Clustal Omega is not in the PATH:
You can either change clustalo command using lmParams function or use the web service. See ?setup
2: running command '/usr/bin/perl -MXML::Simple -e 1' had status 2
3: In .PerlModuleChecks(stop = FALSE, perl = "perl") :
XML::Simple module for perl is not installed.
If you don't want to install a local clustal omega and use the web service, XML::Simple is required
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/LowMACA/lfmSingleSequence.Rd_%03d_medium.png", width=480, height=480)
> ### Name: lfmSingleSequence
> ### Title: Show significant clusters of mutations of every gene in a
> ### LowMACA object without alignment
> ### Aliases: lfmSingleSequence
>
> ### ** Examples
>
> #Load homeobox example
> data(lmObj)
> #Run lfmSingleSequence
> significant_muts <- lfmSingleSequence(lmObj)
Warning in mapMutations(object) :
We excluded these genes (or domains) because they have less than 1 mutations
NULL
Error in .clustalOAlign(genesData, clustal_cmd, clustalo_filename, mail, :
Clustal Omega command not found. clustalo is not in your PATH or it was not installed
Calls: lfmSingleSequence ... tryCatch -> tryCatchList -> tryCatchOne -> <Anonymous>
In addition: There were 13 warnings (use warnings() to see them)
Execution halted