R: SNP metadata used in calling for 1000 genomes pilot data
c6snp
R Documentation
SNP metadata used in calling for 1000 genomes pilot data
Description
SNP metadata used in calling for 1000 genomes pilot data – restricted to chr6
Usage
data(c6snp)
Format
A data frame with 1143009 observations on the following 16 variables.
chr
a numeric vector
chrPosFrom
a numeric vector
chrPosTo
a numeric vector
rs
a factor with levels rs1000rs1000009rs1000025rs10000302 ...
ChrAllele
a factor with levels -AAAAAAAAAAAAAAAAAAAAAAAAAAAA ...
variantAllele
a factor with levels (CA)11/12/13/14/-(G)14/15/16/18/19/20/21/22/23/C(G)20/21/22/23/24/25/27/-/G/GGG(LAREDELETION)/-/A ...
snpAlleleChrOrien
a factor with levels (A)1/13/15/G(A)10/12(A)10/14(C)10/11(CA)10/11/13/14/15(CA)10/14/15/16/17/18/20/21(CA)10/17/18/19/20/21/22/23/24/25(CA)11/12/13/14/-/CACA(CA)11/12/13/14/15/16/17 ...
snp2chrOrien
a numeric vector
snpClassAbbrev
a factor with levels MicrosatelliteNamed snpdipsmixedmulti-basesingle base
Column description:
==================
Col1: Chr
Col2: chrPosFrom: all chromosome positions are 1 based, that is the first base is counted as 1. Position is for each base, not "interbase".
Col3: chrPosTo:
Col4: rs
Col5: ChrAllele: the base or bases on the chromosomes at the snp position or ranges.
Col6: variantAllele: This is the other allele that is not on the chromosome. For ex. Snp is A/C, chromosome has A, variantAllele will be "C".
Col7: snpAlleleChrOrien: This is the list of alleles for the snp in the chromosome orientation.
Col8: : the alignment orientation between snp flank and the chromosome sequence. #orien: 0 - same; 1 - opposite
Col9: snpClassAbbrev: the variation type of the snp.
Details at: http://www.ncbi.nlm.nih.gov/bookshelf/br.fcgi?book=helpsnpfaq&part=Reports#Reports.Variation_Class
Col10: snpClassCode: The numeric code for snp variation class.
possible snpClassCode, SnpClassAbbrev and desciptions are below:
1 single base Only single base variation.ex.A/G.
2 dips indel or dips: deletion insertion polymorphism.ex.-/T.ex.ss149071 obs=AA/GCCTG
3 HETEROZYGOUS HETEROZYGOUS
4 Microsatellite Microsatellite
5 Named snp observed field starts with '(', and not class 3 an 4.ex.(Alu)
6 NOVARIATION NOVARIATION
7 mixed If the subsnp's in an rs cluster have different snp class.
8 multi-base Multiple Nucleotide Polymorphism, where all alleles are same length, and length > 1.ex.ss2421179 AT/GA
Col11: mapLocType: The alignment type at snp site. Possible values and meanings are:
1 Insertion on contig: snp is always represented as one base and this one base in the snp sequence is substituted with more than one bases on the contig sequence in the alignment.
2 Contig allele is one base long.snp is always represented as one base and this one base in the snp sequence is substituted with exactly one base on the contig.
3 Deletion on the contig: part of the snp flanking sequence including the snp was absent on the contig sequence in the alignment.
4 In the alignment, part of the snp flanking sequence including snp is replaced with the contig sequence of longer length.
5 In the alignment, part of the snp flanking sequence including snp is replaced with the contig sequence of exactly the same length.
6 In the alignment, part of the snp flanking sequence including snp is replaced with the contig sequence of a shorter length.
Col12: mapLocCnt: the total number of locations the snp maps to within the assembly.
Col13: Mapweight:
A number that codes for the mapping quality of the snp on each assembly:
1 = snp aligns at exactly one location
2 = snp aligns at two locus on same chromosome
3 = snp aligns at two locus on different chromosomes or more than 3 and less than 10 locations
10= snp aligns at 10 or more 10 locations
Col14: ContigLabel: This is used to show when a snp maps to alternative haplotypes or PAR region. Possible values are:
DR53
PAR
c22_H2
c5_H2
c6_COX
c6_QBL
mitochondrial genome
reference
col15: unPlacedContig: This field only has value when a snp hits an unplaced contig, there is no chromosome positions for the snp.
chrPosFrom and chrPosTo will be NULL. In this case, unPlacedContig will have the actually contig accession that is unplaced on a chromosome.
Source
1000 genomes pilot data
Examples
data(c6snp)
c6snp[1:3,]
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(ind1KG)
Loading required package: chopsticks
Loading required package: survival
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/ind1KG/c6snp.Rd_%03d_medium.png", width=480, height=480)
> ### Name: c6snp
> ### Title: SNP metadata used in calling for 1000 genomes pilot data
> ### Aliases: c6snp
> ### Keywords: datasets
>
> ### ** Examples
>
> data(c6snp)
> c6snp[1:3,]
chr chrPosFrom chrPosTo rs ChrAllele variantAllele snpAlleleChrOrien
1 6 5238 5238 rs3915767 C T C/T
2 6 5597 5597 rs2854679 A G A/G
3 6 5658 5658 rs1419824 C T C/T
snp2chrOrien snpClassAbbrev snpClassCode mapLocType mapLocCnt mapWeight
1 0 single base 1 2 4 3
2 1 single base 1 2 4 3
3 0 single base 1 2 4 3
contigLabel unPlacedContig
1 reference
2 reference
3 reference
>
>
>
>
>
> dev.off()
null device
1
>