The original data to be used. It is suggested to use similar
input as InterVA4, with the first column being death IDs. The only
difference in input is InsilicoVA takes three levels: “present”,
“absent”, and “missing (no data)”. Similar to InterVA software,
“present” symptoms takes value “Y”; “absent” symptoms take take value
“NA” or “”. For missing symptoms, e.g., questions not asked or answered
in the original interview, corrupted data, etc., the input should be coded
by “.” to distinguish from “absent” category. The order of the columns does
not matter as long as the column names are correct. Currently it cannot other
non-symptom columns such as subpopulation. And the first column should be
the death ID. Everything other than the death ID, physician ID, and physician
codes should be symptoms.
phy.id
vector of column names for physician ID
phy.code
vector of column names for physician code
phylist
vector of physician ID used in physician ID columns
causelist
vector of causes used in physician code columns
tol
tolerance of the EM algorithm
max.itr
maximum iteration to run
verbose
logical indicator for printing out likelihood change
Value
code.debias
Individual cause likelihood distribution
csmf
Cause specific distribution in the sample
phy.bias
Bias matrix for each physician
cond.prob
Conditional probability of symptoms given causes
References
M. Salter-Townshend and T. B. Murphy (2013).Sentiment
analysis of online media. In Algorithms from and for Nature and
Life, pages 137-145, Springer.
Examples
data(RandomPhysician)
head(RandomPhysician[, 1:10])
## Not run:
causelist <- c("Communicable", "TB/AIDS", "Maternal",
"NCD", "External", "Unknown")
phydebias <- physician_debias(RandomPhysician, phy.id = c("rev1", "rev2"),
phy.code = c("code1", "code2"), phylist = c("doc1", "doc2"),
causelist = causelist, tol = 0.0001, max.itr = 5000)
# see the first physician's bias matrix
round(phydebias$phy.bias[[1]], 2)
## End(Not run)
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(InSilicoVA)
Loading required package: rJava
Loading required package: coda
Loading required package: ggplot2
Please cite the 'InSilicoVA' package as:
Tyler H. McCormick, Zehang R. Li, Clara Calvert, Amelia C. Crampin, Kathleen Kahn and Samuel J. Clark (2014). Probabilistic cause-of-death assignment using verbal autopsies, Journal of the American Statistical Association, to appear
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/InSilicoVA/physician_debias.Rd_%03d_medium.png", width=480, height=480)
> ### Name: physician_debias
> ### Title: Implement physician debias algorithm
> ### Aliases: physician_debias
>
> ### ** Examples
>
> data(RandomPhysician)
> head(RandomPhysician[, 1:10])
ID elder midage adult child under5 infant neonate male female
1 d1 Y NA NA . . Y
2 d2 Y NA NA . . Y
3 d3 Y NA NA . . Y
4 d4 Y NA NA . . Y
5 d5 Y NA NA . . Y
6 d6 Y NA NA . . Y
> ## Not run:
> ##D causelist <- c("Communicable", "TB/AIDS", "Maternal",
> ##D "NCD", "External", "Unknown")
> ##D phydebias <- physician_debias(RandomPhysician, phy.id = c("rev1", "rev2"),
> ##D phy.code = c("code1", "code2"), phylist = c("doc1", "doc2"),
> ##D causelist = causelist, tol = 0.0001, max.itr = 5000)
> ##D
> ##D # see the first physician's bias matrix
> ##D round(phydebias$phy.bias[[1]], 2)
> ## End(Not run)
>
>
>
>
>
> dev.off()
null device
1
>