R: Expected Vocabulary Growth by Binomial Interpolation (zipfR)
vgc.interp
R Documentation
Expected Vocabulary Growth by Binomial Interpolation (zipfR)
Description
vgc.interp computes the expected vocabulary growth curve for
random sample taken from a data set described by the frequency
spectrum object obj.
Usage
vgc.interp(obj, N, m.max=0, allow.extrapolation=FALSE)
Arguments
obj
an object of class spc, representing the frequency
spectrum of the data set from which samples are taken
N
a vector of increasing non-negative integers specifying the
sample sizes for the expected vocabulary size is calculated (as well
as expected spectrum elements if requested)
m.max
an integer in the range 1 … 9, specifying the
number of spectrum elements to be included in the vocabulary growth
curve (default: none)
allow.extrapolation
if TRUE, the requested sample sizes
N may be larger than the sample size of the frequency spectrum
obj, so that binomial extrapolation is performed.
This obtion should be used with great caution (see
EV.spc for details).
Details
See the EV.spc manpage for more information, especially
concerning binomial extrapolation.
Note that the result of vgc.interp is an object of class
vgc (a vocabulary growth curve), but its input is an
object of class spc (a frequency spectrum).
Value
An object of class vgc, representing the expected vocabulary
growth curves for random samples taken from the data set described by
obj. Data points will be generated for the specified sample
sizes N.
See Also
vgc for more information about vocabulary growth curves
and links to relevant functions; spc for more
information about frequency spectra
The implementation of vgc.interp is based on the functions
EV.spc and EVm.spc. See the respective
manpages for technical details.
spc.interp computes the expected frequency spectrum for
a random sample by binomial interpolation.
Examples
## load the Tiger PP expansion spectrum
## (sample size: about 91k tokens)
data(TigerPP.spc)
## binomially interpolated curve
TigerPP.bin.vgc <- vgc.interp(TigerPP.spc,(1:100)*910)
summary(TigerPP.bin.vgc)
## let's also add growth of V_1 to V_5 and plot
TigerPP.bin.vgc <- vgc.interp(TigerPP.spc,(1:100)*910,m.max=5)
plot(TigerPP.bin.vgc,add.m=c(1:5))