Fits a relational event model to dyadic edgelist data, using either the ordinal or temporal likelihood. Maximum likelihood, posterior mode, and posterior importance resampling methods are supported.
a three-column edgelist matrix, with each row containing (in order) the time/order, sender, and receiver for the event in question.
n
number of senders/receivers.
effects
a character vector indicating which effects to use; see below for specification.
ordinal
logical; should the ordinal likelihood be used? (If FALSE, the temporal likelihood is used instead.)
acl
optionally, a pre-computed acl structure.
cumideg
optionally, a pre-computed cumulative indegree structure.
cumodeg
optionally, a pre-computed cumulative outdegree stucture.
rrl
optionally, a pre-computed recency-ranked communications list.
covar
an optional list of sender/receiver/event covariates.
ps
optionally, a pre-computed p-shift matrix.
tri
optionally, a pre-computed triad statistic structure.
optim.method
the method to be used by optim.
optim.control
additional control parameters to optim.
coef.seed
an optional vector of coefficients to use as the starting point for the optimization process.
hessian
logical; compute the hessian of the log-likelihood/posterior surface?
sample.size
sample size to use when estimating the sum of event rates.
verbose
logical; deliver progress reports?
fit.method
method to use when fitting the model.
conditioned.obs
the number of initial observations on which to condition when fitting the model (defaults to 0).
prior.mean
for Bayesian estimation, location vector for prior distribution (multivariate-t). (Can be a single value.)
prior.scale
for Bayesian estimation, scale vector for prior distribution. (Can be a single value.)
prior.nu
for Bayesian estimation, degrees of freedom for prior distribution. (Setting this to Inf results in a Gaussian prior.)
sir.draws
for sampling importance resampling method, the number of posterior draws to take (post-resampling).
sir.expand
for sampling importance resampling method, the expansion factor to use in the initial (pre-resampling) sample; sample size is sir.expand*sir.draws.
sir.nu
for sampling importance resampling method, the degrees of freedom for the t distribution used to obtain initial (pre-resampling) sample.
gof
logical; calculate goodness-of-fit information?
x
an object of class rem.dyad.
object
an object of class rem.dyad.
...
additional arguments.
Details
rem.dyad fits a (dyadic) relational event model to an event sequence, using either the full temporal or ordinal data likelihoods. Three estimation methods are currently supported: maximum likelihood estimation, Bayesian posterior mode estimation, and Bayesian sampling importance resampling. For the Bayesian methods, an adjustable multivariate-t (or, if prior.nu==Inf, Gaussian) prior is employed. In the case of Bayesian sampling importance resampling, the posterior mode (and the hessian of the posterior about it) is used as the basis for a multivariate-t sample, which is then resampled via SIR methods to obtain an approximate set of posterior draws. While this approximation is not guaranteed to work well, it is generally more robust than pure mode approximations (or, in the case of the MLE, estimates of uncertainty derived from the inverse hessian matrix).
Whether Bayesian or frequentist methods are used, the relevant likelihood is either based entirely on the order of events (ordinal=TRUE) or on the realized event times (ordinal=FALSE). In the latter case, all event times are understood to be relative to the onset of observation (i.e., observation starts at time 0), and the last event time given is taken to be the end of the observation period. (If an event is also specified, this event is ignored.)
Effects to be fit by rem.dyad are determined by the eponymous effects argument, a character vector which lists the effects to be used. These are as follows:
NIDSnd: Normalized indegree of v affects v's future sending rate
NIDRec: Normalized indegree of v affects v's future receiving rate
NODSnd: Normalized outdegree of v affects v's future sending rate
NODRec: Normalized outdegree of v affects v's future receiving rate
NTDegSnd: Normalized total degree of v affects v's future sending rate
NTDegRec: Normalized total degree of v affects v's future receiving rate
FrPSndSnd: Fraction of v's past actions directed to v' affects v's future rate of sending to v'
FrRecSnd: Fraction of v's past receipt of actions from v' affects v's future rate of sending to v'
RRecSnd: Recency of receipt of actions from v' affects v's future rate of sending to v'
RSndSnd: Recency of sending to v' affects v's future rate of sending to v'
CovSnd: Covariate effect for outgoing actions (requires a covar entry of the same name)
CovRec: Covariate effect for incoming actions (requires a covar entry of the same name)
CovInt: Covariate effect for both outgoing and incoming actions (requires a covar entry of the same name)
CovEvent: Covariate effect for each (v,v') action (requires a covar entry of the same name)
OTPSnd: Number of outbound two-paths from v to v' affects v's future rate of sending to v'
ITPSnd: Number of incoming two-paths from v' to v affects v's future rate of sending to v'
OSPSnd: Number of outbound shared partners for v and v' affects v's future rate of sending to v'
ISPSnd: Number of inbound shared partners for v and v' affects v's future rate of sending to v'
FESnd: Fixed effects for outgoing actions
FERec: Fixed effects for incoming actions
FEInt: Fixed effects for both outgoing and incoming actions
Note that not all effects may lead to identified models in all cases - it is up to the user to ensure that the postulated model makes sense.
Data to be used by rem.dyad must consist of an edgelist matrix, whose rows contain information on successive events. This matrix must have three columns, containing (respectively) the event times, sender IDs (as integers from 1 to n), and receiver IDs (also from 1 to n). As already noted, event times should be relative to onset of observation where the temporal likelihood is being used; otherwise, only event order is employed. In the temporal likelihood case, the last row should contain the time for the termination of the observation period – any event on this row is ignored. If conditioned.obs>0, the relevant number of initial observations is taken as fixed, and the likelihood of the remaining sequence is calculated conditional on these values; this can be useful when analyzing an event history with no clear starting point.
If covariates effects are indicated, then appropriate covariate values must be supplied as a list in argument covar. The elements of covar should be given the same name as the effect type to which they correspond (e.g., CovSnd, CovRec, etc.); any other elements will be ignored. The format of a given covariate element depends both on the effect type and on the number of covariates specified. The basic cases are as follows:
Single covariate, time invariant: For CovSnd, CovRec, or CovInt, a vector or single-column matrix/array. For CovEvent, an n by n matrix or array.
Multiple covariates, time invariant: For CovSnd, CovRec, or CovInt, a two-dimensional n by p matrix/array whose columns contain the respective covariates. For CovEvent, a p by n by n array, whose first dimension indexes the covariate matrices.
Single or multiple covariates, time varying: For CovSnd, CovRec, or CovInt, an m by p by n array whose respective dimensions index time (i.e., event number), covariate, and actor. For CovEvent, a m by p by n by n array, whose dimensions are analogous to the previous case.
Note that “time varying” covariates may only change values when events transpire; thus, they should be regarded as temporally endogenous. (See the reference below for details.)
Butts, C.T. (2008). “A Relational Event Framework for Social Action.” Sociological Methodology, 38(1).
See Also
rem
Examples
## Not run:
#Generate some simple sample data based on fixed effects
roweff<-rnorm(10) #Build rate matrix
roweff<-roweff-roweff[1] #Adjust for later convenience
coleff<-rnorm(10)
coleff<-coleff-coleff[1]
lambda<-exp(outer(roweff,coleff,"+"))
diag(lambda)<-0
ratesum<-sum(lambda)
esnd<-as.vector(row(lambda)) #List of senders/receivers
erec<-as.vector(col(lambda))
time<-0
edgelist<-vector()
while(time<15){ # Observe the system for 15 time units
drawsr<-sample(1:100,1,prob=as.vector(lambda)) #Draw from model
time<-time+rexp(1,ratesum)
if(time<=15) #Censor at 15
edgelist<-rbind(edgelist,c(time,esnd[drawsr],erec[drawsr]))
else
edgelist<-rbind(edgelist,c(15,NA,NA))
}
#Fit the model, ordinal BPM
effects<-c("FESnd","FERec")
fit.ord<-rem.dyad(edgelist,10,effects=effects,hessian=TRUE)
summary(fit.ord)
par(mfrow=c(1,2)) #Check the coefficients
plot(roweff[-1],fit.ord$coef[1:9],asp=1)
abline(0,1)
plot(coleff[-1],fit.ord$coef[10:18],asp=1)
abline(0,1)
#Now, find the temporal BPM
fit.time<-rem.dyad(edgelist,10,effects=effects,ordinal=FALSE,hessian=TRUE)
summary(fit.time)
plot(fit.ord$coef,fit.time$coef,asp=1) #Similar results
abline(0,1)
#Finally, try the BSIR method (note: a much larger expansion factor
#is recommended in practice)
fit.bsir<-rem.dyad(edgelist,10,effects=effects,fit.method="BSIR",
sir.draws=100,sir.expand=5)
summary(fit.bsir)
par(mfrow=c(3,3)) #Examine the approximate posterior marginals
for(i in 1:9){
hist(fit.bsir$post[,i],main=names(fit.bsir$coef)[i],prob=TRUE)
abline(v=roweff[i+1],col=2,lwd=3)
}
for(i in 10:18){
hist(fit.bsir$post[,i],main=names(fit.bsir$coef)[i],prob=TRUE)
abline(v=coleff[i-8],col=2,lwd=3)
}
## End(Not run)