R Graphical Manual

Browse All

Last data update: 2014.03.03

R: Pre-processing of Raw Point Data

procdnpoint

R Documentation

Pre-processing of Raw Point Data

Description

Transforms raw data of point distributions into formats appropriate for ulterior analysis. Please, see the details section for the different tasks that this function performs.

Usage

procdnpoint(dnpoint, tolerance = 1e-03)

Arguments

`dnpoint`	The object 'dnpoint' to be pre-processed.
`tolerance`	A value specifying the maximal distance to consider any point of the data set to be 'close enough' to the known or observed records of each species.

Details

The structure of data is firstly polished in removing duplicated points. Three main tasks are then developed:

1) Distance calculations. Two matrices are produced: (i) points X points, and (ii) species X points. Great circle or Euclidean distances are calculated if the coordinate system is geographical or rectangular, respectively. The second matrix of species X points corresponds to the Hausdorff distance between any single point of the data and the set of observed points for each species.

2) Data organization. A Boolean table accounts for the real species distributions. Here, species are the rows whereas the universe of unique records is arranged into columns. Although it is redundant, a list of species occurrences is also created. This list enables us to include additional points for each species if those extra points are separated from the observed/known points by a negligible distance. The parameter tolerance dictates the criterion to consider a spatial gap between records as negligible.

The argument tolerance is assumed to be in kilometers for the geographical coordinate system. Eventually, negative values are converted to absolute ones. You can pass a numeric vector specifying the tolerance radius for each point of the data set. If the length of the vector tolerance differs from the number of points indicated by the input object, then its values are recycled through rep function until they reach the required length. Finally, if different tolerance values are set to the same point because there are different species occurring on it, then we arbitrarily consider the corresponding value in lexicographic order.

3) Minimum spanning tree report. For each species, a full report about its MST is provided (i.e. total length, endvertices of MST arcs, normalized weight for each point proportional to the mean length of its incident MST arcs).

Value

An object of class dotdata, which is a list with elements:

`Call`	All arguments passed to this function when it was called.
`Label`	Character vector giving the labels for each species.
`dntable`	Boolean distributional table of species by records.
`Occupancy`	List of sampled records by species. Additional points falling in the tolerance radius around observed points are also included.
`Coords`	Two-columns table with (Longitude, Latitude)/(x, y) coordinates for the repository of unique points found throughout the data set.
`MSTsp`	List with data about the minimum spanning tree for each species.

Author(s)

Daniel A. Dos Santos <dadossantos@csnat.unt.edu.ar>

Examples

  #####
  # You can recognize the format of a typical input file
  # in the following created .txt:
  write(c("sp", "latitude", "longitude"), file= "proof.txt", ncolumns = 3, append = TRUE, sep = ",")
  # Sample 20 points from a normal distribution and segregate them into two sets equally sized. 
  x <- c(rnorm(10), rnorm(10, 2))
  y <- rnorm(20)
  for (i in 1:20) 
    write(c(LETTERS[ceiling(i/10)], x[i], y[i]),file= "proof.txt", 3, TRUE, sep = ",")
  # Put getwd() to identify the path where
  # the file 'proof.txt' has been located
  # Read the generated file.
  proof <- read.coord(inputfile = "proof.txt", type = "cartesian") 
  procdnpoint(proof) # Show the values returned by this function
  #####
  unlink("proof.txt") # Delete