Housing data for 506 census tracts of Boston from the 1970
census. The dataframe
BostonHousing contains the original data by Harrison and
Rubinfeld (1979), the dataframe BostonHousing2 the corrected
version with additional spatial information (see references below).
Usage
data(BostonHousing)
data(BostonHousing2)
Format
The original data are 506 observations on 14 variables,
medv being the target variable:
crim
per capita crime rate by town
zn
proportion of residential land zoned for lots over 25,000 sq.ft
indus
proportion of non-retail business acres per town
chas
Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
nox
nitric oxides concentration (parts per 10 million)
rm
average number of rooms per dwelling
age
proportion of owner-occupied units built prior to 1940
dis
weighted distances to five Boston employment centres
rad
index of accessibility to radial highways
tax
full-value property-tax rate per USD 10,000
ptratio
pupil-teacher ratio by town
b
1000(B - 0.63)^2 where B is the proportion of blacks by town
lstat
percentage of lower status of the population
medv
median value of owner-occupied homes in USD 1000's
The corrected data set has the following additional columns:
cmedv
corrected median value of owner-occupied homes in USD 1000's
town
name of town
tract
census tract
lon
longitude of census tract
lat
latitude of census tract
Source
The original data have been taken from the UCI Repository Of Machine Learning
Databases at
See Statlib and references there for details on the corrections.
Both were converted to R format by Friedrich Leisch.
References
Harrison, D. and Rubinfeld, D.L. (1978).
Hedonic prices and the demand for clean air.
Journal of Environmental Economics and Management, 5,
81–102.
Gilley, O.W., and R. Kelley Pace (1996). On the Harrison and Rubinfeld
Data. Journal of Environmental Economics and Management, 31,
403–405. [Provided corrections and examined censoring.]
Newman, D.J. & Hettich, S. & Blake, C.L. & Merz, C.J. (1998).
UCI Repository of machine learning databases
[http://www.ics.uci.edu/~mlearn/MLRepository.html]. Irvine, CA:
University of California, Department of Information and Computer
Science.
Pace, R. Kelley, and O.W. Gilley (1997). Using the Spatial Configuration of
the Data to Improve Estimation. Journal of the Real Estate Finance
and Economics, 14, 333–340. [Added georeferencing and spatial
estimation.]