NCEAS Scientific Computing: Solutions Center Use Case

Compute Polygon Centroids Using R

This use case demonstrates how to compute, display, and use centroids of polygons stored in standard R spatial data objects, using two different R geophysical packages.


The centroid of a polygon is the geometric center of that polygon, calculated using a standard mathematical algorithm. Polygon centroids are are widely used to estimate the 'center of gravity' of individual polygons in a collection; in addition, centroids are used when plotting maps as the default location for displaying polygon labels

The R Environment provides several different options for calculating, storing, and using polygon centroids. In this Use Case, we demonstrate methods included in two widely-used R packages:

  • maptools contains tools for input/output, processing, and analysis of geophysical data, in particular, ESRI shape files. maptools also provides software interfaces ('wrappers') for exchanging spatial object with other R packages.

  • PBSMapping was developed by a team of fisheries scientists. It extends the R language to include two-dimensional plotting features similar to those commonly available in a Geographic Information System (GIS).

  • rgdal The R environment implementation of the Geospatial Data Abstraction Library (GDAL), which includes the widely-used PROJ.4 library for map projection/transformation operations.

Click here to download a .zip file archive containing the R source code and sample shape file used in this Example.

I. Using the maptools package

The first example:

  • 1) Ingests a polygon Shape file into an R Map object,
  • 2) computes the centroid for each polygon,
  • 3) reads the polygon file into an R Spatial object using a method that extracts the file's map projection attributes,
  • 4) projects the centroid points into the same coordinate system as the polygons,
  • 5) appends the centroid point coordinates onto the polygon data object's attribute table,
  • 6) creates a NEW data frame object containing the polygon centroids and selected attributes from the input polygon file.

Note: The maptools package provides data input method and Spatial class; the sp class provides data plotting method

ComputePolyCentroids <- function()
{
# R Script demonstrates two methods to compute Polygon Centroids:
# calcCentroid(), supplied by PBSmapping package, and get.Pcent(),
# supplied by maptools package. Also demonstrates R map projection
# transformation methods relevant to centrold calculation
#
# August 18, 2007
# Author: Rick Reeves, Scientific Programmer, 
# National Center for Ecological Analysis and Synthesis (NCEAS)
# University of California, Santa Barbara www.nceas.ucsb.edu
#
# Field scientists often work with sets of polygons stored in 
# ESRI shape file format, a widely used, platform-independant
# format for storing vector-based spatial data.
#
# R provides two alternative methods for computing centroids 
# for a set of polygons stored in Shape files.:
#
#   1) get.Pcent(), in the maptools package of geographic 
#      data analysis tools for shape file spatial data.
#
#   2) calcCentroid(), in the PBSMapping package of GIS-type 
#      spatial analysis and mapping tools. 
# 
# One issue that arises is the map projection and coordinate
# system, if any, incorporated in the incoming Polygon data set. 
# We will want to be sure that the calculated polygon centroids
# incorporate the same projection/coordinate system attributes.
# Specific R methods, demonstrated here, deal with the projection 
# issues.
#
# First, load the R packages that we need for the demonstration. 
#  
   library(rgdal)      # contains readOGR() function for reading shape files., auto-detecting the .prj file. 
   library(maptools)   # contains get.Pcent() function; automatically loads foreign and sp
   library(PBSmapping) # for GIS_like geospatial object manipulation / analysis including poly
#
# Next, read in the polygon shape file Several alternative routines exist for doing this:
# 
#      read.shape    (maptools): Reads shape file into Map object, which is the format required for 
#                    input to the get.Pcent() method.
#      readShapePoly (maptools): Reads shape file into Spatial object, widely used
#                    in R to store geospatial shape and attribute data sets
#      readOGR       (rgdal) Reads shape file into Spatial object, also auto-detects
#                    the shape file's projection attribute string (if it exists).
#
#
# First, demonstrate the use of get.Pcent(), supplied by maptools package.
# We will also show how to append the centroids as attributes onto the incoming Polygon file.
#
# Step 1: Read in vector polygon and file (in Shape File format), get the polygon projection information, 
#         and compute the polygon centroids. 
#
# Note: get.Pcent() method returns a two-column numeric matrix containing centroid point coordinates.
# Input polygon file contains outlines of all counties in five northwestern United States
# get.Pcent() requires a Map data object as input. 'Map' is an older, 'deprecated' class that
# we use here because get.Pcent requires it.
#
   MapPolysMapClass = read.shape("FiveNWStatesCounties")
   MapPolyCents     = get.Pcent(MapPolysMapClass)
#
# Use this information to project the matrix of map coordinates returned by get.Pcent
# into the same system  as the incoming polygons. 
# To get the map projection parameters for the polygon file, use readOGR() from rgdal library, 
# which auto detects the projection parameters as it reads the '.prj' component of the shape file.
#
   MapPolysDataFrame <- readOGR("FiveNWStatesCounties.shp","FiveNWStatesCounties")
   ProjString = proj4string(MapPolysDataFrame)

   CentCoords  = matrix(c(MapPolyCents[,1],MapPolyCents[,2]),nrow=length(MapPolyCents[,1]),ncol=2,byrow=FALSE) 
   ProjectedCentroids = spTransform(SpatialPoints(coordinates(CentCoords),proj4string=CRS(ProjString)),CRS(ProjString))
#
# Create a basic plot using the base graphics package features,
# using only the default plotting parameters.

   plot(MapPolysDataFrame,)  
   points(ProjectedCentroids)
 
Five State County Polygons Five State County Polygons
#
# Note: Use the class() function to check the R class types for 
# MapPolys (Map object) and MapPolyCents ( matrix object)
# 
# How many centroids were calculated? Get number of rows in the second column of centroid matrix.
# 
   numberOfCentroids = length(MapPolyCents[,2])
#
# How many polygons in five-state area? Get number of elements in the Shapes list component of the
# Map object (enter > MapPolysMapClass@names to get list of the components).
#
   numberOfPolygons = MapPolysMapClass$Shapes
#
# Now that we have confirmed that the two objects have equal length, lets create two new data frames: 
# 1) The polygon centroids, with county FIPS codes and county names attached.
# 2) The original map polygons, with polygon centroid coordinates appended. 
#
# How many centroids were calculated? Get number of rows in the second column of centroid matrix.
# 
   PolyAttrNames = names(MapPolysMapClass$att.data)
   PolyAttrNames  # we will select the attribute that stores County Names.
# 
   CountyNames=(MapPolysMapClass$att.data["COUNTYNAME"])
#
# cbind() merges vectors, returning a data frame.
#
   NamedCentroids = cbind(MapPolyCents,CountyNames)
#
# Now append the centroid coordinates to the map polygon data frame using all-purpose cbind()
# Make a copy and then append to the $att.data component
#
   MapPolysAndCentroids = MapPolysMapClass
   MapPolysAndCentroids$att.data = cbind(MapPolysAndCentroids$att.data,MapPolyCents)
# 
# take a look at the new attribute file:
#
   MapPolysAndCentroids$att.data 
#
# We need to create a better set of names for the revised
# the revised/appended data frame, and replace the existing ones. 
#
   NewColumnNames = c(names(MapPolysMapClass$att.data),"LONGITUDE","LATITUDE")
   names(MapPolysAndCentroids$att.data) = NewColumnNames

II. Using the PBSmapping package

The second example:

  • 1) Ingests a polygon Shape File into a PolySet object use by the PBSmapping package,
  • 2) computes the centroid for each polygon,
  • 3) plots the polygons and centroids,
  • 4) creates a NEW data frame object containing the polygon centroids and selected attributes from the input polygon file,
  • 5) plots the polygons and centroids using the base graphics plot routines, adding color and symbol attributes.
#
# Next, generate the centroids using routines in the PBSmapping package, which provides 
# GIS-like services.
# Get map projection attributes (included in the '.prj' component of the shape file).
#
   MapPolysDataFrame <- readOGR("FiveNWStatesCounties.shp","FiveNWStatesCounties")
   ProjString = proj4string(MapPolysDataFrame)
#
# In any event, the incoming polygon data now has a projection attribute.
# Generate centroids using PBSmapping method. First, we promote the 
# Spatial to the PolySet data object used by PBSmapping
# methods.
#   
   InputPolygonsPS  = SpatialPolygons2PolySet(MapPolysDataFrame)
#
# Note rollup=1 parameter: This generates one centroid for each
# unique Polygon Identifier (PID) in the shape file, rather than
# for each element in the shape file. Some polygons with 'holes'
# may occupy multiple elements. For other options, see PBSmapping
# documentation. 
# 
   PolyCentPBS = calcCentroid(InputPolygonsPS,rollup=1)
#
# Plot the polygons and their centroids 
# 
   plotPolys(InputPolygonsPS,col="wheat1",xlab="longitude",ylab="latitude")
   addPoints(PolyCentPBS,col="blue",pch=19)
 
PBSMapping Polygons PBSMapping Polygons/Centroids
   
#
# PolyCentPBS returns a PolyData object, which contains just the
# centroid X and Y coordinates and the PID for corresponding polygons. 
#
   CentCoords  = matrix(c(PolyCentPBS$X,PolyCentPBS$Y),nrow=length(PolyCentPBS$X),ncol=2,byrow=FALSE) 
#
# Create a SpatialPointsDataFrame and assign the county polygon name and FIPS code attributes to the centroids.
#
   CentroidAttrs = data.frame(MapPolysDataFrame$COUNTYNAME,MapPolysDataFrame$COUNTYFIPS)
   CentroidsSPDF = SpatialPointsDataFrame(CentCoords,CentroidAttrs,coords.nrs=numeric(0),CRS(ProjString))
#
# Use the generic R Graphics to create simple plot of the polygons and Spatial Points format points, 
# this time adding color to the map.
#
   plot(MapPolysDataFrame, col="wheat")
   points(CentroidsSPDF,col="dark blue",pch=16)
Five State County Polygons Five State County Polygons
#
# Done. 
#
}

Learning More:

The CRAN (R) Task View summarizes the R Analytical Environment's spatial data analysis and management tools..

This edition of the R Newsletter contains an in-depth article describing the R spatial data classes; parts of the article appear here.

This ESRI White Paper describes the Shape File format.

This Web Site describes the standard .dbf format which is a component of the ESRI Shape File.

Point of Contact for this Use Case: Rick Reeves, NCEAS Scientific Programmer reeves@nceas.ucsb.edu reeves@nceas.ucsb.edu
This Use Case was compiled August, 2007.

Site Home | NCEAS Home | KNB Home