|
|
Sampling Vector Polygon Data using a Point Grid Overlay
Using elements of the R sp package to extract attribute values at regular intervals from a vector polygon map using a regularly-spaced point grid
Bailey's Ecoregions for the continental United States delineate areas of common climactic and vegetation characteristics across the continent. They are available as a digital spatial dataset (various formats) from from the United States National Atlas data portal. An NCEAS scientist wished to generate a regularly-spaced grid covering the entire map, and then assign the underlying Ecoregion code to each grid location. Further, they wished to vary the spatial resolution of this grid order to create sample the polygon data at different spatial frequencies. A straightforward geospatial processing task, that until the advent of the R geospatial processing tools, required a full-scale Geographic Information System (GIS) to perform. Now, we can produce the required results using functions in the sp and PBSMapping packages.
In this Use Case we will work through the process of reading, computing, and displaying the input Bailey's Ecoregions file and the spatial datasets derived from this map.
Click here to download the source code and sample data used in this example.
Read and Display Vector Polygons: Bailey's Ecoregions Map
On the left is a display of the Bailey's Ecoregions polygons. On the right is the R code used to set up the R graphics environment and display the regions in map form. Note: The PBSMapping package methods are tailored to the display of multi-layer spatial data sets in that it is easy to 'stack' calls to the routines to generate a single multi-dataset (e.g., points-over-polygons) diagrams. In addition, they provide axis labels and map 'tic-marks' by default. There is a tradeoff: The map data need to be converted into PBSMapping-specific objects (e.g., the PolySet) before creating the diagram. PBSMapping provides methods for making this conversion, of course.
 |
SampleBaileysEcoRegions <- function()
{
library(sp)
library(PBSMapping)
#
# Read ESRI Shape File into SpatialPolygonsDataFrame
# This produces a SpatialPolygonsDataFrame.
#
EcoRegion = readShapePoly("BaileysEcoRegionsConusGP")
#
# Use PBSMapping routines to generate nice spatial object plots
# Note: This requires data conversion to a PBSMapping PolySet.
#
EcoRegionPS = SpatialPolygons2PolySet(EcoRegion)
#
plotPolys(EcoRegionPS,xlab="W Longitude",ylab="N Latitude")
#
|
Extract and Use Subset of Baileys Ecoregion Attributes for Use Case
The Baileys Ecoregion polygons have nine descriptive attributes. We will extract three - ECO_US, ECO_US_ID, ECO_CODE - for use in this example.
The ECO_CODE attribute will be sampled using the point grid that we will create in the next step. On the right is the R script, on the left is a portion of the reduced Data component
'data.frame': 1690 obs. of 3 variables:
$ ECO_US_ : num 1246 1247 1248 1249 1250 ...
$ ECO_US_ID: num 135 136 1402 15 1403 ...
$ ECOCODE : Factor w/ 164 levels "-212A","-212B",..: 1 116 51...
ECO_US_ ECO_US_ID ECOCODE
0 1246 135 -212A
1 1247 136 M212A
2 1248 1402 -242A
3 1249 15 -242A
4 1250 1403 -242A
5 1251 14 M242B
6 1252 1404 -242A
7 1253 1405 -242A
8 1254 1 M242A
9 1255 1406 -242A
10 1256 1407 -242A
|
#
# Make a copy of the Spatial Polygon file's Data component,
# including ONLY columns 3 through 5. Then, create a copy
# of the Spatial Polygon file, using the new data subset.
#
NewData = EcoRegion@data[,3:5]
EcoRegionTrunc = EcoRegion
EcoRegionTrunc@data = NewData
str(NewData)
NewData[1:10,]
#
|
Create the Sampling Point Grid
Next, use the sp package tools to generate two sampling grids with two degree and
one degree spatial resolution.On the right is the R script, on the left are (top to bottom): the two-degree grid, the two-degree grid overlaid on the Bailey's Ecoregions polygons, the one-degree grid overlaid on the Baley's Ecoregions, and a display (using the ESRI ArcMap GIS Identify feature) of a single grid point value.
 |
#
vals = EcoRegionTrunc@bbox
deltaLong = as.integer((vals[1,2] - vals[1,1]) + 1.5)
deltaLat = as.integer((vals[2,2] - vals[2,1]) + 1.5)
#
# grid resolution in decimal degress.
#
gridRes = 1.0
#
gridSizeX = deltaLong / gridRes
gridSizeY = deltaLat / gridRes
#
# GridTopology object forms the basis for the sampling grid
#
GridTopoOneDeg = GridTopology(vals[,1],c(gridRes,gridRes),
c(gridSizeX,gridSizeY))
#
# Create the SpatialGrid object....
#
SamplingGridOneDeg = SpatialGrid(GridTopoOneDeg)
#
# NOTE: To create the one-half Deg Spatial Grid,
# change the parameter 'gridRes', and use the
# same code subset displayed here...
#
gridRes = 2.0
#
# repeat above code.....
#
# ......
#
SamplingGridTwoDeg = SpatialGrid(GridTopoTwoDeg)
#
# We get a more suitable output from the overlay() method
# if we use a SpatialPoint object as an input, so promote
# the SpatialGrid object to one.
#
SamplingPointsTwoDeg = as(SamplingGridTwoDeg,"SpatialPoints")
#
# plot the two kilometer grid on top of the base map,
# then create a new plot to display the two-and one-degree grids
# Note: we convert the points to PBSMapping-compatible EventData
# to allow use of addPoints() method.
#
NumPoints = length(SamplingPointsTwoDeg@coords[,1])
TwoDegEvents <- as.EventData(data.frame(EID=1:NumPoints,X=SamplingPointsTwoDeg@coords[,1],
Y=SamplingPointsTwoDeg@coords[,2]),projection = "NA")
addPoints(TwoDegEvents,col="green",pch = 20)
#
# plot the two-degree grid by itself
#
dev.set(1)
plotPoints(TwoDegEvents, col = "green", pch = 20)
#
NumPoints = length(SamplingPointsTwoDeg@coords[,1])
OneDegEvents <- as.EventData(data.frame(EID=1:NumPoints,X=SamplingPointsOneDeg@coords[,1],
Y=SamplingPointsOneDeg@coords[,2]),projection = "NA")
#
# create a second plot in new graphics window to display the One Degree grid
#
dev.set(1)
plotPolys(EcoRegionPS, xlab = "W Longitude", ylab = "N Latitude")
addPoints(OneDegEvents,col="red",pch=20)
#
# Use the overlay() method to extract a polygon attribute value
# at each point location within the polygon data set boundary
#
SamplingResultsDFTwoDeg = overlay(EcoRegionTrunc,SamplingPointsTwoDeg)
#
#
# Create a point shape file to contain the
# sampling results. The attribute table will include
# all three Bailey's measures along with the corresponding
# point location.
# Make some modifications to the overlay results data frame
# so that the shape file has usable attributes.
#
# First, add the sampling point coordinates
#
Temp2 = cbind(SamplingPointsTwoDeg@coords,SamplingResultsDFTwoDeg)
#
# Next, convert the ECOCODE attribute from a factor to a character string
#
ecoCode = as.character(Temp1$ECOCODE)
flags = is.na(ecoCode)
ecoCode[flags] = "NoMatch"
OutputDataFrame = cbind(Temp2[1:4],ecoCode)
#
# Finally, generate a point shape file that includes the EcoRegion attribute.
#
write.pointShape(SamplingPointsTwoDeg@coords,OutputDataFrame,"BaileyEcoRegionGridTwoDeg")
#
|
 |
 |
 |
Create new point Shape File, then read and display the file
Lets create an output file consisting ONLY of grid points falling within the United States boundary. First, perform a Point-In-Polygon operation using the PBSMapping findPolys() method to assign the bounding polygon to each grid point.
- 1. Perform a Point-In-Polygon operation.
- 2. Write the point subset into a Shape File.
- 3. Read in and display the points in the Shape File.
- 4. Take a look at the R data object structure for both Point types.
Here is the R script, and the resulting plot:
 |
#
# Bind the Bailey's EcoCodes to the points in the
# entire two-degree grid.
#
TwoDegEventsWithCodes = cbind(TwoDegEvents,ecoCode)
#
# Extract ONLY grid points within the USA boundaries.
#
PointsInsideRegions = findPolys(TwoDegEventsWithCodes,EcoRegionPS)
PointIndexesInsideRegions = sort(unique(PointsInsideRegions$EID))
PointsInsideRegions = TwoDegEvents[PointIndexesInsideRegions,]
#
# write a new point shape file
#
write.pointShape(cbind(PointsInsideRegions$X,PointsInsideRegions$Y),
PointsInsideRegions,"BaileyEcoRegionGridClipped")
#
# look at the structure of the point data object
# before and after conversion to Events
#
TwoDegInUSA = readShapePoints("BaileyEcoRegionGridClipped")
#
NumPoints = length(TwoDegInUSA@data[,1])
PointEvents <- as.EventData(data.frame(EID=1:NumPoints ,X=TwoDegInUSA@coords[,1],
Y=TwoDegInUSA@coords[,2]))# ,projection = "NA")
#
# Make a new plot
#
addPoints(TwoDegEvents,col="orange",pch = 20)
plotPolys(EcoRegionPS, xlab = "W Longitude", ylab = "N Latitude")
addPoints(PointEvents,col="red",pch = 20)
#
# Look at the structure of the point data object
# before and after conversion to Events
#
str(PointEvents)
#
str(TwoDegInUSA)
#
# End of example!
#
|
#
Formal class 'SpatialPointsDataFrame' [package "sp"] with 5 slots
..@ data :'data.frame': 206 obs. of 4 variables:
.. ..$ EID : num [1:206] 3 4 5 6 7 8 9 10 11 12 ...
.. ..$ X : num [1:206] -121 -119 -117 -115 -113 ...
.. ..$ Y : num [1:206] 48.5 48.5 48.5 48.5 48.5 ...
.. ..$ ecoCode: Factor w/ 114 levels "-212F","-212H",..: 88 111 ...
.. ..- attr(*, "data_types")= chr [1:4] "N" "N" "N" "C"
..@ coords.nrs : num(0)
..@ coords : num [1:206, 1:2] -121 -119 -117 -115 -113 ...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : NULL
.. .. ..$ : chr [1:2] "coords.x1" "coords.x2"
..@ bbox : num [1:2, 1:2] -122.8 26.5 -68.8 48.5
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : chr [1:2] "coords.x1" "coords.x2"
.. .. ..$ : chr [1:2] "min" "max"
..@ proj4string:Formal class 'CRS' [package "sp"] with 1 slots
.. .. ..@ projargs: chr NA
#
# Note that EventData class does not retain the point attributes
#
Classes ‘EventData’ and 'data.frame': 206 obs. of 3 variables:
$ EID: int 1 2 3 4 5 6 7 8 9 10 ...
$ X : num -121 -119 -117 -115 -113 ...
$ Y : num 48.5 48.5 48.5 48.5 48.5 ...
#
|
Learning More:
The USFS Ecoregions Center: US Forest Service Natural Resource Assessment and Analysis Ecoregions Center web site
The CRAN (R) Task View summarizes the R Analytical Environment's spatial data analysis and management tools..
This ESRI White Paper describes the Shape File format.
This Web Site describes the standard .dbf format which is a component of the ESRI Shape File.
|