- 1 Spatial Data in R: Building Objects from Scratch!
- 2. Importing and Exporting Spatial Data using
`rgdal`

- 3. OPTIONAL: Understanding Spatial* Objects

Welcome to Spatial Data in R! This first set of tutorials (in three parts) is provides an introduction to the two types of spatial data you will encounter in R: vector data and raster data. By the end of this tutorial, you should have a good sense of how R thinks about spatial data, and how to import and export spatial datasets you may get from other programs or sources.

Almost all spatial vector data structures in R are based on the `sp`

package. Even other libraries that may seem independent are usually built on top of `sp`

, even if you can’t see it.

The `sp`

package has three main types of spatial data we’ll work with: points, lines, and polygons. There are some differences between these different types, but they are all very similar.

To help you understand what these data structures are like, in this section we’ll create some spatial data from scratch. This is probably not how you’ll work with data in the future – most of the time you just import spatial data from a source – but this exercise will help give you a good foundation and help you know how to troubleshoot problems in the future.

There are three basic steps to creating spatial data by hand:

**Create geometric objects (points, lines, or polygons)****Convert those geometric objects to**`Spatial*`

objects (`*`

stands for Points, Lines, or Polygons)- Geometric objects live in an abstract space (the x-y plane). To make them
*spatial*objects, we also need to include information on how those x-y coordinates relate the places in the real world using a Coordinate Reference System (CRS).

- Geometric objects live in an abstract space (the x-y plane). To make them
**(***Optional*:) Add a data frame with attribute data, which will turn your`Spatial*`

object into a`Spatial*DataFrame`

object.

**Exercise 0**

Discuss with your neighbor: What do we need to specify in order to define spatial vector data?

Points are the most basic geometric shape, so we begin by building a `SpatialPoints`

object.

A points is defined by a pair of x-y coordiantes, so we can create a set of points by (a) creating matrix of x-y points, and (b) passing them to the `SpatialPoints`

function to create our first `SpatialPoints`

object:

```
library(sp)
toy.coordinates <- rbind(c(1.5, 2), c(2.5, 2), c(0.5, 0.5), c(1, 0.25), c(1.5,
0), c(2, 0), c(2.5, 0), c(3, 0.25), c(3.5, 0.5))
toy.coordinates
```

```
[,1] [,2]
[1,] 1.5 2.00
[2,] 2.5 2.00
[3,] 0.5 0.50
[4,] 1.0 0.25
[5,] 1.5 0.00
[6,] 2.0 0.00
[7,] 2.5 0.00
[8,] 3.0 0.25
[9,] 3.5 0.50
```

```
my.first.points <- SpatialPoints(toy.coordinates) # ..converted into a spatial object
plot(my.first.points)
```

To get a summary of how R sees these points, we can ask it for summary information in a couple different ways. Here’s a summary of available commands:

`summary(my.first.points)`

```
Object of class SpatialPoints
Coordinates:
min max
coords.x1 0.5 3.5
coords.x2 0.0 2.0
Is projected: NA
proj4string : [NA]
Number of points: 9
```

`coordinates(my.first.points)`

```
coords.x1 coords.x2
[1,] 1.5 2.00
[2,] 2.5 2.00
[3,] 0.5 0.50
[4,] 1.0 0.25
[5,] 1.5 0.00
[6,] 2.0 0.00
[7,] 2.5 0.00
[8,] 3.0 0.25
[9,] 3.5 0.50
```

Unlike a simple geometric object, a `SpatialPoints`

object has the ability to keep track of how the coordinates of its points relate to places in the real world through an associated “Coordinate Reference System” (CRS – the combination of a geographic coordinate system and possibly a projection), which is stored using a code called a `proj4string`

. The proj4string is so important to a `SpatialPoints`

object, that it’s presented right in the summary:

`summary(my.first.points)`

```
Object of class SpatialPoints
Coordinates:
min max
coords.x1 0.5 3.5
coords.x2 0.0 2.0
Is projected: NA
proj4string : [NA]
Number of points: 9
```

In this case, however, while our `SpatialPoints`

object clearly knows what a CRS *is*, the Spatial object we just created **does not** have a projection or geographic coordinate system defined. It is ok to plot, but be aware that for many meaningful spatial operations you will need to define a CRS.

CRS objects can be created by passing the `CRS()`

function the code associated with a known projection. You can find the codes for most commonly used projections from www.spatialreference.org.

Note that the same CRS can often be called in many ways. For example, one of the most commonly used CRS is the WGS84 latitude-longitude projection. You can create a WGS84 lat-long projection object by either passing the reference code for the projection – `CRS("+init=EPSG:4326"`

) – or by directly calling all its relevant parameters – `CRS("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")`

.

**Note:** Some users report this only works if `EPSG`

is lower-case (`epsg`

), so try that if you have issues!

Here’s an illustration of assigning a CRS:

`is.projected(my.first.points) # see if a projection is defined. `

`[1] NA`

```
# Returns `NA` if no geographic coordinate system or projection; returns
# FALSE if has geographic coordinate system but no projection.
crs.geo <- CRS("+init=EPSG:32633") # UTM 33N
proj4string(my.first.points) <- crs.geo # define projection system of our data
is.projected(my.first.points)
```

`[1] TRUE`

`summary(my.first.points)`

```
Object of class SpatialPoints
Coordinates:
min max
coords.x1 0.5 3.5
coords.x2 0.0 2.0
Is projected: TRUE
proj4string :
[+init=EPSG:32633 +proj=utm +zone=33 +datum=WGS84 +units=m +no_defs
+ellps=WGS84 +towgs84=0,0,0]
Number of points: 9
```

When `CRS`

is called with only an EPSG code, R will try to complete the CRS with the parameters looked up in the EPSG table:

```
crs.geo <- CRS("+init=EPSG:32633") # looks up UTM 33N
crs.geo # prints all parameters
```

```
CRS arguments:
+init=EPSG:32633 +proj=utm +zone=33 +datum=WGS84 +units=m +no_defs
+ellps=WGS84 +towgs84=0,0,0
```

We’ll talk more about managing projections in the next lesson.

Geometry-only objects (objects without attributes) can be subsetted similar to how vectors or lists are subsetted; we can select the first two points by

`my.first.points[1:2]`

```
class : SpatialPoints
features : 2
extent : 1.5, 2.5, 2, 2 (xmin, xmax, ymin, ymax)
coord. ref. : +init=EPSG:32633 +proj=utm +zone=33 +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0
```

Moving from a `SpatialPoints`

to a `SpatialPointsDataFrame`

occurs when you add a `data.frame`

of attributes to the points. Let’s just add an arbitrary table to the data – this will label each point with a letter and a number. **Note points will merge with the data.frame based on the order of observations.**

```
df <- data.frame(attr1 = c("a", "b", "z", "d", "e", "q", "w", "r", "z"), attr2 = c(101:109))
df
```

```
attr1 attr2
1 a 101
2 b 102
3 z 103
4 d 104
5 e 105
6 q 106
7 w 107
8 r 108
9 z 109
```

```
my.first.spdf <- SpatialPointsDataFrame(my.first.points, df)
summary(my.first.spdf)
```

```
Object of class SpatialPointsDataFrame
Coordinates:
min max
coords.x1 0.5 3.5
coords.x2 0.0 2.0
Is projected: TRUE
proj4string :
[+init=EPSG:32633 +proj=utm +zone=33 +datum=WGS84 +units=m +no_defs
+ellps=WGS84 +towgs84=0,0,0]
Number of points: 9
Data attributes:
attr1 attr2
z :2 Min. :101
a :1 1st Qu.:103
b :1 Median :105
d :1 Mean :105
e :1 3rd Qu.:107
q :1 Max. :109
(Other):2
```

Now that we have attributes, we can also subset our data the same way we would subset a `data.frame`

. Some subsetting:

`my.first.spdf[1:2, ] # row 1 and 2 only`

```
class : SpatialPointsDataFrame
features : 2
extent : 1.5, 2.5, 2, 2 (xmin, xmax, ymin, ymax)
coord. ref. : +init=EPSG:32633 +proj=utm +zone=33 +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0
variables : 2
names : attr1, attr2
min values : a, 101
max values : b, 102
```

`my.first.spdf[1:2, "attr1"] # row 1 and 2 only, attr1`

```
class : SpatialPointsDataFrame
features : 2
extent : 1.5, 2.5, 2, 2 (xmin, xmax, ymin, ymax)
coord. ref. : +init=EPSG:32633 +proj=utm +zone=33 +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0
variables : 1
names : attr1
min values : a
max values : b
```

`plot(my.first.spdf[which(my.first.spdf$attr2 > 105), ]) # select if attr2 > 5`

A `SpatialPointsDataFrame`

object can be created directly from a `data.frame`

by specifying which columns contain the coordinates. This is interesting, for example if you have a spreadsheet that contains latitude, longitude and some values. You can read this into a data frame with `read.table`

, and then create the object from the data frame in one step by using the `coordinates()`

command. That automatically turns the dataframe object into a `SpatialPointsDataFrame`

.

- If you haven’t already, create a new directory
`R_Workshop`

on your Desktop. - Download and place the
`RGIS1_Data`

folder from coursework in this directory. `read.csv()`

to read`sf_restaurant_inspections.csv`

into a dataframe in R and name it`sf.df`

.- Use
`head()`

to examine the first few lines of the dataframe. - Use
`class()`

to examine which object class the table belongs to. - Load the
`sp`

library (`library(sp)`

) Convert the table into a spatial object using the

`coordinates`

command and passing the names of the columns in the table that correspond to longitude and latitude:`coordinates(sf.df) <- c("longitude", "latitude")`

**Note the reverse order! While we usually say “latitude and longitude”, since longitude corresponds to the x-coordinates on a map and latitude corresponds to the y-coodinates and R assumes coordinates to be ordered (x,y), we pass c("longitude","latitude")!**

Use

`class()`

again to examine which object class the table belongs to now:

What to you observe?Plot restaurants with terrible scores by subsetting on

`Score`

and using the`plot`

command.

SpatialPolygons are very, very common, especially in political science (think administrative borders, electoral constituencies, etc.), so they’re important to get used to.

`SpatialPolygons`

from scratch.`SpatialPolygons`

are a little more complicated than `SpatialPoints`

. With `SpatialPoints`

, we moved directly from x-y coordinates to a `SpatialPoints`

object.

To get a `SpatialPolygons`

object, we have to build it up by (a) creating `Polygon`

objects, (b) combining those into `Polygons`

objects (note the “s” at the end), and finally (c) combining those to create `SpatialPolygons`

. So what are these components?

- A
`Polygon`

object is a single geometric shape (e.g. a square, rectangle, etc.) defined by a single uninterrupted line around the exterior of the shape. - A
`Polygons`

object consists of one*or more*simple geometric objects (`Polygon`

objects) that combine to form what you think of as a single unit of analysis (an “observation”). For example, each island in Hawaii would be a`Polygon`

, but Hawaii itself is the`Polygons`

consisting of all the individual island`Polygon`

objects. - A
`SpatialPolygons`

object is a collection of`Polygons`

objects, where each`Polygons`

object is an “observation”. For example, if we wanted a map of US states, we would make a`Polygons`

object for each state, then combine them all into a single`SpatialPolygons`

. If you’re familiar with shapefiles,`SpatialPolygons`

is basically the R analogue of a`shapefile`

or`layer`

.

**One special note:** if you want to put a hole in a polygon (e.g. drawing a donut, or if you wanted to draw South Africa and leave a hole in the middle for Lesotho) you do so by (a) creating a `Polygon`

object for the outline, (b) creating a second `Polygon`

object for the hole and passing the argument `hole=TRUE`

, and (c) combine the two into a `Polygons`

object.

Let’s try building up a `SpatialPolygon`

!

```
# create polyon objects from coordinates. Each object is a single geometric
# polygon defined by a bounding line.
house1.building <- Polygon(rbind(c(1, 1), c(2, 1), c(2, 0), c(1, 0)))
house1.roof <- Polygon(rbind(c(1, 1), c(1.5, 2), c(2, 1)))
house2.building <- Polygon(rbind(c(3, 1), c(4, 1), c(4, 0), c(3, 0)))
house2.roof <- Polygon(rbind(c(3, 1), c(3.5, 2), c(4, 1)))
house2.door <- Polygon(rbind(c(3.25, 0.75), c(3.75, 0.75), c(3.75, 0), c(3.25,
0)), hole = TRUE)
# create lists of polygon objects from polygon objects and unique ID A
# `Polygons` is like a single observation.
h1 <- Polygons(list(house1.building, house1.roof), "house1")
h2 <- Polygons(list(house2.building, house2.roof, house2.door), "house2")
# create spatial polygons object from lists A SpatialPolygons is like a
# shapefile or layer.
houses <- SpatialPolygons(list(h1, h2))
plot(houses)
```

As with SpatialPoints, we can associated a `data.frame`

with SpatialPolygons. There are two things that are important about doing this with SpatialPolygons:

- When you
**first**associate a`data.frame`

with a SpatialPolygons object, R will line up rows and polygons by matching`Polygons`

object names with the`data.frame`

`row.names`

. - After the initial association, this relationship is
**NO LONGER**based on row.names! For the rest of the`SpatialPolygonsDataFrame`

’s life, the association between`Polygons`

and rows of your`data.frame`

is based on the order of rows in your`data.frame`

, so don’t try to change the order of the`data.frame`

by hand!

Make attributes and plot. Note how the door – which we created with `hole=TRUE`

is empty!

`attr <- data.frame(attr1 = 1:2, attr2 = 6:5, row.names = c("house2", "house1"))`

```
houses.DF <- SpatialPolygonsDataFrame(houses, attr)
as.data.frame(houses.DF) # Notice the rows were re-ordered!
```

```
attr1 attr2
house1 2 5
house2 1 6
```

`spplot(houses.DF)`