class: center, middle, inverse, title-slide .title[ # Geospatial Analysis with
R-Programming
] .subtitle[ ## .f3[🗺️ | Application of ‘terra’, ‘raster’, & ‘gdal’] ] .author[ ###
Dr. Ankit Deshmukh
] .institute[ ### Department of Civil Engineering,
Pandit Deendayal Energy University, Gandhinagar ] .date[ ### 12 March 2024 ] --- class: middle inverse
<!-- ------------------------- Start your slides ------------------------- --> # .f2.gold[“‘A map says to you.<br />Read me carefully, follow me closely, doubt me not ... <br />I am the earth in the palm of your hands.”] .b.f3[~ Beryl Markham] --- class: .incremental ## Outline of this talk! - .f2[What is Geospatial analysis?] -- - .f2[Why use scripting for Spatial analysis?] -- - .f2[R Programming!] - Quick setup of R and Rstudio. - Basics of R. -- - .f2[Geospatial Analysis with R.] - Basics of GIS datatypes - Raster operations. - Vector operations. - Statistics for GIS data. - etc... .footnote.i[You can access the slides from: **[ankitdeshmukh.com/slides/GST_with_R/**](www.ankitdeshmukh.com/slides/GST_with_R/)] --- class: .incremental ## What do we understand from "Geospatial Analysis"? .f3[ - A multidisciplinary approach involving patterns identification, relationships, and trends within geographic data. - Concepts from geography, cartography, geographic information systems (GIS), and statistical analysis to gain insights into spatial phenomena. ] -- ### Key components of geospatial analysis include -- - **Geographic Information Systems (GIS):** GIS is a technology that enables the capture, storage, analysis, and visualization of spatial data. <br /> -- - **Remote Sensing:** Remote sensing involves acquiring information about the Earth's surface without direct physical contact. <br /> -- - **Spatial Statistics:** This involves the application of statistical techniques to analyze spatial patterns and relationships. <br /> -- - **Spatial Modeling:** This involves creating mathematical representations of spatial relationships to simulate and predict real-world phenomena. <br /> --- class: .incremental ## What is Scripting? .pull-left[ - Scripting refers to writing a series of instructions or commands that can be executed by a computer. - These instructions are written in a programming language (R, Python, SQL) - We use R programming language for this. - R is scripting language specifically designed for statistical computing and graphics. ] .pull-right[ <img src="images/R Programming.jpg" class="w-100 br4 dib center"> ] --- ## Why use scripting .pull-left[ - **Control:** With scripting, you have precise control over every step of the analysis process. - **Efficiency:** Scripting allows for automation of repetitive tasks, saving time and effort compared to manually clicking through GUI interfaces. - **Speed:** Scripting allows for faster execution of repetitive tasks compared to navigating through multiple GUI menus. - **Customization:** Scripting offers greater flexibility in tailoring analyses to your specific needs. - **Reproducibility:** This is crucial for transparency, peer review, and building upon existing research. - **Scalability:** Efficient scripts can handle large datasets more efficiently than GUI-based tools. ] .pull-right[ <img src="https://imgs.xkcd.com/comics/dependency.png" class="w-70 br1 dib center"> .footnote[Image Sourse: https://xkcd.com/2347/] ] --- ## The shift towards open geospatial Open source refers to software that is designed to be publicly accessible, allowing anyone to see, modify, and distribute the code as they see fit. .pull-left[ - Open Data applies the principles of free and open to geospatial data. - Open Education applies the principles of open source to the creation of teaching materials allowing organizations to share syllabus materials reducing cost and reaching a wider audience. - Open Standards promote inter-operability between applications. - Open standard are a key tool allowing geo-spatial practitioners to work together, with the added benefit of avoiding **technology lock-in**. ] .pull-right[ <img src="images/Open Source.jpg" class="w-80 br2 dib center"> ] --- ## Why to use open sourced tools/software .f3[ - **Free to use:** Open-source software is often available at no cost, saving you money on licensing fees. - **Community support:** You can benefit from a large community of users and developers who provide support, resources, and updates. - **Customization:** Open-source software allows you to modify and adapt the code to fit your specific needs and preferences. - **Interoperability:** It seamlessly integrates with other software and systems, promoting compatibility and collaboration. - **Transparency:** The code is openly accessible, ensuring transparency and trust in the software's functionality and security. ] --- class: .incremental ## Basic data types in GIS .pull-left[ - **Raster data** consists of continuous cells or pixels, such as elevation or satellite imagery. - **Vector data** is the most common type of GIS data. Vector data represents geographic data symbolized as points, lines, or polygons. - **GeoPackage (GPKG)** is an open, non-proprietary, platform-independent and standards-based data format for geographic information systems. - Can store multiple types of geospatial data - Portability. ] .pull-right[ <img src="images/Data.png" class="w-80 br4 dib center"> ] --- class: .incremental ## R package for Geospatial analysis -- - **terra:** Spatial Data Analysis - Offers advanced raster data handling capabilities, including large dataset support and high-performance computation. -- - **raster:** Raster Data Analysis - Offers functions for working with gridded spatial data -- - **sf:** Simple Features for R - A comprehensive package for working with spatial data using modern data structures and methods. -- - **stars:** Spatiotemporal Arrays - Provides a unified framework for handling spatiotemporal raster data, allowing for efficient analysis and visualization. -- - **rgdal:** Bindings for the 'Geospatial' Data Abstraction Library -- - **leaflet:** Interactive Web Maps with the JavaScript 'Leaflet' Library spatial: Functions for Kriging and Point Pattern Analysis - Implements various spatial statistical methods. -- - **tmap:** Thematic Maps - Provides a flexible framework for creating thematic maps. <br /> -- etc... --- ## `terra`: for spatial data analysis This package is an attempt to climb on the shoulders of giants (GDAL, PROJ, GEOS, NCDF, GeographicLib, Rcpp, R). - `terra` provides methods to manipulate geographic (spatial) data in "raster" and "vector" form. <img src="images/terra.png" class="w-80 br4 dib center"> .footnote[https://rspatial.org/ | https://github.com/rspatial/terra] --- ## Adding basic Map in to the slides A javascript library for R.
--- class: .incremental ## Basics of R-Programming .note[ - It’s open-source. No fees or licenses are needed. - It’s platform-independent. - It has lots of packages. - It’s great for statistics. - It’s well suited for Machine Learning. - R lets you perform data wrangling. - R is still growing. ] -- .f3[ - [Getting Started with R Programming]( https://ankitdeshmukh.com/post/2021-09-20-getting-started-with-r/)<br /> - [Hello R Markdown!](https://ankitdeshmukh.com/post/2021-12-01-r-markdown/) ] .f3.blue.b[to the RStudio ...] --- ## Plotting the map .pull-left[ ```r # Plot the output in a map. plot(ahm_b10, type="interval", ylab = "Latitude [m]", xlab = "Longitude [m]", main = "Ahemdabad District") plot(v, border = "#220066", lwd = 1.5, add = TRUE) sbar(50000, xy = "bottomright", type="bar", below="m", label=c(0,25000,50000), cex=0.8) north(type=3, cex=1) ``` ] .pull-right[ <img src="images/Ahm10.png" width="100%" style="display: block; margin: auto;" /> ] --- ## Reading vector data and copping ```r if(!require(sf)){install.packages("sf");library(sf)} raster_data <- rast( "./data/Large Tiff/Landsat-8-B2.TIF") vector_data <- vect(st_read("./Data/Large Tiff/Bounding_Box.gpkg")) cropped_data <- terra::crop(x = raster_data, y = vector_data, mask = TRUE) writeRaster(x = cropped_data, filename = "./data/Cropped_Image.tif") ``` <img src="images/Ahm_Data.png" class="w-50 br4 dib center"> --- ## A raster data file ```r raster_data <- rast(x = "./data/LS_8_Band_2.tif") # reading the file raster_data # Printing ``` ``` ## class : SpatRaster ## dimensions : 750, 695, 1 (nrow, ncol, nlyr) ## resolution : 30, 30 (x, y) ## extent : 250305, 271155, 2551605, 2574105 (xmin, xmax, ymin, ymax) ## coord. ref. : WGS 84 / UTM zone 43N (EPSG:32643) ## source : LS_8_Band_2.tif ## name : LC08_L2SP_148044_20240118_20240124_02_T1_SR_B2 ## min value : 6532 ## max value : 17495 ``` --- ## This is different from the traditional GIS software: <img src="images/Incorrect_Projection.png" class="w-80 br4 dib center"> .b.blue[Images with different projection (EPSG:32642 and EPSG:32643 image.)] --- class: .incremental ## Landsat Bands .pull-left[ ```r rast_fname <- paste0("./data/LS_8_Band_",c(1:7,10),".tif") sample_rast <- rast(rast_fname) names(sample_rast) <- c('ultra-blue', 'blue', 'green', 'red', 'NIR', 'SWIR1', 'SWIR2',"TIR") sample_rast ``` ``` ## class : SpatRaster ## dimensions : 750, 695, 8 (nrow, ncol, nlyr) ## resolution : 30, 30 (x, y) ## extent : 250305, 271155, 2551605, 2574105 (xmin, xmax, ymin, ymax) ## coord. ref. : WGS 84 / UTM zone 43N (EPSG:32643) ## sources : LS_8_Band_1.tif ## LS_8_Band_2.tif ## LS_8_Band_3.tif ## ... and 5 more source(s) ## names : ultra-blue, blue, green, red, NIR, SWIR1, ... ## min values : 6093, 6532, 7920, 7145, 7662, 7455, ... ## max values : 15947, 17495, 20631, 23095, 28933, 54498, ... ``` ] .pull-right[ <img src="images/Landsat_Band.jpg" class="w-80 br1 dib center"> ] --- ## Single band and composite maps ```r par(mfrow = c(2,2)) plot(b2, main = "Blue", col = gray(0:100 / 100)) plot(b3, main = "Green", col = gray(0:100 / 100)) plot(b4, main = "Red", col = gray(0:100 / 100)) plot(b5, main = "NIR", col = gray(0:100 / 100)) ``` <img src="images/Band_GrayScale.png" class="w-60 br4 dib center"> --- ## True color composite with visible RGB band .pull-left[ ```r b2 <- rast(x = "./data/LS_8_Band_2.tif") # Blue b3 <- rast(x = "./data/LS_8_Band_3.tif") # Green b4 <- rast(x = "./data/LS_8_Band_4.tif") # Red landsatRGB <- c(b4, b3, b2) plotRGB(landsatRGB, stretch = "lin") ``` ] .pull-right[ <img src="Slides_files/figure-html/unnamed-chunk-10-1.png" width="100%" /> ] --- ## False color composite with NIR, Red, Blue band .pull-left[ ```r b2 <- rast(x = "./data/LS_8_Band_2.tif") # Blue b4 <- rast(x = "./data/LS_8_Band_4.tif") # Red b5 <- rast(x = "./data/LS_8_Band_5.tif") # NIR landsatFCC <- c(b5, b4, b2) plotRGB(landsatFCC, stretch = "lin") ``` ] .pull-right[ <img src="Slides_files/figure-html/unnamed-chunk-12-1.png" width="100%" /> ] --- ## Relation between bands A scatterplot matrix can be helpful in exploring relationships between raster layers. This can be done with the pairs function. .pull-left[ ```r pairs(sample_rast[[1:2]], main = "UB versus Blue") ``` <img src="Slides_files/figure-html/unnamed-chunk-13-1.png" width="100%" /> ] .pull-right[ ```r pairs(sample_rast[[4:5]], main = "Red versus NIR") ``` <img src="Slides_files/figure-html/unnamed-chunk-14-1.png" width="100%" /> ] --- ## Vegetation indices Vegetation indices (VIs) provide valuable insights into the health, and distribution of vegetation across landscapes. > In the function below, img is a muti-layer SpatRaster object and i and k are the indices of the layers (layer numbers) used to compute the vegetation index. ```r vi <- function(img, k, i) { bk <- img[[k]] bi <- img[[i]] vi <- (bk - bi) / (bk + bi) return(vi) } ``` --- ## Compute NDVI and GNDVI with Landsat-8 bands. .pull-left[ The normalized difference vegetation index (NDVI): a metric for quantifying the health and density of vegetation. ```r ndvi <- vi(sample_rast, 5, 4) plot(ndvi, col=rev(rainbow(10)), main = "NDVI") ``` <img src="Slides_files/figure-html/unnamed-chunk-16-1.png" width="100%" /> ] .pull-right[ Green Normalized Difference Vegetation Index (GNDVI): estimating photo synthetic activity. ```r gndvi <- vi(sample_rast, 5, 3) plot(gndvi, col=rev(terrain.colors(10)), main="GNDVI") ``` <img src="Slides_files/figure-html/unnamed-chunk-17-1.png" width="100%" /> ] --- ## Histogram .pull-left[ - To represent the distribution of values contained within our raster using hist to produces a histogram. - Histograms are often useful in identifying **outliers** and bad data values in our raster data. ```r hist(ndvi, main = "NDVI values", xlab = "NDVI", ylab= "Frequency", col = "#ffaa00", xlim = c(-0.5, 1), breaks = 30, xaxt = "n") axis(side=1, at = seq(-0.6, 1, 0.2), labels = seq(-0.6, 1, 0.2)) ``` ] .pull-right[ <img src="Slides_files/figure-html/unnamed-chunk-19-1.png" width="100%" /> ] --- ## Thresholding .pull-left[ We can apply threshold for values such as NDVI range [0.2 - 0.35] - Note that NDVI values are standardized and ranges between -1 to +1. - Higher values indicate more green cover. ```r classify_mat <- c(-Inf, 0.25, NA, 0.25, 0.3, 1, 0.3, Inf, NA) Ranges <- matrix(classify_mat, ncol = 3, byrow = TRUE) Ranges land_class <- classify(ndvi, Ranges) plot(land_class, main = 'NDVI between [0.2 - 0.35]') ``` ] .pull-right[ <img src="Slides_files/figure-html/unnamed-chunk-21-1.png" width="100%" /> ] --- ## Thresholding based on multiple class with `classify` function .pull-left[ ```r library(RColorBrewer) Classes <- c(-1,0.25, 0.3, 0.4, 0.5, 1) Class_data <- classify(ndvi, Classes) plot(Class_data, col = brewer.pal(n = 5, name = "Dark2"), main = 'NDVI based thresholding') ``` ] .pull-right[ <img src="Slides_files/figure-html/unnamed-chunk-23-1.png" width="100%" /> ] --- ## Clustering of Data Clustering of `NDVI` layer ```r set.seed(123) nr <- as.data.frame(ndvi, cell=TRUE) head(nr) ``` ``` ## cell NIR ## 13 13 0.27633870 ## 14 14 0.17334302 ## 15 15 0.12448494 ## 16 16 0.09909983 ## 17 17 0.11282411 ## 18 18 0.13928502 ``` --- ## K-means clustering of data We create 6 clusters, allow 500 iterations, start with 5 random sets using "Lloyd" method. ```r kmncluster <- kmeans(nr[,-1], centers=6, iter.max = 500, nstart = 5, algorithm="Lloyd") str(kmncluster) ``` ``` ## List of 9 ## $ cluster : int [1:505784] 1 4 2 6 2 2 2 2 2 2 ... ## $ centers : num [1:6, 1] 0.259 0.139 0.431 0.197 0.333 ... ## ..- attr(*, "dimnames")=List of 2 ## .. ..$ : chr [1:6] "1" "2" "3" "4" ... ## .. ..$ : NULL ## $ totss : num 4787 ## $ withinss : num [1:6] 37.5 36.8 33.1 38.1 37.5 ... ## $ tot.withinss: num 254 ## $ betweenss : num 4533 ## $ size : int [1:6] 100008 110257 26403 131142 64238 73736 ## $ iter : int 146 ## $ ifault : NULL ## - attr(*, "class")= chr "kmeans" ``` --- ## Visulize the clusters ```r # Use the ndvi object to set the cluster values to a new raster knr <- rast(ndvi, nlyr=1) knr[nr$cell] <- kmncluster$cluster knr ``` ``` ## class : SpatRaster ## dimensions : 750, 695, 1 (nrow, ncol, nlyr) ## resolution : 30, 30 (x, y) ## extent : 250305, 271155, 2551605, 2574105 (xmin, xmax, ymin, ymax) ## coord. ref. : WGS 84 / UTM zone 43N (EPSG:32643) ## source(s) : memory ## name : NIR ## min value : 1 ## max value : 6 ``` --- ## Visulize the clusters ```r mycolor <- c("#fef64b", "#ff0000", "#daa520", "#0000ff", "#00ff00", "#f400f3") par(mfrow = c(1,2)) plot(ndvi, col = rev(terrain.colors(6)), main = "Landsat-NDVI") plot(knr, main = 'Unsupervised classification', col = mycolor, type="classes") ``` <img src="images/Clustering.png" class="w-70 br4 dib center"> --- ## Acknowledgment Concept and some codes are adopted form "Aniruddha Ghosh and Robert J. Hijmans" ## Reading Material - https://ankitdeshmukh.com/post/ - https://github.com/anixn/Geospatial_Analysis_with_R.git - https://r.geocompx.org/ - https://rspatial.org/