Spatial Data Management For GIS and Data Scientists
Written by Nikos Vaggalis
Friday, 24 November 2023
Videos of the lectures taught in Fall 2023 at the University of Tennessee are now available as a YouTube playlist. They provide a complete overview of the concepts of GeoSpatial science using Google Earth Engine, PostgresSQL GIS , DuckDB, Python and SQL.
Taught on campus, but recorded for the rest of us to enjoy for free, by Dr. Qiusheng Wu, an Associate Professor in the Department of Geography & Sustainability at the University of Tennessee. Dr. Qiusheng is also an Amazon Visiting Academic and a Google Developer Expert (GDE) for Earth Engine.
The target groups addressed by the course are GIScientists and geographers who want to learn about Data Science, and the other way around, data scientists who want to work with geographical data; and of course students in that area.
Geographical data nowdays are everywhere. At its simplest form you'll be familiar with Google Maps, Mobile applications and social media metadata, while at the more advanced, there's the need to model objects that exist in the real world and are location aware. The software industry aside, lately there's many traditional business that started working with that kind of data.
In this course then you'll learn how to manage geospatial and big data using Google Earth Engine, PostgresSQL GIS , DuckDB, Python and SQL which you will use to query, analyze, and manipulate spatial databases effectively. Take note that PostGIS, a geospatial extension to Postgres is the the most popular Postgres extension. Under that perspective, the course's value which explores various techniques for efficiently retrieving and managing spatial data, explodes multifold.
As such, students who successful complete the course should be able to:
Know the commonly used vector and raster formats
Understand the basics of Python (e. g. , variables, data types, functions, loops, modules)
Use practical tools for data science (e. g. , Jupyter notebook, Colab, Anaconda, VS Code)
Explain the Earth Engine data types (e. g. , Image, ImageCollection, FeatureCollection)
Visualize local vector and raster datasets interactively in a Jupyter environment
Visualize Earth Engine vector and raster datasets interactively in a Jupyter environment
Perform geospatial analysis with Earth Engine datasets
Export Earth Engine datasets
Create spatial databases with PostgreSQL and PostGIS
Store and query spatial data with PostGIS
Perform spatial analysis with PostGIS
Download spatial data from various sources efficiently
The tech stack used throughout the course is impressive too. Tools that are going to be used include:
ArcGIS Pro
QGIS
Miniconda
Google Colab
Visual Studio Code
PyCharm
Google Earth Engine
Geemap
PostgreSQL
The course is making use of that stack beginning very early on, as seen by the curriculum spanning 13 weeks:
Week 1: Course Introduction Week 1: Spatial Data Models Week 2: Installing Miniconda and geemap Week 2: Introducing Visual Studio Code Week 2: Setting Up Powershell for VS Code Week 2: Introducing Git and GitHub Week 3: Python Basics Week 3: Getting Started with Geemap Week 4: Using Earth Engine Image Week 4: Filtering Image Collection Week 4: Filtering Feature Collection Week 5: Styling Feature Collection Week 5: Earth Engine Data Catalog Week 5: Visualizing Cloud Optimized GeoTIFF (COG) Week 6: Visualizing STAC and Vector Data Week 6: Downloading OpenStreetMap Data Week 6: Visualizing Earth Engine Data Week 7: Timeseries visualization and zonal statistics Week 7: Parallel processing with the map function Week 7: Earth Engine data reduction Week 8: Creating Cloud-free Imagery with Earth Engine Week 9: Downloading Earth Engine Images Week 9: Downloading Earth Engine Image Collections Week 9: Earth Engine Applications Week 10: DuckDB for Geospatial Week 10: Introduction to DuckDB (CLI, Python API, VS Code, DBeaver) Week 10: DuckDB CLI and SQL Basics Week 10: Introducing SQL Basics with DuckDB Week 11: Intro to the DuckDB Python API Week 11: Importing Spatial Data Into DuckDB Week 11: Exporting Spatial Data From DuckDB Week 12: Working with Geometries in DuckDB Week 13: Analyzing Spatial Relationships with DuckDB Week 13: Visualizing Geospatial Data in DuckDB with leafmap and lonboard
Of course 13 weeks was the duration on campus. The rest we can enjoy at a self pace. The videos are also accompanied by an online reference book in HTML format.
Quality wise, Dr. Qiusheng Wu clearly explains the concepts and showcases the whole process of working with the tools that handle geodata. Which means that even if you are not familiar with Geo-science, the course is well worth attending regardless due to the tech stack employed, especially the PostgreSQL part. If on the other hand you already are a data scientist, then this is a must do.
A veritable treasure trove of assorted how-to recipes for PostgreSQL, stored as a Github repository, has been started by Nikolay Samokhvalov, well known in the PostgreSQL world.
Kafka 3.9 has been released. The team says this is a major release and the final in the 3.x line. It This will also be the final major release to feature the deprecated Apache ZooKeeper mode. Kafka is [ ... ]