Exercise: Organizing Spatial DataOverviewThe ScenarioStep 1: Create your project workspace Step 2. Organizing and preparing data in your workspaceStudy AreaASTER Elevation DataLand cover dataLand coverTesting your workspaceRecap / What's next
All spatial analyses require data and often generate a lot of temporary files. You'll benefit by being organized with your data and maintaining documentation on what is what in among your files.
In this exercise, we review some best practices for managing spatial data for a project. These aren't hard and fast rules in that you can still get valid results without following these guidelines, but based on my years of experience, I found these tips have saved me gobs of time.
We'll explore these best practices in the context of a fabricated project. Here, I provide you with some datasets in various formats, scales, and projections. Some are well documented; others are not. Your task will be to prepare a workspace for analysis. In subsequent tutorials, we will explore where you might turn when you aren't given the source data but rather have to find them yourself. From there we'll review basic principles of cartography and best practices for presenting your results in map, tabular, and text format.
Congratulations on your new job as GIS specialist for the Malagasy Conservation Group (MGC)! Your first assignment is to assist in planning the route of some newly acquired un-manned aerial vehicles (UAVs or drones) over Masoala National Park in the NE corner of Madagascar. As yet, the specific objective of the analysis is not known, but you are told to prepare the data for whatever may be asked of you.
Your predecessor has left you all the data you need, but not in a very organized fashion and with varying levels of documentation. These data sets can be found here: https://env761.github.io/ConsGIS/DATA/Lab0_Data.zip. Your task will be to organize these files and prepare an ArcMap workspace so that you are up and running when the drone team comes to visit in the next few days.
The following steps will guide you through this process. At the end of the tutorial you will have a single project folder containing all the data you need for the analysis as well as an ArcMap document and toolbox with the proper environment settings applied. This project folder can be backed up or moved to different locations while still retaining all the data and formatting required to allow you to jump right back into the analysis.
The first step in any geospatial project you begin should be to create a workspace that will keep your files organized. Geospatial analysis is notorious for making many, many intermediate datasets, themselves made up of multiple files, which can easily clutter up your machine. ArcGIS Pro has some mechanisms for handling this, but we recommend a few additional steps prior to creating your ArcGIS Pro project that will help in keeping your workspace organized. These steps involve creating a folder structure consisting of a project or root folder in which everything else is stored, and within this folder are four subfolders - data
, docs
, scratch
, and scripts
- each with a specific purpose. Once that is complete, we'll create a new ArcGIS Pro project in the root folder and our workspace components will be complete. In the end, it will have the following structure:
ProjectFolder A directory on your local machine. Copy this folder to copy the entire project
/data Container for all spatial data and non-spatial data related to the project
/docs Container for descriptive, help or other documentation related to the project
/scratch Location for scratch data
/scripts Container for any geoprocessing scripts used in the project
/*.aprx The project's map document
/*.tbx The project's toolbox
/readme.txt A brief description of the project
Project folder and sub-folders
Using Windows Explorer, create a project folder (we'll call it Lab0_Masoala
for now) and the four subfolders for your Masoala project. Always be sure that no spaces occur in the project folder name or anywhere in the path to this folder. Spaces in file, folder, and path names can cause errors when certain ArcGIS tools are run. Use underscores if you need (e.g. "My_project", not "My project"), but avoid spaces and other odd characters.
The readme.txt file
Create a new text named README.txt
in your project folder file. Use this file to store a few comments about the workspace - enough to briefly explain what the project is about, to differentiate it from other workspaces in case you or someone else revisits this workspace from a long hiatus. Include your name and the date.
The ArcGIS Pro project and it's components
Open ArcGIS Pro and create a new blank project. Let's name it Masoala
and save it in a new folder on your class drive/ Be sure the option to create a new folder for the project is NOT checked.
When you do this, ArcGIS Pro creates a new project file (Masoala.aprx
, a default toolbox (Masoala.tbx
), and a default geodatabase (Masoala.gdb
).
Create a scratch geodatabase
In the Scratch
folder, create a scratch geodatabase called scratch.gdb
. You'll have to do this from within ArcGIS Pro's catalog
pane by right-clicking on the Databases
option and selecting New File Geodatabase
.
Set geoprocessing environment variables
Finally, you'll want to set your geoprocessing environment variables - at least the workspace variables - for your project. This is done in the Analysis
menu, from the Environments
tab.
Data
folder.Scratch
folder.Depending on other needs of your analysis, you might want to set other environment variables, but this will do for now.
→ In the end your workspace should look like this. (You may have to refresh your Lab0_Masoala folder in ArcGIS...)
You now should have your workspace all set and should be ready to begin your analysis. Getting in the habit of starting each project by creating a workspace in this format will likely save you a lot of time and headache in the long run. You can view an example of how your workspace should look by expanding the ExampleWorkspace_Masoala.zip
file.
With our workspace set, we can organize the data we need to do our analysis. All input data sets should be stored in your Data folder, but you can add subfolders if you wish to further organize your data, e.g. by source, date, type, or whatever.
Often, you will need to preprocess your data sets before you actually do any analysis with them. This can involve uncompressing files, converting formats, defining projections, reprojecting data, etc. While you will definitely want to keep the processed files, it's up to you whether you want to retain the original files in your workspace after pre-processing them for analysis. Usually, if the data sets can be easily obtained again, if necessary, and/or if they consume valuable disk space, I will delete them and just keep the data in the format I need for processing.
The MCG has provided two geospatial data files delimiting the study extent. These are found in the MCGdata folder (zipped as MCG.zip on Sakai). Your first task is to prepare these files and add them to your map. Also in this folder is a README_MCG.txt, which contains information about these files.
Copy/unzip entire MCG data folder to your workspace data folder.
Create a new map in your project
Add the ParkBoundary.shp
shapefile to your map, then zoom to the layer. Does the feature appear to be in Madagascar, as you'd expect?? Nope. Looks like it has a projection issue.
Properties
>Source
>Spatial Reference
)Add the LabordeGrid.shp
file. Better luck with this one?
You've just discovered that these two feature classes do not have any defined projections. The metadata file indicates these coverages use the "Laborde" projection, but since the coverages themselves have no defined coordinate system, ArcMap has no way of knowing this. So, the next step is to define the projection for these files.
Open the Define Projection tool and add the Park Boundary coverage as the input.
Now we need to locate the correct coordinate system to assign to these data sets.
Readme\_MCG.txt
file indicates these data use the "Laborde" coordinate system. Try searching for Laborde
in the Spatial Reference Properties box.Details
link in the Coordinate system window), and cross-reference the values to the ones in the Readme\_MCG.txt
file.In addition to the park boundary and the quadrat grid, the drone team will also need an elevation and a land cover dataset. ASTER elevation data were given to us in their raw downloaded form as 4 zip files, each comprising 1 x 1° tiles of 30 arc-second DEMs. In this section, we will uncompress the files, mosaic the tiles into a single dataset, and reproject the data to match the Laborde projection used above.
ASTER
folder from the 761_data folder to your data folder.README.pdf
file to discover what the _dem.tif
and the_num.tif
files represent.Add the two unzipped geoTIFF files to your map; examine them and their properties.
Unzip the remaining zip file. (You can overwrite the Readme.pdf files as each is identical.) As a preprocessing step, we will merge both sets of geoTIFF into single raster datasets as they'll be easier to manage in the project.
_dem.tif
tiles as inputs to the tool, and save the output to the scratch folder. Set the pixel depth to the same as the input images. Save the output as ASTER_DEM.img
and run the tool._num.tif
files, saving the output to ASTER_NUM.img
.The remaining step is to reproject these datasets to the Laborde projection. Use the Project Raster tool to do this, saving the outputs to the data folder. [Some important aspects to note when using this tool:]{.underline}
Tananarive_1925_To_WGS_1984_2
. (This may appear by default.)ASTER_NUM
dataset) you should use NEAREST or MAJORITY, while for continuous data (e.g. the ASTER_DEM
dataset) you should use BILINEAR. Be sure you understand why.When the datasets are reprojected, take a look at the output. You'll see that you have elevation data that extends far beyond the study area.* A quick trick to subset is to set the processing extent environment variable on the geoprocessing tool. (If you don't know how to do this, ask the instructor or one of the TAs...)
* It's your call whether you want perform the Mosaic to New Raster tool on its own or add it to a geoprocessing model in your workspace. I find it useful to add complex preprocessing steps to a toolbox, which both documents the process and provided an easy way to repeat the steps if necessary.
We now have ASTER DEM data (and their corresponding quality control rasters) for the extent and in the coordinate system of our analysis. To conserve space, we'll delete the original GeoTIFF files and their compressed counterparts. In the end you should only have 8 files in your folder.
Africa Land Cover owner:consbio
From the results, view the details for the "Land cover, Africa and the Arabian Peninsula". Review the details and then right-click and add this dataset to your map. (It may take a few seconds...)C:/
drive, and thus won't be available to you if you move machines.At the end of the exercise you have a tidy, efficient workspace ready for analysis. All the data is organized and in a single projection, which will greatly simplify analysis and minimize errors. In later exercises we will explore how you might assemble a dataset when the data are not simply given to you.