ENV 761 (Fay) - Conservation GIS

Optimal Reserve Design with Marxan (v2)

Introduction

We've spent the past few weeks developing landscape attributes that can be used to rank, sort, and prioritize our Mogollon planning units (HUC 12s) for conservation. The next step is to establish a method to combine all these metrics into a decision support system (DSS) that land managers can use to evaluate different land use scenarios (i.e. different prioritization schemes) and identify which set of HUC12s make up the most efficient combination given a specific conservation objective.

The simplest means to do this (especially from a GIS standpoint) is just to sort the HUCs on one or a few of the landscape attributes we've developed. For example, take the top X% of the HUCs in terms of, say, pronghorn habitat area, and then from those, take the top Y% of HUCs in terms of low threat, etc., until you have a predetermined area of [quality] pronghorn habitat or species conserved – or until you have run out of funds.

A second, slightly more complex approach is performing a weighted overlay of the various HUC attributes, much like we did in creating our weighted threat classification a few weeks back. From there, we sort the HUCs on their overall score and select enough HUCS until an objective is met.

These two methods can be quite useful, but they don't necessarily achieve the most parsimonious solution. They both employ a purely "greedy" algorithm meaning that each HUC is selected for its ranking, independent of what's already been selected. It's possible that the two top ranked HUCs may each have the highest richness scores. However, if most of the species included in HUC 2 also occur in HUC 1, you might get a better overall richness score if you chose HUC 1 and a HUC ranked lower than HUC 2 if it included more species that were not included in HUC 1. In short, you may want to look for the set of HUCs that most complement each other.

Using previous selections to inform subsequent ones is known as a heuristic approach, i.e. one that learns as it goes. A greedy heuristic approach does exactly what is mentioned above: finds the single best HUC and then adds another HUC based on how it complements already selected ones. However, this approach also may not achieve the most parsimonious solution. It could be that a starting with a combination of two lower-ranked HUCs rather than the single top ranked HUC leads to a more efficient overall suite of HUCs.

So, it turns out that finding the exact combination of HUCs that ends up with the single most parsimonious suite of HUCs is not that easy! It involves evaluating every unique combination of HUCs. With our 142 HUCs, that would be or roughly 1042 combinations!!

So instead, we use an approach called simulated annealing to poke around for optimal solutions. This approach uses something of a greedy heuristic approach, but uses a "cooling factor" whereby sub-optimal HUCs are occasionally added to the portfolio to explore whether they may ultimately lead to an overall more parsimonious solution. To do this we use Marxan

Marxan is a standalone program that allows the user to identify suites of planning units (i.e. patches or HUCs) that maximizes conservation goals with the least amount of cost. Marxan attempts to find the optimal suite using the simulated annealing process. The process starts with seed HUCs (either randomly selected or user defined), and then adds other HUCs at random, each time evaluating the marginal gain or loss of adding that HUC using an objective function which is based on the cost of adding the HUC and the contribution to meeting a specified target. The HUC is added to the portfolio if its gain exceeds a threshold (). At first, the threshold is set low – allowing more sub-optimal HUCs to be included in the ongoing portfolio. But as more HUCs are added the threshold increases, and thus only HUCs more likely to contribute to an overall solution are added.

Marxan is may not find the single most optimal solution. Instead, it makes several repeat runs from the same inputs and points to the best solution of those runs to be our optimal solution. En route to this, Marxan also produces a listing of how frequently each HUC occurs in the optimized solution from each run (called the "summed solutions"). Both are useful in arriving at a conservation portfolio.

In this lab, we explore how Marxan is applied to our Mogollon datasets. We use ArcGIS to create the Marxan input files and then run Marxan. Both the “best solution” from 100 model iterations and the “summed solution” are displayed geographically using ArcGIS. It is not intended to be an exhaustive review of Marxan and what it can do, but it should give you enough familiarity with the software to explore further and also some idea on how to create the inputs from familiar datasets using ArcGIS.

 


Materials/Data

For this exercise, you'll be given (in the OptimizationLab.zip file). This file includes:


Analysis Steps

Overview

Marxan requires 5 input files to run. These files contain necessary information on the planning units (pu.dat), the conservation targets (spec.dat), the amount of conservation target found in each planning unit (puvsp.dat), the amount of shared or exposed boundaries among planning units (bound.dat), and finally run-time options for Marxan execution (input.dat). (Refer to the Marxan documentation for a full description of these files.)

These files are simply text files, but they adhere to a specific format; deviating from that format will prevent Marxan from running, and often with little clue why it's refusing to run. Fortunately, the ArcMarxan toolkit takes care of setting up a proper Marxan workspace and configures all these files. Still, we need to generate the GIS file from which ArcMarxan creates these files. And we also need to devise models that can process the Marxan Outputs.

And so the workflow include the following:

 

A. Preparation for analysis

You should see a formatted workspace that includes the datasets mentioned above along with a folder named Software. This folder contains the Marxan executable (Marxan_x64.exe), the Inedit.exe program, and two folders: MarxanInputs and MarxanOutputs. The folder also contains a sub-folder of links to the additional Marxan information and downloads as well as the ArcMarxan.pyt Python toolkit modified to work with ArcGIS Pro.

 

B. Creating hexagon planning units

While we'll be using the HUC12s as planning units, it's good to know how to construct tessellated planning units as well. Below are the steps for this;

This generates hexagons for the entire extent of the provided feature class. However, we want to remove those falling completely outside the study area. We can do this by using the Select Features By Location tool to select features intersecting the Mogollon Plateau feature class, but then we should write the selected features to a new feature class so that the remaining features have sequential feature IDs, which will streamline bookkeeping later on.

Lastly, while the internal feature id (FID or OBJECTID field, depending on whether you saved as a shapefile or geodatabase feature class) is sequential, it's never good to rely on those values remaining consistent; ESRI internal IDs can change without notice, so we'll want to save these values to a new field.

 

C. Adding cost and status values to the planning units

We need to assign cost and status to attributes to each planning unit. We'll go back to using our HUC12s, but make a copy of the HUC12 feature class (to have a pristine backup) and use the copy for the following analyses.

 

The cost field refers to the cost of including a given planning unit in the reserve, and is used by Marxan to find the least costly portfolio of units meeting our GAP cover representation objective. Costs can be actual dollar amounts to acquire the unit, if known, but more often a proxy of cost is used. Common proxies are unit area (the larger the unit the more costly to acquire) or perhaps area within a certain ownership class (e.g. private land could be more expensive than public land). However, unit costs can include anything that favors one HUC over another, including opportunity costs such as the cost of not including a HUC that has good metrics.

In our example, however, we'll keep it simple for now and just use the area field: the cost of a HUC will simply be the area of the HUC, in acres.

 

The status field is used to specify any HUCs that we might want to 'lock' into or out of the final solution -- regardless of whether it contributes to the optimal solution. This can be in cases where we know that one or a few HUCs will already be protected and we want to find the other HUCs that will complement these best. We'll begin our analyses setting all values in the status field to zero indicating that none are locked into or out of the final solution.

 

We can seed our Marxan runs to start with a set of planning units by setting the status of those PUs to 1. We'll favor betweenness centrality in our analysis by setting the PUs with the top 10 betweenness centrality scores to have a status of 1. (Alternatively, we could lock those PUs in by setting the value to 2.)

D. Adding species representation values

The next step is to add a column to our PU feature class for each conservation feature we want to include in our reserve portfolio and in that column list the amount of that conservation feature that is represented in the PU. If you had raster datasets of species abundance or density for individual species you could compute these values using zonal statistics and then join the appropriate values to the planning unit attribute table.

In our case, however, we are going to use GAP cover types as proxies for species and design a reserve network that contains representative amounts of each cover type. Therefore, rather than single rasters for each conservation target, we have one raster that includes all conservation target. This actually makes our job easier as we can use the Tabulate Area tool quickly generate a table listing the area of each GAP cover type falling within each planning unit.

Take a quick look at your PU attribute table to ensure everything looks ok.

 

E. Running the ArcMarxan Tools

With the PU's tagged with a cost, status, and conservation target areas, we now have what we need to use the ArcMarxan tools to generate the Marxan inputs.

First we need to create a folder for our Marxan data.

Next, we use the ArcMarxan Create Input File and Folders tool to create the Marxan workspace

Create the Marxan files using the following tools. For each, use your PU file and the input folder created above as the tool inputs. Select the other tool inputs appropriately.

Examine the output files in a text editor.

 

F. Editing the spec.dat file

The default spec.dat does not include proper representation targets for our conservation features. We will edit it in Excel, setting default targets of 30% of the existing extent of each.

 

G. Setting MARXAN run-time parameters

With our input tables set, we're now ready to set the run-time parameters for our MARXAN analysis. The Inedit.exe program (found in the Software folder) is used to create and modify the "input.dat" file that contains the run-time settings read by MARXAN when executed.

  1. Copy the Indedit.exe program into your MARXAN folder (the one containing the "input", "output", and "pu" folders).

  2. Open up this copied Inedit.exe program.

  3. Configure the settings to match the screen captures below. This initial setting will run a purely greedy richness algorithm, i.e. no annealing. Later we'll modify this to use simulated annealing and see if it finds a more cost efficient solution. Note: make sure to set the input and output paths to where your data files are!

 

H. Running Marxan

When Marxan completes, it will have created 5 new files in the MarxanOutput folder. These files, each named after the string you provided in the "Save File Name" entry in the Inedit.exe program above, include:

The Marxan documentation contains full explanations of each of these files, but we'll take a quick look at three of these file and use two of them, the "_best.txt" file and the "_ssoln.txt" files, to display our results in ArcMap.

 

I. Reviewing conservation target representation in the optimal solution

The _mvbest.txt file created by Marxan (in your output folder) lists all the conservation targets (i.e. the GAP cover types) and summarizes how each fared in the best solution found by Marxan.

The last column indicated whether the target was met or not. Were any targets missed?

 

J. Mapping the planning units included the Marxan solutions

Of the 100 iterations Marxan runs, the solution with the lowest objective function score bubbles up as the "best solution". The planning units (i.e. HUCs) included in this solution are listed in the _best.txt output file. This file can be joined with the HUC dataset to map out these HUCs.

Marxan also provides another useful result: the summed solutions files. This file, _ssoln.txt, lists each HUC and the number of times that HUC was included in the solution of a particular iteration. We ran 100 iterations of the HUC selection process, so a value of 99 for a given HUC (for example) in this table indicates that the HUC appeared in all but one of the solutions.

This metric is useful as it identifies HUCs that may not appear in the single "best" solution, but occur frequently in sub-optimal solutions. There must be some reason why this HUC keeps on getting selected, and this allows us to see that.

Use the steps below to map out the HUCs involved in the single best solution as well as to display the frequency that each HUC occurs in the solution set of all Marxan iterations.

You can now display HUC polygons as those occurring in the single best solution of all runs, or labeled by the number of times each appears in the optimal solution in a single run.


Assignment

This assignment will not be a formal lab report. Instead, we will Marxan a few times with our HUCs with different run time settings to explore the different ways in which optimization can be calculated.

A. Complete the following Marxan runs, being sure to rename output so that the files are not overwritten.
B. Examine the results of the above runs and fill out the missing values in the MarxanResults.xls table.

Values for the first three rows of the MarxanResults.xls table (in the Docs folder) can be obtained by looking at the _sum.txt file generated for each run. Open the table in ArcGIS and sort on the Score column, from smallest to largest. The row with the lowest score represents the 'best' solution. Use the values in this row to identify the total cost, the number of planning units, and the number of missing targets in the best solution and fill out the corresponding rows in the MarxanResults.xls table.

Add the _ssoln.txt file to ArcMap. This file, which lists in how many iterations of Marxan analyses did a particular planning unit appear in the solution (max = 100), contains the information you need to figure out the remaining two rows in the MarxanResults.xls table.

C. Create simple maps showing the HUCs in the best solution for Runs 2 and 4.

Include only the Mogollon Plateau boundary and the HUCs in this map. No other cartographic elements are needed; this map will not be scored on elegance, but the reader should be able to easily discern included HUCs from excluded ones (i.e., make included HUCs dark and excluded ones light).

D. Create simple maps showing HUC frequency in the solutions for Runs 2 and 4 (i.e, use the ssoln.txt files).

Set the break values in the symbology to show HUCs that occur in 0, 25, 50, 99, and 100 solutions of the 100 model iterations. Don't worry if some classes have no HUCs or if HUCs are too small to show in your map.

► Submit your two sets of maps, your MarxanResults.xls table, and a brief interpretation of the results listed in the MarxanResults.xls table. Include in your summary the following:

Also, zip up your entire Software folder and submit it into your Sakai Dropbox.