CanoPy technical manual

CanoPy is the Python module for the Georgia Canopy Analysis 2009 project sponsored by the Georgia Forestry Commission (GFC). For further information about this project, please refer to the CanoPy page.

This document outlines the uses and methodology of the functions contained within the CanoPy module. To learn how to use this module, please read the user manual. Also refer to the tutorial.

Authors

Owen Smith
Huidae Cho, Ph.D.

Requirements

ArcGIS Desktop 10.x
ArcPy
Python 2 standard module: os
Feature Analyst™ by the Textron Systems
Automated Feature Extraction (AFE) models trained using Feature Analyst

We are currently planning on developing a fully open source solution without using ArcGIS and Feature Analyst.

canopy_config module

Contained in canopy_config.py are all the data paths that CanoPy functions operate with. Example configuration can be found in canopy_config-example.py and copied into canopy_config.py. This module will be imported by the CanoPy module.

phyregs_layer

Type: str
Layer containing polygon features for all physiographic regions
Required attribute fields:
- NAME (Text)
- PHYSIO_ID (Long)
- AREA (Float)
Example: phyregs_layer = 'Physiographic_Districts_GA'

naipqq_layer

Type: str
Layer containing polygon features for all NAIP quarter quad (QQ) tiles
Required attribute field:
- FileName (Text)
Example: naipqq_layer = 'naip_ga_2009_1m_m4b'

naipqq_phyregs_field

Type: str
New field name for assigning physiographic region IDs to the naipqq_layer
This output text field will be created in the naipqq_layer by canopy.assign_phyregs_to_naipqq(). PHYSIO_IDs from the physregs_layer in which a naipqq_layer polygon is contained are output into this field.
Output format: ,#,#,…,
Example: naipqq_phyregs_field = 'phyregs'

naip_path

Type: str
Input folder in which NAIP imagery is stored

The structure of this input folder is defined by USDA, the original source of NAIP imagery. Under this folder are multiple 5-digit numeric folders that contain actual imagery GeoTIFF files.

F:/Georgia/ga/
              34083/
                    m_3408301_ne_17_1_20090929.tif
                    m_3408301_ne_17_1_20090930.tif
                    ...
              34084/
                    m_3408407_ne_16_1_20090930.tif
                    m_3408407_nw_16_1_20090930.tif
                    ...
              ...

Example: naip_path = 'F:/Georgia/ga'

spatref_wkid

Type: int
Desired output coordinate system in WKID format
Well-Known IDs (WKIDs) are numeric identifiers for coordinate systems administered by Esri. This variable specifies the target spatial reference for output files. The WKID used for the GFC canopy project is 102039 (USA Contiguous Albers Equal Area Conic USGS version).
Example: spatref_wkid = 102039

project_path

Type: str
Folder path with which all other output paths are determined

The default structure of the project folder is defined as follows:

C:/.../ (project_path)
       Data/
            Physiographic_Districts_GA.shp (added as a layer)
       Results/
               gtpoints_Winder_Slope.shp
               ...
       2009 Analysis/ (analysis_path)
                     Data/
                          naip_ga_2009_1m_m4b.shp (added as a layer)
                          snaprast.tif (snaprast_path)
                     Results/ (results_path)
                             Winder_Slope/ (physiographic region name)
                                          Inputs/
                                                 reprojected NAIP tiles
                                          Outputs/
                                                  intermediate output tiles
                                                  canopy_2009_Winder_Slope.tif
                             ...
       2019 Analysis/ (analysis_path)
                     Data/
                          naip_ga_2019_1m_m4b.shp (added as a layer)
                          snaprast.tif (snaprast_path)
                     Results/ (results_path)
                             Winder_Slope/ (physiographic region name)
                                          Inputs/
                                                 reprojected NAIP tiles
                                          Outputs/
                                                  intermediate output tiles
                                                  canopy_2019_Winder_Slope.tif
                             ...
       ...

NOTE: Output folder must be manually created. It is used when running Feature Analyst and is _NOT_ created by CanoPy.
Example: project_path = 'C:/work/Research/GFC Canopy Assessment'

analysis_path_format

Type: str
Format of the analysis path for one year
Example: analysis_path_format = '%s/%%d Analysis' % project_path

analysis_year

Type: int
Year for analysis
Example: analysis_year = 2009

snaprast_path

Type: str
Snap raster to which all output tiles will be snapped
This input/output raster is used to snap NAIP tiles to a consistent grid system. If this file does not already exist, the filename part of snaprast_path must be r + the filename of an existing original NAIP tile so that canopy.reproject_naip_tiles() can automatically create it based on the folder structure of the NAIP imagery data (naip_path).
Example: snaprast_path = '%s/Data/rm_3408504_nw_16_1_20090824.tif' % analysis_path

results_path

Type: str
Folder where all results will be stored
Example: results_path = '%s/Results' % analysis_path

canopy module

All functions designed for preproccessing NAIP imagery and for postprocessing trained/classified canopy tiles in addition to utility functions are contained within canopy.py.

NOTE: The physregs_layer and naipqq_layer must be added to an ArcMap or ArcGIS Pro dataframe for CanoPy functions to run.

assign_phyregs_to_naipqq()

This function adds the phyregs field to the NAIP QQ shapefile and populates it with physiographic region IDs that intersect each NAIP tile.

Arguments: None

Config variables assigned with canopy_config:

phyregs_layer = canopy_config.phyregs_layer
naipqq_layer = canopy_config.naipqq_layer
naipqq_phyregs_field = canopy_config.naipqq_phyregs_field

Process:

The data fields of the input NAIP QQ shapefile are read using arcpy.ListFields and a new text field titled naip_phyregs_field is added. If the field already exists, it is deleted and a new field is created.
Using arcpy.CalculateField_managment, a comma (,) is inserted into the newly created naip_phyregs_field. This becomes important as the format for the naip_phyregs_field must be ,#,#,…, to allow for SQL statments in following functions to be able to read the naip_phyregs_field properly. The SQL selections will allow for the right NAIP tiles to be computed as the NAIP QQ shapedfile has a corresponding field for ile names.
All selections are cleared and each NAIP QQ polygon will contain the naip_phyregs_field filled with the IDs of physiographic regions that the QQ tile intersects.

reproject_naip_tiles(phyreg_ids)

This function reprojects and snaps the NAIP tiles that intersect selected physiographic regions.

Arguments:

phyreg_ids (list of int): IDs of physiographic regions to process

Config variables assigned with canopy_config:

phyregs_layer = canopy_config.phyregs_layer
naipqq_layer = canopy_config.naipqq_layer
naipqq_phyregs_field = canopy_config.naipqq_phyregs_field
spatref_wkid = canopy_config.spatref_wkid
snaprast_path = canopy_config.snaprast_path
naip_path = canopy_config.naip_path
results_path = canopy_config.results_path

Process:

The spatial reference desired is set using the WKID specified in the canopy_config using arcpy.SpatialReference which reads the WKID.
If the snap raster does not exist within the snaprast_path, it is created and arcpy.env.snapRaster is used to set all output cell alignments to match the snap.
All NAIP tiles intersecting the input phyreg_ids are selected using an SQL clause to select the phyreg_ids.
The FileName field from each selected NAIP QQ polygon is read.
Using arcpy.ProjectRaster_managment, the selected NAIP are reprojected to the specified WKID and saved as outputs and the prefix r (reprojected) is added to the filename.
The outputs of this function are saved in an inputs folder and are what will used by Textron's Feature Analysis.

convert_afe_to_final_tiles(phyreg_ids)

This function converts Textron's Feature Analyst classified outputs to final GeoTIFF files

Arguments:

phyreg_ids (list of int): IDs of physiographic regions to process

Config variables assigned with canopy_config:

phyregs_layer = canopy_config.phyregs_layer
naipqq_layer = canopy_config.naipqq_layer
naipqq_phyregs_field = canopy_config.naipqq_phyregs_field
snaprast_path = canopy_config.snaprast_path
results_path = canopy_config.results_path

Process:

All NAIP tiles in the desired physiographic region are first selected using an SQL statement to select the input physiographic IDs.
The filenames from the NAIP QQ shapefile with the reprojected prefix are used to as the outputs folder created to save the classified imagery is walked through.
Conversion is necessary as some AFE models used in feature analysis output GeoTIFF files and some output shapefiles.
1. If the file is a shapefile, it is converted to raster with classes 1 and 0.
2. If the file is a GeoTIFF file, the values are reclassified from 1 to 0 and 2 to 1.
3. If the file has already run through this function and has the appropriate prefix, nothing happens to it.
Outputs are saved in the outputs folder with the prefix fr (final reprojected).

clip_final_tiles(phyreg_ids)

This function clips final tiles to their respective NAIP QQ area to eliminate overlap.

Arguments:

phyreg_ids (list of int): IDs of physiographic regions to process

Config variables assigned with canopy_config:

phyregs_layer = canopy_config.phyregs_layer
naipqq_layer = canopy_config.naipqq_layer
naipqq_phyregs_field = canopy_config.naipqq_phyregs_field
snaprast_path = canopy_config.snaprast_path
results_path = canopy_config.results_path

Process:

First, the OID field of the entire NAIP QQ shapefile is encoded.
All NAIP tiles in the desired physiographic region are first selected using an SQL statement to select the input physiographic IDs.
The output files from canopy.convert_afe_to_final_tiles are looped over and, using the corresponding OID field, are then clipped to their respective NAIP QQ polygons.
If the tile has already been clipped and has the appropriate prefix, it will be skipped. If not, the tile will be clipped and the output GeoTIFF will have the prefix cfr (clipped final reprojected).

mosaic_clipped_final_tiles(phyreg_ids)

This function mosaics clipped final GeoTIFF and then clips the mosaicked files to their corresponding physiographic regions

Arguments:

phyreg_ids (list of int): IDs of physiographic regions to process

Config variables assigned with canopy_config:

phyregs_layer = canopy_config.phyregs_layer
naipqq_layer = canopy_config.naipqq_layer
naipqq_phyregs_field = canopy_config.naipqq_phyregs_field
analysis_year = canopy_config.analysis_year
snaprast_path = canopy_config.snaprast_path
results_path = canopy_config.results_path

Process:

All NAIP tiles in the input physiographic regions are first selected using an SQL statement to select the input physiographic IDs.
If the mosaicked file with the analysis year set by the canopy_config file exists, the function ends. If no mosaiked layer with the analysis year exists, the process continues.
Input tiles to be mosiacked are products from canopy.clip_final_tiles with the prefix cfr.
Mosiacking occurs using arcpy.MosaicToNewRaster to create the output raster as a new 2 bit GeoTIFF file.
The new mosaiked data set is clipped to the outline of the physiographic region with the corresponding physiographic ID.

convert_afe_to_canopy_tif(phyreg_ids)

This function is a wrapper function that converts AFE outputs to the final canopy GeoTIFF file by invoking convert_afe_to_final_tiles(), clip_final_tiles(), and mosaic_clipped_final_tiles() in the correct order.

Arguments:

phyreg_ids (list of int): IDs of physiographic regions to process

Config variables assigned with canopy_config:

phyregs_layer = canopy_config.phyregs_layer
naipqq_layer = canopy_config.naipqq_layer
naipqq_phyregs_field = canopy_config.naipqq_phyregs_field
analysis_year = canopy_config.analysis_year
snaprast_path = canopy_config.snaprast_path
results_path = canopy_config.results_path

generate_gtpoints(phyreg_ids, point_density, max_points=400, min_points=200)

This function generates randomized points for ground truthing with fields for corresponding years populated with the corresponding canopy value at each point.

Arguments:

phyreg_ids (list of int): IDs of physiographic regions to process
point_density (int): Number of randomly generated points per square kilometer
max_points (int, default=400): Maximum number of points generated per region
min_points (int, default=200): Minimum number of points generated per region

Config variables assigned with canopy_config:

phyregs_layer = canopy_config.phyregs_layer
naipqq_layer = canopy_config.naipqq_layer
spatref_wkid = canopy_config.spatref_wkid
analysis_year = canopy_config.analysis_year
results_path = canopy_config.results_path

Process:

The physiographic regions are selected using the input physiographic IDs.
Random points in each region are created using arcpy.CreateRandomPoints.
Fields are created in each point shapefile with the field name of GT (e.g., GT).
The attributes of the naip_layer are spatially joined with the created random points shapefile and saved as a new point layer.
A new spatially joined point shapefile allows for the file names of each point's corresponding classified NAIP QQ to be read and converted to a NumPy array. The conversion of each NAIP QQ to a NumPy array allows the function to handle the memory requirements of the regions.
Each point is converted to corresponding rows and columns within the corresponding NAIP QQ using calculate_row_column().
The value of the NumPy cell at each point's row and column is then added to the GT attribute field of the spatially joined shapefile.
After all point values are read, all fields except FID, Shape, and GT are deleted in the spatially joined shapefile.
The originally generated point shapefile is deleted.

software

CLAWRIM Wiki

Table of Contents

CanoPy technical manual

Authors

Requirements

canopy_config module

phyregs_layer

naipqq_layer

naipqq_phyregs_field

naip_path

spatref_wkid

project_path

analysis_path_format

analysis_year

snaprast_path

results_path

canopy module

assign_phyregs_to_naipqq()

reproject_naip_tiles(phyreg_ids)

convert_afe_to_final_tiles(phyreg_ids)

clip_final_tiles(phyreg_ids)

mosaic_clipped_final_tiles(phyreg_ids)

convert_afe_to_canopy_tif(phyreg_ids)

generate_gtpoints(phyreg_ids, point_density, max_points=400, min_points=200)