Table of Contents
CanoPy technical manual
CanoPy is the Python module for the Georgia Canopy Analysis 2009 project sponsored by the Georgia Forestry Commission (GFC). For further information about this project, please refer to the CanoPy page.
This document outlines the uses and methodology of the functions contained within the CanoPy module. To learn how to use this module, please read the user manual. Also refer to the tutorial.
Authors
- Huidae Cho, Ph.D.
Requirements
- ArcGIS Desktop 10.x
- ArcPy
- Python 2 standard module: os
- Feature Analyst™ by the Textron Systems
- Automated Feature Extraction (AFE) models trained using Feature Analyst
We are currently planning on developing a fully open source solution without using ArcGIS and Feature Analyst.
canopy_config module
Contained in canopy_config.py
are all the data paths that CanoPy functions operate with. Example configuration can be found in canopy_config-example.py
and copied into canopy_config.py
. This module will be imported by the CanoPy module.
phyregs_layer
- Type:
str
- Layer containing polygon features for all physiographic regions
- Required attribute fields:
NAME
(Text)PHYSIO_ID
(Long)AREA
(Float)
- Example:
phyregs_layer = 'Physiographic_Districts_GA
'
naipqq_layer
- Type:
str
- Layer containing polygon features for all NAIP quarter quad (QQ) tiles
- Required attribute field:
FileName
(Text)
- Example:
naipqq_layer = 'naip_ga_2009_1m_m4b
'
naipqq_phyregs_field
- Type:
str
- New field name for assigning physiographic region IDs to the
naipqq_layer
- This output text field will be created in the
naipqq_layer
bycanopy.assign_phyregs_to_naipqq()
.PHYSIO_ID
s from thephysregs_layer
in which anaipqq_layer
polygon is contained are output into this field. - Output format:
,#,#,…,
- Example:
naipqq_phyregs_field = 'phyregs
'
naip_path
- Type:
str
- Input folder in which NAIP imagery is stored
- The structure of this input folder is defined by USDA, the original source of NAIP imagery. Under this folder are multiple 5-digit numeric folders that contain actual imagery GeoTIFF files.
F:/Georgia/ga/ 34083/ m_3408301_ne_17_1_20090929.tif m_3408301_ne_17_1_20090930.tif ... 34084/ m_3408407_ne_16_1_20090930.tif m_3408407_nw_16_1_20090930.tif ... ...
- Example:
naip_path = 'F:/Georgia/ga
'
spatref_wkid
- Type:
int
- Desired output coordinate system in WKID format
- Well-Known IDs (WKIDs) are numeric identifiers for coordinate systems administered by Esri. This variable specifies the target spatial reference for output files. The WKID used for the GFC canopy project is 102039 (USA Contiguous Albers Equal Area Conic USGS version).
- Example:
spatref_wkid = 102039
project_path
- Type:
str
- Folder path with which all other output paths are determined
- The default structure of the project folder is defined as follows:
C:/.../ (project_path) Data/ Physiographic_Districts_GA.shp (added as a layer) Results/ gtpoints_Winder_Slope.shp ... 2009 Analysis/ (analysis_path) Data/ naip_ga_2009_1m_m4b.shp (added as a layer) snaprast.tif (snaprast_path) Results/ (results_path) Winder_Slope/ (physiographic region name) Inputs/ reprojected NAIP tiles Outputs/ intermediate output tiles canopy_2009_Winder_Slope.tif ... 2019 Analysis/ (analysis_path) Data/ naip_ga_2019_1m_m4b.shp (added as a layer) snaprast.tif (snaprast_path) Results/ (results_path) Winder_Slope/ (physiographic region name) Inputs/ reprojected NAIP tiles Outputs/ intermediate output tiles canopy_2019_Winder_Slope.tif ... ...
- NOTE: Output folder must be manually created. It is used when running Feature Analyst and is _NOT_ created by CanoPy.
- Example:
project_path = 'C:/work/Research/GFC Canopy Assessment
'
analysis_path_format
- Type:
str
- Format of the analysis path for one year
- Example:
analysis_path_format = '%s/%%d Analysis' % project_path
analysis_year
- Type:
int
- Year for analysis
- Example:
analysis_year = 2009
snaprast_path
- Type:
str
- Snap raster to which all output tiles will be snapped
- This input/output raster is used to snap NAIP tiles to a consistent grid system. If this file does not already exist, the filename part of
snaprast_path
must ber
+ the filename of an existing original NAIP tile so thatcanopy.reproject_naip_tiles()
can automatically create it based on the folder structure of the NAIP imagery data (naip_path
). - Example:
snaprast_path = '%s/Data/rm_3408504_nw_16_1_20090824.tif' % analysis_path
results_path
- Type:
str
- Folder where all results will be stored
- Example:
results_path = '%s/Results' % analysis_path
canopy module
All functions designed for preproccessing NAIP imagery and for postprocessing trained/classified canopy tiles in addition to utility functions are contained within canopy.py
.
NOTE: The physregs_layer
and naipqq_layer
must be added to an ArcMap or ArcGIS Pro dataframe for CanoPy functions to run.
assign_phyregs_to_naipqq()
This function adds the phyregs field to the NAIP QQ shapefile and populates it with physiographic region IDs that intersect each NAIP tile.
Arguments: None
Config variables assigned with canopy_config
:
phyregs_layer = canopy_config.phyregs_layer
naipqq_layer = canopy_config.naipqq_layer
naipqq_phyregs_field = canopy_config.naipqq_phyregs_field
Process:
- The data fields of the input NAIP QQ shapefile are read using
arcpy.ListFields
and a new text field titlednaip_phyregs_field
is added. If the field already exists, it is deleted and a new field is created. - Using
arcpy.CalculateField_managment
, a comma (,
) is inserted into the newly creatednaip_phyregs_field
. This becomes important as the format for thenaip_phyregs_field
must be,#,#,…,
to allow for SQL statments in following functions to be able to read thenaip_phyregs_field
properly. The SQL selections will allow for the right NAIP tiles to be computed as the NAIP QQ shapedfile has a corresponding field for ile names. - All selections are cleared and each NAIP QQ polygon will contain the
naip_phyregs_field
filled with the IDs of physiographic regions that the QQ tile intersects.
reproject_naip_tiles(phyreg_ids)
This function reprojects and snaps the NAIP tiles that intersect selected physiographic regions.
Arguments:
phyreg_ids
(list ofint
): IDs of physiographic regions to process
Config variables assigned with canopy_config
:
phyregs_layer = canopy_config.phyregs_layer
naipqq_layer = canopy_config.naipqq_layer
naipqq_phyregs_field = canopy_config.naipqq_phyregs_field
spatref_wkid = canopy_config.spatref_wkid
snaprast_path = canopy_config.snaprast_path
naip_path = canopy_config.naip_path
results_path = canopy_config.results_path
Process:
- The spatial reference desired is set using the WKID specified in the
canopy_config
usingarcpy.SpatialReference
which reads the WKID. - If the snap raster does not exist within the
snaprast_path
, it is created andarcpy.env.snapRaster
is used to set all output cell alignments to match the snap. - All NAIP tiles intersecting the input
phyreg_ids
are selected using an SQL clause to select thephyreg_ids
. - The
FileName
field from each selected NAIP QQ polygon is read. - Using
arcpy.ProjectRaster_managment
, the selected NAIP are reprojected to the specified WKID and saved as outputs and the prefixr
(reprojected) is added to the filename. - The outputs of this function are saved in an inputs folder and are what will used by Textron's Feature Analysis.
convert_afe_to_final_tiles(phyreg_ids)
This function converts Textron's Feature Analyst classified outputs to final GeoTIFF files
Arguments:
phyreg_ids
(list ofint
): IDs of physiographic regions to process
Config variables assigned with canopy_config
:
phyregs_layer = canopy_config.phyregs_layer
naipqq_layer = canopy_config.naipqq_layer
naipqq_phyregs_field = canopy_config.naipqq_phyregs_field
snaprast_path = canopy_config.snaprast_path
results_path = canopy_config.results_path
Process:
- All NAIP tiles in the desired physiographic region are first selected using an SQL statement to select the input physiographic IDs.
- The filenames from the NAIP QQ shapefile with the reprojected prefix are used to as the outputs folder created to save the classified imagery is walked through.
- Conversion is necessary as some AFE models used in feature analysis output GeoTIFF files and some output shapefiles.
- If the file is a shapefile, it is converted to raster with classes 1 and 0.
- If the file is a GeoTIFF file, the values are reclassified from 1 to 0 and 2 to 1.
- If the file has already run through this function and has the appropriate prefix, nothing happens to it.
- Outputs are saved in the outputs folder with the prefix
fr
(final reprojected).
clip_final_tiles(phyreg_ids)
This function clips final tiles to their respective NAIP QQ area to eliminate overlap.
Arguments:
phyreg_ids
(list ofint
): IDs of physiographic regions to process
Config variables assigned with canopy_config
:
phyregs_layer = canopy_config.phyregs_layer
naipqq_layer = canopy_config.naipqq_layer
naipqq_phyregs_field = canopy_config.naipqq_phyregs_field
snaprast_path = canopy_config.snaprast_path
results_path = canopy_config.results_path
Process:
- First, the
OID
field of the entire NAIP QQ shapefile is encoded. - All NAIP tiles in the desired physiographic region are first selected using an SQL statement to select the input physiographic IDs.
- The output files from
canopy.convert_afe_to_final_tiles
are looped over and, using the correspondingOID
field, are then clipped to their respective NAIP QQ polygons. - If the tile has already been clipped and has the appropriate prefix, it will be skipped. If not, the tile will be clipped and the output GeoTIFF will have the prefix
cfr
(clipped final reprojected).
mosaic_clipped_final_tiles(phyreg_ids)
This function mosaics clipped final GeoTIFF and then clips the mosaicked files to their corresponding physiographic regions
Arguments:
phyreg_ids
(list ofint
): IDs of physiographic regions to process
Config variables assigned with canopy_config
:
phyregs_layer = canopy_config.phyregs_layer
naipqq_layer = canopy_config.naipqq_layer
naipqq_phyregs_field = canopy_config.naipqq_phyregs_field
analysis_year = canopy_config.analysis_year
snaprast_path = canopy_config.snaprast_path
results_path = canopy_config.results_path
Process:
- All NAIP tiles in the input physiographic regions are first selected using an SQL statement to select the input physiographic IDs.
- If the mosaicked file with the analysis year set by the
canopy_config
file exists, the function ends. If no mosaiked layer with the analysis year exists, the process continues. - Input tiles to be mosiacked are products from
canopy.clip_final_tiles
with the prefixcfr
. - Mosiacking occurs using
arcpy.MosaicToNewRaster
to create the output raster as a new 2 bit GeoTIFF file. - The new mosaiked data set is clipped to the outline of the physiographic region with the corresponding physiographic ID.
convert_afe_to_canopy_tif(phyreg_ids)
This function is a wrapper function that converts AFE outputs to the final
canopy GeoTIFF file by invoking convert_afe_to_final_tiles()
,
clip_final_tiles()
, and mosaic_clipped_final_tiles()
in the correct order.
Arguments:
phyreg_ids
(list ofint
): IDs of physiographic regions to process
Config variables assigned with canopy_config
:
phyregs_layer = canopy_config.phyregs_layer
naipqq_layer = canopy_config.naipqq_layer
naipqq_phyregs_field = canopy_config.naipqq_phyregs_field
analysis_year = canopy_config.analysis_year
snaprast_path = canopy_config.snaprast_path
results_path = canopy_config.results_path
generate_gtpoints(phyreg_ids, point_density, max_points=400, min_points=200)
This function generates randomized points for ground truthing with fields for corresponding years populated with the corresponding canopy value at each point.
Arguments:
phyreg_ids
(list ofint
): IDs of physiographic regions to processpoint_density
(int
): Number of randomly generated points per square kilometermax_points
(int, default=400
): Maximum number of points generated per regionmin_points
(int, default=200
): Minimum number of points generated per region
Config variables assigned with canopy_config
:
phyregs_layer = canopy_config.phyregs_layer
naipqq_layer = canopy_config.naipqq_layer
spatref_wkid = canopy_config.spatref_wkid
analysis_year = canopy_config.analysis_year
results_path = canopy_config.results_path
Process:
- The physiographic regions are selected using the input physiographic IDs.
- Random points in each region are created using
arcpy.CreateRandomPoints
. - Fields are created in each point shapefile with the field name of
GT
(e.g.,GT
). - The attributes of the
naip_layer
are spatially joined with the created random points shapefile and saved as a new point layer. - A new spatially joined point shapefile allows for the file names of each point's corresponding classified NAIP QQ to be read and converted to a NumPy array. The conversion of each NAIP QQ to a NumPy array allows the function to handle the memory requirements of the regions.
- Each point is converted to corresponding rows and columns within the corresponding NAIP QQ using
calculate_row_column()
. - The value of the NumPy cell at each point's row and column is then added to the
GT
attribute field of the spatially joined shapefile. - After all point values are read, all fields except
FID
,Shape
, andGT
are deleted in the spatially joined shapefile. - The originally generated point shapefile is deleted.