High spatial resolution three-dimensional mapping of vegetation spectral dynamics using computer vision ☆

High spatial resolution three-dimensional (3D) measurements of vegetation by remote sensing are advancing ecological research and environmental management. However, substantial economic and logistical costs limit this application, especially for observing phenological dynamics in ecosystem structure and spectral traits. Here we demonstrate a new aerial remote sensing system enabling routine and inexpensive aerial 3D measurements of canopy structure and spectral attributes, with properties similar to those of LIDAR, but with RGB (red-green-blue) spectral attributes for each point, enabling high frequency observations within a single growing season.This “ Ecosynth ” methodology appliesphotogrammetric “ Structure fromMotion ” computer vision algorithms to large sets of highly overlapping low altitude ( b 130 m) aerial photographs acquired using off-the-shelfdigitalcamerasmountedonaninexpensive( b USD$4000),lightweight( b 2 kg),hobbyist-gradeunmanned aerial system (UAS). Ecosynth 3D point clouds with densities of 30 – 67 points m − 2 were produced using commercial computer vision software from digital photographs acquired repeatedly by UAS over three 6.25 ha (250 m × 250 m)TemperateDeciduousforestsitesinMarylandUSA.Ecosynthpointcloudsweregeoreferenced withaprecisionof1.2 – 4.1 mhorizontalradialrootmeansquareerror(RMSE)and0.4 – 1.2 mverticalRMSE.Un-derstory digital terrain models (DTMs) and canopy height models (CHMs) were generated from leaf-on and leaf-off point clouds using procedures commonly applied to LIDAR point clouds. At two sites, Ecosynth CHMs were strong predictors of ﬁ eld-measured tree heights (R 2 0.63 to 0.84) and were highly correlated with a LIDARCHM(R0.87)acquired4 daysearlier,thoughEcosynth-basedestimatesofabovegroundbiomassandcar-bondensitiesincludedsigni ﬁ canterrors(31 – 36%of ﬁ eld-basedestimates).Repeatedscanningofa50 m × 50 m forestedareaatsixdifferenttimesacrossa16monthperiodrevealedecologicallysigni ﬁ cantdynamicsincanopy coloratdifferentheightsandastructuralshiftupwardincanopy density,asdemonstratedbychangesinvertical height pro ﬁ les of point density and relative RGB brightness. Changes in canopy relative greenness were highly correlated (R 2 = 0.87) with MODIS NDVI time series for the same area and vertical differences in canopy color revealed the early green up of the dominant canopy species, Liriodendron tulipifera , strong evidence that Ecosynth time series measurements can capture vegetation structural and spectral phenological dynamics at the spatial scale of individual trees. The ability to observe canopy phenology in 3D at high temporal resolutions represents a breakthrough in forest ecology. Inexpensive user-deployed technologies for multispectral 3D scanning of vegetation at landscape scales ( b 1 km 2 ) heralds a new era of participatory remote sensing by ﬁ eld ecologists, community foresters and the interested public. © 2013 The Authors. Published by Elsevier Inc

response of terrestrial ecosystems to changes in climate and land use (Frolking et al., 2009;Morisette et al., 2008;Richardson et al., 2009), yet no single instrument is technically or logistically capable of combining structural and spectral observations at high temporal and spatial resolutions. Here we demonstrate an inexpensive user-deployed aerial remote sensing system that enables high spatial resolution 3D multispectral observations of vegetation at high temporal resolutions, and discuss its prospects for advancing the remote sensing of forest structure, function and dynamics.
Tree heights, generally in the form of canopy height models (CHM), are the most common remotely sensed 3D vegetation measurements. CHMs can be produced using stereo-pair and multiple-stereo photogrammetry applied to images acquired from aircraft and satellites (Hirschmugl et al., 2007;St. Onge et al., 2008) and active synthetic aperture radar (SAR) sensors (Treuhaft et al., 2004), but are now most commonly produced using active LIDAR remote sensing (Light Detection and Ranging). LIDAR CHMs with precisions of 0.2-2 m can be produced across forest types and acquisition settings (i.e., altitude, point density, etc.; Andersen et al., 2006;Wang & Glenn, 2008) based on the return times of laser pulses reflected from canopy surfaces and the ground, by generating models of understory terrain elevations (digital terrain models; DTM) and top canopy surface heights, which are then subtracted (Dubayah & Drake, 2000;Popescu et al., 2003). Canopy heights and other metrics of vertical structure are useful for estimating aboveground biomass and carbon density (Goetz & Dubayah, 2011;Lefsky et al., 2002), biomass change (from multiple LIDAR missions; Hudak et al., 2012), fire risk (Andersen et al., 2005;Skowronski et al., 2011), and for individual tree extraction by species (Falkowski et al., 2008;Vauhkonen et al., 2008) among many other scientific and management applications.
While conventional airborne LIDAR acquisitions have become less expensive over time, they remain very costly for researchers and other end-users, especially if required at high spatial resolution over a few small areas or at high temporal frequencies (Gonzalez et al., 2010;Schimel et al., 2011). When applied over large spatial extents (e.g., > hundreds of square kilometers) LIDAR can be used to map aboveground biomass at a cost of $0.05-$0.20 per hectare (Asner, 2009). However, typical commercial aerial LIDAR acquisitions often cost a minimum of $20,000 per flight regardless of study area size (Erdody & Moskal, 2010), representing a significant barrier to widespread application, especially for local environmental management and in ecological field studies based on annual or more frequent observations at numerous small sites or sampling plots (e.g., Holl et al., 2011). Even LIDAR satellite missions require local calibration data from multiple small sampling locations dispersed across spatial scales (Defries et al., 2007;Dubayah et al., 2010;Frolking et al., 2009).
The fusion of active-3D and optical-image remote sensing datasets has become increasingly common for the mapping of vegetation structural and spectral traits for applications including the measurement of aboveground biomass and carbon, identifying individual species, and modeling the spatial heterogeneity of vegetation biochemistry (Anderson et al., 2008;Ke et al., 2010;Turner et al., 2003;Vitousek et al., 2009). However, the need to combine data from different sensors presents multiple challenges to both analysis and application, including areas of no data, spatial misalignment, and the need to reduce the quality of one dataset to match the other, such as coarsening LIDAR structural observations to match optical image observations (Hudak et al., 2002;Geerling et al., 2007;Mundt et al., 2006;Anderson et al., 2008). Recent advances in 3D remote sensing have combined active 3D and spectral measurements in a calibrated sensor package (Asner & Martin, 2009). Yet despite their high utility, integrated fusion instruments remain too costly to be deployed at the frequent time intervals needed to capture vegetation temporal dynamics at the same location within a growing season (Kampe et al., 2010;Schimel et al., 2011).
To overcome the cost and logistical barriers to routine and frequent acquisition of high spatial resolution 3D datasets, three rapidly emerging technologies can be combined: low-cost, hobbyist-grade Unmanned Aircraft Systems (UAS); high-speed consumer digital cameras (continuous frame rates >1 s −1 ); and automated 3D reconstruction algorithms based on computer vision. Recent advances in hobbyist-grade UAS capable of autonomous flight make it possible for an individual to obtain over the Internet a small (b1 m diameter), light-weight (b2 kg), and relatively low-cost (bUSD$4000) aerial image acquisition platform that can be programmed to fly a specified route over an area at a fixed altitude (e.g., 100 m above the ground). Dandois and Ellis (2010) demonstrated that high spatial resolution 3D "point cloud" models of vegetation structure and color (RGB; red-green-blue) can be produced by applying Structure from Motion computer vision algorithms (SfM; Snavely et al., 2010) to sets of regular digital photographs acquired with an off-the-shelf digital camera deployed on a kite, without any information about sensor position and orientation in space. While this early "Ecosynth" system proved capable of yielding useful data, kite platforms proved incapable of supporting the consistent, repeated acquisitions of overlapping high quality images needed to observe dynamics in vegetation structure and color at high spatial resolutions in 3D over larger areas.
This study will demonstrate that by enhancing Ecosynth methods using automated UAS image acquisition techniques, high spatial resolution multispectral 3D datasets can be repeatably and consistently produced, thereby enabling the structural and spectral dynamics of forest canopies to be observed in 3D; a major advance in the remote sensing of forest ecosystems. Ecosynth methods encompass the full process and suite of hardware and software used to observe vegetation structural and spectral traits from ordinary digital cameras using computer vision. Ecosynth methods are not presented as a replacement for remote sensing systems designed to map large extents, but rather as an inexpensive user-deployed system for detailed observations across local sites and landscapes at scales generally less than 1 km 2 , much like ground-based Portable Canopy LIDAR (PCL; Hardiman et al., 2011), or web-cam phenology imaging systems deployed at carbon flux towers (PhenoCam; Richardson et al., 2009;Mizunuma et al., 2013). Nevertheless, the general utility and maturity of Ecosynth methods for routine and inexpensive forest measurements on demand will be demonstrated by comparing these with estimates of understory terrain, canopy height, and forest aboveground biomass density produced by field and LIDAR methods across three >6 ha forest study sites. The unprecedented ability of Ecosynth methods to simultaneously observe vegetation structural and spectral dynamics at high spatial resolutions will be demonstrated by comparing vertical profiles of vegetation structure (Parker & Russ, 2004) and RGB relative brightness (Mizunuma et al., 2013;Richardson et al., 2009) acquired at six times across the Northern Temperate growing season to data from vegetation stem maps, discrete return LIDAR, and a MODIS NDVI time series.

Computer vision for remote sensing
Automated photogrammetric systems based on computer vision SfM algorithms (Snavely et al., 2008) enable the production of geometrically precise 3D point cloud datasets based entirely on large sets of overlapping digital photographs taken from different locations (Dandois & Ellis, 2010;Dey et al., 2012;Rosnell & Honkavaara, 2012). SfM relies on photogrammetric methods that have already been used for estimating tree height from overlapping images acquired using large-format, photogrammetric-grade cameras coupled with flight time GPS and IMU data, including automated feature extraction, matching and bundle adjustment (Hirschmugl et al., 2007;Ofner et al., 2006), and these methods have been discussed as a viable alternative to LIDAR for 3D forestry applications (Leberl et al., 2010). However, SfM differs from prior photogrammetric applications in that camera position and orientation data that are conventionally acquired using GPS and IMU instruments carried by the aircraft are removed from the 3D modeling equation, and instead the 3D reconstruction of surface feature points is determined automatically based on the inherent "motion" of numerous overlapping images acquired from different locations (Snavely et al., 2008). The result is an extremely simple remote sensing instrument: an ordinary digital camera taking highly overlapping images while moving around or along objects.
SfM techniques have already proved successful for accurate 3D modeling of built structures, bare geological substrates, and finespatial scale individual plant structure (de Matías et al., 2009;Dey et al., 2012;Harwin & Lucieer, 2012;Snavely et al., 2010). SfM has been applied to generate 3D surface models of open fields, forests and trees from aerial images acquired from a remote-controlled multi-rotor aircraft (Rosnell & Honkavaara, 2012;Tao et al., 2011). Recently, Wallace et al. (2012) used SfM algorithms to improve the calculation of sensor position and orientation on a lightweight UAS (≈5 kg with payload) carrying a mini-LIDAR sensor with lightweight GPS and new micro-electromechanical system (MEMS) based IMU equipment (2.4 kg), finding sub-meter horizontal and vertical spatial accuracies of ground targets (0.26 m and 0.15 m, respectively). That study found low variance (0.05 m-0.25 m) of manually extracted individual tree height measurements from the LIDAR point cloud but did not compare these with field measured tree heights.

UAS for remote sensing
UAS are increasingly being deployed for low-cost, on-demand aerial photography and photogrammetry applications (Harwin & Lucieer, 2012;Hunt et al., 2010;Rango et al., 2009). Rosnell and Honkavaara (2012) used an autonomous multirotor aircraft to take aerial photos in a grid pattern to generate orthomosaics and land surface elevation models using photogrammetry and computer vision software. Lin et al. (2011) recently explored the deployment of LIDAR sensors on relatively small UAS (11.5 kg with platform, battery and payload) suggesting a technology useful for measuring forest structure, but without demonstrating the production of canopy height or other forestry measures. As both conventional LIDAR and photogrammetric techniques require precise measurements of sensor position and orientation during flight, these techniques require high-accuracy global positioning systems (GPS) and inertial monitoring units (IMU), both of which are relatively expensive and heavy instruments (>10 kg) that tend to limit applications to the use of relatively large UASs (>10 kg) and higher altitudes (>130 m), invoking logistical and regulatory requirements similar to those of conventional manned aircraft.

Study areas
Research was carried out across three 6.25 ha (250 m × 250 m) forest research study sites in Maryland USA: two areas on the campus of the University of Maryland Baltimore County (UMBC; 39°15′18″N 76°42′32″W) and one at the Smithsonian Environmental Research Center in Edgewater Maryland (SERC; 38°53′10″N 76°33′51″W). UMBC sites are centered on and expanded from the smaller study sites described by Dandois and Ellis (2010).
The first UMBC study site, "Knoll," centers on a forested hill surrounded by turfgrass and paved surfaces, peaking at about ≈60 m ASL (above mean sea level) and gradually descending by 5 to 20 m. The forest is composed of a mixed-age canopy (mean canopy height 25 m, max. 42 m) dominated by American beech (Fagus grandifolia), oak (Quercus spp.), and hickory (Carya spp.) but also including several large mature white ash (Fraxinus americana) and tulip-poplar (L. tulipifera). The second UMBC study site, "Herbert Run," consists of a remnant forest patch similar in size and composition (mean canopy height 20 m, max. 34 m) to the Knoll (elevation 55 m ASL) but steeply sloping (up to 50% grade) down to a riparian forest along a small stream (Herbert Run; 40 m ASL) and back up to a road running parallel to the stream. The riparian forest canopy consists mostly of an even-aged stand of black locust (Robinia pseudoacacia) overstory with black cherry (Prunus serotina) understory along the steep stream banks, with honey locust (Gleditsia triacanthos) and green ash (Fraxinus pennsylvanica) becoming dominant in closest proximity to the stream.
The "SERC" study site is located approximately at the center of the "Big Plot" at the Smithsonian Environmental Research Center in Edgewater, Maryland that has been the long-term focus of a variety of forest ecology and remote sensing studies (McMahon et al., 2010;Parker & Russ, 2004). The site is comprised of floodplain with a gradual slope (8% mean grade) from a small hill (≈19 m ASL) at the north to a riparian area (≈0 m ASL) to the east and south. The canopy is dominated by tulip-poplar, American beech, and several oak (Quercus spp.) species in the overstory (mean canopy height 37 m, max. 50 m).

Forestry field methods
At UMBC sites, a 25 m × 25 m subplot grid was staked out within forested areas using a Sokkia Set 5A Total Station and Trimble TSC2 Data Logger based off of the local geodectic survey network (0.25 m horizontal radial RMSE, 0.07 m vertical RMSE; WGS84 UTM Zone 18N datum). Tree location, species, DBH and height of trees greater than 1 cm DBH were hand mapped within the subplot grid between June 2012 and March 2013. Tree heights were measured by laser hypsometer during leaf-off conditions over the same period for the five largest trees per subplot, based on DBH, as the average of three height measurements taken at approximately 120°intervals around each tree at an altitude angle of b45°. Subplot canopy height was then estimated as the mean height of the 5 tallest trees, i.e., average maximum height.
Field data for SERC were collected as part of a long-term forest inventory and monitoring program as described by McMahon et al. (2010). In that project, individual trees greater than 1 cm DBH were mapped to a surveyed 10 m × 10 m subplot grid using a meter tape placed on the ground and were identified to species. For the current study, a sample of field measured tree heights were obtained by overlaying a 25 m x 25 m subplot grid across the existing stem map in GIS and selecting the five largest trees per subplot based on DBH. During winter 2013, tree heights were measured as described above in 30 of the 100 25 m × 25 m subplots: 26 in randomly selected subplots and 4 in a group of subplots that comprise a 50 m × 50 m subset area.

Aerial LIDAR
LIDAR data covering UMBC sites were acquired in 2005 by a local contractor for the Baltimore County Office of Information Technology with the goal of mapping terrain at high spatial resolution across Baltimore County MD, USA. The collection used an Optech ALTM 2050 LIDAR with Airborne GPS and IMU under leaf-off conditions in the spring of 2005 (2005/03/18-2005/04/15; ≈ 800-1200 m above ground surface; ≈ 140 kn airspeed; 36 Hz scan frequency; 20°scan width half angle; 50 kHz pulse rate; ≈150 m swath overlap; mean point density 1.5 points m −2 ; NAD83 Harn Feet horizontal datum; NAVD88 Feet vertical datum). More recent LIDAR data for UMBC sites were not available (Baltimore County has a 10 year LIDAR collection plan), so the 2005 LIDAR dataset represents the only existing 3D forest canopy dataset at these sites. Airborne LIDAR data for SERC were collected 2011/10/05 by the NASA GSFC G-LiHT (Goddard LIDAR-Hyperspectral-Thermal; Cook et al., 2012) remote sensing fusion platform (350 m above ground surface; 110 kn airspeed; 300 kHz pulse repetition frequency; 150 kHz effective measurement rate; 30°scan width half angle; 387 m swath width at 350 m altitude; mean point density 78 points m −2 ; WGS84 UTM Zone 18N horizontal coordinate system; GEOID09 vertical datum; data obtained and used with permission from Bruce Cook, NASA GSFC on 2012/02/22).

Ecosynth-computer vision remote sensing
The term "Ecosynth" is used here and in prior research (Dandois & Ellis, 2010) to describe the entire processing pipeline and suite of hardware involved in generating ecological data products (e.g., canopy height models (CHMs), aboveground biomass (AGB) estimates, and canopy structural and spectral vertical profiles) and is diagrammed in Fig. 1. The Ecosynth method combines advances and techniques from many areas of research, including computer vision structure from motion, UAS, and LIDAR point cloud data processing.

Image acquisition using UAS
An autonomously flying, hobbyist-grade multi-rotor helicopter, "Mikrokopter Hexakopter" (Fig. 1a; HiSystems GmbH, Moormerland, Germany; http://www.mikrokopter.de/ucwiki/en/MikroKopter) was purchased as a kit, constructed, calibrated and programmed for autonomous flight according to online instructions. The flying system included a manufacturer-provided wireless telemetry downlink to a field computer, enabling real-time ground monitoring of aircraft altitude, position, speed, and battery life.
Image acquisition flights were initiated at the geographic center of each study site because Hexakopter firmware restricted autonomous flight within a 250 m radius of takeoff, in compliance with German laws. This required manual piloting of the Hexakopter through a canopy gap at the Knoll and SERC sites; flights at Herbert Run were initiated from an open field near study site center. Flights were programmed to a predetermined square parallel flight plan designed to cover the study site plus a 50 m buffer area added to avoid edge effects in image acquisitions, by flying at a fixed altitude approximately 40 m above the peak canopy height of each study site. Once the Hexakopter reached this required altitude, as determined by flight telemetry, automated flight was initiated by remote control. Flight paths were designed to produce a minimum photographic side overlap of 40% across UMBC sites and 50% at SERC owing to higher wind prevalence at that study site at the time of acquisition; forward overlap was >90% for all acquisitions.
A Canon SD4000 point-and-shoot camera was mounted under the Hexakopter to point at nadir and set to "Continuous Shooting mode" to collect 10 megapixel resolution photographs continuously at a rate of 2 frames s −1 . Camera focal length was set to "Infinity Focus" (≈4.90 mm) and exposure was calibrated to an 18% grey camera target in full sun with a slowest shutter speed of 1/800 s. Images were acquired across each study site under both leaf-on and leaf-off conditions as described in Supplement 2. Two leaf-on acquisitions were produced at the Knoll study site to assess repeatability of height measurements and spectral changes caused by Fall leaf senescence (Leaf-on 2; Supplement 2). At SERC, four additional data sets were collected across a 16 month period to capture the structural and spectral attributes of the canopy at distinct points throughout the growing season (winter leaf-off, early spring, spring green-up, summer mature green, early fall leaf-on, senescing). Upon completion of its automated flight plan, the aircraft returned to the starting location and was manually flown vertically down to land.

3D point cloud generation using SfM
Multi-spectral (red-green-blue, RGB) three-dimensional (3D) point clouds were generated automatically from the sets of aerial photographs described in Supplement 2 using a purchased copy of Agisoft Photoscan, a commercial computer vision software package (http://www.agisoft.ru; v0.8.4 build 1289). Photoscan uses proprietary algorithms that are similar to, but not identical with, those of Bundler (Personal email communication with Dmitry Semyonov, Agisoft LLC, 2010/12/01) and was used for its greater computational efficiency over the open source Bundler software used previously for vegetation point cloud generation (estimated at least 10 times faster for photo sets > 2000; Dandois & Ellis, 2010). Photoscan has already been used for 3D modeling of archaeological sites from kite photos (Verhoeven, 2011) and has been proposed for general image-based surface modeling applications (Remondino et al., 2011).
Prior to running Photoscan, image sets were manually trimmed to remove photos from the take-off and landing using the camera time stamp and the time stamp of GPS points recorded by the Mikrokopter. Photoscan provides a completely automated computer vision SfM pipeline, taking as input a set of images and automatically going through the steps of feature identification, matching and bundle adjustment. To generate each 3D RGB point, Photoscan performs several tasks as part of an automated computer vision SfM pipeline (Verhoeven, 2011). This is accomplished by automatically extracting "keypoints" from individual photos, identifying "keypoint matches" among photos (e.g., Lowe, 2004), and then using bundle adjustment algorithms to estimate and optimize the 3D location of feature correspondences together with the location and orientation of cameras and camera internal parameters (Snavely et al., 2008;Triggs et al., 1999). A comprehensive description of the SfM process is presented in Supplemental 7. Photoscan was run using the "Align Photos" tool with settings: "High Accuracy" and "Generic Pair Pre-selection". The "Align Photos" tool automatically performs the computer vision structure from motion process as described above, but using proprietary algorithms. According the manufacturer's description, the "High Accuracy" setting provides a better solution of camera position, but at the cost of greater computation time. Similarly, the "Generic Pair Pre-selection" setting uses an initial low accuracy assessment to determine which photos are more likely to match, reducing computation time. After this, no other input is provided by the user until processing is complete, at which time the user exports the forest point cloud model into an ASCII XYZRGB file and the camera points into an ASCII XYZ file.
Photoscan was installed on a dual Intel Xeon X5670 workstation (12 compute cores) with 48GB of RAM, which required 2-5 days of continuous computation to complete the generation of a single point cloud across each study site, depending roughly on the size of the input photo set (Supplement 2). Point clouds thus produced consisted of a set of 3D points in an arbitrary but internally consistent geometry, with RGB color extracted for each point from input photos, together with the 3D location of the camera for each photo together with its camera model, both intrinsic (e.g., lens distortion, focal length, principle point) and extrinsic (e.g., XYZ location, rotational pose along all three axes), in the same coordinate system as the entire point cloud (Fig. 1b).

SfM point cloud georeferencing and post-processing
Ground control point (GCP) markers (five-gallon orange buckets) were positioned across sites prior to image acquisition in configurations recommended by Wolf and Dewitt (2000). The XYZ locations of each GCP marker were measured using a Trimble GeoXT GPS with differential correction to within 1 m accuracy (UTM; Universal Transverse Mercator projection Zone 18N, WGS84 horizontal datum). The coordinates of each GCP marker in the point cloud coordinate system were determined by manually identifying orange marker points in the point cloud and measuring their XYZ coordinates using ScanView software (Menci Software; http://www.menci.com). Six GCPs were selected for use in georeferencing, the center-most and the five most widely distributed across the study site; remaining GCPs were reserved for georeferencing accuracy assessment.
A 7-parameter Helmert transformation was used to georeference SfM point clouds to GCPs by means of an optimal transformation model implemented in Python (v2.7.2; Scipy v0.10.1; Optimize module) obtained by minimizing the sum of squared residuals in X, Y, and Z between the SfM and GCP coordinate systems, based on a single factor of scale, three factors of translation along each axis, and three angles of rotation along each axis ( Fig. 1c; Wolf & Dewitt, 2000). Georeferencing accuracy was assessed using National Standard for Spatial Data Accuracy (NSSDA) procedures (RMSE = Root Mean Square Error, RMSE r = Radial (XY) RMSE, RMSE z = Vertical (Z) RMSE, 95% Radial Accuracy, and 95% Vertical Accuracy; Flood, 2004), by comparing the transformed coordinates of the GCP markers withheld from the transformation model with their coordinates measured by precision GPS in the field. This technique for georeferencing is referred to as the "GCP method".
GCP markers at SERC were obscured by forest canopy under leaf-on conditions, so georeferencing was only achievable using GPS track data downloaded from the Hexakopter. This method was also applied to the Knoll and Herbert Run datasets to evaluate its accuracy against the GCP method. Owing to hardware limitations of the Hexakopter GPS, positions could only be acquired every 5 s, a much lower frequency that was out of synch with photograph acquisitions (2 frames s −1 ). To overcome this mismatch and the lower precision of the Hexakopter GPS, the entire aerial GPS track (UTM coordinates) and the entire set of camera positions along the flight path (SfM coordinate system) were fitted to independent spline curves, from which a series of 100 XYZ pseudo-pairs of GPS and SfM camera locations were obtained using an interpolation algorithm (Python v2.7.2, Scipy v0.10.1 Interpolate module) and then used as input for the georeferencing of point clouds using the same Helmert transformation algorithm used in the GCP method. This technique for georeferencing is referred to as the "spline method". SERC georeferencing accuracy with the spline method was then assessed during leaf-off conditions based on 12 GCP markers placed along a road bisecting the study site that were observable in the SfM point cloud, using the same methods as for UMBC sites (Supplement 2). However, the poor geometric distribution of these GCP markers across the SERC study site precluded their direct use for georeferencing.

Noise filtering of SfM point clouds
Georeferenced SfM point clouds for each study site included a small but significant number of points located far outside the possible spatial limits of the potential real-world features, most likely as the result of errors in feature matching (Triggs et al., 1999). As in LIDAR postprocessing, these "noise" points were removed from point clouds after georeferencing using statistical outlier filtering (Sithole & Vosselman, 2004). First, georeferenced point clouds were clipped to a 350 m × 350 m extent: the 250 m × 250 m study site plus a 50 m buffer on all sides to avoid edge effects. A local filter was applied by overlaying a 10 m grid across the clipped point cloud, computing standardized Z-scores (Rousseeuw & Leroy, 1987) within each grid cell, and removing all points with |Z-score| > 3; between 1% and 2% of input points were removed at this stage (Supplement 2). While filtering did remove some verifiable canopy points, filters were implemented instead of manual editing to facilitate automation. At this point, "Ecosynth" point clouds were ready for vegetation structure measurements.

Terrain filtering and DTM creation
After georeferencing and noise-filtering of computer vision point clouds, a 1 m grid was imposed across the entire clipped point cloud of the study site and the median elevation point within each 1 m grid cell was retained; all other points were discarded. Understory digital terrain models (DTMs) were then generated from these median-filtered leaf-on and leaf-off point clouds using morphological filter software designed for discrete return LIDAR point clouds ( Fig. 1d; Zhang & Cui, 2007;Zhang et al., 2003). This software distinguished terrain points based on elevation differences within varying window sizes around each point within a specified grid mesh. This algorithm enabled convenient batching of multiple filtering runs with different algorithm parameters, a form of optimization that is a common and recommended practice with other filtering algorithm packages (Evans & Hudak, 2007;Sithole & Vosselman, 2004;Tinkham et al., 2012;Zhang et al., 2003), and has previously been used across a range of different forest types, including high biomass redwood forests of the Pacific northwest (Gonzalez et al., 2010), Florida mangroves (Simard et al., 2006;Zhang, 2008) and in prior studies at similar sites (Dandois & Ellis, 2010). Ordinary Kriging was then used to interpolate 1 m raster DTMs from terrain points using ArcGIS 10.0 (ESRI, Redlands, CA; Popescu et al., 2003).
Ecosynth DTM error was evaluated across 250 m × 250 m sites as a whole relative to slope and land cover classes (Clark et al., 2004) following NSSDA procedures (Flood, 2004). Land cover across the Knoll and Herbert Run sites was manually interpreted and digitized in ArcGIS 10.0 from a 2008 leaf-off aerial orthophotograph (0.6 m horizontal accuracy, 0.3 m pixel resolution, collected 2008/03/01-2008/04/01) into seven categories: forest (woody vegetation >2 m height), turfgrass, brush (woody vegetation b 2 m height), buildings, pavement, water, and other (i.e., rock rip-rap, unpaved trail). Land cover feature height (e.g., greater or less than 2 m) and aboveground feature outline (e.g., for buildings and forest canopy) was determined from the Ecosynth canopy height model for each study site. The SERC study site was classified as all forest.
LIDAR understory DTMs were generated at UMBC sites using a bare earth point cloud product provided by the LIDAR contractor and interpolated to a 1 m grid using Ordinary Kriging. Despite being collected 5 years prior to the current study, the 2005 LIDAR bare earth product still provided an accurate depiction of the relatively unchanged terrain at the UMBC study sites. A LIDAR understory DTM was generated for the SERC study site using the morphological terrain filter on the set of "last return" points and interpolating to a 1 m grid using Ordinary Kriging.

CHM generation and canopy height metrics
Sets of aboveground point heights were produced from Ecosynth and LIDAR point clouds by subtracting DTM cell values from the elevation of each point above each DTM cell; points below the DTM were discarded (Popescu et al., 2003). To investigate the accuracy of Ecosynth methods, aboveground point heights for Ecosynth leaf-on point clouds were computed against three different DTMs; those from leaf-on Ecosynth, leaf-off Ecosynth, and LIDAR bare earth. LIDAR CHMs were only processed against LIDAR DTMs. All aboveground points ≥2 m in height were accepted as valid canopy points and used to prepare CHM point clouds. CHM point height summary statistics were calculated within 25 m × 25 m subplots across each study site, including median (Hmed), mean (Hmean), minimum (Hmin), maximum (Hmax), and quantiles (25th, 75th, 90th, 95th and 99th = Q-25, Q-75, Q-90, Q-95 and Q-99 respectively). At all sites, Ecosynth and LIDAR CHM metrics were compared with field measured heights of the five tallest trees within each subplot using simple linear regressions (Dandois & Ellis, 2010), although for Knoll and Herbert Run, LIDAR comparisons at these sites must be considered illustrative only: the long time delay since LIDAR data acquisition biases these from any direct quantitative comparisons. At SERC, where Ecosynth and LIDAR were collected only a few days apart in 2011, Ecosynth canopy height statistics were also compared directly with LIDAR height statistics within 25 m × 25 m grid cells overlaid across the SERC study site and compared using simple linear regression. For each site, one outlier was identified (Grubbs, 1969) and removed from analysis where Ecosynth overestimated field height by >10 m due to tree removal (Knoll), tall canopy spreading into a plot with few small trees (Herbert Run), and a plot that had only one large tree and several smaller, suppressed understory trees (SERC).

Prediction of forest aboveground biomass and carbon from 3D point clouds
Ecosynth and LIDAR CHMs were used to predict forest canopy aboveground biomass density (AGB Mg ha −1 ) at all study sites using linear regression to relate canopy height metrics to field based estimates of biomass within forested 25 m × 25 m subplots. Biomass density was estimated by first computing per tree biomass using standardized allometric equations for the "hard maple/oak/hickory/beech" group (Jenkins et al., 2003; AGB = EXP(−2.0127 + 2.4342 * LN(DBH))), summing total AGB per subplot and then standardizing to units of Mg ha −1 . Linear regression was then used to predict subplot AGB from CHM height metrics, with prediction error computed as the RMSE error between observed and predicted AGB values (Drake et al., 2002;Lefsky et al., 1999). Aboveground forest carbon density was estimated by multiplying AGB by a factor of 0.5 (Hurtt et al., 2004). As with estimates of canopy height, AGB predictions obtained from LIDAR at Knoll and Herbert Run would be expected to show large errors due to the large time difference between LIDAR (2005) and field measurements (2011). Nevertheless, AGB predictions were made at all sites using both Ecosynth and LIDAR to demonstrate the general utility of Ecosynth for similar applications as LIDAR.

Repeated seasonal 3D RGB vertical profiles
Computer vision RGB point clouds were used to assess forest spectral dynamics in 3D by producing multiple point cloud datasets of the SERC site in leaf-off (Winter), early spring (Spring 1), spring green-up (Spring 2), mature green (Summer), early senescing leaf-on (Fall 1), and senescing (Fall 2) conditions between October 2010 and June 2012 (Supplement 2). A single 50 m × 50 m sample area was selected for its diverse fall colors, clipped from each point cloud and stratified into 1 m vertical height bins for analysis. Canopy height profiles (CHPs) were then generated for all points within the 50 m × 50 m sample area across the six point clouds, with each 1 m height bin colorized using the mean RGB channel value of all points within the bin. The relative RGB channel brightness (e.g., R/(R + G + B)) was computed based on the mean RGB point color within each 1 m bin (Richardson et al., 2009). A CHP of the sample area was also generated from the leaf-on G-LiHT point cloud for comparison, combining all returns. For each of the six seasonal point clouds, the relative green channel brightness (i.e., G/(R + G + B) = Strength of green: S green ) was extracted for all points within the height bin corresponding to mean field measured canopy height within the 50 m × 50 m sample area (Mizunuma et al., 2013;Richardson et al., 2009

Image acquisition and point cloud generation
Image acquisition flight times ranged from 11 to 16 min, acquiring between 1600 and 2500 images per site, depending mostly on prevailing winds. As wind speeds approached 16 kph, flight times increased substantially, image acquisition trajectories ranged further from plan, and photo counts increased. Wind speeds >16 kph generally resulted in incomplete image overlap and the failure of point cloud generation and were thus avoided. Point cloud generation using commercial SfM software required between 27 and 124 h of continuous computation to complete image processing across 6.25 ha sites, depending in part on the number of photographs (Supplement 2).

Characteristics of Ecosynth point clouds
Ecosynth point clouds are illustrated in Fig. 2 and described in Supplement 2. Point cloud density varied substantially with land cover and between leaf-on and leaf-off acquisitions (Table 1, Fig. 3), with forested leaf-on point clouds generally having the highest densities (Table 1). Densities of leaf-off point clouds were similar across all three study sites (20-23 points m −2 ), and leaf-on densities were similar across UMBC sites (27-37 points m −2 ), but the leaf-on SERC cloud was twice as dense (67 points m −2 ) as the leaf-on UMBC clouds. Point cloud density varied substantially with land cover type at the Knoll and Herbert Run, and was generally highest in types with the greatest structural and textural complexity such as forest, low brush and rock riprap (29-54 points m −2 ; Table 1) and lowest in types that were structurally simple and had low variation in contrast like roads, sidewalks, and turfgrass (7-22 points m −2 ). However, building roof tops had similar point densities to vegetated areas at Herbert Run (35 points m −2 ), where shingled roofs were present, compared with simple asphalt roofs at Knoll.
Point cloud georeferencing accuracies are reported in Table 2. For the Knoll and Herbert Run sites, horizontal georeferencing accuracies of 1.2 m-2.1 m RMSE r and vertical accuracies of 0.4 m-0.6 m RMSE z were achieved using the GCP method. Horizontal and vertical accuracies of 4.1 m and 1.2 m, RMSE r and RMSE z respectively, were achieved for the SERC leaf-off point cloud using the spline method. However, the spline method produced lower horizontal and vertical accuracies (higher RMSE) than the GCP method at the Knoll and Herbert Run sites (RMSE r 3.5 m-5.4 m, RMSE z 1.7 m-4.7 m, Supplement 4). Horizontal and vertical RMSE for LIDAR are generally much lower (0.15 m, 0.24 m, contractor reported).

Digital terrain models
Understory DTMs generated from computer vision are compared with LIDAR bare earth DTMs in Table 3 and Fig. 4. Ecosynth DTM errors were higher under forest cover than in open areas at the Knoll and Herbert Run sites (Fig. 4). Ecosynth leaf-off DTMs more accurately captured understory terrain than Ecosynth leaf-on DTMs (leaf-off RMSE z 0.89 m-3.04 m; leaf-on RMSE z 2.49 m-5.69 m; Table 3). At the Knoll, DTM difference maps between Ecosynth leaf-off and LIDAR (Fig. 4c) revealed large error sinks (b− 5 m) in the north-west, north-east, and southern portions of the study site. Leaf-on DTMs generally overestimated understory terrain elevation at all three study sites (Fig. 4c) resulting in spikes of error (>5 m) compared to LIDAR DTMs. At all sites, DTM differences between Ecosynth and LIDAR were larger in forest compared with non-forest areas ( Fig. 4c and d; Table 4).

Canopy height, biomass and carbon estimates
Use of Ecosynth and LIDAR CHMs to predict field-measured tree heights across forest subplots at all sites is described in Table 5 and plotted for Ecosynth only in Fig. 6. At the Knoll and Herbert Run sites, results demonstrate that Ecosynth CHMs adequately predicted field-measured heights of the five tallest trees per subplot (i.e., average maximum height, AvgTop5) when either Ecosynth leaf-off (R 2 0.82-0.83) or LIDAR DTMs (R 2 0.83--0.84) were used. When Ecosynth leaf-on DTMs were used, the quality of canopy height predictions was much lower (R 2 0.62-0.67). For the SERC site, Ecosynth predictions of field measured canopy height were very low for all DTMs (R 2 0.07-0.30) and lower than would be expected when LIDAR was used to estimate field heights (R 2 0.50). For Ecosynth, field height prediction errors with the leaf-off DTM (3.9-9.3 m RMSE) were generally higher than when the LIDAR DTM was used (3.2-6.8 m RMSE) but lower than when the leaf-on DTM was used (7.1-10.9 m RMSE). LIDAR CHMs at Knoll and Herbert Run showed a strong relationship to field measurements (R 2 0.71 & 0.77), but had larger errors (RMSE 5.7 & 5.4 m) as expected given the 5 years elapsed between LIDAR and field measurements. At SERC, estimates of error between Ecosynth and LIDAR predictions of field canopy height were comparable (RMSE 3.3 & 3.6 m). Direct comparison of Ecosynth and LIDAR CHMs at SERC, where data was collected only days apart, also revealed strong agreement between the two sensor systems (R 0.87, RMSE 2.3 m; Supplement 5), suggesting that the two sensors were characterizing the canopy with a similar degree of precision.
Aboveground biomass (AGB) predictions from Ecosynth and LIDAR CHMs at all sites are shown in Table 6. For Knoll and Herbert Run, Ecosynth predictions of field estimated AGB showed relatively strong relationships, but also relatively high error (R 2 0.71 & 0.73; RMSE 94 & 87 Mg ha −1 , Table 6), with errors representing approximately 31-36% of field estimated per subplot mean AGB densities from allometric equations. LIDAR predictions of AGB at Knoll and Herbert Run showed similar relationships to those from Ecosynth, but with 3-9% more error relative to field estimated mean AGB (R 2 0.63 & 0.72; RMSE 101 & 107 Mg ha −1 ), a result that is expected given the time lag between LIDAR data acquisition and field measurements. At SERC, where Ecosynth and LIDAR data were collected at approximately the same time, the close resemblance of AGB predictions (Table 6) provides strong evidence that these systems generally yield similar estimates of AGB and aboveground carbon, which is approximated by multiplying AGB by a factor of 0.5 (Hurtt et al., 2004).

Vertical canopy profiles
Vertical canopy height profiles (CHPs) of Ecosynth CHMs are shown in Fig. 7 for a selected 50 m × 50 m sample area at SERC, illustrating the relative frequency of points within 1 m height bins and their mean RGB color. CHPs from the Spring 2, Summer, Fall 1 and Fall 2 time periods (Fig. 7g, i, and k) showed a similar vertical density profile as the single leaf-on LIDAR acquisition at this site. However, at the same time periods, Ecosynth observed almost no points in the understory and at ground level when compared with both LIDAR and Ecosynth Winter and Spring 1 scans ( Fig. 7a and c). Mean RGB channel brightness was fairly constant across the vertical profile in each time period, except in the Spring 1 and Fall 1 time periods, with slightly lower green and higher blue levels at the top of the canopy under early senescing conditions (Fall 1, Fig. 7j), and slightly higher green at the same height under early spring conditions (Spring 1, Fig. 7d). Time-series comparison of MODIS NDVI (MOD13Q1) for 2011 and Ecosynth S green for the 38-39 m height bin for the corresponding day of year (DOY) is shown in Fig. 8. For the observed time periods, the time series pattern of Ecosynth S green closely matched that of MODIS NDVI and corresponding NDVI and S green values were highly correlated (R 2 0.87).

Ecosynth Canopy Height Models (CHMs)
Ecosynth CHMs produced strong predictions of field-measured tree heights at the Knoll and Herbert Run (R 2 0.82-0.84, Table 5, Fig. 6), well within the typical range of LIDAR predictions (Andersen et al., 2006;Wang & Glenn, 2008), except when Ecosynth leaf-on DTMs were used for CHM generation (R 2 0.62-0.67, Table 5). At the SERC site, Ecosynth CHM predictions of field-measured tree height were very weak (R 2 0.07-0.30) regardless of DTM used as were LIDAR CHM predictions (R 2 0.50, Table 5). Weaker prediction power of Ecosynth and LIDAR CHMs at SERC may be explained by the  of plot level canopy height predictions from small footprint LIDAR, which are generally between 1 and 3 m RMSE, (Andersen et al., 2006;Clark et al., 2004;Hyyppä et al., 2008). Errors in LIDAR predictions of field-measured tree heights at the Knoll and Herbert Run (RMSE 5.7 & 5.4 m) are readily explained by the 5 year time lag between LIDAR acquisition and field measurements. While comparisons of Ecosynth and LIDAR CHMs at the Knoll and Herbert Run sites are biased by the 5 year time lag since LIDAR acquisition, a number of ecologically relevant changes in canopy structure are observable in Fig. 5a and b. At the Knoll, Ecosynth revealed a large tree gap just north of center in the forest where a large beech had fallen down after the LIDAR acquisition, the planting of about 30 small ornamental trees (≈10 m height) to the south-east of the main forest area, and general increases in tree height over 5 years. At Herbert Run, rapid growth of black locust, honey locust and green ash trees is visible in a recovering riparian forest area (below road). Errors in Ecosynth canopy height predictions are less well understood than those for LIDAR, but include some similar sources, including errors in measuring tree height and location in the field, DTM error, and errors introduced by limitations of the sensor system (Andersen et al., 2006;Falkowski et al., 2008;Hyyppä et al., 2008). With LIDAR, lower flight altitudes generally produce more accurate observations of the forest canopy, at the cost of reduced spatial coverage (Hyyppä et al., 2008). Ecosynth images were acquired at much lower altitudes than typical for LIDAR (40 m above canopy vs. > 350 m, Supplement 2), but it is not known if higher altitudes, which would increase the spatial coverage of observations, would also reduce the accuracy of height measurements, or even increase it. The point densities of Ecosynth were comparable to the dense point clouds produced by the NASA G-LiHT LIDAR at SERC (Table 1), but it is not known whether Ecosynth point densities are correlated with the accuracy of canopy height estimates. LIDAR studies indicate that estimates of height, biomass, and other structural attributes are relatively robust to changes in point cloud density down to 0.5 points m −2 (Naesset, 2009;Naesset & Gobakken, 2005;Treitz et al., 2012).
In Ecosynth methods, overstory occlusion limits observations and point densities lower in the canopy. Leaf-on DTMs were therefore of much lower quality than those from leaf-off conditions, lowering the accuracy of CHMs that can be produced in regions without a leaf-off period. However, repeated estimates of forest canopy heights confirm that Ecosynth methods are robust under a range of forest, terrain, weather, flight configuration, and computational conditions. For example, at the Knoll site, two leaf-on image collections acquired with different canopy conditions and color due to autumn senescence, different lighting conditions (clear and uniformly cloudy; Supplement 2) and were processed using different versions of computer vision software (Photoscan v0.8.4 and v0.7.0), yet these produced canopy height estimates comparable to field measurements when LIDAR or Ecosynth leaf-off DTMs were used (Knoll leaf-on 1 R 2 0.83 & 0.82; leaf-on 2 R 2 0.84 & 0.83; Table 5). Nevertheless, the accuracy and density of Ecosynth point clouds do appear to be sensitive to a number of poorly characterized factors including camera resolution, flight altitude, and the SfM algorithm used for 3D processing, justifying further research into the influence and optimization of these factors to produce more accurate estimates of vegetation structure.

Predictions of aboveground biomass and carbon
At Knoll and Herbert Run, Ecosynth predictions of canopy height metrics and field estimated AGB (R 2 0.71 & 0.73; Table 6) were comparable to those common for LIDAR and field measurements, which have R 2 ranging from 0.38 to 0.80 Popescu et al., 2003;. However, at SERC both Ecosynth and LIDAR predictions of field estimated AGB were lower than would be expected (R 2 0.27 & 0.34). When assessed using cross-validated RMSE as a measure of AGB prediction 'accuracy' (Drake et al., 2002;Goetz & Dubayah, 2011), Ecosynth AGB was also less accurate than field estimates based on allometry and LIDAR (Table 6). In addition to errors in canopy height metrics, AGB error sources include field measurements along with errors in allometric modeling of AGB from field measurements which include uncertainties of 30%-40% (Jenkins et al., 2003). Another limit to the strength of AGB predictions (R 2 ) is the relatively low variation in canopy heights and biomass estimates across this study; higher R 2 values are generally attained for models of forests across a wider range of successional states (Lefsky et al., 1999). For example at SERC, the relatively low variation in subplot AGB (coefficient of variation, CV, 40%) relative to other sites (53% & 65%) may explain the low R 2 and large error in LIDAR estimates of AGB (R 2 0.34, RMSE 106 Mg ha −1 ); at the Knoll and Herbert Run, 2005 LIDAR AGB predictions cannot be fairly compared with those based on 2011 field measurements. Despite their generally lower quality, Ecosynth canopy height metrics can be successfully combined with field measurements of biomass, carbon or other structural traits (e.g., canopy bulk density, rugosity) to generate useful high-resolution maps for forest carbon inventory, fire and habitat modeling and other research applications Skowronski et al., 2011;.

Observing canopy spectral dynamics in 3D
Vertical profiles of forest canopy density and color generated from Ecosynth point clouds reveal the tremendous potential of computer vision remote sensing for natively coupled observations of vegetation structure and spectral properties at high spatial and temporal resolutions (Figs. 7, 8). Canopy structure observed by LIDAR and Ecosynth 4 days apart at SERC under early fall (Fall 1) conditions yielded similar 3.31 4.13 7.14 1.16 2.28 a Accuracy is reported for the "GCP method", table of accuracies of the "spline method" for these sites is provided in Supplement 2. b Accuracy could not be assessed because canopy completely covered GCPs. c Georeferenced using aerial GPS "spline method". point densities across canopy height profiles with a notable peak around 30m (Fig. 7i), and similar densities were observed under senescing (Fig. 7k) and summer conditions (Fig. 7g). As expected, under leaf off-conditions Ecosynth point density showed a completely different pattern, with the highest density observed near the ground (Fig. 7a). Comparison of green and senescent canopy color profiles reveals a shift in relative brightness from "more green" to "more red" (Fig. 7j & l), caused by increasing red leaf coloration in deciduous forests during autumn senescence that has also been observed in annual time series from stationary multispectral web cameras in deciduous forests in New Hampshire, USA (Richardson et al., 2009). Under leaf-off conditions, colors were fairly constant across the vertical canopy profile with the relatively grey-brown coloration of tree trunks and the forest floor (Fig. 7b). During spring green-up, the canopy profile showed a strong increase in canopy density in the upper layers, likely due to the emergence of new small leaves and buds (Fig. 7c), and this is confirmed by slight increases in relative green brightness at the top of the canopy (Fig. 7d). Summer and fall canopy profiles (Summer, Fall 1, Fall 2) had greater point densities than LIDAR at the overstory peak, but few to no points below this peak (Fig. 7g, i, and k), likely because dense canopy cover under these conditions occluded and shadowed understory features. When cameras cannot observe forest features, they cannot be detected or mapped using computer vision algorithms, a significant limitation to observing forest features deeper in the canopy, especially under leaf-on conditions. Conversely, the spring green-up profile (Spring 2; Fig. 7e) showed a greater density of points in the understory compared to summer and fall profiles, but also a lower peak. This may be due to the fact that Spring 2 photos observed deeper into the canopy from being over-exposed (e.g., brighter but with reduced contrast) due to changes in illumination during the scan flyover, resulting in more well illuminated shadows (Cox & Booth, 2009).
The strength of greenness (S green ) in the overstory at SERC followed a similar pattern throughout the growing season as the MODIS NDVI time series over the same area and S green was highly correlated with corresponding MODIS NDVI DOY values (R 2 0.87; Fig. 8), suggesting that Ecosynth may be a useful proxy for NDVI. NDVI measured with satellite remote sensing provides strong predictions of ecosystem phenology and dynamics (Morisette et al., 2008;Pettorelli et al., 2005;Zhang & Goldberg, 2011) and high spatial resolution, near surface observations obtained with regular digital cameras can provide more detailed information to help link ground and satellite based observations (Graham et al., 2010;Mizunuma et al., 2013;Richardson et al., 2009). High spatial and temporal resolution 3D-RGB Ecosynth data provides an additional level of detail for improving understanding of ecosystem dynamics by incorporating information about canopy 3D structural change along with color spectral change. The increase in S green at the top of the canopy in the Spring 1 point cloud may be associated with the dominant L. tulipifera (tulip-poplar) tree crowns within the forest, which are expected to green-up first in the season (Supplement 6; Parker & Tibbs, 2004).
Unlike current LIDAR image fusion techniques, Ecosynth methods natively produce multispectral 3D point clouds without the need for high precision GPS and IMU equipment, enabling data acquisition using inexpensive, lightweight, low altitude UAS, thereby facilitating routine observations of forest spectral dynamics at high spatial resolutions in 3D, a new and unprecedented observational opportunity for forest ecology and environmental management. Ecosynth methods may also complement LIDAR image fusion collections by enabling high frequency observations of forest canopy dynamics in between infrequent LIDAR acquisitions, with LIDAR DTMs enhancing Ecosynth CHMs in regions where forests do not have a leaf-off season.

General characteristics of Ecosynth 3D point clouds
Ecosynth point clouds are generated from photographs, so 3D points cannot be observed in locations that are occluded from view in multiple photos, including understory areas occluded by the overstory, or in areas masked in shadow, leading to incomplete 3D coverage in Ecosynth datasets. In contrast, LIDAR provides relatively complete observations of the entire canopy profile, from top to ground, even in leaf-on conditions, owing to the ability of laser pulses to penetrate through the canopy (Dubayah & Drake, 2000). Nevertheless, 3D point clouds produced using UAS-enhanced Ecosynth methods compare favorably with those from aerial LIDAR, though positional accuracies of Ecosynth point clouds were significantly lower (horizontal error: 1.2 m-4.1 m; vertical error: 0.4 m-1.2 m; Table 2) than those derived from LIDAR (0.15 m, 0.24, contractor reported). While lower positional accuracies are certainly an important consideration, accuracies in the one to four meter range are generally considered adequate for most forestry applications (Clark et al., 2004), and are consistently achieved by Ecosynth methods under all conditions. Ecosynth point cloud densities (23-67 points m −2 ) were substantially higher than those common for commercial LIDAR products (1.5 points m −2 ; UMBC sites) and were comparable with  Fig. 3) and also in forested versus non-forested areas within sites. Structurally homogenous and simple land surfaces (e.g., rooftops, open grass, pavement) produced far fewer points when compared with structurally complex surfaces (e.g., forest canopy, riprap and low brush; Table 1). This higher point density in tree covered areas is most likely the result of high textural variation in image intensity and/or brightness, which are the basis for feature identification in computer vision (de Matías et al., 2009), though the greater height complexity of forested areas is probably also a factor. Regardless of mechanism, the density and accuracy of 3D point clouds produced by Ecosynth methods across forested landscapes are clearly sufficient for general forestry applications.
4.5. Practical challenges in producing Ecosynth point cloud measurements 4.5.1. Image acquisition using UAS UAS image acquisition systems generally performed well, but required significant investments in training and hardware. Operator training and system building required six weeks and was accomplished using only online resources. To maintain image acquisition capabilities on demand in the face of occasional aircraft damage and other issues, it was necessary to purchase and maintain at least two, or better, three fully functional UAS imaging systems, an investment of approximately $4000 for the first unit and $3000 for additional units (some equipment was redundant). Automated UAS image acquisitions by trained operators were mostly routine (the 9 scans of this study were acquired in 11 acquisition missions), enabling repeated acquisitions on demand across 6.25 ha sites using the same flight plan with flight times b15 min. The only major limitations to acquisition flights were precipitation and wind speeds >16 kph, which caused significant deflection of the aircraft and incomplete image acquisitions. Technological developments in hobbyist-grade UAS are very rapid and accelerating, improving capabilities, driving down prices and increasing availability, as exemplified by the rapid growth and spread of the open-source Ardupilot platform (e.g. http://code.google.com/p/arducopter/wiki/ArduCopter) and the DIYdrones online community (http://diydrones.com).

Computation
The commercial computer vision software used in this study required >27 h to produce a single 3D point cloud across a 250 m × 250 m site when run on a high-end computer graphics workstation with full utilization of all CPU and RAM resources (Supplement 2). The widely available open source computer vision software, Bundler (Snavely et al., 2008), would likely take more than one month to produce similar results. These computational limits are being overcome by more rapid and efficient open-source computer vision algorithms now under development, utilizing parallel processing (e.g. Agarwal et al., 2009) and Graphical Processing Units in calculations (Wang & Olano, 2011), and by redesigning the computer vision processing pipeline to incorporate the sequential structure of image acquisitions (Wang & Olano, 2011).

Georeferencing
Two different georeferencing techniques were used to produce Ecosynth point clouds: one based on ground markers visible from the air (GCP method) and one based on the aircraft GPS path (spline method). As would be expected from the relatively low precision of the inexpensive lightweight GPS in the UAS, the spline method consistently produced point clouds with lower horizontal and vertical RMSE (4.3 m, 2.5 m; Supplement 4) than the GCP method (1.7 m, 0.6 m; Table 2). This is likely the major source of the large georeferencing errors observed at the SERC site (4.1 m, 1.2 m), where ground markers were obscured by the tree canopy. Use of a more precise (and expensive) aircraft GPS, improving the georeferencing accuracy of the spline method algorithms, and the development of other techniques for georeferencing without GCP markers would be useful foci for future research, as field marker placement is both time consuming and fruitless in closed canopy forests without regular canopy gaps. One possible solution may be the development of algorithms that combine aerial GPS locations directly with the camera intrinsic parameters solved for by computer vision algorithms (Xiang & Tian, 2011). It might also be useful to improve the georeferencing accuracy of the GCP method by more accurately surveying GCP marker locations (mapping-grade GPS was used in this study)-a relevant consideration at field research sites where permanent GCP markers can be established to facilitate repeated data collections.

Terrain models
DTM accuracy fundamentally constrains the accuracy of canopy height and related measures of vegetation structure (Andersen et al., 2006;Wang & Glenn, 2008). Ecosynth DTMs showed large deviations from LIDAR DTMs (Fig. 4), which are expected to have elevation precisions of approximately ± 2m RMSE depending on many factors not specifically evaluated in the current study (Gatziolis et al., 2010;Kobler et al., 2007;Tinkham et al., 2011). As would be expected, the precision of Ecosynth DTMs was highest under leaf-off conditions (RMSE z 0.73 m to 2.72 m) compared with leaf-on acquisitions (3.37 m to 5.69 m; Table 3; Fig. 4), and were also more precise in the non-forested areas of the Knoll and Herbert Run (0.60 to 4.49 m) compared with forested areas. Nevertheless, Ecosynth leaf-off DTMs accurate to within 1 -3 m RMSE error when compared to LIDAR DTMs can be produced that are adequate for estimating and mapping forest canopy heights.
In Tropical Moist Forests and other regions without leaf-off periods, the limited leaf-on DTM accuracy of Ecosynth methods remains a significant challenge to producing accurate measurements of  (Table 5) of field measured average maximum height per subplot (AvgTop5) across forested areas of the Knoll (a), Herbert Run (b), and SERC (c) sites. Ecosynth canopy heights estimated using LIDAR DTMs. Linear regression lines (dashed), R 2 , linear models, and RMSE (m) are presented for each comparison, with solid gray reference lines along the one-to-one ratio. Circled data points are outliers based on Grubb's test (>3 SD from the mean) and are not included in regression analysis.   Table 5). Overhead views (on black background) and RGB channel brightness are shown without color enhancement. CHPs show mean RGB color per 1 m bin, uniformly scaled to enhance brightness and contrast for viewing using min-max linear correction. vegetation structure. Even under leaf-off conditions, there are multiple challenges to producing accurate Ecosynth DTMs. Leaf-off point densities in forested areas of Herbert Run (Fig. 3b) were much lower than at other sites; non-forest densities were comparable (Table 1). Some inherent characteristic of Herbert Run forests might explain this, but differences in lighting conditions offer a stronger explanation. Imagery used for Herbert Run DTM generation were collected under overcast conditions, in contrast with the Knoll (partly cloudy) and SERC (clear), where brighter understory illumination may have enhanced computer vision point recognition and produced deeper and denser understory point clouds. Further study of the effects of lighting and other scene conditions may help identify more optimal strategies for Ecosynth DTM production. A second challenge in Ecosynth DTM production is terrain filtering. Even after noise filtering to remove extreme outliers, Ecosynth DTMs tended to retain large sinks caused by low outliers in the terrain point cloud that were not removed by terrain filtering algorithms, which were designed for LIDAR point clouds (Sithole & Vosselman, 2004). These sinks are clearly visible in the north-east, north-west, and southern part of the Knoll leaf-off DTM ( Fig. 4b and c). DTM accuracy is generally influenced by terrain slope, vegetation cover and by the type of filtering algorithm employed (Sithole & Vosselman, 2004;Tinkham et al., 2011;Tinkham et al., 2012), with the greatest accuracies usually achieved by manual filtering (Gatziolis et al., 2010;Kobler et al., 2007). Improved terrain filtering algorithms designed specifically for Ecosynth DTM production would likely create stronger results than those designed for LIDAR point clouds-another useful area for future study.

Advancing computer vision remote sensing
By combining automated UAS image acquisition with state-ofthe-art computer vision algorithms, consistent and repeatable high-spatial resolution 3D point clouds of vegetation were produced across study sites with practical levels of computer resources, largely addressing the major challenges raised in prior work (Dandois & Ellis, 2010). Yet substantial room remains to improve understanding of the parameter space of computer vision remote sensing systems (Table 7). With LIDAR, observational error models and the effects on accuracy of different sensor parameters including altitude and scan resolution are well understood thanks to decades of research (Glennie, 2007;Naesset, 2009). With Ecosynth, basic questions remain about the effects on accuracy of basic elements of the remote sensing system (e.g., the platform, camera, processing algorithms, etc.,) and the conditions of observation (e.g., wind, illumination, forest type and phenology, etc.), and these parameters likely interact in determining the quality and accuracy of Ecosynth results. It is also not clear precisely how computer vision algorithms "see" canopy structure to identify features in imagery (e.g., leaves, branches, gaps, etc.), and how ecologically relevant spectral information might be better acquired by these algorithms. Future investigations of these factors influencing Ecosynth data quality and accuracy across a range of different forest types should enable a more complete understanding of how Ecosynth methods can be optimized to measure forest structural and spectral traits and their dynamics.

Conclusions
Ecosynth methods produce coupled spectral and structural observations at the high spatial and temporal resolutions required to observe vegetation phenology in 3D, portending new approaches to observing and understanding the dynamics of woodland ecosystems. Moreover, Ecosynth yields 3D forest measurements and mapping products comparable to LIDAR and field-based methods at low economic and logistical costs, facilitating multispectral 3D scanning of vegetation on demand at landscape scales (b1 km 2 ) by end users of these data, heralding a new era of participatory remote sensing by field ecologists, community foresters, and even the interested public. Applications of Ecosynth range from high spatial resolution 3D observations of vegetation phenology at the cutting edge of ecological research, to the monitoring of forest carbon stocks or habitat quality by local land managers and conservation groups (Goetz & Dubayah, 2011). This is only the beginning of the transformation of remote sensing by computer vision technologies. By combining inexpensive imagery with computation for 3D canopy reconstruction, computer vision remote sensing systems can be made ever more light-weight, inexpensive and easy to use. As computing powers increase, Ecosynth and related methodologies might ultimately enable multispectral 3D remote sensing on demand by anyone with a cell-phone.  (Fig. 7) at the 38 m-39 m height bin; error bars are standard deviation. Linear regression line (dashed), R 2 , and linear models are presented in subset, with solid gray reference line along the one-to-one ratio. Table 7 Key factors influencing the quality of data obtained by computer vision remote sensing.

Factor
Effects on data quality Platform Altitude, speed, and flight path overlap affect the detail and depth of canopy that can be observed. Camera angle and potentially camera array structure may affect point densities, detail and depth of observations into canopy. Camera Resolution, frame rate, overlap, exposure, color settings, spectral channels (RGB, NIR) may all affect feature identification and matching, resulting in different point cloud spectral properties and densities. Algorithms Algorithms for feature identification, feature matching, use of secondary densification algorithms, color assignment to features, and camera calibration may affect point cloud 3D model accuracy, density and spectral properties. Georeferencing UAS GPS and GCP quality affect spatial accuracy of point clouds and estimates of vegetation structure. Post-processing, filtering Different filtering algorithms (e.g., DTM filtering) affect accuracy in terrain and canopy height models Wind Route following errors can reduce image overlaps, moving leaves and branches limit feature matching and generate positional errors. Illumination Brighter light/full sun increase shadow, leading to decreased penetration in CHP. Diffuse lighting appears to increase penetration in CHP but also lowers contrast, reducing feature identification. Forest: type, species, phenology