A review on stereo vision for feature characterization of upland crops and orchard fruit trees

Md Rejaul Karim; Shahriar Ahmed; Md Nasim Reza; Kyu-Ho Lee; HongBin Jin; Mohammod Ali; Sun-Ok Chung; Joonjea Sung

doi:10.22765/pastj.20240008

Preview

Review Article

Precision Agriculture Science and Technology. 30 June 2024. 104-122
https://doi.org/10.22765/pastj.20240008

A review on stereo vision for feature characterization of upland crops and orchard fruit trees

Md Rejaul Karim¹

Shahriar Ahmed¹

Md Nasim Reza¹²

Kyu-Ho Lee¹

HongBin Jin²

Mohammod Ali¹

Sun-Ok Chung¹²^*^†

Joonjea Sung³^†

¹Department of Agricultural Machinery Engineering, Graduate School, Chungnam National University, Daejeon 34134, Republic of Korea

²Department of Smart Agricultural Systems, Graduate School, Chungnam National University, Daejeon 34134, Republic of Korea

³FYD Company Ltd., Suwon 16676, Republic of Korea

^{*Corresponding Author}

^†These authors equally contributed to this study as corresponding authors.

License (open-access, https://creativecommons.org/licenses/by-nc/4.0/):

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

ABSTRACT

Characterization of plant features is essential for effective management of plant growth monitoring, and precise management practices in crop production. Stereo vision captures multiple perspective images to create a three-dimensional (3D) representation, enabling thorough analysis of crop structure and morphology. The objective of this study was to review the application of stereo vision in feature characterization of plants and fruit trees. Various features of plants such as height, canopy volume, plant spacing, intra-row spacing, and leaf area were surveyed for their characterization potential along with several data acquisition and data processing algorithms consisting of image segmentation, 3D image reconstruction, depth mapping, and disparity mapping. The study found out some results regarding the feature characterization of plants and fruit trees using stereovision. The tree canopy estimation results showed 6-7% error for elliptical and 2-3% error for conical shaped trees as well as for corn plants detection with an accuracy of 96.7% under natural light conditions. From a maximum distance of 5 cm and 1 cm, the errors were observed with the detection accuracy of 74.6% and 62.3%, respectively. The plant height of cabbage, potato, sesame, radish, and soybean were estimated with a R² value of 0.78 to 0.84 and with an error less than 5%. Stereo vision achieved 97% precision with an RMSE of 0.016 m in wheat height measurement and distance of 20 m with errors below 5% for hazelnut trees. By addressing challenges and exploring various techniques, the paper concluded by summarizing key findings and suggesting directions for further research in plant growth and crop production practices.

Keywords

Precision agriculture

Stereo vision

Stereo matching

Depth mapping

Orchard

MAIN

Introduction
Working principle of stereo vision sensor and calibration process
Algorithms for plant feature characterization
Application of stereo vision for feature characterization in crops
Application of stereo vision for feature characterization in fruit trees
Factor affecting stereo vision and stereo matching algorithm
Challenges of stereovision imaging and solutions
Conclusions

Introduction

Feature characterization is the process of extracting and describing unique features or points of interest in an image which is necessary for applications such as object tracking, 3D reconstruction, and monitoring of plants. Stereo vision systems can be used for detailed 3D characterization and feature extraction of plants by combining RGB color data with depth information. Features characterization of plants such as plant height, canopy volume, row distance, and plant spacing are essential for understanding and managing various aspects of plant growth and agricultural practices. Characterizing the features are crucial in upland crops and fruits cultivation for optimizing resources, yield estimation, monitoring and management, harvest planning, resource allocation, research, and development for promoting sustainable agricultural practices. Stereovision involves using two or more cameras to capture images of the same scene from different viewpoints of plants. These images are then processed to extract corresponding feature points and calculate the 3D coordinates through triangulation. It is necessary for applications such as autonomous driving, robotics, and 3D reconstruction that require depth perception (Amean et al., 2013). It provides high-resolution and accurate depth maps, especially for close-range objects and textured surfaces. Stereo vision can work in various lighting conditions and is cost-effective compared to other depth sensing technologies. Stereo vision enabled feature characterization of fruit trees and upland crops using different depth maps and disparity maps (Hu et al., 2020; Hao et al., 2022; Zhong et al., 2021), as well as applied in autonomous driving in agriculture (Feng et al., 2020; Muhovic and Pers, 2020), robot navigation (Yu et al., 2019; Zhang et al., 2022), and agro-industrial inspection (Chen and Shen, 2023; Zhou et al., 2020). Some key methods are followed for feature characterization of plants using stereovision such as (i) structural characterization which construct the 3D structure and geometry of plants by generating dense point clouds from the depth data, measures plant height, leaf area, biomass, and other structural traits from the 3D models, and analyze growth patterns and morphological changes over time by registering 3D models from different time points (Ruigrok et al., 2024; Dandrifosse et al., 2020), (ii) Spectral characterization that leads to estimate plant health parameters such as chlorophyll content, nutrient deficiencies from the spectral data (Yoon and Thai, 2010), (iii) Deep learning for feature extraction that uses RGB-D data from stereo cameras as input to deep learning models for plant detection and segmentation in dense scenes. This method also trained convolutional neural networks (CNNs) on stereo data to automatically extract features such as leaf angles, plant architecture (Xiang et al., 2023).

Stereo vision was applied in different types of smart agricultural activities particularly in cultivation of upland crops and fruit tress such as detecting, mapping and digitizing canopy geometry with different plant architecture (Scalisi et al., 2024), geometric characterization of trees (Rossel and Sanz, 2012), crop height estimation (Kim et al., 2021), orchard and tree mapping (Nielsen et al., 2009), automatic plant feature recognition (Amean, 2017), characterization of upland crop plants (Xiang et al., 2023). It was an important technique for quantification of several features of plants such as canopy height, canopy volume, plant and row spacing for smart agriculture practices (Malekabadi et al., 2019; Bao et al., 2019; Dandrifosse et al., 2020; Guo et al., 2018; Kim et al., 2020; Ni and Burks, 2018; Shan et al., 2018; Rovira-Mas et al., 2010; Milien et al., 2012). Stereo vision also enabled the reconstruction of 3D images of plants and trees (Lee et al., 2010; Sanz-Cortiella et al., 2011a, 2011b; Rosell and Sanz, 2012; Ni and Burks, 2013; Usha and Singh, 2013; Malekabadi et al., 2019). Stereo vision offers cost-effective and rapid plant structural information, including growth patterns and 3D geometry reconstruction. It facilitates measurements of plant height, convex hull volume, surface area, and stem diameter, surpassing other several sensors used in agriculture (Dandrifosse et al., 2020; Ni et al., 2016; Ni and Burks, 2013; Ni and Burks 2018). For these circumstances, stereo vision is getting more popular for feature characterization of plants.

A review study highlighted the stereo-vision where a custom image processing algorithm was used to calculate geometric features such as leaf area and plant dimensions of Boston lettuce growth in a plant factory with promising results (Yeh et al., 2014). The stereo vision was added to other sensors and measurement techniques used in plant features characterization such as laser sensors, depth cameras, LiDAR, high-resolution radar, ultrasonic sensors, digital photographs, and high-resolution X-ray computed tomography (Rosell and Sanz, 2012; Malekabadi et al., 2019; Muller-Linow et al., 2015). Stereo vision was used for disparity mapping as well as for analyzing tree and plant geometry, assisting in distance measurement from a stereo vision camera which was attached with agricultural machinery. Stereo vision was utilized for object detection, fruit recognition, and growth stages monitoring through accurate 3D reconstruction with depth mapping (Tavares and Vaz, 2009). It is preferred for plant feature characterization due to the advantages in providing cost-effective and rapid 3D imaging capabilities, adaptability to natural field condition, overcoming capabilities to the challenges such as homogeneous leaves texture and complex canopy architecture of plants. The suitability of outdoor imaging under varying illumination conditions, producing high-resolution depth, and color information made the stereo vision crucial for plant feature characterization. Given the varied advantages and extensive applicability of stereo vision technology, this review aimed to provide an overview of the use of this technology in characterizing features of upland crops and fruit trees.

Working principle of stereo vision sensor and calibration process

Stereo vision sensors are widely used in various agricultural applications to provide depth perception and 3D information about the objects and environment such as plant health monitoring such as plant height, canopy volume measurement using disparity and depth maps, autonomous navigation of agricultural vehicles such as tractors, combine harvesters under agricultural field conditions, as well as inter plant spacing measurement using stereo vision sensors (Jin and Tang, 2009; Vazquez-Arellano et al., 2016). Several stereo vision sensors usually used for stereo vision technology are shown in Table 1 with their specifications.

Stereo vision uses two identical cameras to capture dual images of a target from different perspectives. It is found to be leading to infer depth from two images acquired from different viewpoints. Fig. 1 represents the working principle of the stereo vision techniques. Stereo vision generates depth maps from images that reflect the triangular similarity of rays from multiple viewpoints, by mimicking human visual perception. The working principle determines the quantifiable depth perception by recording objects with two cameras that are located on a common baseline, with a fixed distance between the two lens centers (Degadwala et al., 2020). Observation of a scene from two slightly different perspectives is the main concept of stereovision that leads to determine the relationship between pixel positions in the images according to the principle of triangulation from where the three-dimensional information extraction can be possible. In Fig. 1, the image coordinate system is illustrated with deviations dl and dr for the left and right images of a stereo pair, respectively. The baseline (b) is the distance from the image planes to the centers of each lens. The offsets are proportional to the distance (Rd) from the object to the camera and this relationship is utilized to compute the depth information of the object. Using the value of dl and dr, the disparity (Dp) is calculated according to the equation (1).

(1)

Dp = dl - dr

https://cdn.apub.kr/journalsite/sites/kspa/2024-006-02/N0570060204/images/kspa_2024_062_104_F1.jpg

Fig. 1.

Working principle of stereo vision.

Table 1.

Several stereo vision sensors that are usually used for stereo vision technology.

Sensors	Technical specifications	Purpose	Sensor image
Binocular camera module	Resolution: 8 MP (3280×2464 pixels) Sensors: IMX219 Focal length:2.6 mm, FOV: 83°×73°×50° Distortion: ＜1%, Base length: 60 mm,	Measuring leaf area index (LAI), mean tilt angle (MTA), leaf angle distribution (LAD), and canopy height
ZED 2	Frame rate: 30 fps High definition: 1080p Improved depth perception with neural engine	Remote monitoring and data collection
Bumblebee	Focal length lens: 3.8 mm FOV: 70° HFOV Baseline: 12 cm Frame rates: 1.8 - 30 fps Gain: automatic/manual Power: 8-32V	Estimation of vegetation indices (VIs), chlorophyll content in leaves, LAI, biomass, and plant size
Intel RealSense D415	Ideal range: 0.5 m to 3 m FOV: 65°×40°, Depth resolution: 1280×720 pixels Depth frame rate: 90 fps, RGB resolution: 1920×1080 pixels, RGB FOV: 69°×42°, RGB frame rate: 30 fps, RGB sensor resolution: 2 MP	Weed detection
Intel RealSense D435	Ideal range: 0.2 to 10 m FOV: 65°×40°, Depth resolution: 1280×720 pixels Depth frame rate: 90 fps, RGB resolution: 1920×1080 pixels, RGB FOV: 69°×42°, RGB frame rate: 30 fps, RGB sensor resolution: 2 MP	Plant height estimation
HBVCAM binocular synchronous camera module	Ideal range: 30 cm to infinity, Maximum resolution: 2560×720 pixels	Plant feature characterization

Rd and Dp are calculated using the equation (2).

(2)

Rd = \frac{b \times f}{Dp \times W}

Where, b is the baseline of stereo camera (mm), f is the focal length of the lenses, Dp is the disparity (pixels), W is the pixels size (mm/pixel), and Rd is the Z coordinate distance in a 3D frame (mm).

Similarly, the X and Y coordinates are calculated according to the equation (3) using the pixel values of xl and yl in right image which are the disparity values (Rovira-Mas et al. 2004).

(3)

X = \frac{xl \times W \times R}{f}, Y = \frac{yl \times W \times R}{f}

Precise camera calibration is essential for accurate 3D information, correcting lens distortions and estimating intrinsic parameters such as focal length and principal point (Malekabadi et al., 2019). The calibration methods such as using chessboard patterns optimize the accuracy by minimizing discrepancies in observed features (Kumar et al., 2020; Zhang et al., 2023). Various calibration methods, including manual, Matlab toolbox, and OpenCV-based approaches, are adopted for stereo vision (Rovira-Mas et al., 2010; Bao et al., 2019; Zhang et al., 2022), with Zhang method being popular due to its simplicity and robustness (Sampling and Methods, 2023; Yu et al., 2019; Li et al., 2018; Zhong et al., 2021). Multi-camera calibration methods (Liu et al., 2022), such as using circular plates for feature extraction, enhance the precision (Cui et al., 2016). Calibration significantly impacts the stereo vision accuracy (Feng and Fang, 2021; Sampling and Methods, 2023; Yu et al., 2019), with errors leading to increased uncertainties in 3D reconstruction (Korthals et al., 2018).

Data acquisition and processing techniques

Data acquisition and processing techniques are vital for the effectiveness of stereo vision and matching algorithms. By acquiring these data acquisition and processing techniques, stereo vision systems can achieve robust and accurate depth estimation, essential for various applications like autonomous navigation, augmented reality, and object recognition. Ni et al. (2016) developed a procedure for data acquisition of plants and trees using the stereo vision where two stereo cameras were assembled as parallel with the baseline at 30 mm for reconstruction of the full view of plants and trees. Multiple images from different angles of view were captured where the target plants and trees were in the center, and the stereo vision camera positions were in around the target plants and trees accordingly. Moreover, the images suggested to be captured from the adjacent locations having an overlapping region. Bao et al. (2019) used Phenobot 1.0, an autonomous data acquisition system using the stereo vision and collected over 100,000 stereo images of sorghum from tall crop with dense canopies. The study faced challenges at maturity stage due to plant height (0.5-3 m) and horizontally spreading leaves were blocking for mid-level and top-level camera views. The study also suggested an alternative solution of attaching additional sets of stereo vision camera heads vertically with variable tilting angles. The process consists of image segmentation, 3D image reconstruction, depth mapping, and disparity mapping.

Image segmentation found to be crucial in various computer vision applications, such as 3D reconstruction, classification, object recognition, and motion detection (Mohammed and El-Sheimy, 2019). A study result highlighted the effectiveness of stereo vision in 3D reconstruction, noise-free object segmentation and a segmented disparity map (Mohammed and El-Sheimy, 2019). A specific stereo vision and clustering algorithm were found to provide improved segmentation results compared to methods relying solely on color or geometry (Dal et al., 2012; Mutto et al., 2011; Imaging et al., 2021; Sheng et al., 2020; Zhao et al., 2016). From literature review various methods were found to exist for image segmentation, including the thresholding method, edge-based method, region-based method, watershed method, clustering-based method, and neural-network-based method (Imaging et al., 2021). The thresholding method analyzed the gray-level histogram of full or partial images to generate threshold values, segmented objects by clustering pixels and widely accepted for its simplicity, robustness, and accuracy (Imaging et al., 2021; Li et al., 2011). The edge-based method identified object boundaries by detecting image edges, offering low complexity but susceptibility to noise (Imaging et al., 2021). The region-based method employed pixel feature homogeneity such as gray scale, color, or texture for segmentation, aiming to partition the image into regions with distinct characteristics (Angelina et al., 2012; Khokher et al., 2013).

The watershed method considered an image as a topographic map, utilizing variations in flood water heights and watershed lines for segmentation (Chai et al., 2006; Kang et al., 2009). A high-efficiency hardware accelerator was developed for a self-organizing map (SOM) neural network for implementing unsupervised color segmentation of stereo images in real time (Imaging et al., 2021; Torbati et al., 2014; Ortiz et al., 2014). Another study exhibited that clustering method was found to be leveraged intraclass and interclass homogeneity for optimal segmentation, often using K-means clustering (Imaging et al., 2021), fuzzy c-means (Li and SHEN, 2008), and probabilistic extensions such as the Gaussian mixture model with the expectation–maximization algorithm. That image segmentation was widely adopted for simplicity and accurate segmentation results (Fauvel et al., 2013). Stereo rectification is a process to eliminate lens distortions and standardize image pairs, aligning optical axes, and ensure row alignment of image planes where plant image pairs are rectified using stereo rectification. Stereo rectification aligned the images along epipolar lines and compensated for lens distortion, including the fish-eye effect around the image boundary (Li et al., 2017).

For projective reconstruction of plant and tree canopies with stereo cameras, Ni et al. (2016b) used visual structure-from-motion (VisualSFM) method. The actual 3D points for image pairs were estimated, and the projective reconstruction was transformed into metric reconstruction through rigid transformation. A validation experiment using a hexagon box demonstrated the ability of the method to achieve true size reconstruction. For 3D reconstruction of dormant cherry tree, two Kinect devices found to be used in an indoor environment where some branches were missed for occlusion and being long distant between camera and the tree (Wang and Zhang, 2013). To reconstruct 3D model of horticultural crops, Song and Eng (2008) used stereo vision where the cameras were installed for scanning the top of the crops to reconstruct the top view of the plants. Han and Burks (2013) studied on 3D reconstruction of citrus canopy using stereo depth cameras where 8 points algorithm was used for stitching consecutive images into a mosaic and results did not achieve real size reconstruction. Stereo vision was also found to be used for 3D reconstruction of corn plants (Wang et al., 2009).

Cheng et al. (2016) demonstrated that depth mapping through stereo vision used to estimate a 3D structure of a scene from stereo camera captured images. The study also showed that pixel depth could be estimated by matching pixels and knowing camera geometry. Depth maps found to be applied in robot navigation, driver assistance systems, and autonomous driving in this study. Ansari et al. (2010) mentioned in a study that the depth map enabled to provide distance information, allowing for tasks such as object detection and distance estimation where stereo vision-based depth sensing was found to capture depth at longer ranges with a high frame rate and a larger field of view to make it suitable for both indoor and outdoor applications. The depth map enabled for 3D object detection and 3D reconstruction of objects in agricultural field. Fig. 2 shows the procedure of calculating the depth values from the images for depth mapping according to the equation (4) as follows:

(4)

D = f \frac{b}{d} = f (\frac{x}{x_{l}}) = f (\frac{b}{x_{l} - x_{r}})

Where, D is the depth value from the images, d is the disparity, and the expression x/x_l and x_l-x_r are to determine the disparity between two images.

https://cdn.apub.kr/journalsite/sites/kspa/2024-006-02/N0570060204/images/kspa_2024_062_104_F2.jpg

Fig. 2.

Depth and disparity calculation using stereo vision.

A disparity map is a visual representation of the difference in depth between the corresponding pixels of a pair of stereo images. In agriculture, it is used to monitor the growth of trees, plants, and crops, as well as for crop detection and height measurement in digital farming (Nugroho et al., 2020). The disparity map helps in assessing tree canopy geometric characteristics and is an important tool in precision agriculture. It has diverse applications in robotics, object detection, remote sensing, and autonomous driving those are closely relevant to agriculture (Quan et al., 2023; Shean et al., 2016; Zhou et al., 2020). It enables the automatic measurement of crop height, which is a crucial factor in agricultural management and decision-making. The use of stereo vision systems and the computation of the disparity map contribute to the efficient monitoring and management of agricultural resources and can support various aspects of agricultural innovation and productivity improvement (Malekabadi et al., 2019). Several stereo vision sensors that are usually used are exhibited in Table 1 with technical specifications. A matrix with dimensions corresponding to an image is termed a disparity map. The values within the matrix represent the distances between corresponding pixels in the left and right images of a stereo pair (Malekabadi et al., 2019); Amean (2017) described a binocular stereo model with identical cameras separated by a baseline distance and coplanar image planes for disparity calculation.

Algorithms for plant feature characterization

Stereo matching algorithm and 3D reconstruction algorithm are used for plant feature characterization using stereo vision. Stereo matching algorithms are crucial for characterizing plant features by quickly and affordably measuring and reconstructing 3D structures. The corresponding pixels are identified in multiple views to compute disparities, aiding in applications such as autonomous driving and robotics (Quan et al., 2023; Shean et al., 2016; Zhou et al., 2020). The methods such as energy minimization and comparisons using window were utilized in stereo matching algorithms (Malekabadi et al., 2019; Quan et al., 2023; Yao and Xu, 2019; Zhou et al., 2020). The stereo matching algorithms were found to be categorized into local, global, and semi-global methods (Zhou et al., 2020), with approaches including the matching using the intensity and feature (Chen et al., 2023; Islam et al., 2023; Okura, 2022; Zhong et al., 2021). Stereo matching was enhanced by deep learning (Zhou et al., 2020), with non-end-to-end and unsupervised learning algorithms were showing promise but were facing challenges such as high computational errors and low-quality results (Quan et al., 2023; Tankovich et al., 2021). The techniques such as stereo camera calibration and rectification were crucial for quality 3D reconstruction of field crops (Bao et al., 2019).

Various techniques were used in 3D reconstruction of plants in agricultural fields using stereo vision to address challenges such as capturing non-rigid plants in noisy environments. These techniques included sensor data fusion, real-time reconstruction using depth cameras such as Microsoft Kinect, algorithms based on multi-view image sequences, and integration of machine learning for crop analysis and morphological feature characterization (Chen et al., 2020; Sampaio et al., 2021). These approaches catered to the unique requirements of 3D reconstruction in agriculture, providing advantages such as detailed morphological feature characterization and improved monitoring of plant characteristics. Real-time 3D reconstruction using Microsoft Kinect cameras was found cost-effective for on-the-field applications (Harandi et al., 2023). Integration of stereo matching algorithms with 3D reconstruction techniques could enhance the accuracy and efficiency of plant feature characterization processes. Fig. 3 demonstrates the stereo matching flow diagram of 3D reconstruction techniques of plants.

https://cdn.apub.kr/journalsite/sites/kspa/2024-006-02/N0570060204/images/kspa_2024_062_104_F3.jpg

Fig. 3.

Flow diagram of stereo matching for 3D model reconstruction of plants.

Application of stereo vision for feature characterization in crops

Measurement of plant height and canopy volume

Plant height was mentioned as an important morphological factor for crop growth identification, yield prediction, and crop cultivation management by Kim et al. (2021). The height of cotton plants was estimated using a tractor-mounted setup employing a Kinect-v2 sensor by Jiang et al. (2016), demonstrating its efficacy in real field conditions. A 3D model for measuring cauliflower plants was developed utilizing a Kinect-v1 based algorithm by Andujar et al. (2016), showing only 2 cm variance from the actual height. Depth map of wheat plants was generated by Dandrifosse et al. (2020), where a segmentation mask technique was followed to represent the plant height as the distance between camera-wheat and camera-ground distances (Fig. 4). Stereo vision achieved 97% precision in determining mean spike top heights when compared to manual measurements, with RMSE of 0.016 m. Plant height estimation of five crops including cabbage, potato, sesame, radish, and soybean was conducted using stereo vision by Kim et al. (2021), with the plant height estimated having an R² ranging from 0.78 to 0.84 and less than 5% error for five different crops. Fig. 5 showed the steps of stereo image processing techniques for measuring the crop plant height. Three types of plants (croton, Jalapeno pepper, lemon tree) of varying leaf sizes were reconstructed using a metric reconstruction method for canopy volume estimation where reconstruction accuracy was verified with hexagon box of known volume and wrapped with printed citrus leaf images using a stereo vision. Plant canopy volumes were calculated by bounding boxes and divided into voxels. For canopy volume estimation, unused voxels were removed, and the volume of remaining voxels were summed in the study (Ni et al., 2016).

https://cdn.apub.kr/journalsite/sites/kspa/2024-006-02/N0570060204/images/kspa_2024_062_104_F4.jpg

Fig. 4.

Schematic diagram of processing the stereo images. (A) depth and color processing, and (B) RGB image processing including segmentation, clustering, and depth filtering.

https://cdn.apub.kr/journalsite/sites/kspa/2024-006-02/N0570060204/images/kspa_2024_062_104_F5.jpg

Fig. 5.

Schematic diagram of plant height estimation using stereo vision.

Plant spacing and row distance measurement

Kim et al. (2021) utilized stereo images to generate disparity maps, calculating pixel disparities to determine crop depth for inter-row distance and plant spacing measurements. Qiu et al. (2018) highlighted the laborious nature of conventional inter-plant space measurements, necessitating automatic measurement methods. Stereo vision was suggested for plant spacing and row distance measurements due to difficulties with individual plant separation using colour cameras. Mooney and Johnson (2014) demonstrated a stereo vision-based corn plant sensing technique with promising performance in individual corn plant detection and centre location measurements. Under natural light conditions, 96.7% of plants were correctly detected, with maximum distance errors of 5 cm and 1 cm for 74.6% and 62.3% of detections, respectively. Therefore, stereo vision enables to successfully measure the plant spacing and row distance.

Application of stereo vision for feature characterization in fruit trees

Tree height and canopy estimation

Morphological features characterization of fruit tree is an essential but labor-intensive task in horticulture. Manual measurements are often followed to measure the tree height and canopy measurement but might not be accurate and reliable particularly when the canopy volume was measured. Malekabadi et al. (2019) captured stereo images of plants and generated disparity map of canopy shape where the used algorithm achieved height calculations with errors of less than 7% for both elliptical (6-7%) and conical trees (2-3%). Dong and Isler (2018) used stereo vision techniques to estimate morphological parameters of apple trees such as tree height, canopy volume using alpha-shape algorithm where the bounding boxes were used. The height and volume of the bounding box represented the tree height and volume, respectively. The result showed 4 cm and 3.8 cm trunk diameter of error in height and volume measurement, respectively. Costa et al. (2019) used stereo vision for measuring the hazelnut tree distance and exhibited the distance measurement results with reasonable accuracy as well as error less than 5% in the range at distances lower than 20 m.

Comparison of stereo vision and different proximal sensors

In agriculture, various proximal sensors including LiDAR, time-of-flight cameras, structure-from-motion, and ultrasonic sensors are used alongside the mono and multi-view stereo vision for plant and tree characterization (Hui et al., 2018; Jay et al., 2014; Kazmi et al., 2012; Li et al., 2014; Perez-Sanz et al., 2017; Scharr et al., 2017). LiDAR sensors create precise canopy models but are expensive and often combined with RGB cameras for colour accuracy. Time-of-flight cameras offer quick depth computation but struggle in strong sunlight. Binocular stereo vision is cost-effective for outdoor conditions but faces challenges with stereo matching errors. Multi-view stereo systems enhance depth map quality, while structure-from-motion reconstructs scenes using a single moving camera. In field conditions, stereo vision is a simple and robust technique for studying canopy architecture, although in-field applications for crop detection and leaf characterization have limited comparison to laboratory settings (Leemans et al., 2013; Muller-Linow et al., 2015; Tilneac et al., 2012). Recent comparisons of 3D sensors for plant feature characterization indicate that stereo vision, while sensitive to sunlight and not ideal for outdoor use, can still provide in-depth information without special shading. Sunlight issues are also faced by time-of-flight cameras with no current solution, while multi-view stereo and structure-from-motion methods like binocular stereo can be enhanced with additional cameras or more shots per scene (Li et al., 2014; Perez-Sanz et al., 2017; Qiu et al., 2018; Vazquez-Arellano et al., 2016; Wang et al., 2018; Yuan et al., 2018). Ultrasonic sensors are cost-effective for measuring plant height but lack accuracy for creating 3D models, not such as LiDAR, which offers better accuracy. Stereo vision, though cheaper, smaller, and more flexible, is outperformed by LiDAR in providing color and metadata without multiple sensors and offers better resolution (Jimenez-Berni et al., 2018; Li et al., 2017). Table 2 showed several key considerations between the LiDAR and stereovision sensor which were found to be used for plant characterization.

Table 2.

Comparison of stereo vision with LiDAR.

Key considerations	LiDAR	Stereo vision
Depth perception and accuracy	LiDAR uses active laser scanning to precisely capture 3D point clouds of plant canopy structures and provides enhanced depth accuracy	Stereo vision depends on depth according to the differences between two camera images
Performance and robustness	3D modelling in outdoor conditions, LiDAR provides more robust and accurate data compared to stereo vision	Insufficient surface texture and intense sunlight can negatively influence the performance of stereo vision
Resolution and measurement	Provides more detailed and precise 3D point cloud density map of plant canopy which leads to more accurate measurement	Promising, but the 3D data results influenced by the algorithm application
Adoption and uses	LiDAR and stereo vision are both successfully utilized for morphological feature characterization of plants and more convenient for large scale measurements	Stereo vision has proven as effective plant imaging technique in both indoor and outdoor environments

Factor affecting stereo vision and stereo matching algorithm

Stereo vision and stereo matching, a major technology in computer vision, plays an important role in reconstructing the three-dimensional (3D) structures of the real world from two-dimensional (2D) images. The applications span across diverse fields including autonomous driving, augmented reality, robotics navigation, and agricultural field application. Despite the widespread utility, using stereo vision and stereo matching or disparity estimation for pixel matching across differently exposed stereo or multiview images presents considerable challenges. Traditional stereo vision and matching algorithms confront various limitations that impede their effectiveness in complex scenarios. Direct sunlight affects stereo matching, causing issues like overexposure that impact image quality and algorithm performance (Li et al., 2014). A study evaluating a stereo vision system for cotton row detection and boll location estimation in direct sunlight underscores the importance of considering light conditions in stereo matching algorithm development (Fue et al., 2020). Challenges arise when direct sunlight and leaf reflection deviate from specified constraints, causing variations in pixel intensities across stereoscopic images, posing difficulties in stereo matching, especially in sunlight zones, and affecting the clarity of leaf texture (Muller-Linow et al., 2015). Sunlight also affects stereoscopic system design, with baseline and affected camera height (Li et al., 2017). Field crop stereo matching is influenced by factors such as lighting conditions, homogeneous colors, dense canopies causing occlusion, intense sunlight leading to specular reflection, and challenges in preserving thin structures such as stems (Bao et al., 2019). Challenges in stereo matching include color inconsistencies, varying illumination, sensor differences, and specular reflections (Dattagupta, 2012). Solutions to address sunlight sensitivity include enhancing stereo matching algorithms with the census transform and using a shadowing device to minimize sunlight impact (Dandrifosse et al., 2020).

Handling occlusions and reflections is a common challenge in stereo vision, leading to mismatches and ambiguities in stereo matching. Minimizing occlusions and reflections through suitable camera configurations and lighting conditions is essential. Robust stereo matching algorithms capable of handling outliers and errors are also necessary. Proper camera calibration is crucial in stereo vision to determine camera parameters accurately, ensuring precise depth estimates and clear images. Using high-quality calibration targets like checkerboards or dot grids and following a rigorous procedure covering various angles and distances minimizes calibration errors. Regular recalibration, especially after exposure to environmental factors, maintains accuracy. Dealing with texture less and repetitive regions poses another challenge in stereo vision, making stereo matching difficult due to a lack of distinctive cues. Adding artificial texture or markers to such regions and utilizing stereo matching algorithms incorporating global or semi-global constraints help overcome this challenge, optimizing matching accuracy over a large area.

Challenges of stereovision imaging and solutions

Several challenges were encountered in the characterization of plants using stereo vision, including high matching time, cost, incorrect matching outputs, stereo image collection in unstructured orchard environments, registering reconstructions of the two sides of fruit tree rows, computation of the disparity map, and the effects of canopy shapes (Zhang et al., 2023). Challenges in handling stereo vision occlusion, especially in dense plant canopies, were found to affect the accuracy in plant feature characterization and measurement (Lowe, 2004; Mirbod et al., 2023; Tan et al., 2007; Wang and Zhang, 2013; Zhang et al., 2023). Challenges in accurately estimating plant parameters and canopy structure included the computation of trees from the disparity map, effects of canopy shapes on stereo vision, and registration of reconstructions of the two sides of fruit tree rows (Ni et al., 2016; Zhang et al., 2023). The complexity of data related to various stages of plant growth, ambient environmental conditions, and leaf overlapping were also identified as challenges in plant feature characterization and measurement (Amean, 2017). To address these challenges, several techniques were suggested, such as improving texture and lighting, avoiding strong sunlight, using local algorithms where global algorithms do not perform well, reconstructing plant canopies with camera matrices, and combining stereo vision with other 3D methods were suggested to improve the accuracy (Li et al., 2014; Ni and Burks, 2013; Wang et al., 2020; Zhang et al., 2023) demonstrates that occlusion in unstructured orchards consists of leaves, branches, and fruits hampers the proper depth measurement. To solve the issues, the semi-global matching method was optimized to obtain high accuracy with sparse disparity values. An improved bilateral filtering technique was suggested to use to solve the holes and discontinuities generated by occlusion. Furthermore, a pyramid fusion model was recommended to combine numerous low-resolution bilateral filtering results for improving accuracy, efficiency, and to create dense disparity maps with decreased errors to 3.2 mm, average relative error of 1.79%, and saving more than 90% of time. Sometimes due to the shortage of light condition, some portion of the depth image becomes visible that make challenging to get some detailed information from the images. To address this challenge, several image enhancement algorithms such as gamma correction, histogram equalization, and Contrast Limited Adaptive Histogram Equalization (CLAHE) are suggested to apply to improve the image quality. Gamma correction algorithm is used to make the image brighter without changing the disparity. Histogram equalization can make visible the missing parts of the depth images happened due to the lack of light condition but can amplify the noise. It is presented that CLAHE can enhance the depth images with the visualization of missing parts with less noise (Xu et al., 2016).

Addressing the dynamic environment is critical while obtaining stereovision pictures in open field. To ensure robust and accurate depth perception in such dynamic settings, a variety of factors should be considered, including lighting conditions, computational demands, algorithmic approaches such as efficient algorithms and post-processing, hardware configurations like camera setup and generic multi-core CPUs or optimized hardware setups, and environmental adaptability such as varying weather conditions and times of day. By addressing these concerns, real-time stereo vision systems may achieve excellent performance and reliability in wide areas, making them appropriate for applications such as autonomous driving of agricultural machinery, robotic navigation, and outdoor depth sensing in agricultural operation environment.

Conclusions

Stereo vision had emerged as a promising tool for characterizing the features of upland crop plants and fruit trees, as evidenced by various studies. Stereo vision had been successfully utilized for estimating plant height, canopy volume, row distance, and plant spacing. These applications had been proven the capability of stereo vision to offer detailed and accurate insights in plant features characterization. In conclusion, the integration of stereo vision technology in plant feature characterization had demonstrated its efficacy in providing high-precision measurements and understanding of plant 3D geometries. The review emphasized on data acquisition, camera calibration, stereo mapping, different algorithms used for stereo matching, ROI segmentation using bounding boxes, and 3D reconstruction of plants. Several challenges faced in plant features characterization were identified and the techniques for overcoming the challenges were also suggested. To address several techniques are being directed to consider for improving texture, lighting for avoiding strong sunlight, using local algorithms where global algorithms do not perform well, reconstructing plant canopies with camera matrices, and combining stereo vision with other 3D methods were suggested to improve the accuracy in processing the images of stereovision. An improved bilateral filtering technique was suggested to use to solve the holes and discontinuities generated by occlusion. Image enhancement algorithms such as gamma correction, histogram equalization, and CLAHE directed to apply on stereovision image processing for more accuracy. The adoption of stereovision techniques should be widespread in precision agriculture such that the stereo vision would hold considerable promise in advancing agricultural practices and facilitate in efficient and precise measurement of plants and trees.

Acknowledgements

This work was supported by Korea Institute of Planning and Evaluation for Technology in Food, Agriculture and Forestry (IPET) through (Open Field Smart Agriculture Technology Short-term Advancement Program), funded by Ministry of Agriculture, Food and Rural Affairs (MAFRA) (322029-3).

References

Amean Z.M., Low T., McCarthy C., Hancock N. 2013. Evaluation of stereovision for extracting plant features. Faculty of Engineering and Surveying. University of Southern Queensland, Toowoomba, QLD. CORE 1-10.

Amean Z.M. 2017. Automatic plant features recognition using stereo vision for crop monitoring. Ph.D. dissertation, The University of Southern Queensland, Queensland, Australia.

Andujar D., Ribeiro A., Fernandez-Quintanilla C., Dorado J. 2016. Using depth cameras to extract structural parameters to assess the growth state and yield of cauliflower crops. Computers and Electronics in Agriculture 122: 67-73.

10.1016/j.compag.2016.01.018

Angelina S., Suresh L.P., Veni S.H.K. 2012. Image segmentation based on genetic algorithm for region growth and region merging. International Conference on Computing, Electronics and Electrical Technologies, ICCEET 970-974.

10.1109/ICCEET.2012.6203833

Ansari M.E., Mousset S., Bensrhair A., Bebis G. 2010. Temporal consistent fast stereo matching for advanced driver assistance systems (ADAS). IEEE Intelligent Vehicles Symposium, Proceedings 825-831.

Bao Y., Tang L., Breitzman M.W., Salas Fernandez M.G., Schnable P.S. 2019. Field-based robotic phenotyping of sorghum plant architecture using stereo vision. Journal of Field Robotics 36(2): 397-415.

10.1002/rob.21830

Chai Y.H., Gao L.Q., Lu S., Tian L. 2006. Wavelet-based watershed for image segmentation algorithm. Proceedings of the World Congress on Intelligent Control and Automation (WCICA) 2: 9595-9599.

Chen C., Shen P. 2023. Research on Crack Width Measurement Based on Binocular Vision and Improved DeeplabV3+. Applied Sciences (Switzerland) 13(5).

10.3390/app13052752

Chen H., Chen J., Guan Z., Li Y., Cheng K., Cui Z., Zhang X. 2023. Toward real-time and accurate dense 3D mapping of crop fields for combine harvesters using a stereo camera. Science Progress 106(4): 1-16.

10.1177/0036850423121597437990514PMC10666697

Chen Y., Zhang B., Zhou J., Wang K. 2020. Real-time 3D unstructured environment reconstruction utilizing VR and Kinect-based immersive teleoperation for agricultural field robots. Computers and Electronics in Agriculture 175: 105579.

10.1016/j.compag.2020.105579

Cheng Y.L., Lee C.Y., Huang Y.L., Buckner C.A., Lafrenie R.M., Denommee J.A., Caswell J.M., Want D.A., Gan G.G., Leong Y.C., Bee P.C., Chin E., Teh A.K.H., Picco S., Villegas L., Tonelli F., Merlo M., Rigau J., Diaz D., Mathijssen R.H.J. 2016. We are IntechOpen, the world leading publisher of open access books Built by scientists, for scientists TOP 1%. Intech, 11: 13.

Cui J., Huo J., Yang M. 2016. Novel method of calibration with restrictive constraints for stereo-vision system. Journal of Modern Optics 63(9): 835-846.

10.1080/09500340.2015.1106602

Costa C., Febbi P., Pallottino F., Cecchini M., Figorilli S., Antonucci F. 2019. Stereovision system for estimating tractors and agricultural machines transit area under orchards canopy. International Journal of Agricultural and Biological Engineering 12(1): 1-5.

10.25165/j.ijabe.20191201.4123

Dal C., Dominio F., Zanuttigh P., Mattocci S. 2012. Stereo Vision and Scene Segmentation. Current Advancements in Stereo Vision.

10.5772/45903

Dal M.C., Zanuttigh P., Cortelazzo G.M., Mattoccia S. 2011. Scene segmentation assisted by stereo vision. Proceedings - 2011 International conference on 3D imaging, modeling, processing, visualization and transmission 57-64.

Dattagupta A. 2012. Stereo matching-improving image quality. Open Access Theses & Dissertations. 2069

Dandrifosse S., Bouvry A., Leemans V., Dumont B., Mercatoris B. 2020. Imaging wheat canopy through stereo vision: overcoming the challenges of the laboratory to field transition for morphological features extraction. Frontiers in Plant Science 11.

10.3389/fpls.2020.0009632133023PMC7040167

Degadwala S., Vyas D., Mahajan A. 2020. Review on stereo vision based depth estimation. International Journal of Scientific Research in Science, Engineering and Technology 665-671.

10.32628/IJSRSET207261

Dong W. and Isler V. 2018. Tree morphology for phenotyping from semantics-based mapping in orchard environments. ArXiv. /abs/1804.05905.

Fauvel M., Tarabalka Y., Benediktsson J.A., Chanussot J., Tilton J.C. 2013. Advances in spectral-spatial classification of hyperspectral images. Proceedings of the IEEE 101(3): 652-675.

10.1109/JPROC.2012.2197589

Feng M., Liu Y., Jiang P., Wang J. 2020. Object detection and localization based on binocular vision for autonomous vehicles. Journal of Physics: Conference Series 1544(1).

10.1088/1742-6596/1544/1/012134

Feng X., Fang B. 2021. Algorithm for epipolar geometry and correcting monocular stereo vision based on a plane mirror. Optik 226(P1): 165890.

10.1016/j.ijleo.2020.165890

Fue K., Porter W., Barnes E., Li C., Rains G. 2020. Evaluation of a stereo vision system for cotton row detection and boll location estimation in direct sunlight. Agronomy 10(8).

10.3390/agronomy10081137

Guo Q., Wu F., Pang S., Zhao X., Chen L., Liu J., Xue B., Xu G., Li L., Jing H., Chu C. 2018. Crop 3D-a LiDAR based platform for 3D high-throughput crop phenotyping. Science China Life Sciences 61(3): 328-339.

10.1007/s11427-017-9056-028616808

Han S., Burks T.F. 2013. 3D reconstruction of a citrus canopy. ASABE, 0300(09): 1-12.

Hao Y.N., Tan Y.C., Tai V.C., Zhang X.D., Wei E.P., Ng S.C. 2022. Review of key technologies for warehouse 3D reconstruction. Journal of Mechanical Engineering and Sciences 16(3): 9142-9156.

10.15282/jmes.16.3.2022.15.0724

Harandi N., Vandenberghe B., Vankerschaver J., Depuydt S., Van Messem A. 2023. How to make sense of 3D representations for plant phenotyping: a compendium of processing and analysis techniques. Plant Methods 19(1): 1-46.

10.1186/s13007-023-01031-z37353846PMC10288709

Hu G., Zhou Z., Cao J., Huang H. 2020. Highly accurate 3D reconstruction based on a precise and robust binocular camera calibration method. IET Image Processing 14(14): 3588-3595.

10.1049/iet-ipr.2019.1525

Hui F., Zhu J., Hu P., Meng L., Zhu B., Guo Y., Li B., Ma Y. 2018. Image-based dynamic quantification and high-accuracy 3D evaluation of canopy structure of plant populations. Annals of Botany 121(5): 1079-1088.

10.1093/aob/mcy01629509841PMC5906925

Imaging S., Hardwired U., Segmentation O. 2021. Object segmentation. Computer Vision 884-884.

10.1007/978-3-030-63416-2_300072

Islam R., Habibullah H., Hossain T. 2023. AGRI-SLAM: a real-time stereo visual SLAM for agricultural environment. Autonomous Robots 47(6): 649-668.

10.1007/s10514-023-10110-y

Jay S., Rabatel G., Gorretta N. 2014. In-field crop row stereo-reconstruction for plant phenotyping. Second International Conference on Robotics and Associated High-Technologies and Equipment for Agriculture and Forestry (RHEA-2014) 10.

Jiang Y., Li C., Paterson A.H. 2016. High throughput phenotyping of cotton plant height using depth images under field conditions. Computers and Electronics in Agriculture 130: 57-68.

10.1016/j.compag.2016.09.017

Jin J., Tang L. 2009. Corn plant sensing using real-time stereo vision. Journal of Field Robotics 26: 591-608.

10.1002/rob.20293

Jimenez-Berni J.A., Deery D.M., Rozas-Larraondo P., Condon A.T.G., Rebetzke G.J., James R.A., Bovill W.D., Furbank R.T., Sirault X.R.R. 2018. High throughput determination of plant height, ground cover, and above-ground biomass in wheat with LiDAR. Frontiers in Plant Science 9.

10.3389/fpls.2018.0023729535749PMC5835033

Kang W.X., Yang Q.Q., Liang R.P. 2009. The comparative research on image segmentation algorithms. Proceedings of the 1st International Workshop on Education Technology and Computer Science ETCS 2009 2: 703-707.

10.1109/ETCS.2009.41719424825

Kazmi W., Foix S., Alenya G. 2012. Plant leaf imaging using time of flight camera under sunlight, shadow and room conditions. IEEE International Symposium on Robotic and Sensors Environments, ROSE 2012 - Proceedings 192-197.

10.1109/ROSE.2012.6402615

Khokher M.R., Ghafoor A., Siddiqui A.M. 2013. Image segmentation using multilevel graph cuts and graph development using fuzzy rule-based system. IET Image Processing 7(3): 201-211.

10.1049/iet-ipr.2012.0082

Kim W.S., Lee D.H., Kim Y.J., Kim T., Lee W.S., Choi C.H. 2021. Stereo-vision-based crop height estimation for agricultural robots. Computers and Electronics in Agriculture 181: 105937.

10.1016/j.compag.2020.105937

Kim W.S., Lee D.H., Kim Y.J., Kim Y.S., Kim T., Park S.U., Kim S.S., Hong D.H. 2020. Crop height measurement system based on 3D image and tilt sensor fusion. Agronomy 10(11).

10.3390/agronomy10111670

Korthals T., Kragh M., Christiansen P., Karstoft H., Jørgensen R.N., Ruckert U. 2018. Multi-modal detection and mapping of static and dynamic obstacles in agriculture for process evaluation. Frontiers Robotics AI 5.

10.3389/frobt.2018.0002833500915PMC7806069

Kumar G.A., Lee J.H., Hwan J., Park J., Youn S.H., Kwon S. 2020. LiDAR and camera fusion approach for object distance estimation in self-driving vehicles. Symmetry 12(2).

10.3390/sym12020324

Lee H., Slatton K.C., Roth B.E., Cropper W.P. 2010. Adaptive clustering of airborne LiDAR data to segment individual tree crowns in managed pine forests. International Journal of Remote Sensing 31(1): 117-139.

10.1080/01431160902882561

Leemans V., Dumont B., Destain M.F. 2013. Assessment of plant leaf area measurement by using stereo-vision. International Conference on 3D Imaging, IC3D 2013 - Proceedings 1-5.

10.1109/IC3D.2013.6732085

Li D., Xu L., Tang X.S., Sun S., Cai X., Zhang P. 2017. 3D imaging of greenhouse plants with an inexpensive binocular stereo vision system. Remote Sensing 9(5).

10.3390/rs9050508

Li J., Kaneko A.M., Fukushima E.F. 2014. Proposal of terrain mapping under extreme light conditions using direct stereo matching methods. 2014 IEEE/SICE International Symposium on System Integration 153-158.

10.1109/SII.2014.7028029

Li L., Zhang Q., Huang D. 2014. A review of imaging techniques for plant phenotyping. Sensors (Switzerland) 14(11): 20078-20111.

10.3390/s14112007825347588PMC4279472

Li Y., Shen Y. 2008. Robust image segmentation algorithm using fuzzy clustering based on kernel-induced distance measure. Proceedings - International Conference on Computer Science and Software Engineering 1: 1065-1068.

10.1109/CSSE.2008.694

Li Y., Zhang Y., Li H., Zhang W., Zhang Q. 2018. Epipolar geometry and stereo matching algorithm for underwater fish-eye images. International Journal of Advanced Robotic Systems 15(2): 1-9.

10.1177/1729881418764715

Li Z., Yang J., Liu G., Cheng Y., Liu C. 2011. Unsupervised range-constrained thresholding. Pattern Recognition Letters 32(2): 392-402.

10.1016/j.patrec.2010.09.020

Liu X., Tian J., Kuang H., Ma X. 2022. A Stereo Calibration Method of Multi-Camera Based on Circular Calibration Board. Electronics (Switzerland) 11(4).

10.3390/electronics11040627

Lowe D.G. 2004. Distinctive image features from scale-invariant key points. International Journal of Computer Vision 60(2): 91-110.

10.1023/B:VISI.0000029664.99615.94

Malekabadi A.J, Khojastehpour M., Emadi B. 2019. Disparity map computation of tree using stereo vision system and effects of canopy shapes and foliage density. Computers and Electronics in Agriculture 156: 627-644.

10.1016/j.compag.2018.12.022

Milien M., Renault-Spilmont A.S., Cookson S.J., Sarrazin A., Verdeil J.L. 2012. Visualization of the 3D structure of the graft union of grapevine using X-ray tomography. Scientia Horticulturae 144: 130-140.

10.1016/j.scienta.2012.06.045

Mirbod O., Choi D., Heinemann P.H., Marini R.P., He L. 2023. On-tree apple fruit size estimation using stereo vision with deep learning-based occlusion handling. Biosystems Engineering 226: 27-42.

10.1016/j.biosystemseng.2022.12.008

Mohammed H.M., El-Sheimy N. 2019. Segmentation of image pairs for 3D reconstruction. International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences - ISPRS Archives 42(2/W16): 175-180.

10.5194/isprs-archives-XLII-2-W16-175-2019

Mooney J.G., Johnson E.N. 2014. A Comparison of Automatic Nap-of-the-earth Guidance Strategies for Helicopters. Journal of Field Robotics 26: 1-17.

Muhovic J., Pers J. 2020. Correcting decalibration of stereo cameras in self-driving vehicles. Sensors (Switzerland) 20(11): 1-17.

10.3390/s2011324132517299PMC7313687

Muller-Linow M., Pinto-Espinosa F., Scharr H., Rascher U. 2015. The leaf angle distribution of natural plant populations: Assessing the canopy with a novel software tool. Plant Methods 11(1): 1-16.

10.1186/s13007-015-0052-z25774205PMC4359433

Ni Z., Burks T.F. 2013. Plant or tree reconstruction based on stereo vision. American Society of Agricultural and Biological Engineers Annual International Meeting 2013, ASABE 2013 3: 2476-2484.

Ni Z., Burks T.F. 2013. Three-dimensional dense reconstruction of plant or tree canopy based on stereo vision 20(2).

Ni Z., Burks T.F., Lee W.S. 2016. 3D reconstruction of plant/tree canopy using monocular and binocular vision. Journal of Imaging 2(4).

10.3390/jimaging2040028

Nielsen M., Slaughter D., Gliever C., Upadhyaya S. 2009. Orchard and tree mapping and description using stereo vision and Lidar. (Larue 1989), 1-6.

Nugroho A.P., Fadilah M.A.N., Wiratmoko A., Azis Y.A., Efendi A.W., Sutiarso L., Okayasu T. 2020. Implementation of crop growth monitoring system based on depth perception using stereo camera in plant factory. IOP Conference Series: Earth and Environmental Science 542(1).

10.1088/1755-1315/542/1/012068

Okura F. 2022. 3D modeling and reconstruction of plants and trees: A cross-cutting review across computer graphics, vision, and plant phenotyping. Breeding Science 72(1): 31-47.

10.1270/jsbbs.2107436045890PMC8987840

Ortiz A., Gorriz J.M., Ramirez J., Salas-Gonzalez D. 2014. Improving MR brain image segmentation using self-organising maps and entropy-gradient clustering. Information Sciences 262: 117-136.

10.1016/j.ins.2013.10.002

Perez-Sanz F., Navarro P.J., Egea-Cortines M. 2017. Plant phenomics: An overview of image acquisition technologies and image data analysis algorithms. GigaScience 6(11):1-18.

10.1093/gigascience/gix09229048559PMC5737281

Qiu R., Wei S., Zhang M., Li H., Sun H., Liu G., Li M. 2018. Sensors for measuring plant phenotyping: A review. International Journal of Agricultural and Biological Engineering 11(2): 1-17.

10.25165/j.ijabe.20181102.2696

Quan Z., Wu B., Luo L. 2023. An image stereo matching algorithm with multi-spectral attention mechanism. Sensors 23(19).

10.3390/s2319817937837009PMC10574877

Rosell J.R., Sanz R. 2012. A review of methods and applications of the geometric characterization of tree crops in agricultural activities. Computers and Electronics in Agriculture 81: 124-141.

10.1016/j.compag.2011.09.007

Rovira-Mas F., Wang Q., Zhang Q. 2010. Design parameters for adjusting the visual field of binocular stereo cameras. Biosystems Engineering 105(1): 59-70.

10.1016/j.biosystemseng.2009.09.013

Ruigrok T., van Henten E.J., Kootstra G. 2024. Stereo vision for plant detection in dense scenes. Sensors 24(6): 1942.

10.3390/s2406194238544205PMC10974154

Sampaio G.S., da Silva L.A., Marengoni M. 2021. 3D reconstruction of non-rigid plants and sensor data fusion for agriculture phenotyping. Sensors 21(12): 1-25.

10.3390/s2112411534203831PMC8232764

Sampling L.H., Methods E. 2023. A systematic stereo camera calibration strategy : leveraging experiment methods.

Han S., Burks T.F. 2009. 3D reconstruction of a citrus canopy. American Society of Agricultural and Biological Engineers (p. 1).

Sanz-Cortiella R., Llorens-Calveras J., Escola A., Arno-Satorra J., Ribes-Dasi M., Masip-Vilalta J., Camp F., Gracia-Aguila F., Solanelles-Batlle F., Planas-Demartí S., Palleja-Cabre T., Palacin-Roca J., Gregorio-Lopez E., Del-Moral-Martínez I., Rosell-Polo J.R. 2011. Innovative LIDAR 3D dynamic measurement system to estimate fruit-tree leaf area. Sensors 11(6): 5769-5791.

10.3390/s11060576922163926PMC3231410

Sanz-Cortiella R., Llorens-Calveras J., Rosell-Polo J.R., Gregorio-Lopez E., Palacin-Roca J. 2011. Characterisation of the LMS200 laser beam under the influence of blockage surfaces. Influence on 3D scanning of tree orchards. Sensors 11(3): 2751-2772.

10.3390/s11030275122163765PMC3231636

Scalisi A., McClymont L., Peavey M., Morton P., Scheding S., Underwood J., Goodwin I. 2024. Detecting, mapping and digitising canopy geometry, fruit number and peel colour in pear trees with different architecture. Scientia Horticulturae 326: 112737.

10.1016/j.scienta.2023.112737

Scharr H., Briese C., Embgenbroich P., Fischbach A., Fiorani F., Müller-Linow, M. 2017. Fast high resolution volume carving for 3D plant shoot reconstruction. Frontiers in Plant Science 1-12.

10.3389/fpls.2017.0168029033961PMC5625571

Shan B., Yuan W., Wang H., Zuo Z. 2018. A parallel stereovision method used for monitoring the collapse of a three-story frame model subjected to seismic loading. International Journal of Distributed Sensor Networks 14(9).

10.1177/1550147718800626

Shean D.E., Alexandrov O., Moratto Z.M., Smith B.E., Joughin I.R., Porter C., Morin P. 2016. An automated, open-source pipeline for mass production of digital elevation models (DEMs) from very-high-resolution commercial stereo satellite imagery. ISPRS Journal of Photogrammetry and Remote Sensing 116(206): 101-117.

10.1016/j.isprsjprs.2016.03.012

Sheng H., Wei S., Yu X. 2020. Image segmentation and object measurement based on stereo vision. Proceedings - 2020 Chinese Automation Congress 3637-3641.

10.1109/CAC51589.2020.9327319

Song Y., Eng B.. 2008. Modelling and Analysis of Plant Image Data for Crop Growth Monitoring in Horticulture. University of Warwick Institutional Repository, Department of Computer Science.

Tan P., Zeng G., Wang J., Kang SB, Quan L. 2007. Image-based tree modeling. ACM Transactions on Graphics 26(3):1-8.

10.1145/1276377.1276486

Tankovich V., Hane C., Zhang Y., Kowdle A., Fanello S., Bouaziz S. 2021. HitNet: hierarchical iterative tile refinement network for real-time stereo matching. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition 14357-14367.

10.1109/CVPR46437.2021.01413

Tavares J.M.R., Jorge R.M. 2009. Advances in computational vision and medical image processing. Computational Methods in Applied Sciences Series, 13, Springer, ISBN 978-1-4020-9085-1.

Tilneac M., Dolga V., Grigorescu S., Bitea M.A. 2012. 3D stereo vision measurements for weed-crop discrimination. Elektronika Ir Elektrotechnika, 123(7): 9-12.

10.5755/j01.eee.123.7.2366

Torbati N., Ayatollahi A., Kermani A. 2014. An efficient neural network based method for medical image segmentation. Computers in Biology and Medicine 44(1): 76-87.

10.1016/j.compbiomed.2013.10.02924377691

Usha K., Singh B. 2013. Potential applications of remote sensing in horticulture-A review. Scientia Horticulturae 153: 71-83.

10.1016/j.scienta.2013.01.008

Vazquez-Arellano M., Griepentrog H.W., Reiser D., Paraforos D.S. 2016. 3-D imaging systems for agricultural applications-a review. Sensors 16(5): 618.

10.3390/s1605061827136560PMC4883309

Wang H., Zhang W., Zhou G., Yan G., Clinton N. 2009. Image-based 3D corn reconstruction for retrieval of geometrical structural parameters. International Journal of Remote Sensing 30(20): 5505-5513.

10.1080/01431160903130952

Wang J., Zhang Y., Gu R. 2020. Research status and prospects on plant canopy structure measurement using visual sensors based on three-dimensional reconstruction. Agriculture (Switzerland) 10(10): 1-26.

10.3390/agriculture10100462

Wang Q., Zhang Q. 2013. Three-dimensional reconstruction of a dormant tree using RGB-D cameras. American Society of Agricultural and Biological Engineers Annual International Meeting 2: 1341-1350.

Wang X., Singh D., Marla S., Morris G., Poland J. 2018. Field-based high-throughput phenotyping of plant height in sorghum using different sensing technologies. Plant Methods 14(1): 1-16.

10.1186/s13007-018-0324-529997682PMC6031187

Xiang L., Gai J., Bao Y., Yu J., Schnable P., Tang L. 2023. Field-based robotic leaf angle detection and characterization of maize plants using stereo vision and deep convolutional neural networks. Journal of Field Robotics 40: 1034-1053.

10.1002/rob.22166

Xu Y., Long Q., Mita S., Tehrani H., Ishimaru K., Shirai N. 2016. Real-time stereo vision system at nighttime with noise reduction using simplified non-local matching cost. IEEE Intelligent Vehicles Symposium (IV), Proceedings 1079-1084.

Yao M., Xu B. 2019. A dense stereovision system for 3D body imaging. IEEE Access 7: 170907-170918.

10.1109/ACCESS.2019.2955915

Yeh Y.H.F., Lai T.C., Liu T.Y., Liu C.C., Chung W.C., Lin T.T. 2014. An automated growth measurement system for leafy vegetables. Biosystems Engineering 117(C): 43-50.

10.1016/j.biosystemseng.2013.08.011

Yoon S.C., Thai C.N. 2010. Stereo spectral imaging system for plant health characterization. In Technological developments in networking, education and automation. pp. 181-186. Dordrecht: Springer Netherlands.

10.1007/978-90-481-9151-2_31PMC3412470

100

Yu X., Fan Z., Wan H., He Y., Du J., Li N., Yuan Z., Xiao G. 2019. Positioning, navigation, and book accessing/returning in an autonomous library robot using integrated binocular vision and QR code identification systems. Sensors (Switzerland) 19(4).

10.3390/s1904078330769857PMC6412710

101

Yuan W., Li J., Bhatta M., Shi Y., Baenziger P.S., Ge Y. 2018. Wheat height estimation using LiDAR in comparison to ultrasonic sensor and UAS. Sensors (Switzerland) 18(11).

10.3390/s1811373130400154PMC6263480

102

Zhang L., Hao Q., Mao Y., Su J., Cao J. 2023. Beyond Trade-Off: An optimized binocular stereo vision-based depth estimation algorithm for designing harvesting robot in orchards. Agriculture (Switzerland) 13(6).

10.3390/agriculture13061117

103

Zhang M., Cai W., Xie Q., Xu S. 2022. Binocular-vision-based obstacle avoidance design and experiments verification for underwater quadrocopter vehicle. Journal of Marine Science and Engineering 10(8).

10.3390/jmse10081050

104

Zhao Y., Gong L., Huang Y., Liu C. 2016. A review of key techniques of vision-based control for harvesting robot. Computers and Electronics in Agriculture 127: 311-323.

10.1016/j.compag.2016.06.022

105

Zhong L., Qin J., Yang X., Zhang X., Shang Y., Zhang H., Yu Q. 2021. An accurate linear method for 3D line reconstruction for binocular or multiple view stereo vision. Sensors (Switzerland) 21(2): 1-19.

10.3390/s2102065833477878PMC7832884

106

Zhou G., Sun X., Dong Q., Cao S., Li M. 2020. Research on camera calibration method for visual inspection of excavator working object. Journal of Physics: Conference Series 1678(1).

10.1088/1742-6596/1678/1/012022

107

Zhou K., Meng X., Cheng B. 2020. Review of stereo matching algorithms based on deep learning. Computational Intelligence and Neuroscience 2020(1).

10.1155/2020/856232332273887PMC7125450

Precision Agriculture Science and Technology ISSN:2672-0086(Print) 2713-5632(Online) 정밀농업과학기술

Preview

A review on stereo vision for feature characterization of upland crops and orchard fruit trees

ABSTRACT

MAIN

(1)

Fig. 1.

Working principle of stereo vision.

Table 1.

Several stereo vision sensors that are usually used for stereo vision technology.

(2)

(3)

(4)

Fig. 2.

Depth and disparity calculation using stereo vision.

Fig. 3.

Flow diagram of stereo matching for 3D model reconstruction of plants.

Fig. 4.

Schematic diagram of processing the stereo images. (A) depth and color processing, and (B) RGB image processing including segmentation, clustering, and depth filtering.

Fig. 5.

Schematic diagram of plant height estimation using stereo vision.

Table 2.

Comparison of stereo vision with LiDAR.

Acknowledgements

References