Evaluation of Clumping Effects on the Estimation of Global Terrestrial Evapotranspiration

In terrestrial ecosystems, leaves are aggregated into different spatial structures and their spatial distribution is non-random. Clumping index (CI) is a key canopy structural parameter, characterizing the extent to which leaf deviates from the random distribution. To assess leaf clumping effects on global terrestrial ET, we used a global leaf area index (LAI) map and the latest version of global CI product derived from MODIS BRDF data as well as the Boreal Ecosystem Productivity Simulator (BEPS) to estimate global terrestrial ET. The results show that global terrestrial ET in 2015 was 511.9 ± 70.1 mm yr−1 for Case I, where the true LAI and CI are used. Compared to this baseline case, (1) global terrestrial ET is overestimated by 4.7% for Case II where true LAI is used ignoring clumping; (2) global terrestrial ET is underestimated by 13.0% for Case III where effective LAI is used ignoring clumping. Among all plant functional types (PFTs), evergreen needleleaf forests were most affected by foliage clumping for ET estimation in Case II, because they are most clumped with the lowest CI. Deciduous broadleaf forests are affected by leaf clumping most in Case III because they have both high LAI and low CI compared to other PFTs. The leaf clumping effects on ET estimation in both Case II and Case III is robust to the errors in major input parameters. Thus, it is necessary to consider clumping effects in the simulation of global terrestrial ET, which has considerable implications for global water cycle research.


Introduction
Evapotranspiration (ET) of terrestrial ecosystems consists of water lost to the atmosphere from plant leaves via transpiration and that lost from the soil, wet vegetation surfaces, and water bodies, through evaporation [1]. ET is the second-largest component of the global terrestrial water cycle since it returns over 60% of terrestrial precipitation to the atmosphere [2,3], and the accompanying latent heat fluxes occupy > 50% of the solar radiation received by the land surface [4]. ET is also coupled with vegetation photosynthesis through canopy stomatal conductance [1,5]. These linkages among water, energy, and carbon cycle make ET a crucial parameter in short-term numerical weather forecasts as well as long-term climate projections. Thus, it is important to estimate terrestrial ET accurately for advancing our understanding of the Earth's climate system.
Global terrestrial ET was firstly simulated with the land surface model, of which ET simulation was coupled with the carbon processes [6]. Various estimates of the magnitude of global terrestrial ET in different units have been reported: (1) 524.9 mm yr −1 (58.4 × 10 3 km 3 yr −1 ) [7]; (2) 272.0-441.5 mm yr −1 [8]; (3) 604 mm yr −1 [9]; and (4) 564.4 mm yr −1 (62.8 × 10 3 km 3 yr −1 ) [10]. These estimates differ considerably due to different datasets and different methods used for the estimation of global terrestrial ET. Luo et al. [11] compared ET simulations of nine flux tower sites using different leaf-tocanopy upscaling schemes with the same data inputs. They found that the big-leaf scheme underestimated ET across all sites and this underestimation was more significant at the sites with higher leaf area index (LAI) values. It indicated that methods of canopy-scale ET simulation would be a major source of uncertainties in global ET simulation.
Based on the pioneering works of Penman [12] and Monteith [13], ET can be calculated by the widely used Penman-Monteith (PM) equation [13]. Although the PM equation has a solid physical foundation, its application to large scales still has several challenges, one of which is the upscaling methodology for canopy ET estimation [14]. The big-leaf model treating the canopy as a big leaf over the soil is a widely used leaf-to-canopy upscaling methodology for ET estimation [13,15,16] and it has several limitations. One is that many parameters of the big-leaf model can only be determined by tuning the model using eddy covariance (EC) flux measurements and do not represent any measurable biochemical or biophysical variables [17]. Another shortcoming of the big-leaf model is the effects of ignoring micrometeorological gradients and intermittent turbulence on the estimation of EC fluxes [17]. The big-leaf model would cause biases in the simulation of canopy gross primary productivity (GPP) and ET [11,18,19].
The two-leaf model, which divides the canopy into sunlit and shaded components has been proved to perform better in estimating canopy GPP and ET than the big-leaf model [11,18,19]. Sunlit-shaded leaf separation needs an accurate description of canopy structure using two canopy structural parameters [14]. One is LAI which is defined as half of the total surface area of green leaves per unit ground surface area [20]. The other one is CI, which quantifies the extent to which leaves deviate from the random distribution [21,22]. There are four reasons why CI is required for ET simulation: (1) CI is required for deriving true LAI from effective LAI (L e ) that can be measured directly using ground-based optical instruments (i.e., LAI-2200) or derived from satellite sensors; (2) CI is needed in the two-leaf model for calculating sunlit and shaded LAI; (3) CI is used in the calculation of canopy gap fraction for accurately simulating radiation transfer inside the canopy [18,22,23]; and (4) CI is also used in estimating the rate of intercepted rain or snow by the canopy [24]. Chen et al. (2016) assessed the impacts of leaf clumping on ET estimation at eight forest flux tower sites using the BEPS model and found that the annual ET could be underestimated by 11.5% on average if Le was used ignoring leaf clumping. Also, under the same modeling scheme, the bias of ET estimation was larger in a more clumped canopy [14]. Moreover, the knowledge of experimental LAI is crucial in ecohydraulic research [25,26].
Recently, a new global daily CI product with 500 m spatial resolution has been produced based on MODIS BRDF data for 2001-2017 [27]. This CI product considers the variations in solar zenith angles, and it could be used for evaluating leaf clumping effects on global ET estimation with the fine resolution of the global CI maps. The aims of this study are (1) to assess leaf clumping effects on estimating global terrestrial ET; (2) to assess leaf clumping effects on the calculation of key biophysical parameters controlling ET simulation, and; (3) to evaluate the robustness of leaf clumping effects with errors in key model parameters.

Modeling Methodology
The BEPS model was used for simulating global terrestrial ET in this research. Although it was a process-based ecosystem model initially developed for estimating the carbon fluxes and ET over the terrestrial ecosystems in Canada [28][29][30][31], it has been applied over ecosystems in East Asia [32], China [33], and over global terrestrial ecosystems [23]. BEPS uses the sunlit-shaded leaf separation method for simulating canopy GPP and ET [18,31] and it was proved that this two-leaf model can remove the uncertainties propagated from the artificial leaf-to-canopy upscaling using the big-leaf and two-big-leaf models [11]. The biases in estimated GPP and ET using the two-leaf model are not affected by LAI while the biases in big-leaf and two-big-leaf models increase with LAI [11] (Table 1).  [39] a V cmax0 is leaf maximum carboxylation rate at the top of the canopy at 25 • C; N 0 is leaf nitrogen content at the top of canopy; χ n is the slope of V cmax variation with leaf nitrogen content; m and g 0 are the slope and intercept in the BWB equation. β is the simple numerical index of rooting distribution based on the asymptotic equation p(z) = 1 − β z , where p(z) is the proportion of roots from the surface to the depth (z). RASM th is the threshold of relatively available soil moisture. a is the fitted slope in the equation of accumulated soil water deficit (ASWD) adjusted V cmax ratio (Appendix A). The peak LAI during the growing season and the annual mean clumping index are given as the mean and standard deviation for each PFT. b ENF: evergreen needleleaf forests; DNF: deciduous needleleaf forests; DBF: deciduous broadleaf forests; EBF: evergreen broadleaf forests; MF: mixed forests.

ET Modeling
The Penman-Monteith equation was used in BEPS to calculate ET (in m s −1 ) [14,31]: where R n represents net radiation (W m −2 ) by the surface of interest and G represents conductive heat flux (W m −2 ) away from the surface. e s is the saturated water vapor pressure (kPa) at the surface. e a represents the actual water vapor pressure (kPa) above the surface. g a represents the aerodynamic conductance (m s −1 ) from evaporating surface to a reference height. γ represents the psychometric constant (≈0.066 kPa • C −1 ). λ v represents the latent heat of vaporization. ρ a is the air density. c p represents the specific heat of the air. g s represents the surface conductance to water vapor (m s −1 ). ∆ represents the change of saturated water vapor pressure with air temperature and it is calculated as [40]: where Ta represents air temperature ( • C). In BEPS, ET is the sum of nine components and it is calculated as: where T o and T u are transpiration rates of overstory and understory dry canopies, respectively. E o and E u are evaporation from intercepted liquid water in overstory and understory canopies. S o and S u is sublimation rates of intercepted snow by overstory and understory canopies, respectively. E s is soil evaporation rate. E g is the evaporation rate of ponded water on the ground and S g is the sublimation rate of the snow on the ground.
In BEPS, leaf transpiration is coupled with leaf photosynthesis rate through stomatal conductance, which is estimated based on the Ball-Woodrow-Berry (BWB) stomatal conductance model [41]: where g sw is stomatal conductance to water vapor (mol m −2 s −1 ). h s is relative humidity (RH) at the leaf surface. C s represents CO 2 concentration (µmol mol −1 ) at the leaf surface. m is a slope (dimensionless) and g 0 is a small intercept term (mol m −2 s −1 ). The m values are highly variable among plant functional types (PFTs) [42]. We choose the most frequently used values in this study ( Table 1). The values of g 0 are usually very low and treated invariant for all PFTs (Table 1). Leaf net photosynthesis rate (A in µmol m −2 s −1 ) is calculated as [18]: where R d is leaf dark respiration (µmol m −2 s −1 ); W c and W j are Rubisco-limited and light-limited gross photosynthesis rates (µmol m −2 s −1 ), respectively. They are calculated as [43]: where V cmax is leaf maximum carboxylation rate (µmol m −2 s −1 ); J is the electron transport rate (µmol m −2 s −1 ); C i is the intercellular CO 2 concentration; Γ is the CO 2 compensation point without dark respiration; K is a function of enzyme kinetics. The values of leaf maximum rate of carboxylation at top of canopy normalized to 25 • C (V 25 cmax0 ) for various PFTs are taken from He et al. [34], who produced global V cmax maps (2007-2017) based on GOME-2 SIF observation [44]. The default V 25 cmax0 is used for the others [23], which is not available in He et al. [34].
The surface resistance to the soil evaporation (r soil ) was calculated as [45]: where θ 1 and θ s are soil water content (m 3 m −3 ) and porosity (m 3 m −3 ) of the 1st soil layer.

The Two-Leaf Model for ET Modeling
BEPS follows the sunlit-shaded two-leaf scheme in simulating canopy-scale GPP and ET [18,31]. In BEPS, the vegetation canopy is stratified into two layers as the overstory canopy and the understory canopy. Both are divided into sunlit and shaded portions because the greatest difference in leaf irradiance exists between sunlit and shaded leaves [18]. The method of calculating sunlit and shaded LAI [46] was improved to incorporate the leaf clumping effect on the canopy radiation transfer [18]: LAI u_sh = LAI u − LAI u_sun (12) where LAI o_sh and LAI o_sun are overstory shaded and sunlit LAI, respectively. LAI u_sh and LAI u_sun are understory shaded and sunlit LAI, respectively. θ is solar zenith angle. Canopy ET was calculated as the sum of sunlit and shaded components as: ET u = ET u_sun LAI u_sun + ET u_sh LAI u_sh (14) where ET o_sun and ET o_sh are ET from a representative sunlit and a representative shaded leaf in the overstory canopy, respectively. ET u_sun and ET u_sh are ET from a representative sunlit and a representative shaded leaf in the understory canopy, respectively.

Meteorological Data
The gridded meteorology data include downward solar radiation, two-meter T a , RH, rainfall (Figure 1), and ten-meter wind speed, and they are downloaded from the ERA-Interim Archive at the European Centre for Medium-Range Weather Forecasts (ECMWF). The data assimilation system used for producing ERA-Interim was based on a 2006 release of the Integrated Forecast System (Cy31r2). This data assimilation system includes a 4-dimensional variational analysis (4D-Var) with a 12-hour analysis window [47]. The data assimilation performance has been closely monitored during the production of ERA-Interim [47]. The spatial resolution of this dataset is approximately 80 km (~1 • ). For hourly simulation at a 1 • resolution, the cubic spline interpolation method in Matlab was applied to interpolate ERA-interim meteorology data from three hourly to one hourly.

Land Cover Map
Global land cover (LC) map in 2013 at 500 m resolution from the Global Land Cover by National Mapping Organizations (GLCNMO) (https://globalmaps.github.io/glcnmo.html, accessed on 3 July 2021) is used in this study. There are 19 LC types based on the Land Cover Classification System (LCCS) [48] (Figure 2). For global ET simulation, the detailed LC types were lumped into nine PFTs. Specifically, EBF, DBF, ENF, DNF, MF, and shrub are kept as they are. Herbaceous is labeled as grass. Cropland, paddy field, and Cropland/Other Vegetation Mosaic were labeled as crop. Other LC types in Figure 1 were labeled as others. For each 1 • modeling pixel, the areal fraction of each PFT was calculated to upscale from 500 m to 1 • resolution [49]. In this study, the BEPS model was run at 1 • resolution for ET simulation. Thus, the GLCNMO global land cover map at 500 m resolution is good enough for this study. The latest GLCNMO land cover map available online is the Version 3 data which was produced from the MODIS data in 2013. This is the reason why we used the GLCNMO2013 land cover map. There is only two years' difference between 2015 and 2013 and the changes of land cover types should be minor within two years. Thus, we assumed that this global land cover map in 2013 could still be used for our ET simulation in 2015.

Global LAI Product
The global LAI product in 2015 used in this study was retrieved from the GLOBMAP LAI Version 3 dataset, which includes consistent long-term (1981-2019) global LAI maps at 8 km resolution [37]. The global LAI maps in 2015 were derived from 8-day synthesis MODIS land surface reflectance products (MOD09A1 C6) and the angular information. The red, near-infrared (NIR), and shortwave-infrared (SWIR) reflectance were used for LAI derivation based on the GLOBCARBON LAI algorithm. The cloud pixels of MOD09A1 C6 data were identified by an algorithm of cloud detection [50], and the cloud pixels were filled using the approach of locally adjusted cubic spline capping [51]. The leaf clumping effects had been considered for each pixel with a global CI map at 500 m resolution [52]. This global LAI product had been validated using field measurements of 28 sites and 45 LAI maps of fine resolution over 29 sites which covered major global PFTs [37]. A global LAI map on 4-11 July 2015, is shown as an example in Figure 3. For ET modeling at the one-degree resolution, the global LAI maps at 8 km resolution were aggregated to derive the mean LAI value for each PFT in each one-degree grid. In this approach, each PFT within the one-degree grid is considered, and the LAI related to each specific PFT is averaged from all the fine resolution pixels. This allowed separate ET estimation for each PFT in each one-degree grid. The ET value of each one-degree grid was then calculated as the sum of ET for all PFTs weighted by their areal fractions [49].

Global Clumping Index Map
CI is a key canopy structural parameter quantifying the level of leaf clumping in vegetation canopy. Chen et al. (2005) firstly proposed the Normalized Difference between Hotspot and Darkspot (NDHD), which is an improved angular index for deriving the global CI map with multi-angular remote sensing data at 6 km resolution [53]. In this study, we used a global CI product (hereafter CAS-CI) at 500 m resolution in 2015, which was derived from the MODIS BRDF product [27]. This is currently the finest global CI map of which temporal resolution is 8-day ( Figure 4). This global CI map has been validated with in-situ observations of CI at thirty-three sites [27]. This global CI map series was aggregated to derive the mean CI value for each PFT in each 1 • grid and it was the basis for the assessment of the impacts of leaf clumping on global ET simulation.

Soil Texture Data
The soil texture type for each one-degree pixel was determined by the sand, silt, and clay fractions. The soil texture data were downloaded from the Harmonized World Soil Database (HWSD) (http://webarchive.iiasa.ac.at/Research/LUC/External-World-soildatabase/HTML/HWSD_Data.html?sb=4, accessed on 3 July 2021). The spatial resolution of the original HWSD dataset is 0.00833 degrees (~1 km) and they were resampled into grids of 0.004167 • (~500 m) to match the spatial resolution of the global LC map. The soil texture type of each grid cell was determined by the USDA soil texture triangle and the sand, silt, and clay fractions.
For global ET simulation at 1 • resolution, the soil texture map at 500 m resolution was aggregated to obtain the dominant soil texture types for the 9 PFTs (Section 3.2). The soil texture types were used to determine the soil hydrological and thermal parameter values such as saturated hydraulic conductivity, porosity, field capacity, wilting point, and soil thermal conductivity, which would influence the estimation of soil evaporation through the ground surface temperature and soil moisture.

Simulation Cases and Model Validation
For the purpose of investigating leaf clumping effects on the estimated global terrestrial ET, three cases of model simulations were conducted. For Case I, both of the true LAI and CI retrieved from remote sensing data were used. This Case was believed to be unbiased. For Case II, the true LAI was used but clumping was ignored (CI = 1). This case would cause the overestimation of sunlit LAI and the underestimation of shaded LAI compared to Case I. For Case III, the effective LAI (L e ) calculated as the product of remote sensing true LAI (GLOBMAP LAI) and remote sensing CAS-CI was used and clumping was ignored. In this case, the sunlit LAI is not biased but the shaded LAI is underestimated compared to Case I. The BEPS simulated soil moisture (SM) from Case I was used to drive the other case simulations for removing the effects of simulated SM on simulated ET among different cases.
The BEPS model simulated ET has been validated extensively with the eddy-covariance (EC) measured ET for C3 and C4 crops in the Western Lake Erie Basin, the USA [54], boreal and temperate forest ecosystems in North America [11,14,55], forest, grassland and cropland in China [56][57][58]. In these studies, the BEPS model simulated half-hourly ET was compared with the EC measured ET and the R 2 is in the range of 0.43 to 0.86 and the RMSE ranging from 0.02 to 0.08 mm hr −1 . The Penman-Monteith equation used to estimate ET and the two-leaf model used to upscale ET from leaf-level to canopy scale have been proved to perform better than their alternatives [11,59]. Thus, the global terrestrial ET simulated by the BEPS model in this study should be reasonable as long as the uncertainties from the input dataset could be well-constrained.

Simulations of Global Terrestrial ET
Using the various land cover, meteorology, and soil inputs, a global annual ET map for 2015 is produced at one-degree resolution for each of the three cases, based on the hourly BEPS simulation. The global ET map of Case I is considered as representing the reality ( Figure 5). The highest ET appeared in the tropics while low ET appeared in the northern high latitudes, high-altitude mountains, and arid regions (i.e., Australia, central Asia, the western US, and the Sahel). The mean ET variations with latitude are also shown in Figure 4. The highest ET (933.4 ± 472.5 mm yr −1 ) occurs at the latitude of 1 • N and ET gradually decreases with the increase of latitude to the North or the South ( Figure 5). The variation of annual ET with latitude is large due to strong latitudinal variations in meteorological conditions, vegetation types, and densities. The ET averaged over the global land surface is 511.9 ± 70.1 mm yr −1 , where the uncertainty ± 70.1 mm yr −1 is the sum of the uncertainty (±26.7 mm yr −1 ) resulting from the input vegetation data as well as model parameters, and the standard deviation (±43.4 mm yr −1 , Figure 8) of the annual global ET during 2013 to 2016 resulting from meteorological variations. The uncertainty from meteorological variations is calculated using the same model parameters and the same LAI and CI input dataset in 2015 but actual hourly meteorological data from 2013 to 2016. The uncertainty resulting from the input vegetation data and model parameters is approximated by the range of global terrestrial ET variation caused by LAI uncertainty of 20% (see Table 2) [23].

Spatial Patterns of Leaf Clumping Effects on ET Estimation
Compared to Case I, leaf clumping effects on global terrestrial ET simulation are assessed. The spatial patterns of bias errors for ET estimates among cases have consistent signs, either positive ( Figure 6) or negative (Figure 7), indicating that the clumping effects are spatially consistent, despite considerable differences in other variables such as meteorology, plants, and soils. It means that leaf clumping effects are vital for the two cases. The largest biases due to neglecting leaf clumping exist in the tropical and the boreal forests (Figures 6 and 7) where high clumping is detected from the MODIS BRDF product (Figure 4). Figure 6 demonstrates that the ET values are overestimated globally if true LAI is used ignoring clumping. The reason is that overstory sunlit LAI is overestimated in Case II. Leaf clumping leads to more light penetration into the canopy and a decrease in overstory sunlit LAI. If ignoring clumping, overstory sunlit LAI would be positively biased while overstory shaded LAI would be negatively biased. Because ET of sunlit leaves is larger than that of shaded leaves per unit leaf area, the total ET is positively biased for Case II.  Figure 7 demonstrates the converse effects of leaf clumping globally for Case III. It can be found that the detailed spatial distributions of the leaf clumping effects are relatively different in Figures 6 and 7. In Case II (Figure 6), the high clumping effects on ET do not appear where LAI is large. The reason is that the relative change in overstory shaded LAI with CI is larger at lower LAI values while the absolute value of overstory shaded LAI varies little with LAI (Equations (9) and (10)). In Case III (Figure 7), the high clumping effects appear in areas with high LAI. This is because the overstory shaded LAI would be more negatively biased with larger LAI values when the CI is the same.

Leaf Clumping Effects on ET Estimation and Its Components for Various PFTs
Based on Case I, leaf clumping effects on estimations of global terrestrial ET as well as its components have been assessed for various PFTs (Figure 8). The discrepancy of ET is 18.3 mm yr −1 (4.7%) between Case II and Case I and −48.3 mm yr −1 (−13.0%) between Case III and Case I (Figure 8). ET of Case II is always larger than those of Case I for all PFTs (Figure 8). Evergreen needleleaf forests (ENF) have the largest relative differences (RD) between Case II and Case I because they are most clumped with the lowest CI on average ( Figure 8).
In contrary to Case II, ET is underestimated for all PFTs in Case III. The reason is that the effective LAI of a highly clumped canopy is much smaller than its true LAI. Deciduous broadleaf forests (DBF) have the largest RD between Case III and Case I because DBF has both high LAI and low CI compared to other PFTs ( Figure 8). If effective LAI is used ignoring clumping, shaded LAI is much smaller in Case III than that in Case I, especially for high LAI and low CI, while the sunlit LAI is unchanged (Equation (9)). Thus, the ET of shaded leaves in Case III is lower than those in Case I so that the total ET is underestimated for Case III.
ET of Case III is underestimated considerably due to the large fractions of shaded T o in ET. The fractions of shaded T o to total ET are 23.2%, 17.0%, 17.3%, 18.7%, 20.1%, 4.1%, 9.4%, 4.4% and 8.3% to ET for ENF, DNF, DBF, EBF, MF, grass, crop, shrubs and others. The discrepancy in ET between Case III and Case I is more than that between Case II and Case I because ET is more biased due to the underestimation of shaded LAI in Case III than that caused by the overestimation of sunlit LAI in Case II. This is consistent with what Chen et al. [14] found by assessing leaf clumping effect on ET estimates over eight forest sites in North America.

Leaf Clumping Effects on the Calculation of Key Biophysical Parameters Controlling Transpiration Simulations
In this section, we explored leaf clumping effects on the calculation of key biophysical parameters controlling transpiration (T) estimation, which is the major component of ET, such as leaf-level absorbed photosynthetic photon flux density (PPFD), net radiation (R n ), stomatal conductance to water vapor (g sw ) and LAI. The leaf clumping effects on the calculation of these biophysical parameters were investigated for sunlit and shaded leaves in the overstory and understory canopies, respectively (Figures 10 and 11).
If true LAI is used in transpiration estimation and clumping is ignored, leaf-level absorbed PPFD, Rn, g sw, and T of the representative sunlit leaf in the overstory canopy would be negatively biased (Figure 10a,b,d,e). In contrast, leaf-level absorbed PPFD, R n , g sw, and T of the shaded leaf in the overstory canopy would be positively biased when true LAI is used in transpiration calculation and clumping is ignored (Figure 10a,b,d,e). For instance, in a typical conifer forest, where LAI is 4.0 and Ω is 0.5, downward shortwave radiation above the canopy is 500 W m −2 and relative humidity is 80%, leaf-level absorbed PPFD, R n , g sw, and T of the representative sunlit leaf in the overstory canopy would be 663.7 µmol m −2 s −1 , 234.3 W m −2 , 4.15 mm s −1 and 0.0391 g H 2 O m −2 s −1 when clumping is considered (Figure 9). However, they will be 641.0 µmol m −2 s −1 , 229.8 W m −2 , 4.12 mm s −1 , and 0.0384 g H 2 O m −2 s −1 when clumping is ignored (Figure 9). Thus, the relative errors (RE) in the estimations of leaf-level absorbed PPFD, Rn, gsw, and T for the representative sunlit leaf in the overstory canopy are −3.4%, −1.9%, −0.6%, and −1.8% when clumping is ignored. In this case, leaf-level absorbed PPFD, Rn, gsw and T of the representative shaded leaf in the overstory canopy would be 75.8 µmol m −2 s −1 , 24.3 W m −2 , 1.76 mm s −1 and 0.0053 g H 2 O m −2 s −1 when clumping is considered ( Figure 10). However, they will be 91.9 µmol m −2 s −1 , 27.9 W m −2 , 1.98 mm s −1, and 0.0061 g H 2 O m −2 s −1 when clumping is ignored ( Figure 10). Thus, the relative errors in the estimations of leaf-level absorbed PPFD, Rn, gsw, and T for the representative shaded leaf in the overstory canopy are 21.2%, 14.7%, 12.8%, and 15.1% when clumping is ignored. The positive biases in these variables for the shaded leaf in the overstory are much higher than the negative biases for the sunlit leaf in the overstory.
In this case, leaf-level absorbed PPFD, Rn, gsw, and T for understory sunlit and shaded leaves would be negatively biased when true LAI is used in transpiration estimation and clumping is ignored ( Figure 10). Meanwhile, understory sunlit LAI would be negatively biased while understory shaded LAI would be positively biased when true LAI is used in transpiration calculation and clumping is ignored ( Figure 10). In this case, canopy-scale understory sunlit and shaded T would be both negatively biased (Figure 10f).  For the same L e , the overstory or understory shaded LAI as well as total LAI increased with the decrease of Ω (more clumped) while overstory or understory sunlit LAI were invariant (Figure 11c). When L e is used in estimating T ignoring clumping, leaf-level absorbed PPFD, R n , g sw, and T of the overstory representative sunlit leaf were invariant while those of the overstory representative shaded leaf were overestimated ( Figure 11). Canopylevel overstory sunlit T was invariant while the shaded T and total T were negatively biased when effective LAI is used ignoring clumping (Figure 11f).
Using the same example as mentioned above, effective LAI for the overstory and understory canopies are set as a constant of 2.0 and 0.2, respectively. If Ω is 0.5, absorbed PPFD, R n , g sw and T for the overstory representative shaded leaf at θ = 45  (Figure 11). The RE in the estimation of PPFD, R n , g s,w, and T when clumping is ignored will be 105.3%, 74.8%, 50.0%, and 67.9%, for the overstory representative shaded leaf. Meanwhile, the sunlit LAI in overstory and understory are unchanged and the shaded LAI in overstory and understory were negatively biased by 68.2% and 56.3% (Ω = 0.5). Total LAI was underestimated by 50.0% if Ω is 0.5 (Figure 11c). Accordingly, T of the overstory sunlit canopy would be unchanged while that of the shaded counterpart was underestimated by 46.8% (Figure 11f). On the other hand, T of understory sunlit canopy would be underestimated by 11.1% while that of the shaded counterpart was underestimated by 33.3%. The total canopy-level transpiration rate was underestimated by 13.6% in this Case (Figure 11f).

Estimations of Global Terrestrial ET
The multiyear average of annual global terrestrial ET (511.9 ± 43.4 mm yr −1 ) from 2013 to 2016 simulated by the BEPS model in this study is within the range (453.7 ± 5.2 to 697.3 ± 10.4 mm yr −1 ) of those estimated by four remote sensings (RS) based physical models, two machine learning methods (i.e., model tree ensemble and random forest), and 14 land surface models (LSM) (Figure 12) [1]. For the model tree ensemble (MTE) method, eddy covariance (EC) measured ET was integrated with satellite remote sensing and surface meteorological data to estimate global terrestrial ET monthly from 1982 to 2008 at 0.5 • spatial resolution [60]. For the random forest (RF) algorithm, a set of independent regression trees was generated by randomly selecting training samples automatically [61]. The RF global terrestrial ET product has a temporal resolution of half-hourly for the period of 2001-2014 at 0.5 spatial resolution [62]. The 14 LSM modeled ET products were from Trends and Drivers of the

Sensitivities of ET to Errors in Key Parameters of the BEPS Model
Uncertainties in the global ET estimation resulting from errors in key model parameters might be as large as the impacts of leaf clumping mentioned above. To investigate the robustness of our results on the impacts of leaf clumping on global ET estimation, sensitivity analysis was conducted for four key parameters such as LAI, Ω, V cmax, and the threshold of relatively available soil moisture under which soil water stress occurs (RASM th ). If LAI is uniformly increased or reduced by 20% (assessed LAI error), ET is increased by 6.0% or reduced by 7.0% (Table 2). Nevertheless, the relative differences (RD) between Case II and Case I and between Case III and Case I are affected slightly (less than 0.5%) by the variations in LAI. This indicated that all cases are influenced by the LAI changes in the same direction. Similar to LAI, if V 25 cmax0 is increased or reduced by 20% in the BEPS model, global terrestrial ET would be increased or decreased in similar magnitudes (around 3.0%), but RD between the cases representing leaf clumping effects changed very little (<0.5%). When RASM th is increased or reduced by 20%, global terrestrial ET is decreased by 1.6% or increased by 2.1%, but the RD between the cases changed very little (<0.5%). Generally, if other parameters in the BEPS model (i.e., N 0 and m in Table 1) are modified, the three cases would be moved in the same direction, and the leaf clumping effects quantified as the RD in global terrestrial ET between the cases would not be considerably influenced by these changes. These sensitivity tests support the justification of the spatially consistent leaf clumping effects (Figures 6 and 7).
In line with expectations, if Ω is increased or reduced by 20%, the RD among the cases would be changed substantially ( Table 2). In the initial evaluation with field measurements, the average bias error of the CAS-CI product was around 0.01 (~2%) and nearly all CAS-CI values were located within ±0.1 of the field observations [27]. Therefore, the 20% variation is generous for most PFTs. The variation range has been made large for covering the high uncertainty in CI estimation of tropical forests [23]. Although leaf clumping is decreased by 20% (Ω is increased), there is still a 2.7% difference between Case II and Case I, 7.4% difference between Case III and Case I, indicating that leaf clumping effects should not be neglected even though substantial uncertainties still exist in Ω measurements.

Implications and Uncertainties of This Study
For ET estimation, both LAI and Ω are required for describing complex canopy architecture. Both LAI and Ω can be acquired from remote sensing. Nevertheless, there are still uncertainties in the values of the two parameters. By now, several global LAI products with moderate resolution (250 m to 7 km) are available based on different algorithms [63]. Depending on field measurements and models used in the algorithm, some LAI products are close to L e [64,65] while some are close to true LAI [7,37,66]. If field-measured LAI is based on allometric equations or acquired from destructive sampling, the LAI would be close to true LAI. If optical instruments (i.e., LAI-2200 or DHP) are used to measure LAI without the clumping correction, the LAI is effective LAI [23]. Some remote sensing LAI products (i.e., GA-TIP and JRC-TIP) were produced based on data assimilation retrieval from albedo without the clumping collection, and thus they are effective LAI instead of the true LAI [63]. Clumping correction is needed for these LAI products before they are used for global ET simulation.
Although the GLOBMAP LAI product used in this study has incorporated the leaf clumping effects, there are still biases in the global CI map used for the GLOBMAP LAI production due to using the RossThick-LiSparse Reciprocal (RTLSR) model with a constant θ of 0 • [37,52]. CI of slight to middle clumped canopies was underestimated while they were overestimated for sparsely vegetated areas by this BRDF and SZA configuration [39]. The CAS-CI product performed better than the CI product produced by He et al. [52], and it has a lower bias when validated with ground measurements. However, the CAS-CI product does not perform well over very sparse crop sites due to the substantial impacts of exposed soil, which impair the derivation of CI from remote sensing data using the NDHD method [27,39]. Moreover, the NDHD method has a disadvantage of underestimating the true CI for tropical forests as satellite sensors are majorly detecting the thick overstory canopy, which would keep off the understory canopy [67]. The influences of frequent clouds would also impair the CI retrievals over tropical areas [27].
There are only a few CI validations for extremely sparse and dense vegetation conditions [27]. Thus, in the spatial patterns of leaf clumping effects on ET, the largest uncertainties would exist in these areas. More field observations are required for further evaluation of the CI estimation over these land cover types. Shrubs and other vegetation types would also have large uncertainties as these PFTs have not been extensively validated in the BEPS model.

Conclusions
The significance of leaf clumping to global terrestrial ET simulation is revealed from its influences on sunlit and shaded LAI stratification, on the calculation of true LAI from effective LAI, and estimations of leaf-level irradiance. This is the first time that foliage clumping effects on ET estimation are assessed globally. The following main conclusions are drawn from this study:

1.
Even though accurate global LAI data is used, neglecting leaf clumping will overestimate global terrestrial ET by about 4.7%. The reason is that leaf clumping reduces sunlit LAI while increases shaded LAI, resulting in an overall reduction in ET.

2.
If L e is used, neglecting leaf clumping will underestimate global terrestrial ET by 13.0%. The reason is that L e is lower than true LAI for a clumped canopy. If L e instead of true LAI is used in ET simulation ignoring leaf clumping impacts, shaded LAI would be substantially negatively biased while sunlit LAI is invariant, causing negative bias in ET.

3.
Although the accuracy of global terrestrial ET simulation still needs improvements, leaf clumping impacts quantified by the RDs between the model simulation cases considering or ignoring clumping are still robust since errors of key model parameters will move in the same directions for all cases.
This study highlights the importance of considering leaf clumping in estimating global terrestrial ET, which would be helpful to improve the quality of global terrestrial ET products derived from various land surface models. Both LAI and CI are needed for accurately estimating sunlit and shaded LAI as well as ET by the two-leaf model. This two-leaf separately ET estimation is important because stomatal conductance controlling transpiration is coupled with photosynthesis at leaf-level, of which sunlit and shaded leaves are controlled by different biochemical processes.
For better serving the global water research, the remote sensing community should produce more accurate global CI products based on data from new satellite sensors. Physical retrieval methods should also be explored to produce high-resolution global CI products. Finally, more field measurements are also needed for the validation of global CI products, especially in tropical regions.
where S and L represent shortwave and longwave radiation, respectively. The superscript asterisk represents the R n constituent. The subscripts 'sun' and 'sh' represent the single sunlit leaf and the single shaded leaf, respectively and the subscripts 'o' and 'u' represent overstory and understory, respectively. The subscripts 'dir' and 'dif' represent direct and diffuse constituents. S * di f _1 and S * di f _2 represent the net shortwave diffuse radiation on the sunlit and shaded representative leaves of the overstory canopy, respectively. S * di f _3 and S * di f _4 represent the net shortwave diffuse radiation on the sunlit and shaded single leaves of the understory, respectively. Methods of calculating shortwave irradiances are taken from Chen et al. (1999). The diffuse (S di f ) and direct (S dir ) components of the incoming global solar radiation are calculated as: where S g is the global solar radiation (W m −2 ) and r is calculated as: where S 0 is the solar constant (1367 W m −2 ). The direct part of net shortwave irradiance of a sunlit leaf in the overstory and understory canopies and the direct part of net shortwave irradiance of the ground are estimated, respectively, as: where α o , α u and α g are overstory, understory, and the ground albedo, respectively. α is the mean leaf-sun angle. Net shortwave diffuse irradiance on a sunlit leaf in the overstory is estimated as: where S di f ,under1 is the shortwave diffuse radiation under the upper layers of leaves (mostly sunlit) in the overstory canopy. C o_sun is from multiple scattering of direct radiation over the overstory sunlit leaves and it is calculated as: S di f ,under1 is calculated as: where P o_sun = e −0.5LAI o_sun /cosθ 1 and θ 1 is a representative zenith angle for diffuse radiation transmission through the upper layers of leaves (mostly sunlit) in the overstory canopy and slightly dependent on overstory sunlit LAI: The net shortwave diffuse irradiance on a shaded leaf in the overstory is estimated as: where C o_sh is from multiple scattering of direct radiation over the overstory shaded leaves and it is calculated as: S di f ,under2 is shortwave diffuse radiation under the overstory canopy and it is calculated as: where P o = e −0.5ΩLAI o /cosθ o and θ o is a representative zenith angle for diffuse radiation transmission through the overstory canopy and slightly dependent on overstory LAI: The net shortwave diffuse irradiance on a sunlit leaf in the understory is estimated as: where C u_sun is from multiple scattering of direct radiation over the understory sunlit leaves and it is calculated as: S di f ,under3 is the shortwave diffuse radiation under the upper layers of leaves (mostly sunlit) in the understory canopy and it can be calculated as: where P u_sun = e −0.5LAI u_sun /cosθ 3 and θ 3 is a representative zenith angle for diffuse radiation transmission through the upper layers of leaves (mostly sunlit) in the understory canopy and slightly dependent on understory sunlit LAI: The net shortwave diffuse irradiance on a shaded leaf in the understory is estimated as: where C u_sh is from multiple scattering of direct radiation over the understory shaded leaves and it is calculated as: S di f ,under4 is the shortwave diffuse radiation reaching to the ground and it can be calculated as: where P = e −0.5Ω(LAI o +LAI u )/cosθ and θ is a representative zenith angle for diffuse radiation transmission through the whole plant canopy and slightly dependent on the sum of overstory and understory LAI: The diffuse part of net shortwave irradiance on the ground surface is estimated as: Net radiation of the ground is estimated as: where the net shortwave irradiance of the ground (S * g ) is the sum of direct and diffuse parts: The net longwave irradiance for a sunlit leaf in the overstory (L * sun_o ) is calculated as: LAI o_sun [ε a σT 4 a + ε o σT 4 o_sh (1 − P o_sh ) +ε u σT 4 u_sun (1 − P u_sun )P o_sh +ε u σT 4 u_sh (1 − P u_sh )P u_sun P o_sh + ε g σT 4 g P u P o_sh ] −2ε o σT 4