Improving spatial synchronization between X-ray and near-infrared spectra information to predict wood density profiles

Wood density is one of the most important physical properties of the wood, used in improvement programs for wood quality of major timber species. Traditional core sampling of standing trees has been widely used to assess wood density profiles at high spatial resolution by X-ray microdensitometry methods, but alternative methods to predict wood properties quality are also needed. Near-infrared (NIR) spectroscopy, a non-destructive technique, is being increasingly used for wood property assessment and has already been demonstrated to be able to predict wood density. However, the estimation of wood density profiles by NIR has not yet been extensively studied, and improved models using spectra information (NIR) and X-ray data need to be developed. To this end, partial least square regression (PLS-R) models for predicting wood density were developed at a 1.4 mm spatial resolution on Pinus pinaster wood cores, with an improved spatial synchronization along the tangential and radial directions of the strip, between X-ray data and NIR spectra. The validation of the best model showed a high coefficient of determination (0.95), low error (0.026) and no outlier. Compression wood samples were not detected as outliers and were correctly predicted by the model. However, pith spectra were detected as outliers and its predicted values were overestimated by 33% due to unusual spectra suggesting a diverse chemical composition. The results suggest that NIR-PLS models obtained can be used for screening maritime pine wood density profiles along the radii at 1.4 mm spatial resolution.


Introduction
Wood density has long been recognized as one of the most important wood physical properties. It has a direct impact on most wood utilizations, but it also is a common indicator of wood quality since it is related to other wood properties such as timber strength and shrinkage, fibre properties, pulp yield and properties or calorific values (Macdonald and Hubert 2002;Mäkinen et al. 2002;Saranpää 2003;Zobel and van Buijtenen 1989). In addition, wood density is one of the simplest and easiest wood properties to asses without the need for complex or expensive equipment, although more sophisticated equipment is required for detailed studies aimed at intra-ring density analysis, like the indirect X-ray microdensitometer (Hevia et al. 2020;Jacquin et al. 2017;Louzada and Fonseca 2002;Louzada 2003;Rozenberg et al. 2001). The procedure developed by Polge (1966) provides a high-resolution diametrical density profile from each wood sample, but the micrographic techniques are too tedious and too time-consuming to be used for large samplings (Polge 1978) as required in improvement programs for wood quality, where wood density is the property most widely used (Rozenberg and Cahalan 1997). An automated system that combined scanning X-ray microdensitometry with image analysis to assess the wood structure at 50 micrometres level was developed by Evans and co-workers (Evans et al. 1995) back in the 90s mainly to assess young softwood plantations (Silviscan-1) (Evans 1994) and later upgraded with an X-ray diffractometer to allow the determination of microfibril angle, and a complex image analysis system that allowed the analysis of hardwood plantations (Silviscan-2) (Anonimous 1997).
Maritime pine is the main softwood species in southwest Europe, representing a planted area of about 3 million ha in Spain, France and Portugal (AEF 2016;da Silva Perez et al. 2007;ICNF 2015). Due to its importance, this species has been under improvement programs initially for straightness and volume followed by improvement in wood quality, including wood density (da Silva Perez et al. 2007;Gaspar et al. 2009;Pot et al. 2002Pot et al. , 2006. Core sampling has been widely used to estimate wood density profiles and demonstrated to play a very important role in performing early selection and shortening the breeding cycle as well as improving the breeding efficiency (Bouffier et al. 2008;Gaspar et al. 2009;Meder et al. 2010;Pan et al. 2018). Genetic analysis of density components of Pinus pinaster, Pinus nigra and Pinus sylvestris showed that the earlywood density is under stronger and most stable genetic control and consequently most suitable to be included in selection and improvement programs (Louzada and Fonseca 2002;Louzada 2003;Gaspar et al. 2008aGaspar et al. , b, 2009Gaspar et al. , 2011Fernandes et al. 2017a,b;Dias et al. 2018). However, for larch, similar heritabilities were found for earlywood and latewood (Pâques et al. 2013).
Currently near-infrared (NIR) spectroscopy is recognized as a powerful nondestructive technique for wood traits. In particular, several studies, condensed in a few reviews, demonstrated that wood density can be predicted by NIR for different tree species (Leblon et al. 2013;Tsuchikawa 2007;Tsuchikawa and Kobori 2015;Tsuchikawa and Schwanninger 2013). More recent studies showed the potential of NIR for high spatial resolution, useful for intra-ring analysis using X-ray microdensitometry as the reference method (Alves et al. 2012;Baettig et al. 2017;Jones et al. 2007;Rodrigues et al. 2013). The main limitation of the equipment currently available is that the window of existing commercial NIR fibreoptic probes is circular and often too large, thus inadequate for the purpose of higher spatial resolution. To circumvent this limitation, different strategies were followed for the spectra acquisition in the radial or transverse sections, directions that maximize the property variation along the wood: (1) using selected rings with earlywood and latewood widths that match the size of the diameter of the fibre-optic window (Alves et al. 2012); (2) acquiring a custom fibre-optic probe with a narrow measurement area (Meder et al. 2010); or (3) reducing the window of existing fibre-optic probes with a low absorption black matte polymer, which in addition has the advantage of converting the circular area in an approximate rectangle (Rodrigues et al. 2013). Even though, these procedures cannot allow a spatial resolution as X-ray microdensitometry does, still a sub-millimetre spatial resolution can be attained (Meder et al. 2010;Rodrigues et al. 2013). A comparable X-ray resolution can be attained by an emerging tool based on the hyperspectral imaging (HSI). HSI was already used to predict the wood density in Pinus pinea (Fernandes et al. 2013a, b) and in Cryptomeria japonica (Ma et al. 2017). However, the errors obtained so far are above 0.063 (Fernandes et al. 2013a, b) or 0.105 (Ma et al. 2017).
Despite the lower errors obtained by using a fibre-optic probe compared to hyperspectral imaging, this methodology still has a major disadvantage regarding the acquisition of the spectra by moving the wood strip by hand along the length, which takes longer than using X-ray techniques. However, some time can be gained since sample preparation requirements are simpler and faster for NIR and there also is the possibility to automate the acquisition (Meder et al. 2010). The time and efforts for spectra acquisition using a fibre probe can be further justified if the spectra are used to assess other parameters like MFA or chemical composition (Meder et al. 2010).
Although an automated linear transport system that allows strips to be scanned by NIR at a spatial resolution of 1 mm steps via a custom fibre-optic probe in the radial direction already exists (Meder et al. 2010), for the remaining NIR users in the wood field the situation is not as good. The major NIR brands do not offer dedicated fibre probes with the required dimensions and there are software copyright limitations to a true integration in order to obtain a fully automated measurement system. This study will show how to circumvent the limitation of the fibre probe by using a black mate plastic with almost no NIR absorption to allow for the conversion of a circular probe in a rectangle with 1 by 3 mm that allows the advancement of the strip in 1 mm increments. In previous publications, using a similar outfit, the quality of the developed models for wood density, based on core sampling, and in particular, the high number of outliers found were discussed in view of the difficulties associated with synchronization between the X-ray data and spectra information (Alves et al. 2012;Rodrigues et al. 2013). This study will show that it is possible to obtain models with similar lower errors but with a reduced percentage of outliers, by using an improved spatial synchronization.

Wood samples collection and preparation
The trees examined in this study were obtained from a natural stand of Pinus pinaster Aiton in Northwest Spain (Atlantic region), one of the most important conifer timber species in the Atlantic area and Southern Europe (Hevia et al. 2017). From a total of 42 trees, 12 mm increment cores were extracted bark-pith-bark (North/South orientation) at breast height (1.30 m from above ground) from 4-to 5-years-old trees. The age at which early selection for wood quality characteristics in radiata pine is feasible, since at younger ages heritability reaches almost zero and stabilizes at 4-5 years old (Wu et al. 2007). Each sampled increment core was glued to a wood holder and cut using a high precision twin-bladed saw to obtain a longitudinal radial strip (approximately 2 mm thick). The final dimensions of the strips were 12 mm width (tangential direction) by the total length from bark to bark (radial direction) by 2 mm thick (axial direction). The obtained strips with uniform surface were nevertheless sanded (P150 to P800-grit sanding paper) to ensure a good surface and clearly visible ring boundaries. From the 42 trees, three radial strips (I, II and III) were randomly selected for NIR analysis.

X-ray microdensitometry
Thin radial strips were scanned for wood density (g/cm 3 ) determination (in radial direction) using an Itrax multiscanner (Cox Analytical Systems, www.coxsy s.se), equipped with a Cr-tube that was operated at 20 kV, 50 mA and 50 ms at each measurement point, with a resolution of 50 μm. The radiographic greyscale images obtained by microdensitometry were used to determine the wood density data of each point, which enabled a comparison of intra-ring and annual tree-ring density values for each sample. Moreover, different radial and tangential positions of the analysis paths ( Fig. 1) were defined in the analysis of the density profile of each radiographic image by WinDendro ® (Regent Instruments, Québec, QC, Canada). Wood density profile values were extracted from the radiographic images after calibrating the greyscale intensities to wood densities using the calibration wedge provided by the manufacturer (Cox Analytical Systems).

Near-infrared (NIR) spectroscopy
Fourier transform near-infrared (FT-NIR) spectra were recorded with a Bruker IN 263 fibre-optic probe with a 10 mm diameter head and a 9-10 mm 2 (about 3.6 mm diameter) window (Alves et al. 2012). The spectra were collected in the wavenumber range from 12,500 to 4000 cm −1 with an 8 cm −1 spectral resolution and 64 scans per spectrum on a Bruker MPA. The thermoplastic resin Spectralon ® (http://www.labsp here.com) was used as a reference.
The measurement window of the fibre-optic probe is a circle and too large to measure a density profile of most of the intra-ring annual rings along the wood strip. To reduce it and further convert it to a rectangle, a low NIR absorber, black matte plastic (0.36 mm thick) with a slit with 1 × 3.6 mm was fixed in the centre of the window probe between the sample and the probe (Rodrigues et al. 2013). Additionally, the rectangular window (1 × 3.6 mm) allowed an easier assignment of the X-ray microdensity reference values when compared to a circle (Alves et al. 2012). The NIR spectra of the whole sample profile were recorded by moving the wood strip in 1 mm increments from South to North direction by hand. Samples were measured under laboratory conditions at 22 °C and 60% humidity.

NIR and X-ray data synchronization
Although the matte black plastic had a slit with 1 mm width, the laser light emerging on the opposite side of the wood strip had an illuminated area of 1.4 mm, due to scattering. Since each spectrum was obtained at 1 mm intervals (moving the strip by hand at 1 mm increments), the average wood density determined by Itrax for each spectrum was obtained considering a window of 1.4 × 3.6 mm, radial x tangential.
The most important step to establish a calibration model is the correct synchronization between the high-resolution X-ray data (50 × 50 µm) and the NIR spectra data (1.4 × 3.6 mm) (Alves et al. 2012;Rodrigues et al. 2013). In the first trial, several paths were tested (Fig. 1): path (a) was selected along each wood strip as close as possible to the actual location where the NIR spectra were obtained. For the correction of the path along the tangential direction (width) of the strip, two additional paths were selected 0.25 mm above (b) and below (c) of the first path (a). This distance was chosen taking into account that the maximum error in the localization of the path would be inferior to 1 mm. In fact, using paths way outside this range originated models with coefficients of determination below 0.67 and errors above 0.056. If required, additional paths would be tested if paths (b Fig. 1 Synchronization for the selected wood strips: 1) tangential direction (above), 3 paths (a, b and c) and 2) radial direction (below) 7 paths (0 to 6) or c) outperformed (a) until no further improvement was obtained. After the best path for each strip was chosen according to the statistics of the PLS-R models obtained for each path, a new origin was selected along the radii of the wood strip in increments of 50 µm to the maximum of six times (300 µm) (Fig. 1). It is worth noticing that the wood strip covers completely the matte black plastic and the slit making it difficult to determine correctly the origin. In addition, the first spectra of each strip cannot start too close to the border of the slit due to the loss of light by scattering. Additionally, WinDENDRO (Regent Instruments, Québec, QC, Canada) has limited functionality when the sides of the paths are too close to the image boundary. The beginning and ending points of each path must be located within a half path-width distance from that boundary (WinDENDRO 2009).

Partial least square regression (PLS-R)
OPUS Quant 2 software (version 7.0, Bruker Optics, Germany) was used for the calculation of the PLS-R models. For the synchronization trials, the PLS-R models used the wavenumber range from 10,500 to 5450 cm −1 . For the model (243 spectra), the spectra were randomly split into 70% for calibration and 30% for validation and in addition, the ranges from 10,500 to 4600 cm −1 and 10,500 to 4450 cm −1 were used.
For all the models developed, the spectra were mean-centred before any preprocessing. Several pre-processes were tested, beside the raw spectra (no pre-processing), first (1stDer) and second derivatives (2ndDer), multiplicative scattering correction (MSC) and vector normalization (VN) as well as combinations of the first derivative with MSC and VN. Cross-validation results using the spectra with no further pre-processing except for mean-centring resulted always in PLS-R models with the best statistics, so only these models are shown. The optimum number of PLS components (rank) was determined by the full inner cross-validation method (leave one out). All models were calculated to a maximum rank of 10, and the statistics of the cross-validation [coefficient of determination (r 2 ), root-mean-square error of cross-validation (RMSECV) and residual prediction deviation (RPD)] were compared.

Outlier detection
Mahalanobis distance is used as a diagnostic for outlier detection in multivariate calibration. The Mahalanobis distance limit value is used for the identification of outliers. OPUS calculates this limit value as follows: The Mahalanobis distance limit is determined based on the distribution of all calibration spectra. For this purpose, the mean value and the standard deviation are calculated. Assuming that there is a normal distribution, a one-sided limit is defined which covers a probability of 99.999%.

X-ray microdensitometry (Itrax)
The average density of the strips of the 42 trees ranged from 0.395 to 0.527 g/cm 3 . The minimum, maximum and average wood density from X-ray densitometry of the three selected strips are shown in Table 1. There is a larger variation within-tree than among-trees as expected (Zobel and van Buijtenen 1989). The between-trees range for the 42 trees is 0.132 g/cm 3 , whereas the within-tree variation ranged from 0.353 g/cm 3 (II), to 0.514 g/cm 3 (III).
Similar wood density results for juvenile wood (0.441-0.451 g/cm 3 ) were reported for this species in France (Alves et al. 2012;da Silva Perez et al. 2007). Somewhat higher (0.484 g/cm 3 ) values were reported for 18 years-old samples collected in Portugal (Louzada and Fonseca 2002).

Path selection
The PLS-R models obtained for each path within each strip were reasonably similar to the main exception of the path (c) for strip I ( Table 2). The path correction along the tangential direction of the strip (a, b and c) hardly shows significant improvements over the initial path (a), which shows that the initial synchronization along the tangential direction was successful ( Table 2). The decision regarding the selection of the best path was based on the statistics of each model; in the case of similar results, as for instance strip I paths a and b (Table 2), path b was selected since it had no outlier.
The correction along the radial direction of the strip was not necessary for strip II, but for strips I and III, a correction of 100 µm (I) and 300 µm (III) improved the synchronization between spectra and Itrax data, resulting in models with better statistics and a lower number of outliers in the case of strip III (Table 2).

PLS-R modelling
A number of models with good statistics were obtained for the combination of spectral ranges 10,500-5450, 10,500-4600 and 10,500-4450 cm −1 , with several pre-processes. For the three regions, the best overall models were obtained with no pre-processing, so only these are shown in Table 3. The cross-validation of the models has good statistics with coefficient of determination (r 2 ) above 0.95, RMSECV below 0.025 g/cm 3 , RPD above 4.5 and a low number of outliers (5), of which two were from the pith (Table 3). However, the smallest region gave the best statistics for the validation, with a high coefficient of determination (0.95), low RMSEP (0.026 g/ cm 3 ), RPD of 4.4 and no outliers detected. The final model, with all spectra (243), using the region 10500-5450 cm −1 resulted in a PLS-R model with similar statistics and eight PLS vectors, with a coefficient of determination (r 2 ) of 0.95, a RMSECV of 0.024 g/m 3 and RPD of 4.6. For Table 2 Statistics of the cross-validation results of the PLS-R models (strips I, II and III), selection of the best path -tangential direction (a, b or c) and origin correction -radial direction (1 to 6) Path, tangential direction; Origin correction, radial direction; n, number of spectra; r 2 , coefficient of determination; RMSECV, rootmean-square error of cross-validation; RPD, residual prediction deviation; Rk, number of principal components; OL, outlier the calculation of the final model, all spectra were used, including the ones flagged as outliers in model development (cross-validation). In total, seven spectra (2.8%) were detected as outliers; five of which were the same as already found in the model development including the two from the pith. The comparison with other models using X-ray as the reference method and fibre-optic probes for spectra acquisition shows that the error obtained in this work (0.024 g/cm 3 ) is similar to or smaller than the reported ones taking into account that the spatial resolution varied from 0.9 to 10 mm. A similar error (0.023 g/cm 3 ) was obtained in the calibration for a 25-year-old Japanese larch (Larix kaempferi) with a spatial resolution of 0.9 mm; however, in this model a larger percentage of outliers (37%) was found in the validation of the model (Rodrigues et al. 2013). Interestingly, in that study, the model returned a higher coefficient of determination (0.98) and a higher RPD (7.6), due to a wider wood density range. A somewhat larger error (0.030 g/cm 3 ) for the same species was obtained using the entire circular window of the fibre probe (3.6 mm), also with a low percentage (7%) of outliers (Alves et al. 2012). Larger errors (0.043 g/cm 3 ) were reported for Pinus radiata at a resolution of 1.4 mm (Baettig et al. 2017) and for Pinus taeda (0.056 g/cm 3 ) at a 10 mm spatial resolution, while for 2 mm resolution, errors in excess of 0.133 g/cm 3 for different NIR instruments were obtained (Jones et al. 2007).
PLS-R models using hyperspectral imaging systems, and X-ray data as the reference data, also resulted in larger errors with a RMSEP of 0.065 g/cm 3 (Fernandes et al. 2013a). Using neural networks, this error improved to 0.063 g/cm 3 (Fernandes et al. 2013b). An even larger RMSECV error (0.105 g/cm 3 ) was reported for a Cryptomeria japonica model using hyperspectral imaging system (SilviScan) and X-ray data as reference (Ma et al. 2017). The poor errors obtained are probably due to the lower spectral resolution of existing HSI equipment, although difficulties in the synchronization between spectra data and X-ray at these high resolutions cannot be ruled out.

Influence of compression wood and pith
Compression wood is characterized by its high lignin content (Timell 1986) as well as composition (Alves et al. 2008(Alves et al. , 2009. The density of compression wood ranged Table 3 Statistics for cross-validation and validation of the PLS-R models, after selection of the best path and correction along the radial direction The rest of the variables are described in Table 2 RMSEP, root-mean-square error of prediction from 0.512 to 0.588 g/cm 3 within the range of latewood density (0.500-0.811 g/ cm 3 ). The spectra were not removed from the model during cross-validation and the spectra were not flagged as outliers. This seems to indicate that the predicted values were correctly predicted within the error of the model, even if its chemical composition differs from that of normal wood. Different results for the influence of compression wood were reported. Higher values for compression wood of Cryptomeria japonica were found and further poorly predicted by the model, which were attributed to the different chemical composition (Ma et al. 2017).
The density of pith determined by Itrax was 0.312-0.314 g/cm 3 within the range of earlywood density (0.246-0.430 g/cm 3 ); however, the pith spectra were removed from the model during cross-validation since its predicted values were 0.394 and 0.440 g/cm 3 , respectively. Moreover, the spectra were flagged as outliers with the spectra Mahalanobis distance exceeding the limit by 3-4 times. Pith or medulla is originated from primary growth of the trunk and very little is known regarding its chemical composition. However, the comparison of the second derivative spectra from 6100 to 5450 cm −1 of the pith (dashed line in Fig. 2), normal wood (grey, Fig. 2) and compression wood (black, Fig. 2) reveals that pith has an unusual chemical composition with a low lignin content compared to normal wood. The relative intensity of the valley at 5983 cm −1 from the aromatic skeletal due to lignin (Schwanninger et al. 2011) shows a gradation in terms of lignin content from pith (lower) to compression wood (higher) and with normal wood somewhere in the middle between the two. Besides this difference, several other differences are clearly visible. In the region from 5940 to 5840 cm −1 , instead of a single valley with minima at 5878 or 5987 cm −1 (normal and compression wood, respectively), the spectra of pith show two minima at 5905 and 5858 cm −1 mainly from holocellulose due to the low contribution of lignin to this region. The minimum at 5804 cm −1 is shifted Fig. 2 FT-NIR spectra in the second derivatives (2ndDer) mode of Pinus pinaster from different tissues: dashed line-pith; black line-compression wood; grey line-normal wood to 5797 cm −1 and the minima at 5740, 5636 and 5589 cm −1 are missing. These differences may explain why the pith spectra were tagged as outliers.

Conclusion
Results of this work show that it is possible to reduce and change the profile of available fibre-optic probes that allow obtaining models for the prediction of wood density at a low spatial resolution with acceptable errors and low number of outliers. Methods for the selection of the correct path and origin were developed that allowed obtaining models with better statistics.
The density of compression wood spectra was correctly predicted within the error of the model and the spectra were not flagged as outliers. However, the pith spectra were considered as outliers with their Mahalanobis distance exceeding the limit by 3-4 times resulting in a 33% overestimation of the predicted density.
The final model obtained can be used for screening and for quality control and selection of the Pinus pinaster wood density with a spatial resolution of 1.4 mm. The final model has a coefficient of determination (r 2 ) of 0.95, RMSECV of 0.024 g/m 3 , RPD of 4.6 and a low number of outliers. The spatial resolution can facilitate the detection of annual growth ring features and evaluation of aspects of heterogeneous wood quality. The technology presents itself as a potential tool for the Pinus pinaster breeding program, enabling the early non-destructive evaluation of intra-ring density at each annual ring.