Genetic diversity assessment among Corema album (L.) D. Don (Ericaceae) genotypes based on ISSR markers and agro-morphological traits

Corema album (L.) D. Don is the only species of the genus Corema growing naturally on sand dunes throughout the Atlantic coast of the Iberian Peninsula, noted for the white coloured berries and nutritional value. The lack of information on genetic studies of the species is one of the most limitations for the establishment of this species as a new culture. Thus, this study focuses on the assessment of the genetic diversity based on inter-simple sequence repeats (ISSR) molecular markers and morphological traits. Seventy-one female plants, from four different sampled sites, were evaluated using six ISSR loci and eight morphological traits. Fifty polymorphic loci were detected. The dendrogram based on the UPGMA method and the principal coordinate analysis classified the 71 C. album genotypes into distinct clusters. The analyses revealed that accessions from the same geographical area were generally, but not entirely, clustered into the same group. Analysis of molecular variance was higher among populations, than within populations. The analysis of morphological traits revealed that there is no distinct separation among the C. album genotypes grown in different geographic areas. To our knowledge, this is the first study on the assessment of the genetic diversity in this species.


Introduction
Corema album (L.) D. Don belongs to the Ericaceae family and occurs naturally on sand dunes of the Atlantic coast of the Iberian Peninsula (ssp.album), and in the Azores Islands on volcanic lava and ash fields (ssp.azoricum Pinto da Silva).Besides its great ecological importance (Guitia ´n et al. 1997;Zunzunegui et al. 2006), its edible white berries have been highly appreciated and exploited (Oliveira and Dale 2012) due to the high nutritional value and important antioxidant properties.These berries contain high amounts of anti-oxidants, phenolic acids, flavonoids and low amounts of anthocyanins (Leo ´n-Gonza ´lez et al. 2013;Andrade et al. 2017b) and are also an important source of fibbers and sugars (Andrade et al. 2017a).
Despite the important properties with potential benefits on human health, this species is still poorly exploited from a commercial point of view, being still harvested from the wild without any cultivation.However, there is an increasing demand for berries with a distinct white colour.Therefore, the establishment of this species as a new crop with fruit of high quality for responding to consumer demands and exploring market opportunities is a challenge.
The characterization of DNA based-markers provides information on the genetic diversity and relationship both within a population (intragroup diversity) and among different populations (intergroup diversity) which provide useful knowledge for breeding selection (Agarwal et al. 2008).Inter-simple sequence repeat (ISSR) markers have shown great potential for assessing the genetic diversity of wild species and structuring of natural populations (Zietkiewicz et al. 1994;Ueno et al. 2015;Zoratti et al. 2015).The ISSR technique is simple, fast and has high reproducibility, low costs and no genome knowledge is need for its implementation (Reddy et al. 2002).These markers have also been used in genetic studies on Vaccinium populations, which also belong to the Ericaceae family (An et al. 2015;Debnath 2007Debnath , 2009;;Gawron ´ski et al. 2017;Yakimowski and Eckert 2008).For C. album, only a molecular approach using ISSR was carried out among male and female plants to identify a putative sex-specific marker (No ´brega et al. 2016).
The present research work employs a survey conducted in wild populations along its distribution area for evaluation of genetic diversity using ISSR markers.

Plant material and sampling sites
We sampled 20 female C. album plants drawn from four populations (Table 1) sited at different biogeographical units along the Atlantic coast of Portugal.For all plants, geographical coordinates were recorded using the global position system (GPS).
In all populations, the vegetation cover was dominated by C. album shrubs and Pinus pinaster Ait. was present further inland of the dune systems, except in Monte Cle ´rigo.In Comporta and Meco, the populations were on the interface between the dunes and the pine woodland, whereas the population of the Duna de Quiaios was under the canopy of the pine woodland.

DNA isolation and amplification
Fresh and healthy leaves were ground to a fine powder in liquid nitrogen using pre-cooled mortar and pestle and then stored at -80 °C until use for DNA extraction.
Total genomic DNA was isolated from approximately 100 mg of leaf powder using the DNeasy Plant Mini Kit (QIAGEN, Hilden, Germany) following the manufacturer's instructions.The concentration of DNA was estimated with the NanoDrop 2000 UV-Vis Spectrophotometer (Thermo Fisher Scientific, Massachusetts, USA).
A set of 20 ISSR primers were screened.Of these, six primers were selected based on their reproducibility and levels of polymorphism and used for final analysis.Primers used with their respective sequences and annealing temperature were showed in Table 2.
Amplification profile consisted of an initial denaturation of 3 min at 94 °C, followed by 40 cycles for 1 min at 94 °C, 1 min at the annealing temperature (Table 2), 2 min at 72 °C and a final extension for 10 min at 72 °C.Amplicons were separated by electrophoresis at 5 V cm -1 in agarose gel (1.5%) containing 0.5 g/mL ethidium bromide and 1 9 TBE running buffer.
In order to have a representative survey for the distribution area, 20 accessions by population were sampled, as described before.
However, the DNA from three plants from Meco, two from Monte Cle ´rigo and four from Duna de Quiaios, was not able to get proper PCR amplification, showing reduced yields.Thus, although we aimed to analyse 80 accessions, due to low amplification quality, some samples had to be excluded and only 71 genotypes were included in the assessment of the genetic diversity.

Morphological data
A total of 10 quantitative phenotypic traits were assessed in detail, during flowering (March to May) and fruiting season (August and September) in all plants.The traits list included the most important leaf, flower and fruit features: plant volume, flower/inflorescences ratio, number of leaves per whorl, length of annual growths, branching number, number of fruits with more than 10.25 mm in diameter, number of fruits with diameter between 10.25 and 7.50 mm, number of fruits with less than 7.50 mm in diameter, percentage of white fruits and fruit dry/fresh weight ratio.
From the 10 traits, only eight were used, for further analyses, as it was explained in the in the statistical analyses part of the morphological variation, due to colinearity among some of them and also to their agronomic relevance.Thus, number of fruits with diameter between 10.25 and 7.50 mm and with less than 7.50 mm were discarded.

Genetic diversity
The genetic diversity of four populations was analysed using six ISSR selected primers.
Scoring of ISSR amplification product sizes was carried out by considering only the clear and unambiguous bands.The results were transformed into a binary presence (1)-absence (0) matrix.Genetic diversity was estimated by three indexes calculated for each ISSR marker: the polymorphism information content (PIC) (Rolda ´n-Ruiz et al. 2000), the resolving power (RP) (Prevost and Wilkinson 1999) and the marker index (MI) (Powell et al. 1996) and by the parameters: number of bands, number of polymorphic bands and percentage of polymorphic loci.
The data obtained by scoring the ISSR profiles of different primers were subject to cluster analysis using the 71 9 71 matrix of the Nei-Li dissimilarity coefficient (Nei and Li 1979) ran in the stats R package (R Core Team 2013).The dendrogram was performed using UPGMA clustering method (Unweighted Pairgroup Method Analysis) by the factoextra R package (Kassambara and Mundt 2017).The genetic structure and the differences between populations were evaluated through a Principal Coordinate Analysis (PCoA) using the vegan R package (Oksanen et al. 2013).A hierarchical analysis of molecular variance (AMOVA) was also performed by using pegas R package (Paradis et al. 2018).In order to seek relations between variables in all plants of the four populations (n = 71), Spearman's correlation (a = 0.05) was used from Hmisc R package (Harrell 2014).
A previous selection of the traits was made regarding their correlations and their agronomic interest.Eight traits were used to perform principal component analyses (PCA), with factoextra R package, in order to find which traits differentiate better each population (Kassambara and Mundt 2017).
A dissimilarity matrix was calculated from the eigen values extracted from the first three PCA axis, using the Mahanalobis distance, like it was performed in Pereira-Lorenzo et al. (2012).This was achieved by using StatMatch R package (D'Orazio and D'Orazio 2019).Dendrogram was built also using the UPGMA method by means of the factoextra R package (Kassambara and Mundt 2017).To evaluate the relationship between matrices of genetic and morphological traits dissimilarities the Mantel test's correlation, were calculated using the vegan R package (Oksanen et al. 2013).

Genetic diversity
All the six ISSR primers produced distinct reproducible polymorphic banding patterns with a total number of 51 scorable bands, ranging from 6 to 10 bands per primer, with an average of 8 bands per primer (Table 3).Amplification products ranged from 500 bp to 2.0 kb.
The Polymorphism Information Content (PIC) values of the primers varied between 0.16 (primer M13) and 0.37 (primer UBC855) with an average of 0.31.The highest Marker index (MI) was showed in primer M13 (17.41), with an average of 10.1 per primer.The highest resolving power was from UBC840 (17.24), with an average per primer of 11.26.Regarding ISSR banding profiles the genotypes from Meco and Comporta presented higher diversity than the genotypes from Monte Cle ´rigo and Duna de Quiaios (Table 3).
The dendrogram issued from a cluster analysis based on 51 ISSR markers (Fig. 1).Five distinct clusters were identified.Cluster four consisted of  AMOVA analysis showed that the proportion of variation attributable to among populations was high (62.40%)whereas only 37.6% occurred within populations.

Morphological variability
The Kruskal-Wallis test performed in the 10 morphological traits, showed significant differences among populations (Table 4).Vegetative traits were the ones that had more significant differences among the four sites.The number of fruits in each of the three classes was higher in Meco and Comporta, although no significant differences were showed in the two higher fruit classes.The percentage of white fruits was only significantly different in Monte Cle ´rigo.The ratio of dry/fresh fruit weight was higher in Quiaios and lower in Comporta, which means that the relation pulp/seed size was higher in Quiaios.
Spearman's correlation showed higher significant correlations in the number of branches and number of fruits with diameter between 10.25 and 7.50 mm, with the other variables (data not showed), therefore both traits were excluded to perform the Principal Component Analyses (PCA).The rest of the traits showed smaller or non significant correlations.
The first three axis of the PCA accumulated 62.1% of the trait variation (Fig. 3).In PC1, length of the annual growths, ratio of flowers per inflorescence and white fruit percentage had more influence in this axis (25.2%).Plant volume, number of fruits with more than 10.25 mm in diameter and dry/fresh fruit weight, had a positive load in PC2 (24% of the variation).Finally, in PC3 which accumulated 12.9% of the variation, was influenced by the number of fruits smaller than 7.5 mm in diameter and the number of leaves per two whorls.
Cluster analysis based on the Mahalanobis distance, calculated as dissimilarity measure between the 71 genotypes, from the standardized three first principal components (PCs), and clustered by UPGMA, and produced the dendrogram presented in Fig. 4. The morphological traits varied widely and the highest phenotypic diversity was observed in Comporta and Meco.However, this general distribution of the genotypes did not reflect the geographical origin.
Mantel test between the morphological and molecular dissimilarity matrices showed a low correlation (r = 0.139; p = 0.004).

Discussion
The genetic improvement of any crop is dependent on the utilization of well characterized wild relatives and breeding techniques.The assessment of genetic diversity is a requirement to select high yielding genotypes.
Fifty alleles were identified by six ISSR markers proving their ability to be used as polymorphic markers in C. album accessions.Although ISSR markers have been extensively used to assess genetic diversity, there are no reports in Corema genus, which include only two species.Thus, to our knowledge this is the first study on the assessment of the molecular genetic diversity in this species.
Regarding molecular data the clustering analyses showed that Meco and Comporta were the populations with more dispersed accessions in the dendrogram, which could lead to a higher genetic diversity compared to Monte Cle ´rigo and Duna de Quiaios.
Both the PCoA analysis and phylogeny reconstruction based on ISSR markers reveals a clustering pattern consistent with geographical location and suggestive of response to environmental conditions variation.In fact, the PCoA analysis showed that the Comporta and Meco populations present a higher dispersion compared to other populations.This  The values of the AMOVA were different from what was observed in other species of the Ericaceae family.Values were higher for within population variance, also using ISSR markers, in Vaccinium myrtillus (Zoratti et al. 2015), Vaccinium angustifolium (Debnath 2009) and Vaccinium vitis-idaea (Debnath and Sion 2009), probably due to self-pollination capabilities in Vaccinium, in turn, absent in Corema.C. album is cross-pollinated through wind (A ´lvarez-Cansino et al. 2010;Guitia ´n et al. 1997) and with a low germination percentage (Calvin ˜o-Cancela 2004), leading to a higher variation between populations.
The first three axis of the PCA explained 62.1% of the variation, regarding variability among agro-morphological traits.Although this was not a high percentage, similar results were achieved in other species, regarding morphological traits (Pereira-Lorenzo et al. 2012;Ciarmiello et al. 2015;Kouakou et al. 2018).All four populations partially overlap, Comporta being the population with greater morphological diversity.Monte Cle ´rigo and Quiaios had almost symmetric distribution, even though they slightly overlap.
Comporta had plants with bigger volumes and fruits with bigger sizes.Quiaios had a big influence regarding the ratio between dry/fresh fruit weight and the length of annual vegetative growth.
The cluster analysis of morphological traits revealed no distinct separation among the C. album genotypes growing in different geographic areas.This variability within and among populations was also found in other studies (Burgos et al. 2018).The morphological traits of the species did not show any accordance to its geographical distribution as, per example, Solouki et al. (2008) found in Matricaria chamomila.
The Mantel test between morphological and molecular data yielded a quite low correlation value, similarly to other studies (Fanizza et al. 1999;Allel et al. 2017;Giordani et al. 2017;Burgos et al. 2018).Several factors might justify the somewhat unexpected low correlation.Molecular markers could be covering parts of the genome with coding and noncoding regions and could be less subjected to selection pressures compared to morphological traits (Burgos et al. 2018;Semagn 2002).A correlation of morphological traits with environmental conditions, but with no genetic correspondence, could mean distinct phenotypes that are not distinct genotypes (Johns et al. 1997).Such discordance might relate to evolutionary and biogeographical processes that are way off the aims the current analysis (see also Martins et al. 2006).

Conclusion
The analyses provided by the ISSR makers, assembled the accessions regarding their geographical distribution, but most importantly, genetic diversity was found.Genetic diversity was higher between populations and lower within.The morphological traits, showed high morphological diversity, but no signal regarding geographic localization.The Mantel test between genetic and morphological data had a low correlation.Comporta and Meco had higher molecular and morphological diversity, being the chosen populations for future plant material selection.
As the first approach, to our knowledge, to genetic and morphological diversity characterization of C. album in Portugal the study reveals a high degree of diversity among the accessions which can be further used for crop improvement.This may provide an opportunity to enhance and boost the breeding strategy.

Fig. 1
Fig. 1 Dendrogram obtained with UPGMA method using the Nei and Li coefficient for 71 plants of the four populations (r = 0.729)

Fig. 2
Fig. 2 Principal Coordinates Analyses (PCoA) of the molecular data, which explained 22% of the variation amongst populations

Fisher
's post hoc test (a = 0.05) was performed and significant differences between columns were marked with a different letter analyses showed three distinct groups (Monte Cle ´rigo, Comporta and Duna de Quiaios), and an overlap of Meco with Monte Cle ´rigo and Comporta.The northern population (Duna de Quiaios) formed a different group from the three southern populations.However, the overlapping between Meco and Monte Cle ´rigo was unexpected, since the population which was geographically closer to Monte Cle ´rigo was that of Comporta.A putative reason to this might be linked to seed dispersers: some studies indicate that one of the main disperser of C. album were seagulls(Calvin ˜o- Cancela 2002, 2004), and their capability to travel long distances could favour gene flow among populations (Calvin ˜o-Cancela 2011).

Fig. 3
Fig. 3 Principal component analysis (PCA) showing the dispersion of the individual of the four populations sampled, as well as what were the traits that had the biggest influence on each population.Vl = Plant volume, RFI = flower/ inflorescences ratio, NLW = Number of leaves per two whorl, LAG = Length of annual growths, BN = Branching number, HFC = Number of fruits with more than 10.25 mm in diameter, PWt = Percentage of white fruits, RP = fruit dry/fresh weight ratio

Table 1
Kruskall-Wallis test among the four populations, at a significance level of a = 0.05, was performed in all traits assessed.A Fisher's post hoc test (a = 0.05) for mean separation was also conducted.The agricolae R package (De Mendiburu 2019) was used.

Table 3
Genetic diversity estimates of C. album populations including NB: number of bands; NPB: Number of polymorphic bands; PPB: Percentage of polymorphic bands; PIC:

Table 4
Summary table regarding morphological traits mean differences, in the four populations