2.5. Data analysis
The
20 qualitative traits were classified, and different values were
assigned in accordance with the survey results. The distribution
frequency of each classification was also calculated. Then, Shannon
diversity index (I ) was calculated in accordance with the
distribution frequency as follows:
\(I=\sum_{\par
\begin{matrix}\\
i=1\\
\end{matrix}}^{n}{(pi)(\ln{pi)}}\),
where pi represents the relative frequency of theith phenotypic class of a trait (Kouam et al. , 2018). The
maximum, minimum, average, standard deviation (SD), and coefficient of
variation (CV) of six quantitative traits were calculated using SPSS
25.0 software. Then, in accordance with the overall average
(\(\overset{\overline{}}{x}\)) and SD (σ), the quantitative trait data
were divided into 10 levels, from the first level
[Xi<(\(\overset{\overline{}}{x}\)-2 σ)] to the 10th level [Xi >
(\(\overset{\overline{}}{x}\) + 2 σ)], and each 0.5 σ was a level.
Principal component (PC) analysis was carried out with 26 phenotypic
indices on SPSS 25.0 software. In accordance with the phenotypic trait
survey data, a matrix (1,0) was constructed, and the registration at theith level of a trait was 1; otherwise, it was 0.
The
bands of SRAP and SSR markers were scored for each primer as presence
(1) and absence (0) for each locus, and the binary matrix was
constructed and statistically analyzed. The allele number
(N a), effective number of alleles
(N e), allele frequency, Nei’s (1973) gene
diversity index (H ), and I of each primer were calculated
on POPGENE software version 1.32 (Yeh et al., 1999). The genetic
similarity coefficient was evaluated and the principal coordinate
analysis (PCoA) was conducted using
NTSYS-pc
software version 2.10e (Rohlf, 2000). Cluster analysis of the unweighted
pair-group method with arithmetic means of the phenotypic traits and
molecular markers was performed on MEGA software version 4.1 (Tamura et
al., 2007).
The combined data of SRAP and SSR were analyzed via Bayesian model on
STRUCTURE software version 2.3.1 to analyze population structure
(Pritchard et al. , 2000). K
(number of clusters) was estimated to be in the range of 2–10, and the
software was run three times to determine this value. STRUCTURE
HARVESTER (Earl and Vonholdt, 2012), which determines the best K on the
basis of the probability of data given K and ΔK (Evanno et al., 2005),
was used to estimate the most likely number of clusters (K).