Country‐wide genetic monitoring over 21 years reveals lag in genetic recovery despite spatial connectivity in an expanding carnivore (Eurasian otter, Lutra lutra) population

Abstract Numerous terrestrial mammal species have experienced extensive population declines during past centuries, due largely to anthropogenic pressures. For some species, including the Eurasian otter (Lutra lutra), environmental and legal protection has more recently led to population growth and recolonization of parts of their historic ranges. While heralded as conservation success, only few such recoveries have been examined from a genetic perspective, i.e. whether genetic variability and connectivity have been restored. We here use large‐scale and long‐term genetic monitoring data from UK otters, whose population underwent a well‐documented population decline between the 1950s and 1970s, to explore the dynamics of a population re‐expansion over a 21‐year period. We genotyped otters from across Wales and England at five time points between 1994 and 2014 using 15 microsatellite loci. We used this combination of long‐term temporal and large‐scale spatial sampling to evaluate 3 hypotheses relating to genetic recovery that (i) gene flow between subpopulations would increase over time, (ii) genetic diversity of previously isolated populations would increase and that (iii) genetic structuring would weaken over time. Although we found an increase in inter‐regional gene flow and admixture levels among subpopulations, there was no significant temporal change in either heterozygosity or allelic richness. Genetic structuring among the main subpopulations hence remained strong and showed a clear historical continuity. These findings highlight an underappreciated aspect of population recovery of endangered species: that genetic recovery may often lag behind the processes of spatial and demographic recovery. In other words, the restoration of the physical connectivity of populations does not necessarily lead to genetic connectivity. Our findings emphasize the need for genetic data as an integral part of conservation monitoring, to enable the potential vulnerability of populations to be evaluated.

have been restored. We here use large-scale and long-term genetic monitoring data from UK otters, whose population underwent a well-documented population decline between the 1950s and 1970s, to explore the dynamics of a population re-expansion over a 21-year period. We genotyped otters from across Wales and England at five time points between 1994 and 2014 using 15 microsatellite loci. We used this combination of long-term temporal and large-scale spatial sampling to evaluate 3 hypotheses relating to genetic recovery that (i) gene flow between subpopulations would increase over time, (ii) genetic diversity of previously isolated populations would increase and that (iii) genetic structuring would weaken over time. Although we found an increase in inter-regional gene flow and admixture levels among subpopulations, there was no significant temporal change in either heterozygosity or allelic richness. Genetic structuring among the main subpopulations hence remained strong and showed a clear historical continuity. These findings highlight an underappreciated aspect of population recovery of endangered species: that genetic recovery may often lag behind the processes of spatial and demographic recovery. In other words, the restoration of the physical connectivity of populations does not necessarily lead to genetic connectivity.
Our findings emphasize the need for genetic data as an integral part of conservation monitoring, to enable the potential vulnerability of populations to be evaluated.

K E Y W O R D S
gene flow, genetic monitoring, microsatellites, population genetics, population recovery, time lag

| INTRODUC TI ON
Many large mammal species have experienced population declines during past centuries, largely due to anthropogenic causes (Cardillo et al., 2005). Some of these declines are now being reversed due to protective legislation, and population recoveries are being observed (Chapron et al., 2014). These declines and subsequent re-expansions into historically occupied ranges provide natural experiments to measure how genetic diversity and structure change both temporally and spatially during population growth, allowing for the testing of theoretical predictions from simulation studies (Hagen et al., 2015).
Theoretical studies have shown that population expansions are likely to be accompanied by changes in genetic diversity (Excoffier & Ray, 2008), which may differ from the changes caused by demographic growth alone (Excoffier et al., 2009). During population expansions, sequential founder events can cause unusual phenomena, including reduced genetic diversity at the wavefront due to random genetic drift, combined with the relative isolation of the founder individuals from the rest of the population (Excoffier & Ray, 2008).
As populations continue to expand, connectivity may be established between previously isolated demes. If this spatial connectivity results in effective gene flow, reductions in spatial population structuring between differentiated groups should be observed, resulting in increased genetic diversity within groups (Ibrahim et al., 1996;Excoffier et al., 2009). Different population genetic metrics will respond with different speeds to such re-establishment of gene flow.
A leading, i.e. early indicator of such genetic recovery is allelic richness, while heterozygosity-based statistics will respond more slowly and thus are lagging indicators (Nei, 1987).
Wildlife populations that have experienced large declines are often fragmented in the process, leading to differentiation through genetic drift (Manel et al., 2004;Rueness et al., 2003). Subsequent recovery of these populations through expansion is therefore predicted to reduce spatial genetic structure through gene flow by reestablishing connectivity between previously isolated genetically differentiated units (Hagen et al., 2015). However, detecting this connectivity requires temporal sampling of the population to determine changes in gene flow, genetic differentiation and genetic diversity between and within subpopulations over time. Such information quantifying the dynamics of range expansions is vital in order to make predictions about the nature of population recoveries, and to develop and monitor progress towards realistic management goals.
The Eurasian otter (Lutra lutra; henceforth referred to as otter) is a largely piscivorous mesocarnivore, which feeds both in freshwater and coastal habitat (Kruuk, 1995). Otters in the United Kingdom are undergoing a population expansion, recovering from a welldocumented large-scale decline that occurred in the second half of the 20th century (Crawford, 2010;Strachan, 2015). This decline was largely attributed to the use and bioaccumulation of pesticides and other industrial compounds such as dieldrin and polychlorinated biphenyls (PCBs), although habitat loss and direct persecution through hunting may also have contributed (Conroy & Chanin, 2000;Mason & Macdonald, 2004). Resulting population declines since the 1950s led to the extinction of otters across large areas of England ( Figure S1), and previous genetic analyses have shown strong northsouth genetic differentiation, as well as several genetically distinct subpopulations (Hobbs et al., 2011;Stanton et al., 2014). This structuring likely reflects both remnant populations, which survived the decline (strongholds in South West England, Wales and Scotland), and past reinforcement programmes (in northeast and southern England; Hobbs et al., 2011).
A wide range of threats pose significant, new and persistent challenges to freshwater biodiversity (Reid et al., 2018), including both direct and indirect threats to otters (O'Rourke et al., 2022). Despite these challenges, national otter survey data (Mathews et al., 2018) indicates that the UK otter population has been expanding both in range and population size since the 1980s ( Figure S1). This population recovery is attributed to legislative protection for both otters (Wildlife and Countryside Act, 1981) and their freshwater habitats (EC Habitats Directive, 1992), and reduction in environmental pollution . The spatial and demographic recovery evidenced by national otter survey data has resulted in an apparently spatially contiguous population in the UK, with previously isolated subpopulations now re-joined (Mathews et al., 2018). It remains unclear, however, to what degree this apparent spatial contiguity might have translated into genetically connected populations. Using samples and data collected over a twenty-one-year period  during population recovery, we focus on the otter as an ideal case study with which to test the following hypotheses: 1. Genetic structuring across the study area will weaken over time as demographic population recovery and spatial expansion proceed, reconnecting previously isolated subpopulations.
2. Gene flow between subpopulations/regions will increase over time with increased contact due to range expansions.
3. Genetic diversity of previously isolated subpopulations will increase, as gene flow between neighbouring subpopulations allows the influx of new alleles and changes allele frequencies of standing genetic variation within populations.

| Samplecollectionandselection
Road-killed otter carcasses from across Wales and England were collected and their locations recorded as part of the national Cardiff University Otter Project programme (www.cardi ff.ac.uk/otter -project). Muscle samples were taken from the hind leg and stored in ethanol at −20°C.
This maximized the overall temporal span (within the constraint of samples available to the Cardiff University Otter Project at the time of analysis) while providing a sufficient number of discrete time points to facilitate the interpretation of trends. A minimum of 3 years separation between time points ensured appropriate temporal independence, based on a typically assumed otter generation time of 3 years (Randi et al., 2003; although more conservative models suggest 7.6 years; calculated according to Pacifici et al., 2013). Earlier temporal subsets were pooled across three consecutive years due to insufficient sample availability within individual years. For time points up to 2004, genotype data were used from a previous study (Hobbs et al., 2011(Hobbs et al., ), including 1993(Hobbs et al., -5 (n = 28), 1998(Hobbs et al., -2000 and 2004 (n = 79). For 2009, 97 samples were genotyped and 7 existing genotypes were included from Stanton et al., 2014;from 2014, 96 samples were genotyped (see Table S2 for data allocation to each study).
When selecting the additional samples to genotype for this study, to avoid spatial pseudoreplication, which might arise through truly random sampling, we randomly selected one sample from every 20 km grid square in which at least one dead otter had been collected that year. We chose a 20 km grid based on otter range size, for which estimates vary between 7.5 km ± 1.5 km (Néill et al., 2009) and 20-30 km (Erlinge, 1968). The resulting dataset comprised 407 individuals, including 236 males, 168 females and three individuals where neither sex nor age class could be determined (males: 147 adults, 78 subadults and 10 juveniles along with one individual of unknown age class; females: 91 adults, 64 subadults and 10 juveniles, along with three individuals of unknown age class). Due to our use of (primarily) roadkill as a sampling mechanism, it is possible that our dataset is biased toward younger male otters, relative to the wild population (Philcox et al., 1999), although without any means of independent verification of population demography, this is not possible to ascertain.
Each sample was mapped as a point location in ESRI ArcGIS 10.3 (Figure 1), and each point allocated to a River Basin District (as specified in the Water Framework Directive Cycle 2, Environment Agency, 2015; Natural Resources Wales, 2015, which are based on groupings of river catchments). Otters typically occupy linear home ranges along freshwater habitats such as rivers (Kruuk, 1995), hence we chose watershed-based spatial aggregations of data as likely to provide an ecologically relevant unit. Aggregation of data within smaller (e.g. catchment-based) units provided insufficient sample size for some of our analyses. RBDs provided a suitable sample size in most areas, although aggregation of adjacent RBDs was required in some areas to ensure adequate sample size for certain analyses (Table 1). We defined these spatial units as 'RBD regions' and combined these with the temporal data (sampling year) to yield 23 spatial-temporal groupings (STGs) for further analysis.
A subset of 14 previously genotyped samples (Hobbs et al., 2011) was also re-analysed to allow calibration of fragment size scoring between the three studies: Stanton et al. (2014) calibrated their data to Hobbs et al. (2011), therefore calibration using the 14 samples from Hobbs et al. (2011) was sufficient to bring all three sets of data together. Despite the known challenges of combining microsatellite datasets (Ellis et al., 2011), this procedure enabled adjustment of allele sizes to account for variation between sequencing platforms and scorers, allowing the dataset to be analysed as a whole (see Section 3 for details).

| Geneticvariabilitybylocus,within spatiotemporalgroupings(STGs)ofRiverBasin District(RBD)regions
Genotyping errors and null allele frequencies were estimated using Microchecker V2.2.3 (Van Oosterhout et al., 2004). Each spatialtemporal grouping (STG) was run independently, and loci were only removed if they were identified as having null alleles in the majority of regions at a specific time point (D'Urban Jackson et al., 2017). Null alleles are a potential source of bias during the estimation of population differentiation (F ST ) therefore it is important to identify them if present. An exact test of Hardy-Weinberg equilibrium (HWE) and a test for linkage disequilibrium (LD) between all pairs of loci were conducted in ARLEQUIN 3.5 (Excoffier & Lischer, 2010). Deviation from HWE was estimated using 1,000,000 Markov chain steps and 100,000 dememorisation steps. Linkage disequilibrium between loci was estimated using 10,000 permutations with the number of random initial conditions set to 2 and a significance level of 0.05. These analyses were run on the whole dataset (i.e. for all STGs combined) initially and then repeated on the samples in each STG individually, for LD only time points 2009 and 2014 were included in the analysis as a representative sample as for true LD one would expect the same result at any given time point and these data had full spatial coverage over the study area and the largest sample size.
Genetic diversity per locus was estimated using MICRO-SATELLITE TOOLKIT (Park, 2001). Unbiased expected heterozygosity (Nei, 1987), observed heterozygosity and number of alleles per locus were calculated along with the mean of each of these statistics.
A paired 2-sample t-test was used to test for a significant difference between the expected and observed heterozygosity using R studio (R version 3.4.3, R Core Team, 2017).
Genetic diversity was estimated for each STG using multiple diversity statistics. We used MICROSATELLITE TOOLKIT (Park, 2001) to estimate an unbiased estimator of expected heterozygosity, based on Nei's unbiased gene diversity (Nei, 1987) and observed heterozygosity including standard deviations. Expected heterozygosity and observed heterozygosity are more robust to sample size changes, when the sample size is greater than 5-10 individuals (a threshold that was met in the majority, 21/23 for N > 5 and 19/23 for N > 10 of our STGs); however, it is particularly important that standard deviations are reported and considered alongside estimates for smaller sample sizes (Pruett & Winker, 2008). HP-RARE v1.1 (Kalinowski, 2005) was used to estimate allelic richness (A r ) and private allelic richness (pA r ) for each STG; initially, this was calculated using the smallest sample size at each individual time point, respectively. As sample size fluctuated both between regions and time points, we considered the bias that these fluctuations might introduce into each analysis. For analyses that were sensitive to such data fluctuations (e.g. allelic richness), we used the resampling approach implemented in HP-RARE where the resample size was set to the smallest sample size in any RBD region for that year (Pruett & Winker, 2008) to allow comparisons to be made across all RBD regions at a particular time point. A second analysis was run where each dataset was resampled at the smallest sample size across all time points (N = 6) to account for sample size bias in the estimations and allow comparisons to be made across all STGs. Additional statistical analyses of genetic diversity estimates were carried out using R; correlations between both allelic richness and private allelic richness under both resampling regimes were assessed using Kendall's Tau (nonparametric rank correlation method that allows for tied values), as was the relationship between observed heterozygosity and RBD region land area.
The inbreeding coefficient (F IS ) was initially estimated based on all individuals from each STG, using FSTAT version 2.9.3.2 (Goudet, 1995), and tested for significant deviation from 0 using a 1-sample T-test in R with correction for false discovery rate (FDR) F I G U R E 1 Genetic clusters (K2-5) identified in UK otters sampled across 1999-2014 using a Bayesian approach in structure. Circles show the location of each otter in the dataset and the colours indicate the proportion of each genetic cluster to each individual belongs to. For K = 4 and K = 5, a denotes the major mode and b the minor mode where not all 10 repeat runs agreed, see Figure S4 for the number of runs attributed to each mode by CLUMPP.
using the Benjamini and Hochberg (1995) method to account for multiple testing. Population structure and admixture can affect F IS estimates, however. For example, if individuals from more than one genetic cluster are analysed as a single population, a deficiency of heterozygotes is likely to be observed due to violation of the Hardy-Weinberg equilibrium (HWE) assumption of a single, randomly mating population (Hardy, 1908;Waples, 2015;Weinberg, 1908). This deficiency produces a positive F IS value and thus a false indication of inbreeding, and is termed the 'Wahlund effect' (Wahlund, 1928). In order to distinguish between the Wahlund effect and actual, recent population-level inbreeding, we therefore recalculated F IS using only individuals identified as belonging to the dominant genetic cluster for the STG, defined as follows. First, cluster assignments and admixture proportions were calculated for each STG using STRUCTURE 2.3.4 (Pritchard et al., 2000), at the smallest value of K, which maximized global likelihood (Kalinowski, 2011). For each STG, we then identified individuals as either belonging to the dominant cluster or to one of the two minor clusters, based on their proportional allocation to each. Individuals not belonging to the major cluster were excluded from the recalculated value of F IS , and a comparison of the two F IS estimates allowed us to highlight any results potentially explained by Wahlund effects. Individuals were considered 'admixed' if their assignment proportion was less than 0.8 to a single cluster (Heppenheimer et al., 2018;Rutledge et al., 2010), remaining individuals were considered 'nonadmixed'.
To statistically explore variation in genetic diversity we tested for differences between regions and years in several measures: individual H o , A r (calculated within each STG, and resampled at N = 6, see above) and individual admixture proportions (using the largest assignment at K = 3 as our indicator), with each measure in turn assigned as the dependent variable. To ensure that our models were not confounded by uneven sampling across region/year groups, we used a GLM approach including both RBD region and year as independent terms and excluded 1994 due to data deficiency. For our models of individual H o and individual admixture proportions, we also included Sex as a categorical variable, to test for sex bias.
In preliminary testing for H o and admixture, we incorporated the year:RBD region interaction to evaluate whether putative change over time differed between regions, but in all cases, the interaction was nonsignificant (p > 0.05) and in order to simplify model reporting this term was removed from all starting models. For all GLMs, model fit was evaluated by exploration of residuals to check assumptions of normality, homogeneity of variance and absence of leverage.
Tukey tests were applied post hoc to test pairwise differences. For models of H o and A r , a Gaussian model with an identity link met all model assumptions. Models of admixture failed to meet necessary assumptions and no suitable error family/link function combination was found. We therefore used a univariate approach, testing region, year and sex separately using a Kruskal-Wallis test for differences in medians with region or sex, and a Kendall rank correlation to test for trend over time. Because our univariate tests were unable to control for uneven sampling within the model, we tested for year differences within regions and tested for regional differences within years. Any groups with N < 6 were excluded (see Table 3).

| Populationstructureandgeneflow
Population differentiation was estimated between STGs in ARLEQUIN 3.5, using the pairwise F ST estimator by Weir and Cockerham (1984) with the 'number of different alleles (F ST -like)' option and 10,000 permutations and significance level set at 0.05.
Analysis of Molecular Variance (AMOVA) was also performed in ARLEQUIN 3.5, with 10,000 permutations for significance and 1000 permutations for mantel test, using the 'number of different alleles (F ST -like)' option for molecular distance.
We tested for isolation by distance and spatial autocorrelation using pairwise matrices of land distance and genetic distance estimates for each time point. Land distance was calculated using the GDISTANCE package in R to determine the least-cost pathway between each pair of otters at each respective time point from a rasterised map of Great Britain with cell size set to 1 km 2 . On an irregularly shaped island like Great Britain, with multiple peninsulas separated by sea, land distance was deemed a more realistic measure of physical space between otters than Euclidean distance. Each land cell was given a resistance of 1 (sea cells were classified as 'NoData'), such that the least-cost resistance estimates calculated reflected the land distance between each pair of otters. For genetic distance, the proportion of shared alleles was calculated between pairs of otters using GenAlEx 6.5 (Peakall & Smouse, 2012). Mantel tests and Mantel correlograms were performed using the package VEGAN in R using Pearson's correlations and 10,000 permutations with an α = 0.05. The Holm correction for testing multiple p-values was used for the Mantel correlograms, and breakpoints for distance classes were set to every 50 km from 0 to 800 km. Potential sex differences TA B L E 1 Geographic regions used in genetic diversity and differentiation analysis of UK Eurasian otters (Lutra lutra).  , 2015), N is the number of otters genotyped and Land Area is the total km 2 of land within an RBD region (no Land Area is provided for the 'other' RBD region as it covers a vast and unconnected area of land with very few samples).

Landarea (km 2 )
in isolation by distance and spatial autocorrelation were additionally evaluated by applying the same testing to subsets of the data including only adult females, and only adult males.
Gene flow between STGs was estimated using two different methods, for four time points spanning 1999-2014 (the first time point, 1994, was omitted due to a small sample size and lack of geographic coverage across the whole study region). Firstly, GENEPOP v4.6 (Rousset, 2008) was used to estimate the effective number of migrants (Nm), corrected for sample size, using the private alleles method developed by Barton and Slatkin (1986), which should be most sensitive to recent migration due to the rare nature of private alleles (Yamamichi & Innan, 2012). Nm was estimated across the whole dataset and pairwise between all STGs. These results will be referred to as nondirectional migration. Secondly, BayesAss v3.0 (Wilson & Rannala, 2003)  ing points) with 10,000,000 MCMC iterations following a burn-in of 1,000,000 MCMC iterations and a sample interval of 5000. Trace output files were recorded and used to monitor for mixing and convergence using TRACER v1.7.1 (Rambaut & Drummond, 2003) and the three runs were averaged to obtain the percentage of migrants between each region in a pairwise fashion. Migration rates between the Western Wales and Severn regions and the Northern and Eastern regions, respectively, were unlikely to provide reliable results as the pairwise F ST between these regions was <0.05 and as such the results for these pairwise estimates were discarded (Faubet et al., 2007).
To determine the extent of population structure within UK otters we used two complementary approaches, one parametric and one nonparametric. The first was a Bayesian clustering algorithm implemented in STRUCTURE 2.3.4 (Pritchard et al., 2000), with no location prior, using the admixture model with correlated allele frequencies. All samples, regardless of collection year, were run together as one dataset for K = 1 to K = 13, with a burn-in of 100,000 followed by 1,000,000 MCMC steps, running 10 replicates for each value of K. We chose K = 13 as the maximum K value based on the 11 river basin districts in Wales and England, plus Ireland and Scotland. The results were summarized using STRUCTURE HARVESTER v 0.6.94 (Earl & vonHoldt, 2012), and we used the method by Evanno et al. (2005) and the likelihood of K (Pritchard & Wen, 2003) to explore the most likely number of clusters present in the data. Individual admixture proportions were averaged across the 10 runs for each K using CLUMPAK using default parameters (Kopelman et al., 2015). Based on a cut-off of 0.8 for the proportion of assignment to a specific cluster, we determined the number of 'admixed' individuals (Heppenheimer et al., 2018;Rutledge et al., 2010) at each time point for each value of K (i.e. any individual with less than 0.8 assignment to a single cluster was considered admixed). We used these data to quantify the percentage of admixed individuals across all clusters in the dataset at each time point. We note that a cut-off at 0.8 is commonly used ( West), based on data for all regions. As before, a burn-in of 100,000 followed by 1,000,000 MCMC steps was used, restricting K to K = 2 for 5 replicate runs. For each run, individuals were assigned to one of the two clusters based on >50% assignment and each cluster went through another round of partitioning at K = 2 until the assignment of individuals was ~50% to each cluster.

| GeneticvariabilitybylocusandRiverBasin district(RBD)region
Full genotypes (for 15 microsatellite loci) were obtained for all 193 samples newly analysed in this study (100% genotyping success rate), and re-analysis of the 14 calibration samples allowed an additional 214 genotypes selected from previous studies (Hobbs et al., 2011;Stanton et al., 2014) to be adjusted such that the datasets could be merged and analysed as one. Across years, none of the loci showed significant evidence (p < 0.05) of null alleles at >2 of the 5 geographic regions, and therefore, all 15 loci were retained for further analysis.
All 15 loci were polymorphic, with the number of alleles per locus ranging from 6 to 11, and observed heterozygosity per locus from 0.40 to 0.70 (Table 2).
Observed heterozygosity was consistently and significantly smaller than expected heterozygosity across all loci (paired twosample t 14 = 11.219, p < 0.001; Table 2), potentially indicating population substructuring (Jin & Chakraborty, 1995 although the latter estimate was based only on two individuals and therefore should be treated with caution, see Table 3).  Figure 2 for Region and Year comparison).
Other measures of diversity followed a similar spatial pattern, of higher genetic diversity in the Eastern than the Western RBD regions ( The F IS estimates suggested significant inbreeding in the majority of RBD regions at certain time points, but many incidents of apparent inbreeding were likely due to the Wahlund effect (annotated with i and w, respectively, in Table 3, with additional detail in Table S3).

| Populationstructureandgeneflow
Tests Results from all explored values of K are shown in Figure S4. The percentage of individuals considered genetically admixed between clusters (Q < 0.8) increased over time (i.e. suggesting increased gene flow between areas in the UK) from 4% in 1994 (at K = 2, K = 3, K = 4a and K = 5b) to 16%, 17%, 24% and 23% in 2014 (at K = 2, K = 3 K = 4a and K = 5b, respectively). The alternate modes of K = 4 and K = 5 also showed increased admixture over time, albeit from a different starting point due to the partitioning of the Welsh cluster into two groups (for detail see Table S5). Correlation analysis for values at K = 3 suggests that increases in admixture were significant in Eastern, South West and Western Wales RBD regions (Kendall rank correlation: p = 0.029, 0.036, 0.029, respectively), whereas trends in Northern and Severn RBD regions were not significant (p > 0.05; for detail see Figure S6). Differences in admixture between RBD regions were only significant in 1999 (Kruskal Wallis chi-square = 9.632, p = 0.025, when admixture was higher in Northern than in other regions); in all other years, admixture showed no difference between regions (p > 0.05). Differences in the degree of individual admixture between the sexes were also not significant (p > 0.05).

The number of genetic clusters in the data as inferred by
Discriminant Analysis of Principal Components (DAPC) using the lowest BIC score as the model evaluation criterion was K = 8; however, these clusters were very difficult to rationalize biologically or spatially (data not shown) and thus we explored other values of K, using the gradient of the decline between BIC values as a guide.
Comparison of the clusters inferred by DAPC ( Figure S7) with those inferred from STRUCTURE ( Figure 1) show close agreement for K = 2 and K = 3, and agreement with the major STRUCTURE modes for K = 4 and K = 5. This indicates that parametric and nonparametric methods agree for the strongest genetic divisions within the overall population (here strong agreement up to K = 3) but that assignments start to differ between methods as genetic divisions weaken (i.e. at higher values of K).

| DISCUSS ION
Our study represents comprehensive population genetic tracking of a recovering carnivore population over a twenty-one-year period.
Quantifying changes in genetic structure over time allowed us to test theoretical predictions about demographic population recovery and ensuing genetic changes. Using the Eurasian otter as our case study, we predicted a weakening of genetic structure as anthropogenically  (Weir & Cockerham, 1984). Asterisks indicate significant deviation from 0 after FDR correction (*p < 0.05, **p < 0.01); w indicates likely Wahlund effect whereas i indicates inbreeding (based on comparison with F IS2 , see Table S3 for detail). A r and pA r are reported twice, once with rarefaction based on the smallest sample size at each time point (excluding sample sizes <5), and once using n = 6 (12 genes). Data in italics derive from small sample sizes and should hence be treated with caution, na's indicate that a particular statistic could not be calculated for that sample size. Underlined years indicate that the time point includes the year stated ±1 year, to increase the sample size. Samples from Ireland or Scotland are not included here due to a lack of temporal coverage.
fragmented populations reconnected as part of population recovery. Our use (predominantly) of roadkill samples may have biased our dataset toward younger males, which are assumed to disperse longer distances than other demographics (Philcox et al., 1999). This could lead to an overestimation of connectivity and diversity, but we found that the overarching population genetic structure showed surprisingly little change over time, with no increase in genetic variability (H o and A r ) within regions despite increased gene flow and admixture. These results suggest that the UK otter population is less functionally connected than previously presumed, perhaps due to landscape or other ecological/behavioural barriers impeding effective genetic mixing. Understanding such limitations in recovering populations is important with respect to managing and quantifying conservation successes.

| Geneticvariability
Spatial differences in both H o and A r were highly significant, with higher genetic variability in Eastern and Northern than in Western regions (Severn, South West and Western Wales) that persisted across the 21-year study period.
In the absence of historic data predating the otter population crash, it is difficult to determine whether these spatial differences in diversity are historic or recent. However, translocations of otters into the Eastern and Northern regions are likely to have contributed to higher diversity in these regions. Population reinforcement projects have occurred in both regions, during which otters were released to bolster very small, residual populations that were deemed unlikely to recover without intervention. In the Northern region, 25 otters were released between 1990 and 1993; all were wild-born otters rehabilitated following injury or abandonment and were translocated from widely dispersed UK locations to the Derwent and Esk catchments in Yorkshire (to the northeast of the Northern region; White et al., 2003). In the Eastern region, 117 otters were released between 1983 and 1999; all were captive bred in the UK, and most were released in East Anglia (to the east of the Eastern region), although later releases were also made on the upper Thames catchment (on the western edge of the Eastern region; Bonesi et al., 2013;Jefferies et al., 2000;Wayre, 1985).
Although details of breeding stock are unknown, it appears likely that one of the utilized breeding lines may have included some Eurasian otters of non-UK origin (Hájková et al., 2007). TA B L E 4 Pairwise F ST estimates (Weir & Cockerham, 1984) between RBD regions between 1999 and 2014. Note: n/a: not enough data to estimate F ST at that RBD-year combination. All values were significant (p < 0.05) based on permutation tests of population differentiation in Arlequin (10,000 permutations), except where noted (n.s.). Bottom rows: global differentiation estimated from AMOVA, along with the number of RBD regions (# RBD regions) included in the analysis.
has not resulted in significant mixing between these and adjacent regions.
Differences in the number of loci deviating from HWE and in LD-when calculated for the whole dataset versus by RBD regionare likely due to the Wahlund effect, i.e. an excess of homozygotes being observed due to sampling from a structured population, treating the whole dataset as one when subdivisions exist. A further important consequence of this Wahlund effect is allelic associations between loci across the total population, which can lead to signals of LD (Garnier-Géré & Chikhi, 2013). If the signal reflected physical linkage between pairs of loci, then it would be expected to occur across all regions, whereas the patterns observed suggest other processes are occurring which emulate true physical linkage. One alternative explanation could be genetic admixture between regions. Admixture linkage disequilibrium has been described in unlinked loci due to differing allele frequencies in parental populations when there is gene flow between genetically distinct demes (Pfaff et al., 2001;Rybicki et al., 2002). Several other studies across Europe have used similar methods to estimate the genetic diversity and population structure of otters at various spatial scales. In our study, across all samples, the average number of alleles (N A ) was high compared with other European studies, whereas observed heterozygosity was relatively low (Lanszki et al., 2008;Mucci et al., 2010). This difference could be due to populations is similar to that observed within population strongholds in the fragmented French otter population (Pigneur et al., 2019). The only other study in Europe with a N A higher than ours is Arrendal et al. (2004), which similarly sampled across the whole of Sweden (and part of Norway), including populations that were highly genetically distinct from one another and that had received introductions.
When split by spatial-temporal grouping (STG), N A showed higher diversity in eastern regions than western ones within our study, but the overall range of values was within that for otter populations in countries across Europe (Mucci et al., 2010).

| Populationstructureandgeneflow
Both clustering analyses of population structure (STRUCTURE and DAPC) showed that otters in the United Kingdom have maintained a highly genetically structured population despite increased connectivity between subpopulations through range expansion. At higher values of K, the two methods deviated slightly in their spatial-clustering pattern, which could be due to the assumptions made of Hardy-Weinberg equilibrium and/or linkage equilibrium in the Bayesian clustering model implemented in STRUCTURE, which does not exist in the DAPC model (Jombart et al., 2009). For both approaches, the inferred genetic clusters reflect known otter stronghold populations and agree with previous studies on the genetic structure of the UK otters (Hobbs et al., 2011;Stanton et al., 2014). This maintenance of population structure over time is also reflected in the predominantly high and temporally consistent pairwise F ST estimates between most regions across all time points, in contrast to global F ST , which decreased over time. This could be due to the difference in the way that the two estimates are calculated (with pairwise F ST using allele frequencies and our implemented AMOVA taking into account size distances among alleles; Excoffier et al., 1992) and also the larger There are few empirical studies that explicitly quantify changes in genetic structure over time during contemporary population expansions, and therefore, the body of work with which to compare our results is limited. However, the maintenance of genetic structuring seen in UK otters over more than 20 years is somewhat unexpected.
For example, the rapid disintegration of genetic structuring was   (Hindrikson et al., 2017) and central European lynx (Mueller et al., 2022).
There are several factors that could contribute to the long-term maintenance of population structure in UK otters. Firstly, the landscape in the United Kingdom has changed significantly over the 40 years that otters were absent from large areas. Increased urbanization, human population size, number of roads, volume of traffic and alterations to rivers could all affect the realized connectivity between subpopulations. However, there is also a lag time for the genetic signatures left by old barriers to disappear and those created by new barriers to become detectable (Landguth et al., 2010). When assessing the effects of barriers to gene flow, the number of generations since (positive or negative) changes in barrier permeability need to be taken into account, as well as the dispersal ability of the species, as both of these factors have been shown to affect the rate at which such genetic signatures change (Landguth et al., 2010). It is assumed that otters can disperse over relatively large distances, as several tracking studies have recorded individuals moving tens of kilometres in one night (Green et al., 1984;Jenkins, 1980;Quaglietta et al., 2013), therefore dispersal limitation should be less likely in otters than in other less mobile species. Secondly, isolation in fragmented populations may have also affected other aspects of otter ecology through genetic drift-for example, Kean et al. (2017) found regionally differentiated scent odour profiles in the UK, which reflected genetic structure within the population. The discovery of these dialects is important given that otters communicate predominantly by scent material left in their spraint (Trowbridge, 1983).
Scent differences apparently signalling age, reproductive status, sex and even individual identity have previously been described (Kean et al., 2011). These differing regional scent 'dialects' could be restricting gene flow, if otters are preferentially mating with otters of similar scent. Under this scenario, intrinsic aspects of otter behaviour may contribute to the maintenance of genetic structuring, despite the re-establishment of spatial connectivity.
The results from our spatial autocorrelation analysis showed similar patterns at each time point (negative spatial autocorrelation in distance classes at small spatial scales, suggesting local dispersal).
Given the territorial nature of adult otters (Erlinge, 1968), it would be expected that as a local population grows and reaches carrying capacity, individuals would need to move further to establish their own territory (Sjöåsen, 1997), leading to a gradual increase in the distance over which we observe negative spatial autocorrelation.
Instead, our results showed no change over time, suggesting that even though populations are assumed to be approaching carrying capacity in some areas, dispersal distance has remained largely unchanged. A lack of density-dependent dispersal in otters might explain this result but seems unlikely given that this phenomenon has been observed in over 70% of mammal species studied (reviewed in Matthysen, 2005). Furthermore, density-dependent dispersal was also proposed as a mechanism for the short dispersal distances detected by Quaglietta et al. (2013) in otters in Portugal. An alternative explanation for our findings could be that improvements in river water and habitat quality have led to increased local carrying capacities, for instance through the recovery of prey populations (e.g. brown trout, Monteith et al., 2005). At higher prey densities, otter range size may decrease (Néill et al., 2009;Sidorovich, 1991), as has been observed for some other carnivore species such as the Eurasian lynx (Herfindal et al., 2005), countering increased dispersal distances associated with expanding populations. Our analysis of adult females and adult males suggests that both sexes disperse, as they exhibit similar patterns of negative spatial autocorrelation in the smaller distance classes (<150 km). These findings are contrary to previous evidence from radio tracking (which suggested male-biased dispersal in Eurasian otters; Quaglietta et al., 2013) but are similar to the limit of gene flow detected for both sexes in Scotland by Dallas et al. (2002).
We note, however, that our spatial autocorrelation analyses represent a relatively simplistic approach to explore spatial patterns of relatedness. Given that the landscape across Great Britain is highly heterogeneous, further analysis using a landscape genetics approach utilizing habitat-specific resistance values should be undertaken to further elucidate otter dispersal in the UK.

| CON CLUS IONS
Between surveys carried out in the late 1970s and 2010, the otter population in Great Britain expanded from small stronghold populations and near extinction in some areas (i.e. East Anglia) to an almost continuous distribution (Crawford, 2010;Strachan, 2015).
This large-scale population recovery was proclaimed a success for policy and practice, where changes in pollution control and the improvement of river and riparian habitats (Crawford, 2010) have supported a largely natural population expansion. Our study shows that this spatial connectivity has not translated into genetic connectivity in the manner or speed expected, with the population having retained the strong spatial genetic structuring observed earlier on in the recovery process (Hobbs et al., 2011;Stanton et al., 2014). The appearance of a spatially continuous population may therefore give a false sense of security with respect to genetic robustness, since in fact the population remains vulnerable, comprised of genetically fragmented subpopulations (Reed, 2004). Given the overall increasing levels of gene flow and admixture seen in this study, it may be that more time is all that is needed for the achieved spatial connectivity of the otter population to translate into genetic mixing. Future analysis of the data using landscape genetic techniques (Manel et al., 2003) or esti- Our findings illustrate that spatial recovery of formerly endangered species may not necessarily imply that genetic recovery has occurred as well. Genetic recovery may require much longer than is apparent from spatial data alone leaving populations more vulnerable than they first appear. Newly published evidence from the 6th Otter Survey for Wales  has detected declines in otter signs across Wales, for the first time since surveys began. These new signals of potential population decline in Wales, alongside the lack of genetic recovery of otters across Wales and England illustrated by the current study, reinforce the value of population monitoring programmes that explicitly include genetic monitoring. Importantly, such studies are likely to provide a cornerstone for genetic monitoring programmes needed for the post-2020 CBD biodiversity monitoring framework, wherein genetic diversity will be included for the first time (Hoban et al., 2021).

ACK N OWLED G M ENTS
We thank the past and present Cardiff University Molecular Ecology

CO N FLI C TO FI NTE R E S T
The authors declare no conflict of interest.

DATAAVA I L A B I L I T YS TAT E M E N T
A csv file of the raw genotype data with population information has been made available in the Supporting Information. See also the