Using spatially explicit models to improve sampling of rare marine species: behavior and ecology drive depth distribution of rare species eDNA in pelagic marine environments

Author

Amy M. Van Cise1*, Eiren K. Jacobson2, Megan R. Shaffer3, Pedro F.P. Brandão Dias3, David L. Miller4,5, Brice X. Semmens6, Len Thomas2, Ryan P. Kelly3, A. Ole Shelton7, Krista M. Nichols7, Kim M. Parsons7

  1. School of Aquatic and Fishery Sciences, University of Washington, Seattle, WA, USA
  2. Centre for Research into Ecological and Environmental Modelling, School of Mathematics and Statistics, University of St Andrews, St Andrews, Scotland
  3. School of Environmental and Marine Affairs, University of Washington, Seattle, WA, USA
  4. Biomathematics and Statistics Scotland, Dundee, UK
  5. UK Centre for Ecology and Hydrology, Lancaster Environment Centre, Lancaster, UK
  6. Scripps Institution of Oceanography, University of California San Diego, San Diego, CA, USA
  7. Northwest Fisheries Science Center, National Marine Fisheries Service, National Oceanic and Atmospheric Administration, Seattle, WA, USA


*Corresponding author email: avancise@gmail.com


Running page head: Depth distribution of rare target detections

Introduction

Detecting rare species is a longstanding challenge facing many ecological fields, e.g. ecosystem-level studies of biodiversity or community composition (e.g. Cao, Larsen, and Thorne 2001, 2001; Queheillalt et al. 2002; White et al. 2023; Jeliazkov et al. 2022) as well as species-level spatial ecology studies (e.g. Engler, Guisan, and Rechsteiner 2004; Le Lay et al. 2010; Guisan et al. 2006). Many methodological approaches have been developed to address this problem, both in the experimental design phase (Le Lay et al. 2010; Guisan et al. 2006) and in the data analysis phase (Engler, Guisan, and Rechsteiner 2004; Guélat and Kéry 2018; Lyet et al. 2013). More recently, eDNA is emerging as a viable tool for detecting rare species, with promise to improve spatial distribution models (SDMs) and biodiversity monitoring (e.g. Afonso et al. 2024; Valdivia-Carrillo et al. 2025), particularly when integrated with existing sampling methods (e.g. Claver et al. 2025; Shelton et al. 2022).

However, detecting rare species in the marine environment presents several unique challenges owing to its vast expanse and extreme depths, most of which is remote and difficult (i.e. costly) to observe directly. Within this large environment, the distribution of species in three dimensions is driven by their ecological behaviors, and rare species will be overlooked if their depth distribution is not accounted for in experimental design. This is especially true for mobile species capable of utilizing a broad swath of the water column; some rare marine species have depth ranges spanning up to 3km (Schorr et al. 2014). Using approaches such as niche-based modeling and spatial modeling in the design phase of an experiment can increase sampling efficiency and increase positive detections while reducing the cost of monitoring distributions of rare and/or mobile species (Guisan et al. 2006; Le Lay et al. 2010).

While eDNA-based species detection is a promising approach to detecting rare species, it relies on the collection of relatively small amounts of water (generally 1-10 L) throughout the range of the target species, necessitating careful decision-making regarding where to collect point-samples with the aim of describing geographic (\(x\) and \(y\)) distribution, either independently of or alongside depth (\(z\)) distribution. However, sampling across the \(z\) dimension will enhance our capability to describe spatiotemporal variability in \(z\) habitat use that, when combined with variability in \(xy\) habitat use, may provide novel insight into the spatiotemporal ecology of these species in three dimensions. That new insight may be used to improve spatial distribution models that can then be incorporated into the management of rare species, improve our understanding of the ecology of understudied marine species, and be used to improve future sampling designs.

Marine mammals, and specifically cetaceans, are notably some of the rarest species in the marine environment. Traditionally, their habitat use and distributions are mapped using visual and/or passive acoustic observations (e.g. Dalpaz et al. 2021; Rankin et al. 2008; Elizabeth A. Becker et al. 2020a, 2022; Fiedler et al. 2023; Forney and Barlow 1998; Findlay et al. 1992; Kanaji and Miyashita 2021). Visual line-transect surveys are the most common, in which observers aboard a ship or airplane will record detections of any marine mammals and their distance to the transect line, covering the survey area in a systematic pattern. These surveys are limited to the surface of the ocean, where most cetaceans spend very little of their time, and to daylight hours - and their effectiveness decreases in poor weather or sea state. Ship-board passive acoustic line-transect surveys collect recordings along similar systematic transects, and can often detect species up to XX kilometers away from the ship in any direction. They are not limited to surface or daylight observations, but only detect species that are actively vocalizing. Systematic eDNA surveys, where water samples are collected along transects, do not share limitations related to depth, daylight, weather conditions, or species’ vocal activity and are therefore potentially powerful as standalone or complementary monitoring tools (Pikitch 2018). However, eDNA sampling notably has a spatial detection range that is much smaller than either of the two previous surveys and strongly affected by diffusion and transport of DNA by oceanographic features (Yamahara et al. 2025).

Because eDNA samples have a limited radius of detection, and because they are costly both to collect and to process, studies targeting a specific set of rare species may benefit from restricting their range of sampling depths via spatial or niche-based modeling. ere is currently a lack of published data providing guidance regarding the sampling depths to target when cetacean species, taxonomic groups, or ecological groups of rare species, or what ecological or observational processes may drive variability in the probability of detecting specific species across depths.

Here, use spatial and niche-based modeling to describe the \(z\) distribution of cetacean species’ eDNA in the water column and identify broad ecological drivers of the observed patterns. We specifically ask:

  1. Does cetacean eDNA distribution vary in the depth dimension? If so, can this variabilty be predicted by preferred prey or phylogeny?
  2. Does latitudinal and longitudinal variability in eDNA distribution interact with depth variability to affect species-specific probability of detection in three dimensions?

In addition to these ecological questions, we quantify the effect of common sources of observational bias on our detection of rare species, including technical replication (Ficetola et al. 2015) and primer specificity (Shaffer et al. 2025). We also consider the effect of DNA degradation on detection rates; while DNA degradation has been shown to decay over time (e.g. Brandão-Dias et al. 2025), we specifically consider the effect of freeze/thaw cycles on DNA decay and downstream detection probability.

To answer these questions, we utilize eDNA (water) samples collected in the California Current in 2019 in order to assess the abundance of North Pacific hake (Merluccius productus). We leverage this opportunistic dataset to examine cetacean detections generated via genetic metabarcoding of three mitochondrial target regions. By collapsing eDNA detections from these three primers across latitude and longitude (\(xy\)) and focusing only on the depth (\(z\)) distribution of sampling effort (0-500 m), we aim to determine (1) how detection probability varies with primer specificity, technical replication, or freezing and thawing samples; and (2) whether species- or taxon-specific detection probability varies with depth. Finally, we explicitly incorporate spatial (\(xy\)) distributions to determine (3) whether depth-dependent detection probability varies with latitude and longitude.

Methods

Sample Collection

We used eDNA samples collected on the 2019 US-Canada Integrated Ecosystems and Acoustics-Trawl Survey for Pacific hake from 2 July to 19 August 2019. The study area is encompassed by the California Current System, which is 500-800km wide, extends from 43°N to 23°N, and comprises an equatorward surface current and a subsurface poleward current that is concentrated mainly on the continental shelf (Hickey 1979). The system is characterized by strong alongshore advection, inshore upwelling concentrated primarily along the continental shelf break, and multiple terrestrial and oceanic water sources Auad, Roemmich, and Gilson (2011). These complex oceanographic processes support a highly productive and diverse assemblage of plankton and fishes, and form a rich prey base for at least 25 cetacean species that inhabit the study area (Carretta et al. 2023).

A full description of field sampling and DNA extraction are provided in Shelton et al. (2022). Briefly, we used water samples that were previously collected from seawater at 175 stations in the study area, spanning south-north from latitudes 37.89499°N to 48.39881°N and west-east 126.4298°W to 122.8638°W. We targeted six depths, so that our sample set of 516 samples included 148 samples collected at 0 m, 147 samples collected at 50 m, 74 samples collected at 150 m, 73 samples collected at 300 m, and 74 samples collected at 500 m. The field team collected water using Niskin bottles attached to a CTD rosette and filtered 2.5 L of water from each station and depth using a 47-mm diameter mixed cellulose ester (MCE) filter with a 1-µm pore size and a vacuum filtration manifold connected to a vacuum pump, then immediately preserved the filters in Longmire’s buffer and stored them at room temperature until extraction.

Sample Processing

DNA was previously extracted using a phase-lock phenol chloroform method as described in Ramón-Laca et al. (2021). Using this extracted DNA, we randomized stations and depths across 96-well working plates, removed one plate at a time for PCR amplification and library preparation, and stored samples in a -20°C freezer when not in use. Each time we removed a plate from the freezer, we recorded a freeze/thaw cycle for that plate in order to examine the effect of expected DNA degradation due to freeze/thaw cycles on downstream probability of detecting rare species.

Next, we tested all samples for inhibition by spiking an internal positive control (IPC) of known concentration into each sample and quantifying it using a probe assay with quantitative PCR (qPCR). We identified inhibited samples by examining the IPC concentration, as described in Shelton et al. (2022). For each sample, we selected a dilution that was determined from the qPCR data (described above) to be not inhibited, which was either 1x or 0.2x across all samples.

We then amplified each sample with three different mitochondrial primer sets: MFU targeting fishes using the 12S region (Miya et al. 2015), MV1 targeting vertebrates using the 12S region, (Valsecchi et al. 2020) and DL targeting only cetaceans using the control region (Baker et al. 2006). For all samples, we ran one technical PCR replicate for MFU, one for DL, and three for MV1. We optimized primer, Taq, and cycling conditions using mock communities (2025).

Our reaction mixture for amplification (PCR1) consisted of 10 μL of Phusion Master Mix (2X), 0.4 μL of the forward primer (10 μM); 0.4 μL of the reverse primer (10 μM), 0.6 μL of 100% DMSO, 0.5 μL of rAlbumin (20 μg/μL), 4.4 μL of nuclease-free water and 2 μL of DNA template. We ran PCR1 with the following cycling conditions: an initial denaturation of 98°C for 30 sec; followed by cycles of: denaturation (98°C for 10 sec), annealing (60°C for 30 sec), and extension (72°C for 3 sec); with a final extension of 72°C for 10 min and hold at 4°C. We ran 35 cycles for MFU and MV1, and 40 cycles for DL.

If a sample failed to produce a band at the previously determined optimal dilution when amplified using MV1 or MFU, we diluted the sample and reran it until a visible band was produced. We could not do the same for DL, which targets only cetaceans, because the rarity of our target made it prohibitively difficult to distinguish between an inhibited sample and a negative sample. Here we assumed that the optimal dilution selected by qPCR was not inhibited and treated samples without bands as negative detections.

Any MFU and MV1 samples that returned fewer than 100 total reads after sequencing were rerun in an effort to complete the dataset to include one successful MFU replicate and three successful MV1 replicates. This required up to six replicate sequencing runs in some cases.

Library Preparation and Sequencing

We then cleaned the amplified PCR product using Ampure Beads (1.2x beads:product ratio, with two 70% ethanol washes) and then indexed each sample in a second PCR reaction (PCR2) using the following reaction mixture: 12.5 μL of KAPA HiFi HotStart ReadyMix (Roche Diagnostics), 1.25 μL of IDT for Illumina DNA/RNA UD Indexes (Sets A-D), as well as primer-specific volumes of PCR1 product and nuclease free water. We added 5 μL PCR1 product, and 6.25 μL of nuclease free water for all MFU and MV1 samples, and added 11.25 μL PCR1 product and no nuclease free water to all DL samples. We ran PCR2 using the following cycling conditions: an initial denaturation of 95°C for 5 min; cycles of: 98°C for 20 sec, 56°C for 30 sec, 72°C for 1 min; and a final extension of 72°C for 5 min. We ran 35 [IS THIS RIGHT?] cycles for MiFish-U and MarVer1, and 12 cycles for D-loop.

After indexing, we visualized the PCR product on a 2% agarose gel and quantified DNA concentration using Quant-iT™ dsDNA Assay Kit (Thermo Fisher Scientific, USA) with Fluoroskan™ Microplate Fluorometer (Thermo Fisher Scientific, USA). We then normalized all indexed products to a concentration of 5ng/ul [IS THIS RIGHT?], pooled the normalized product into 96-sample libraries for sequencing, and then extracted the target band using the E-Gel™ SizeSelect™ II Agarose Gels (Thermo Fisher Scientific, USA). Finally, we loaded our pooled libraries, in 6-8 pM concentrations with a 20% PhiX spike-in, onto an Illumina MiSeq platform at the University of Washington with a v3 600 cycle kit. In total, we ran 36 sequencing runs between May 2023 and July 2025.

For every sequencing run, we added a sample with nuclease-free water (negative control) and genomic DNA from kangaroo (positive control) to PCR1 and carried it through the rest of the library preparation workflow. If anything other than human or NA was found in the negative control, or if kangaroo appeared in any other eDNA sample, we omitted that plate from the final analysis. This occurred for only one plate of technical replicates, which was later re-run without contamination [IS THIS RIGHT?].

Bioinformatic Processing

We analyzed all sequencing data using the bioinformatics workflow established in Guri et al. (2025). We trimmed Illumina-demultiplexed reads with Cutadapt v4.9 to remove primer and adapter sequences, allowing a maximum mismatch rate of 0.1 (Martin, 2011). We then conducted quality filtering and minimum-length truncation using DADA2, and inferred amplicon sequence variants (ASVs) using default denoising settings (Callahan et al. 2016).

Following initial QAQC of the raw sequence data, we queried each unique ASV against a vertebrate mitochondrial subset of the January 2025 NCBI nr eukaryote database using standalone BLAST configured with “-word_size 20 -evalue 1e-30 -max_target_seqs 100” (Altschul et al., 1990), supplemented with a negative-taxid file to remove taxa that are not known to occur in the North Pacific Ocean. We made taxonomic assignments using a least-common-ancestor procedure in TaxonKit v0.17 (Shen & Ren, 2021), prioritizing exact-identity matches and otherwise applying a stepwise identity threshold (>99.3%, >98%). We manually reviewed all cetacean identifications to ensure consistency with regional species distributions.

Detection Data QAQC

After bioinformatic processing of all samples, we removed any additional dilutions of MFU or MV1 samples that were run to maximize sequencing success, keeping only the iteration with the greatest total read count. We then defined a sample as a station/depth/technical replicate combination. To obtain positive detections for each sample, we kept any species with > 1 sequencing read. We filtered out any non-target (non-cetacean) species, and also removed detections of bottlenose dolphins, which we considered likely contamination from concurrent labwork and/or captive animals, and unidentified Balaenopterids or Delphinids. Across all samples, we define non-detection as 0 reads for a given cetacean species in a station/depth location that was positive for non-target species (e.g. fish) in either MFU or MV1 sequence data.

Data standardization

Because primer efficacy likely varies, and different numbers and types of technical replicates were sometimes run for different stations, we chose to standardize the dataset for further analysis so that the same number and type of technical replicates were included for each station and depth combination in the dataset. The standardized dataset included 1 technical replicate using MFU, 3 technical replicates using MV1, and 1 technical replicate using DL.

Data Analysis

Question 1: How does observation bias affect rare DNA detectability?

Because various decisions made during sample collection, processing, and sequencing can introduce observation bias, we first leveraged the information contained in multiple technical replicates from individual biological samples to better understand the detection processes for cetacean eDNA. Here we examine the probability of detecting cetacean DNA from any species, under the assumption that all cetacean DNA is equally capable of being detected if it is present [cite Shaffer paper here?]. We used a Bayesian hierarchical occupancy model to separately estimate the probability of occupancy/capture by depth and detectability by primer and number of freeze/thaw cycles, based on the model presented in Patin et al. (in review).

We modeled the presence of cetacean eDNA in a given biological replicate at site \(i\) and depth \(z\), \(y_{i,z}\), as a Bernoulli random variable with occurrence probability \(\Psi_d\):

\[ y_{i,z} \sim \text{Bernoulli}(\Psi_{z}) \]

where probability of presence in a sample at depth \(\Psi_z\) is a smooth function of sample depth \(z\):

\[ \text{logit}(\Psi_z) = \beta_0 + f(z). \tag{1}\]

We then modeled the observed detection of cetacean eDNA in each technical replicate \(x_t\) as a Bernoulli random variable with primer- and freeze/thaw-specific detection probability \(\Omega_\text{primer*thaw}\) conditional on presence in the biological sample \(y_{i,z}\):

\[ x_t \sim \text{Bernoulli}({\Omega_\text{primer}} \cdot z_{i,z}). \tag{2}\]

where detection probability \(\Omega\) is a logit-linear function of primer type and number of freeze/thaw cycles that the sample used in the technical replicate has been subjected to \(F\):

\[ \text{logit}(\Omega_\text{primer*thaw}) = \beta_\text{primer} + \beta_\text{thaw} \cdot F. \tag{3}\]

Here, \(\beta_\text{primer}\) is a primer-specific intercept and \(\beta_\text{thaw}\) is a slope parameter that interacts with \(F\), the number of freeze/thaw cycles.

We implemented this model using nimble [de Valpine et al. (2017); DOI] in R (R Core Team 2016). We used uninformative or weakly informative prior distributions for model parameters and ran four chains, each with 250,000 iterations, 225,000 of which were burnin, and assessed convergence using the Gelman-Rubin diagnostic (Gelman and Rubin 1992).

Using the results of this model, we simulated the primer-specific expected probability of detection with increasing technical replication.

Question 2: Does species-specific POD vary with depth?

To determine whether species-specific POD varies in the z (depth) dimension, we used generalized additive models [GAMs; S. N. Wood (2017)] implemented using the mgcv package in R (R Core Team 2016). GAMs have previously been demonstrated to be an appropriate tool for modeling spatial distributions of cetaceans (e.g. Elizabeth A. Becker et al. (2016); Elizabeth A. Becker et al. (2019); Elizabeth A. Becker et al. (2020b); Elizabeth A. Becker et al. (2022); Abrahms et al. (2019)). We fit all GAMs using detection or non-detection for each site, depth, and technical replicate as the response variable. We used a binomial distribution with a logit link function, and estimated model parameters using REML. Models examining the effect of depth were built iteratively, starting by modelling the probability of detection (POD) of any cetacean species as a smooth function of sampling depths (Eq. Equation 4), then allowing the intercept to vary by species (Eq. Equation 5), and finally allowing both intercept and model smooth to vary by species (Eq. Equation 6).

\[ \text{logit}(\mu_i) = \beta_0 + f(z_i) \tag{4}\]

\[ \text{logit}(\mu_{i,s}) = \beta_{s} + f(z_i) \tag{5}\]

\[ \text{logit}(\mu_{i,s}) = \beta_{s} + f_s(z_i) \tag{6}\]

In these equations, \(\mu_i\) is the probability of detecting any cetacean for location \(i\) and depth \(z_i\) and \(\mu_{i,s}\) is the probability of detecting species \(s\) at location and depth \(i\). \(\beta_0\) is a shared intercept parameter and \(\beta_{s}\) is a species-specific intercept parameter. \(f(d_i)\) denotes a smooth on depth across species and \(f_s(d_i)\) denotes a species-specific smooth on depth.

In addition to examining depth variability across species-level strata, we examined several two other stratification schemes in order to determine whether depth variability in detection probability is driven by (1) phylogeny or (2) preferred prey type. Although we have determined that primer specificity, technical replication, and number of freeze/thaw cycles affect POD in our samples, we have excluded these from downstream modeling since primers and technical replication were applied consistently across samples.

We assessed model fit using AIC, and model performance using standard metrics including the percent of deviance explained, the area under the receiver operating characteristic curve (AUC, Fawcett 2006), the true skill statistic (TSS, Allouche, Tsoar, and Kadmon 2006), and a visual comparison of predicted vs. observed detection rates across sampling depths. We also visually compared our model-based predictions of probability of detection across depth with estimates of species-specific time-at-depth as reported by Watwood and Buonantony (2012) and the U.S. Department of the Navy (2017).

Question 3: Does depth variability in POD covary with geographic variability?

We also used a GAM to examine covariability between depth (\(z\)), latitude (\(y\)), and longitude (\(x\)) on a per-species basis. Using the best-fit model from Question 2 as our starting point (species-specific variability in POD with \(z\)), we tested covariation in \(xyz\) on a species-specific basis with the following:

\[ \text{logit}(\mu_{s,i}) = \beta_{s} + f_s(x_i,y_i,z_i) \tag{7}\]

\(f(x_i,y_i,z_i)\) was constructed as tensor product of marginal smoothers of x, y, and z and in the case of the species-specific factor-smooth interaction we used the tensor product of x, y, and z, and a random effect on species. Using the ti() contraction in mgcv enabled us to construct intermediate tensor products on interactions that we were interested in.

As in Question 2, we assessed model fit using AIC, and model performance using percent deviance explained, AUC, and TSS. We then conducted a visual comparison of the predicted depth of maximum POD across and our observed detection depth within the study area for the most frequently observed baleen whale, dolphin, and beaked whale.

Results

We selected 516 water samples collected at 175 unique locations during the 2019 NOAA hake abundance survey (Figure 1), targeting 0, 50, 150, 300, and 500 m sampling depths at each station. Where the bottom was less than 500 m deep, we did not collect a sample at 500 m. We amplified and sequenced these samples in 1 technical replicate using MFU, 3 technical replicates using MV1, and 1 technical replicate using DL, resulting in a final total of 1548 amplified and sequenced samples.

Table 1: Scientific names, common names, abbrevation, taxonomic family, broad prey category, and total number of detections for each cetacean species detected in our study area.
Species Common name Sp. abbrevation Family Broad taxon Prey family # Detections DL MFU MV1
Lagenorhynchus obliquidens Pacific white-sided dolphin Lobl Delphinidae Dolphin/Porpoise fish 95 4 20 71
Phocoenoides dalli Dall’s porpoise Pdal Phocoenidae Dolphin/Porpoise fish 74 6 5 63
Megaptera novaeangliae humpback whale Mnov Balaenopteridae Baleen whale invert 73 9 14 50
Balaenoptera physalus fin whale Bphy Balaenopteridae Baleen whale invert 60 4 16 40
Stenella coeruleoalba striped dolphin Scoe Delphinidae Dolphin/Porpoise fish 33 0 2 31
Phocoena phocoena harbor porpoise Ppho Phocoenidae Dolphin/Porpoise fish 32 2 4 26
Grampus griseus Risso’s dolphin Ggris Delphinidae Dolphin/Porpoise fish 26 5 3 18
Orcinus orca killer whale Oorc Delphinidae Dolphin/Porpoise fish 16 2 3 11
Berardius bairdii northern giant bottlenose whale Bbai Ziphiidae Beaked whale squid 14 0 2 12
Ziphius cavirostris goose-beaked whale Zcav Ziphiidae Beaked whale squid 12 2 2 8
Eschrichtius robustus gray whale Erob Eschrichtiidae Baleen whale invert 11 0 3 8
Balaenoptera musculus blue whale Bmus Balaenopteridae Baleen whale invert 8 1 1 6
Balaenoptera acutorostrata minke whale Bacu Balaenopteridae Baleen whale invert 7 1 1 5
Lissodelphis borealis northern right whale dolphin Lbor Delphinidae Dolphin/Porpoise fish 4 4 0 0
Mesoplodon stejnegeri saber-toothed whale Mste Ziphiidae Beaked whale squid 2 0 0 2
Balaena mysticetus bowhead whale Bmys Balaenidae Baleen whale invert 1 0 1 0

Across all samples, we observed 468 positive detections of 16 cetacean species, including 6 baleen whale species, 5 dolphin species, 2 porpoise species, and 3 beaked whale species (Table 1). 11 species were observed more than 10 times, and and additional 5 were observed fewer than 10 or fewer times. Notably, this includes one detection of a bowhead whale (Balaena mysticetus), as well as three detections of saber-toothed whales (Mesoplodon stejnegeri). Figure 1 illustrates the distribution of detections across sampling depths z, ignoring xy distribution, for all cetacean species included in the study.

Figure 1: Left: Distribution of detections across sampling depths for all species included in the study. Species are grouped as follows: Baleen whales includes the Balaenopteridae, Balaenidae, and Eschericthidae; Dolphins/Porpoises includes all Delphinidae and Phocoenidae; Beaked whales includes all Ziphiidae. Right: Geographic distribution of samples used in this study, acquired from water collected during a 2019 survey of hake distribution in the California Current.

Question 1: Quantifying the effect of observation bias on POD

Given the presence of cetacean eDNA in a sample, the probability of detecting that DNA is significantly affected by primer specificity (Figure 2, left). Assuming a single freeze/thaw cycle, MV1 had the highest predicted POD (0.44), followed by DL (0.35) and then MFU (0.31). Using the posterior POD for each primer, we estimated the expected rate of increase in POD with technical replication by primer (Figure 2, middle), again assuming a single freeze/thaw cycle. Achieving a POD of >90% for cetacean DNA collected in a sample would require four technical replicates of MV1, ~5-6 technical replicates of DL, or 7 technical replicates of MFU.

While increasing the number of technical replicates increases POD, we found that DNA degradation by freeze/thaw cycles can decrease the POD of DNA in a sample (Figure 2, right). We found a significant relationship between freeze/thaw cycle and posterior estimates of POD for the longest primer in our study (DL, 360 bp), and a weaker and nonsignificant relationship between freeze/thaw cycle and posterior POD for MV1 (202 bp). There was no relationship detected between posterior POD and freeze/thaw cycles for MFU (185 bp), noting that only 5 of 516 total MFU samples sequenced in this study had > 2 thaw cycles.

Given our posterior estimates of detection probability by primer and number of freeze/thaw cycles, we calculated a sample-specific probability of detection for each sample in our study, assuming cetacean DNA is present in the sample. Despite the significant reduction in POD in our DL samples with a high number of freeze/thaw cycles (lowest DL POD = 0.08), including four replicates using primers that were not strongly affected by freeze/thaw cycle ensured a high overall probability of detecting cetacean eDNA in a sample, conditional on its presence. Mean POD across all samples in the study was 0.87 (range: 0.85 - 0.92). Averaged across all samples in this study when accounting for freeze/thaw cycles, the mean POD for DL is 0.14, mean POD for MFU is 0.31, and 0.39 for MV1.

Figure 2: Left: Posterior distribution of primer-specific probability of detecting cetacean eDNA given its presence in a sample (horizontal axis) with one freeze-thaw cycle. Middle: Posterior mean primer-specific detection probability (vertical axis) with increasing technical replication (horizontal axis) for each primer used in this study. Right: Posterior distribution of the probability of detection given presence (vertical axis) by number of freeze/thaw cycles (horizontal axis) and primer (panels). For all plots, MFU = Mifish, MV1 = MarVer1, DL = Dloop.

Question 2: Does POD vary with depth?

We first negated the null hypothesis of no variation across depth and that depth does not interact with primer or freeze/thaw cycle to affect POD (not sure if this is the best way to say this) (Supplemental Figure S1, add Q2 model of depth POD here??). We then determined that the best alternative model describing POD includes significant smooth terms for depth, species, and a depth x species interaction (AIC = 4547, Table 2). This model explained 14.3% of variance in eDNA detection probability, with a mean AUC of 0.73 and a mean optimized TSS of 0.38. Full model TSS for all thresholds, as well as per-species AUC, optimized TSS, TSS values across all thresholds are shown in Supplemental Table S2 and Supplemental Figure S2. Species-specific model predictions generally agreed with the raw detection data.

Table 2: Akaike Information Criterion values for all GAMs tested to characterize the relationship between depth and POD of eDNA.
model stratification df deltaAIC
depth, s(species) species 48 0
depth + species species 19.28 141.6
depth, s(family) family 19.72 307.4
depth, s(prey type) prey 13.16 412.5
depth none 3.734 549.4

Broadly, delphinids were more likely to be detected at the surface, beaked whales at depths > 200m, and baleen whales either at the surface or at depths > 400m (Figure 3). Note that although detection data from the five species detected fewer than 10 times were included in our model input, we did not make predictions for any of these species.

Among the delphinids, species-specific trends in POD generally reflected model estimates of time-at-depth. Conversely, most beaked whale and baleen whale species in the study were very likely to be detected outside depths where they spend most of their time (Figure 3). Most notably, predicted POD-at-depth plots for two of the three baleen whales indicated a high probability of detection at 500 m, while these two species spend most of their time above ~200 m. Per-species model predictions are shown alongside observed detections in Supplemental Figure S5.

Figure 3: Species-specific probability of detection across all depths sampled in this study. Solid line shows POD; shaded regions show 95%, 75%, and 50% confidence intervals. Dashed line shows proportional time at depth. Species abbreviation codes can be found in Table 1; plots are colored by taxonomic category as outlined in Table 1.

Models stratifying detections by taxonomic Family and preferred prey type both had lower AIC values than the model stratified by species (AIC = 4854 and 4960, respectively); however, these models still had significant terms for Family and depth x Family interactions (taxonomic model) and for depth, prey type, and depth x prey type interactions (prey type model). Predicted POD curves for each model illustrate varying trends in POD with depth across strata (Figure 4). Similar to the model stratified by species, we did not make predictions for Families with fewer than 10 detections using the taxonomic model, which excluded the Balaenidae (bowhead whale family).

Figure 4: Left: POD by depth for species stratified by taxonomic Family. Right: POD by depth for species stratified based on broad prey categories.

Question 3: Does depth variability in POD covary with geographic variability?

Adding xy covariates to the best fit model from Q2 improved overall model performance. Our model included terms for species, lat/lon, and depth, as well as species x lat/lon, species x depth, and species x depth x lat/lon (AIC = 4201), and explained 25.3% of the variability in eDNA detection. Although terms for depth and lat/lon were not significant, removing these terms increased model AIC to 4947. Mean AUC across a 10-fold validation test was 0.78 and mean optimized TSS was 0.45. Full model TSS for all thresholds are shown in Supplemental Table S2 and Supplemental Figure S2.

Although \(xy\) predictor variables improved model performance, the predicted \(z\) distribution of POD for the most frequently detected delphinid, baleen whale, and beaked whale exhibited minimal variability across \(x\) and \(y\), largely adhering to the broad trends identified in Q2 (Figure 5).

The Pacific white-side dolphin is most likely to be detected near the surface, with max POD varying between depths of 0 m and 220 m, and deeper max POD depths toward the southeast corner of the study area. POD was considered 0 (less than 1%) in the region off the shelf break between Point Conception and San Fransisco Bay (Figure 5).

Depths of highest POD for Baird’s beaked whales varied between 130 m and 470 m, with minimal spatial variability in depth of max POD across the study area. Notably, POD was considered 0 (less than 1%) along most of the continental shelf.

Depths of highest POD for humpback whales had two disparate peaks, at 0 m and 500 m, with almost no intermediate depths at max POD. The two peaks in depth of greatest POD for humpback whales were roughly distributed so that higher probabilities of detection at the surface occurred along the continental shelf, and higher probabilities of detection at depth occured on the shelf break or in deeper water. POD was considered 0 (less than 1%) along the coast near Point Conception, as well as off the shelf in the same region.

Figure 5: Depth of maximum probability of detection (POD; top) and range of depths falling within 50% CI of the depth of maximum detection (bottom) for three species: Baird’s beaked whale (B. bairdii), max POD ~ 0.3; Pacific white-sided dolphin (L. obliquidens), max POD ~ 0.35; humpback whale (M. novaegliae), max POD ~ 0.54. Regions with max POD < 0.01 were masked in light grey.

In addition to examining geographic patterns in the depth of maximum POD, we examined uncertainty around these values by mapping the range of depths that fell within the 50% confidence interval of maximum POD (Figure 5, bottom). In the northern portion of the study area, where Pacific white-sided dolphins are most likely to be observed near the surface, the range of depths within the 50% CI of maximum POD is neglgible (minimum depth range = 0 m). That range increases toward the south and east (maximum depth range = 500 m), with the greatest variability in the south and east of the study area, near the region where POD is close to 0. Baird’s beaked whales exhibited minimal variability in 50% CI width of maximum depth (minimum depth range = 0 m), except for a small portion of the northwestern edge of the study area (maximum depth range = 470 m) near the edge of the prediction grid.

The bimodal peak of maximum POD observed in humpback whale eDNA was mirrored in the range of depths within the 50% CI of maximum POD: the depth range was largest in the transition zones between the two peaks (maximum depth range = 500 m), and small outside of the transition zones (minimum depth range = 0 m).

Discussion

In addition to being extremely rare targets, cetaceans are highly mobile in \(xy\) space, and their \(z\) distributions span 100s to 1,000s of meters. Mapping the \(z\) distribution of their eDNA can provide new insight regarding their 3D spatial use, including seasonal or geographic variability in \(z\) distribution. Additionally, eDNA monitoring has the potential to improve our understanding of \(xy\) spatial distribution, especially for visually or acoustically elusive species, when used in joint species distribution models. Here we show that the use of occupancy models and niche-based models can inform and optimize eDNA sample collection and processing. For cetaceans and other highly mobile rare marine species, tailoring experimental design based on the results of this study could increase detection rates and improve distributional models while minimizing cost.

Question 1:

We began with an occupancy model to explicitly separate the effects of eDNA presence at the sampling site/in the bottle and downstream detection in the sequence data. By implementing this framework we observed an increase in POD with increased primer specificity. Similarly, increasing technical replication increases POD, except that increasing freeze/thaw cycles decreases POD and may counteract the effect of technical replication for longer target regions, as was observed with DL in this study. Shorter regions (MFU and MV1) more robust to decay from freeze/thaw cycles but lower genetic variability in cetacean DNA in each of those regions hinders the ability to classify sequences to species, which necessitated heavy manual curation of the reference database for each of these primers. Within the Delphinidae family, several sequences could not be resolved to species, resulting in the loss of 1 DL species-level detection, 2 MFU species-level detections, and 17 MV1 species-level detections. These family-level detections could not be included in downstream species-specific models of eDNA distribution.

Similar studies have found that observation bias and rare species dropout in eDNA sampling are affected by a number of parameters in the sample collection and processing protocols (e.g. Shelton et al. 2023; Gold et al. 2023), including sampling volume (Patin et al. in review), filter type and pore size (Liu et al. 2024), primer specificity (Shaffer et al. 2025), and technical replication (Ficetola et al. 2015), among others. The results of our study indicate that rare species in a sample can be detected more efficiently using primers with higher specificity and targeting regions with greater variability among the taxa of interest; further, the samples should be subjected to a limited number of freeze/thaw cycles in order to preserve the integrity of longer DNA fragments. Because increased technical replication is adviseable to increase POD of rare species, the best practice would be to conduct all technical replicates for a given sample set concurrently, thus only needing to thaw the DNA 0-1 times before amplification and sequencing.

After accounting for observation bias in our detection rates, we found that species-specific patterns of cetacean eDNA presence with depth were qualitatively similar in shape when fitted in an occupancy model context vs GAM models used in the remainder of the study. Estimates of POD from the occupancy model were higher overall than POD estimates from GAM models, with confidence intervals approaching 1. This makes sense since the Bayesian hierarchical occupancy model was estimates the POD conditional on the presence of target eDNA in the sample, whereas the GAM collapse conditional POD with the probability of target species’ occurence at a sampling site, which will lower overall POD.

While the occupancy model produced qualitatively similar estimates of the depth distribution of cetcean POD, we were not able to produce similar models of species-specific POD using the occupancy model. We hypothesize that this is due to overpartitioning of the detection data amongst strata, and used GAM models for Questons 2 and 3 in our study. While the GAM models do not explicitly account for the sources of observation bias identified in this study, our cetacean-wide occupancy model results indicate that the results would be qualitatively similar, with additional resolution between species-specific conditional POD and the probability of a species’ presence at a site. Continue eDNA sampling at these locations will facilitate the convergence of these models in the future.

Question 2:

POD across all cetaceans varied significantly with sampling depth (Supplemental Figure S1). This variability was best described at a species-specific level, and broadly reflected our current understanding of species-specific variability in use of the water column (Figure 3). This was especially true among the delphinids and porpoises; on the other hand, we observed some noticeable deviations from expected trends in both the baleen whales and the beaked whales. These trends could be inherent to the behavior of the animals themselves, or more likely may reflect the “behavior” of eDNA.

For example, baleen whales are generally thought to be shallow divers, and while some dives up to approximately 500 m have been recorded in a few species (e.g. fin whales, Panigada et al. 1999), recent reports of time-at-depth for cetacean species indicates that baleen whales spend the vast majority of their time within the top 100-200 m of the ocean (Watwood and Buonantony 2012; U.S. Department of the Navy 2017). However, we observe a sharp spike in detection probability at 500 m for two of the three baleen whale species included in our model predictions. Gray whale eDNA was not likely to be observed at 500 m; this is most likely because they are a nearshore coastal species that is rarely observed in deep water. Humpback and fin whales both have pelagic distributions, and both species’ eDNA was as likely to be observed at 500 m as at the surface, with low POD at intermediate depths.

This observed pattern has several possible explanations. It is possible that some of these deeper eDNA detections, especially those of deeper-diving fin whales, are detections of deep dives made by animals foraging for krill at depth. However, based on our current knowledge of the diving limits of baleen whales, we consider it highly unlikely that most of the deep detections are indicative of the presence of the detected species at that depth. We considered the possibility that the large spike in deep detections was driven by whalefall or predation (i.e. whole or parts of animals sinking in the water column). In total, baleen whales were detected below 250 m at 28 locations; of those, 8 detections (28.5%) were within 100 m of the ocean floor and could be potential detections of whalefall. A map of potential whalefall locations is included in Supplemental Figure S2. In addition to this, a single baleen whale detection coincided with the detection of killer whale eDNA, and therefore could be attributed to predation. No eDNA detections of baleen whales coincided with the detection of shark eDNA.

The remaining 19 deep-water detections of baleen whale eDNA (68%) may be attributed to whale poop. While sinking rates and depths of baleen whale poop are generally unknown, we know from expert observations that fecal material from baleen whales is quite large and can sink rapidly from the surface (WHO CAN I PERS COMM FOR THIS?!? KIM?!?!). Larger clumps of eDNA can be detectable for up to 2-4 days at the surface (Brandão-Dias et al. 2025; S. A. Wood et al. 2020), and likely longer at depth where it is colder, darker, and there are fewer microbes digesting cellular material (Andruszkiewicz, Sassoubre, and Boehm 2017). Research on oceanic particle sinking rate indicates that smaller particles (up to 6 mm) can sink at rates of ~50-300 m day, and that sinking rates increase rapidly with particle size while degradation rates decrease with increasing particle size, with maximum sinking depths in the range of ~800-5,000 m (Giering et al. 2020; McDonnell, Boyd, and Buesseler 2015; Iversen and Ploug 2010).

Based on this, the maximum sinking depth of larger poop particles may far exceed the \(z\) range of our study area. It is possible that baleen whale detection rates will continue to increase with increasing depth in some regions due to rapid particle sinking and longer eDNA residency times at depth (Andruszkiewicz, Sassoubre, and Boehm 2017). Similarly, detections made near the ocean floor that may be whalefall could also be signals of accumulated whale poop. Genearlly, eDNA is thought to travel less than 10 horizontal km before it is no longer detectable; therefore on a broad scale these detections are likely to reflect the \(xy\) distribution of these species but the temporal resolution of detections may be decreased from days to months in samples from deeper environments.

Beaked whales were also generally detected deeper than where they are thought to spend most of their time (Figure 3), although still within their normal range of vertical habitat use. Beaked whales spend less of their time near the surface than either baleen whales or delphinids; time-at-depth models estimate that they spend approximately 70% of their time below 100 m (Watwood and Buonantony 2012; U.S. Department of the Navy 2017), with depth ranges spanning down as far as 3,000 m (Schorr et al. 2014). While below the surface, beaked whales generally make deep (500+ m) foraging dives interspersed with several shallow (200-300 m) “resting” dives (Baird et al. 2006). We found that goose-beaked whales, considered the most common beaked whale in the Pacific Ocean, were most likely to be detected between 200 and 300 m, in the range of their resting dives. Baird’s beaked whales were most likely to be detected at 400-500 m; as 500 m was the extent of our sampling range, it is possible that this species or other beaked whales have higher detection probabilities at depths greater than 500 m. While we did not make predictions of POD by depth for the saber-toothed whale, all three detections of this species were at 300 m (n = 1) and 500 m (n =2).

Finally, for both baleen whales and beaked whales, high POD at depth may be caused by either slower decay/longer residency times or decreased abundance of highly prevalent taxa such as microbes, plankton, fishes, etc (Robinson et al. 2024; Pinfield et al. 2019).

In general, sampling near the surface will prioritize detections of delphinids and porpoises, and sampling at 300+ m will prioritize detection rates of beaked whales, with baleen whales detectable at both the surface and depths >300 m. While we found that POD varies by species, we also observed significant trends in POD with depth when species were grouped by phylogenetic family or by broad prey category. This indicates some degree of similarity in habitat use across the \(z\) axis within these groups, such that researchers wanting to design an eDNA monitoring study targeting a species for which there is little informaton could stratify their sampling according to phylogenic or niche-based distribution models.

Question 3:

Trends in POD by depth varied geographically by species, as illustrated by the three example species for which we made predictions of depth of maximum POD (Figure 5). At the broad scale of this study, we expect that this at least partially reflects geographic variability in species-specific use of the water column. That may reflect geographic shifts in the depth distribution of prey or preferred prey items of a given predator, which may in turn reflect geographic variability in oceanographic conditions.

Alternatively, it is likely that at least a portion of this observed variability is directly linked to oceanographic variability, and its effect on the fate and transport of eDNA once deposited in the water column. eDNA degradation and residence time will be affected by water temperature, pH, and microbial density. During that time, horizontal transport will be affected by currents, eddies, etc. Vertical transport will be affected by particle size, stratification, upwelling, mixed layer depth, etc. This study is relatively insensitive to horizontal transport of eDNA, which is expected to travel up to XX km on average before it is no longer detectable (e.g., Xiong et al. 2025; Brasseale et al. 2025; Brandão-Dias et al. 2025), our samples were collected at 10 km intervals throughout the study area. Given this high level resolution, our predictions of the \(z\) distribution in probability of detecting rare species eDNA is most likely to be affected by oceanographic processes that change sinking rates or residence time.

It is likely that variability in the \(z\) distribution of species-specific POD represents a combination of both \(xy\) variability in species-specific use of the water column, as well as \(xy\) variability in eDNA sinking and residency times. In both cases, underlying oceanographic processes are an important driver of the observed patterns. To assess these links between oceanography and 3D spatial distribution, our next study will explicitly focus on the development of 3D spatial distribution models using eDNA and using these models to determine the important oceanographic drivers of cetacean eDNA in 3D space.

Considerations for beaked whales

Among cetaceans, beaked whales were the most rarely observed in our dataset, echoing the results of previous visual and acoustic data collection efforts in the California Current, resulting in poor performance metrics for SDMs and density estimates for this clade (e.g. Elizabeth A. Becker et al. 2020b). For example, during regular surveys of the U.S. EEZ portion of the California Current ecosystem, Berardius bairdii were observed approximately 45 times between 1991 and 2018 (Carretta et al. 2023) across a study area roughly double the latitudinal width of our study area.

We observed the same species 14 times using eDNA, and note that all but one of these detections were made below the surface. Most detections were made below 100 m, and the highest predicted POD for any beaked whales species was at depths > 200 m both with and without \(xy\) model covariates. Incorporating \(xy\) covariates illustrated that beaked whales are not likely to be observed on the continental shelf, supporting previous SDMs showing their absence near the coast (Elizabeth A. Becker et al. 2020b), and that their offshore eDNA detections are most likely to be at depths > 300 m throughout the Califonia Current. In the context of beaked whale ecology, this result is not surprising, but it illustrates the promise of eDNA to supplement our current database of positive detections in the case of extremely rarely sighted species such as beaked whales, which may improve SDMs considerably. eDNA can also be used to refine our understanding of habitat use in the \(z\) dimension for these deepwater species.

Marine Mammal monitoring: lessons for eDNA experimental design

Species-specific spatial models, niche-based modeling, or phylogeny-based distribution models can be used to stratify eDNA sampling designs in order to increase sampling efficiency and positive detections of target taxa, especially for rare and mobile marine species, while reducing sampling costs. While eDNA sampling is promising as a complementary or standalone approach to characterizing the distributions of marine mammals, its efficacy will be greatest for species with SDMs that are low resolution or have high confidence intervals due to a paucity of visual or acoustic observations. These are generally deeper-diving species such as beaked whales. Future marine mammal surveys may consider integrating nighttime collection of eDNA samples from depths >300 m to increase positive detections of these species. Increasing positive detections for these or any rare species will increase spatiotemporal resolution of SDMs and habitat use models, which can then be fed back into experimental design for future eDNA sampling surveys.

Once samples are collected, primer specificity, DNA degradation associated with freeze–thaw cycles, and the level of technical replication all influence the probability of detecting eDNA in each sample. Increasing primer specificity and technical replication increases probability of detecting rare species in the sample. While targeting shorter fragments that are robust to DNA decay from freeze/thaw cycles may be a useful way to increase technical replication, it may also decrease the ability to classify sequence data down to species-level resolution. Therefore, sequencing longer and more specific fragments first, using as few freeze/thaw cycles as possible, and conducting all technical replicates simultaneously will increase POD of DNA that is rare in the sample.

Integrating optimized eDNA sampling design with strategies that maximize DNA capture has the potential to substantially expand the utility of eDNA for modeling the distribution of rare marine species, including cetaceans. The framework developed here provides practical guidance for tailoring eDNA sample collection and processing to single or multiple taxa in accordance with study objectives, increasing detection efficiency while reducing sampling effort and cost.

Data availability

All raw sequence data have been uploaded to GenBank (Accessions: XXXX). Detection data, sample metadata, and all code used to generate this manuscript are available at https://github.com/MMARINeDNA/zDistribution.

Acknowledgements

We are grateful to the many people who supported eDNA sample collection and processing, including XXXX, and the crew of the R/V XXXX. Funding for this research came from ONR Grant XXXX.

Supplemental Materials

Table S1. Sample metadata.

Table 3: Table S2. Per-species AUC, optimized TSS, and optimized TSS threshold for Q1 and Q3 models.
Figure 6: Supplemental Figure S1. Q1 null hypothesis plot.
Figure 7: Supplemental Figure S2. TSS plots for the best-fit Q@ model (top) and Q3 model (bottom) showing the 10-fold mean TSS values across all detection thresholds. The optimal detection threshold is that which maximizes TSS.
Figure 8: Supplemental Figure S3. Locations of eDNA detections within 100 m of the ocean floor, which may represent nearby whalefall.
Figure 9: Supplemental Figure S5. Q1 best fit model with observed detections.

References

Abrahms, Briana, Heather Welch, Stephanie Brodie, Michael G. Jacox, Elizabeth A. Becker, Steven J. Bograd, Ladd M. Irvine, Daniel M. Palacios, Bruce R. Mate, and Elliott L. Hazen. 2019. “Dynamic Ensemble Models to Predict Distributions and Anthropogenic Risk Exposure for Highly Mobile Species.” Diversity and Distributions 25 (8): 1182–93. https://doi.org/10.1111/ddi.12940.
Afonso, Luís, Joana Costa, Ana Mafalda Correia, Raul Valente, Eva Lopes, Maria Paola Tomasino, Ágatha Gil, et al. 2024. “Environmental DNA as a Complementary Tool for Biodiversity Monitoring: A Multi-Technique and Multi-Trophic Approach to Investigate Cetacean Distribution and Feeding Ecology.” PLOS ONE 19 (10): e0300992. https://doi.org/10.1371/journal.pone.0300992.
Allouche, Omri, Asaf Tsoar, and Ronen Kadmon. 2006. “Assessing the Accuracy of Species Distribution Models: Prevalence, Kappa and the True Skill Statistic (TSS).” Journal of Applied Ecology 43 (6): 1223–32. https://doi.org/10.1111/j.1365-2664.2006.01214.x.
Andruszkiewicz, Elizabeth A., Lauren M. Sassoubre, and Alexandria B. Boehm. 2017. “Persistence of Marine Fish Environmental DNA and the Influence of Sunlight.” PLOS ONE 12 (9): e0185043. https://doi.org/10.1371/journal.pone.0185043.
Auad, Guillermo, Dean Roemmich, and John Gilson. 2011. “The California Current System in Relation to the Northeast Pacific Ocean Circulation.” Progress in Oceanography 91 (4): 576–92. https://doi.org/10.1016/j.pocean.2011.09.004.
Baird, Robin W., Daniel L. Webster, Daniel J. McSweeney, Allan D. Ligon, Gregory S. Schorr, and Jay Barlow. 2006. “Diving Behaviour of Cuvier’s ( Ziphius Cavirostris ) and Blainville’s ( Mesoplodon Densirostris ) Beaked Whales in Hawai‘i.” Canadian Journal of Zoology 84 (8): 1120–28. https://doi.org/10.1139/z06-095.
Baker, C. S., V. Lukoschek, S. Lavery, M. L. Dalebout, M. Yong-un, T. Endo, and N. Funahashi. 2006. “Incomplete Reporting of Whale, Dolphin and Porpoise ‘Bycatch’ Revealed by Molecular Monitoring of Korean Markets.” Animal Conservation 9 (4): 474–82. https://doi.org/10.1111/j.1469-1795.2006.00062.x.
Becker, Elizabeth A., James V. Carretta, Karin A. Forney, Jay Barlow, Stephanie Brodie, Ryan Hoopes, Michael G. Jacox, et al. 2020a. “Performance Evaluation of Cetacean Species Distribution Models Developed Using Generalized Additive Models and Boosted Regression Trees.” Ecology and Evolution 10 (12): 5759–84. https://doi.org/10.1002/ece3.6316.
Becker, Elizabeth A., Karin A. Forney, David L. Miller, Jay Barlow, Lorenzo Rojas-Bracho, Jorge Urbán R, and Jeff E. Moore. 2022. “Dynamic Habitat Models Reflect Interannual Movement of Cetaceans Within the California Current Ecosystem.” Frontiers in Marine Science 9 (May). https://doi.org/10.3389/fmars.2022.829523.
Becker, Elizabeth A., Karin A. Forney, Bruce J. Thayre, Amanda Debich, Gregory S. Campbell, Katherine Whitaker, Annie B. Douglas, Anita Gilles, Ryan Hoopes, and John A. Hildebrand. 2016. “Habitat-Based Density Models for Three Cetacean Species Off Southern California Illustrate Pronounced Seasonal Differences 4 (May): 1–14. https://doi.org/10.3389/fmars.2017.00121.
Becker, Elizabeth A, Karin A Forney, David L Miller, Paul C Fiedler, Jay Barlow, and Jeff E Moore. 2020b. “Habitat-Based Density Estimates for Cetaceans in the California Current Ecosystem Based on 1991-2018 Survey Data.” December.
Becker, Elizabeth A, Karin A Forney, Jessica V Redfern, Jay Barlow, Michael G Jacox, Jason J Roberts, and Daniel M Palacios. 2019. “Predicting Cetacean Abundance and Distribution in a Changing Climate.” Diversity and Distributions 25 (4): 626–43. https://doi.org/10.1111/ddi.12867.
Brandão-Dias, Pedro Fp, Megan Shaffer, Gledis Guri, Kim M. Parsons, Ryan P. Kelly, and Elizabeth Andruszkiewicz Allan. 2025. “Differential Decay of Multiple Environmental Nucleic Acid Components.” Scientific Reports 15 (1): 26791. https://doi.org/10.1038/s41598-025-12916-5.
Brasseale, Elizabeth, Nicolaus Adams, Elizabeth Andruszkiewicz Allan, Eiren K. Jacobson, Ryan P. Kelly, Owen R. Liu, Stephanie Moore, Megan Shaffer, Jilian Xiong, and Kim Parsons. 2025. “Marine eDNA Production and Loss Mechanisms.” Journal of Geophysical Research: Oceans 130 (4): e2024JC021643. https://doi.org/10.1029/2024JC021643.
Callahan, Benjamin J, Paul J McMurdie, Michael J Rosen, Andrew W Han, Amy Jo A Johnson, and Susan P Holmes. 2016. DADA2: High-resolution Sample Inference from Illumina Amplicon Data.” Nature Methods 13 (7): 581–83. https://doi.org/10.1038/nmeth.3869.
Cao, Y., D. P. Larsen, and R. St-J. Thorne. 2001. “Rare Species in Multivariate Analysis for Bioassessment: Some Considerations.” Journal of the North American Benthological Society 20 (1): 144–53. https://doi.org/10.2307/1468195.
Carretta, James V., Erin M. Oleson, Karin A. Forney, David W. Weller, Aimee R. Lang, Jason Baker, Anthony J Orr, et al. 2023. “U.S. Pacific Marine Mammal Stock Assessments: 2022.” U.S. Department of Commerce, NOAA Technical Memorandum NMFS-SWFSC-684.
Claver, Cristina, Beatriz Sobradillo, Iñaki Mendibil, Oriol Canals, Guillermo Boyra, Leire Ibaibarriaga, Ryan P. Kelly, and Naiara Rodríguez-Ezpeleta. 2025. “Integrating eDNA and Acoustic-Trawl Data to Provide Small Pelagic Biomass Estimates for Fisheries Assessment.” bioRxiv. https://doi.org/10.1101/2025.06.11.659118.
Dalpaz, L., A. D. Paro, F. G. Daura-Jorge, M. Rossi-Santos, T. F. Norris, S. N. Ingram, and L. L. Wedekin. 2021. “Better Together: Analysis of Integrated Acoustic and Visual Methods When Surveying a Cetacean Community.” Marine Ecology Progress Series 678 (November): 197–209. https://doi.org/10.3354/MEPS13898.
de Valpine, Perry, Daniel Turek, Christopher J. Paciorek, Clifford Anderson-Bergman, Duncan Temple Lang, and Rastislav Bodik. 2017. “Programming With Models: Writing Statistical Algorithms for General Model Structures With NIMBLE.” Journal of Computational and Graphical Statistics 26 (2): 403–13. https://doi.org/10.1080/10618600.2016.1172487.
Engler, Robin, Antoine Guisan, and Luca Rechsteiner. 2004. “An Improved Approach for Predicting the Distribution of Rare and Endangered Species from Occurrence and Pseudo-Absence Data.” Journal of Applied Ecology 41 (2): 263–74. https://doi.org/10.1111/j.0021-8901.2004.00881.x.
Fawcett, Tom. 2006. “An Introduction to ROC Analysis.” Pattern Recognition Letters, ROC Analysis in Pattern Recognition, 27 (8): 861–74. https://doi.org/10.1016/j.patrec.2005.10.010.
Ficetola, Gentile F., Johan Pansu, Aurélie Bonin, Eric Coissac, Charline Giguet-Covex, Marta De Barba, Ludovic Gielly, et al. 2015. “Replication Levels, False Presences and the Estimation of the Presence/Absence from eDNA Metabarcoding Data.” Molecular Ecology Resources 15 (3): 543–56. https://doi.org/10.1111/1755-0998.12338.
Fiedler, Paul C., Elizabeth A. Becker, Karin A. Forney, Jay Barlow, and Jeff E. Moore. 2023. “Species Distribution Modeling of Deep-Diving Cetaceans.” Marine Mammal Science 39 (4): 1178–1203. https://doi.org/10.1111/mms.13057.
Findlay, K. P., P. B. Best, G. J. B. Ross, and V. G. Cockcroft. 1992. “The Distribution of Small Odontocete Cetaceans Off the Coasts of South Africa and Namibia.” South African Journal of Marine Science 12 (1): 237–70. https://doi.org/10.2989/02577619209504706.
Forney, Karen, and Jay Barlow. 1998. “Seasonal Patterns in the Abundance and Distribution of California Cetaceans.” Marine Mammal Science 14 (July): 460–89.
Gelman, Andrew, and Donald B. Rubin. 1992. “Inference from Iterative Simulation Using Multiple Sequences.” Statistical Science 7 (4): 457–72. https://doi.org/10.1214/ss/1177011136.
Giering, Sarah Lou Carolin, Emma Louise Cavan, Sünnje Linnéa Basedow, Nathan Briggs, Adrian B. Burd, Louise J. Darroch, Lionel Guidi, et al. 2020. “Sinking Organic Particles in the OceanFlux Estimates From in Situ Optical Devices.” Frontiers in Marine Science 6 (February). https://doi.org/10.3389/fmars.2019.00834.
Gold, Zachary, Andrew Olaf Shelton, Helen R. Casendino, Joe Duprey, Ramón Gallego, Amy Van Cise, Mary Fisher, et al. 2023. “Signal and Noise in Metabarcoding Data.” PLOS ONE 18 (5): e0285674. https://doi.org/10.1371/journal.pone.0285674.
Guélat, Jérôme, and Marc Kéry. 2018. “Effects of Spatial Autocorrelation and Imperfect Detection on Species Distribution Models.” Methods in Ecology and Evolution 9 (6): 1614–25. https://doi.org/10.1111/2041-210X.12983.
Guisan, Antoine, Olivier Broennimann, Robin Engler, Mathias Vust, Nigel G. Yoccoz, Anthony Lehmann, and Niklaus E. Zimmermann. 2006. “Using Niche-Based Models to Improve the Sampling of Rare Species.” Conservation Biology 20 (2): 501–11. https://doi.org/10.1111/j.1523-1739.2006.00354.x.
Hickey, Barbara M. 1979. “The California Current System—Hypotheses and Facts.” Progress in Oceanography 8 (4): 191–279. https://doi.org/10.1016/0079-6611(79)90002-8.
Iversen, M. H., and H. Ploug. 2010. “Ballast Minerals and the Sinking Carbon Flux in the Ocean: Carbon-Specific Respiration Rates and Sinking Velocity of Marine Snow Aggregates.” Biogeosciences 7 (9): 2613–24. https://doi.org/10.5194/bg-7-2613-2010.
Jeliazkov, Alienor, Yoni Gavish, Charles J. Marsh, Jonas Geschke, Neil Brummitt, Duccio Rocchini, Peter Haase, William E. Kunin, and Klaus Henle. 2022. “Sampling and Modelling Rare Species: Conceptual Guidelines for the Neglected Majority.” Global Change Biology 28 (12): 3754–77. https://doi.org/10.1111/gcb.16114.
Kanaji, Yu, and Tomio Miyashita. 2021. “Habitat Use and Habitat Overlap of Delphinidae Species Revealed by Mixed-Species Group Occurrence in the North Pacific Ocean.” Marine Mammal Science 37 (3): 1128–38. https://doi.org/10.1111/mms.12794.
Le Lay, Gwenaëlle, Robin Engler, Erika Franc, and Antoine Guisan. 2010. “Prospective Sampling Based on Model Ensembles Improves the Detection of Rare Species.” Ecography 33 (6): 1015–27. https://doi.org/10.1111/j.1600-0587.2010.06338.x.
Liu, Qianqian, Juan Tan, Min Wang, Ni Xin, Rui Qi, and Hui Wang. 2024. “Optimization of Pore Size and Filter Material for Better Enrichment of Environmental DNA.” Frontiers in Environmental Science 12 (June). https://doi.org/10.3389/fenvs.2024.1422269.
Lyet, Arnaud, Wilfried Thuiller, Marc Cheylan, and Aurélien Besnard. 2013. “Fine-Scale Regional Distribution Modelling of Rare and Threatened Species: Bridging GIS Tools and Conservation in Practice.” Diversity and Distributions 19 (7): 651–63. https://doi.org/10.1111/ddi.12037.
McClatchie, Sam. 2014. “Oceanography of the Southern California Current System Relevant to Fisheries.” In Regional Fisheries Oceanography of the California Current System: The CalCOFI Program, edited by Sam McClatchie, 13–60. Dordrecht: Springer Netherlands. https://doi.org/10.1007/978-94-007-7223-6_2.
McDonnell, A. M. P., P. W. Boyd, and K. O. Buesseler. 2015. “Effects of Sinking Velocities and Microbial Respiration Rates on the Attenuation of Particulate Carbon Fluxes Through the Mesopelagic Zone.” Global Biogeochemical Cycles 29 (2): 175–93. https://doi.org/10.1002/2014GB004935.
Miya, M., Y. Sato, T. Fukunaga, T. Sado, J. Y. Poulsen, K. Sato, T. Minamoto, et al. 2015. MiFish, a Set of Universal PCR Primers for Metabarcoding Environmental DNA from Fishes: Detection of More Than 230 Subtropical Marine Species.” Royal Society Open Science 2 (7): 150088. https://doi.org/10.1098/rsos.150088.
Panigada, S, M Zanardelli, S Canese, and M Jahoda. 1999. “How Deep Can Baleen Whales Dive?” Marine Ecology Progress Series 187: 309–11. https://doi.org/10.3354/meps187309.
Pikitch, Ellen K. 2018. “A Tool for Finding Rare Marine Species.” Science 360 (6394): 1180–82. https://doi.org/10.1126/science.aao3787.
Pinfield, Róisín, Eileen Dillane, Anne Kathrine W. Runge, Alice Evans, Luca Mirimin, Jonas Niemann, Thomas E. Reed, et al. 2019. “False-Negative Detections from Environmental DNA Collected in the Presence of Large Numbers of Killer Whales (Orcinus Orca).” Environmental DNA 1 (4): 316–28. https://doi.org/10.1002/edn3.32.
Queheillalt, Dianna M., James W. Cain, Denise E. Taylor, Michael L. Morrison, Stacia L. Hoover, Natasha Tuatoo-Bartley, Lourdes Rugge, et al. 2002. “The Exclusion of Rare Species from Community-Level Analyses.” Wildlife Society Bulletin (1973-2006) 30 (3): 756–59. https://www.jstor.org/stable/3784228.
R Core Team. 2016. “R: A Language and Environment for Statistical Computing.” Vienna, Austria: R Foundation for Statistical Computing.
Rankin, Shannon, Jay Barlow, Julie N. Oswald, and Lisa Ballance. 2008. “Acoustic Studies of Marine Mammals During Seven Years of Combined Visual and Acoustic Line-Transect Surveys for Cetaceans in the Eastern and Central Pacific Ocean.” NOAA-TM-NMFS-SWFSC-429.
Robinson, Chloe V., Karina Dracott, Robin D. Glover, Adam Warner, and Amy Migneault. 2024. DNA from Dives: Species Detection of Humpback Whales (Megaptera Novaeangliae) from Flukeprint eDNA.” Environmental DNA 6 (2): e524. https://doi.org/10.1002/edn3.524.
Schorr, Gregory S., Erin A. Falcone, David J. Moretti, and Russel D. Andrews. 2014. “First Long-Term Behavioral Records from Cuvier’s Beaked Whales (Ziphius Cavirostris) Reveal Record-Breaking Dives.” PLOS ONE 9 (3): e92633. https://doi.org/10.1371/journal.pone.0092633.
Shaffer, Megan R., Elizabeth Andruszkiewicz Allan, Amy M. Van Cise, Kim M. Parsons, Andrew Olaf Shelton, and Ryan P. Kelly. 2025. “Observation Bias in Metabarcoding.” Molecular Ecology Resources 25 (7): e14119. https://doi.org/10.1111/1755-0998.14119.
Shelton, Andrew Olaf, Zachary J Gold, Alexander J Jensen, Erin D\primeAgnese, Elizabeth Andruszkiewicz Allan, Amy Van Cise, Ramón Gallego, et al. 2023. “Toward Quantitative Metabarcoding.” Ecology 104 (2). https://doi.org/10.1002/ecy.3906.
Shelton, Andrew Olaf, Ana Ramón-Laca, Abigail Wells, Julia Clemons, Dezhang Chu, Blake E. Feist, Ryan P. Kelly, et al. 2022. “Environmental DNA Provides Quantitative Estimates of Pacific Hake Abundance and Distribution in the Open Ocean.” Proceedings of the Royal Society B 289 (1971). https://doi.org/10.1098/RSPB.2021.2613.
U.S. Department of the Navy. 2017. “Dive Distribution and Group Size Parameters for Marine Species Occurring in the u.s. Navy’s Atlantic and Hawaiisouthern California Training and Testing Study Areas.” Newport, Rhode Island: Naval Undersea Warfare Center Division.
Valdivia-Carrillo, Tania, Amy Van Cise, Megan Shaffer, Kim Parsons, Ally Im, Andrew Shelton, Eiren K. Jacobson, et al. 2025. “Modeling Cetacean eDNA Distribution Along the Washington Coast Using Metabarcoding from Opportunistic Samples and Generalized Additive Models.” bioRxiv. https://doi.org/10.1101/2025.11.21.689289.
Valsecchi, Elena, Jonas Bylemans, Simon, J Goodman, Roberto Lombardi, Ian Carr, Laura Castellano, Andrea Galimberti, and Paolo Galli. 2020. “Novel Universal Primers for Metabarcoding Environmental DNA Surveys of Marine Mammals and Other Marine Vertebrates.” https://doi.org/10.1002/edn3.72.
Watwood, Stephanie L, and Danielle M Buonantony. 2012. “Dive Distribution and Group Size Parameters for Marine Species Occurring in Navy Training and Testing Areas in the North Atlantic and North Pacific Oceans. NUWC-NPT Technical Document 12,085.” Newport, Rhode Island: Naval Undersea Warfare Center Division.
White, Hannah J., Caroline M. McKeon, Robin J. Pakeman, and Yvonne M. Buckley. 2023. “The Contribution of Geographically Common and Rare Species to the Spatial Distribution of Biodiversity.” Global Ecology and Biogeography 32 (10): 1730–47. https://doi.org/10.1111/geb.13734.
Wood, S N. 2017. Generalized Additive Models: An Introduction with R. Chapman and Hall/CRC.
Wood, Susanna A., Laura Biessy, Janie L. Latchford, Anastasija Zaiko, Ulla von Ammon, François Audrezet, Melania E. Cristescu, and Xavier Pochon. 2020. “Release and Degradation of Environmental DNA and RNA in a Marine System.” Science of The Total Environment 704 (February): 135314. https://doi.org/10.1016/j.scitotenv.2019.135314.
Xiong, Jilian, Parker MacCready, Elizabeth Brasseale, Elizabeth Andruszkiewicz Allan, Ana Ramón-Laca, Kim M. Parsons, Megan Shaffer, and Ryan P. Kelly. 2025. “Advective Transport Drives Environmental DNA Dispersal in an Estuary.” Environmental Science & Technology 59 (15): 7506–16. https://doi.org/10.1021/acs.est.5c01286.
Yamahara, Kevan M., Elizabeth A. Allan, Julie Robidart, William H. Wilson, James M. Birch, Pascal Craw, Ethan Edson, et al. 2025. “A State-Of-The-Art Review of Aquatic eDNA Sampling Technologies and Instrumentation: Advancements, Challenges, and Future Prospects.” Environmental DNA 7 (4): e70170. https://doi.org/10.1002/edn3.70170.