
Sample points from system detection efficiency surfaces
Source:R/sample_points_from_system_de.R
sample_points_from_system_de.Rd
Generates random points based on cumulative probability surfaces from system detection efficiency calculations. Points are sampled with probability proportional to the cumulative detection probability across the receiver array.
Usage
sample_points_from_system_de(
system_de,
n_points = 100,
prob_column = "cumulative_prob",
fish_select = NULL,
time_select = NULL,
position_points = NULL,
min_prob_threshold = 0.001,
max_prob_threshold = 1,
seed = NULL,
by_group = TRUE,
uniform = FALSE,
crs = NULL
)
Arguments
- system_de
A list returned by detection efficiency system calculation containing a data tibble with cumulative_prob values.
- n_points
Integer. Number of points to sample per fish-time combination (if fish_id and time columns exist) or total. Default is 100.
- prob_column
Character. Name of the probability column to use for sampling. Default is "cumulative_prob".
- fish_select
Integer, character vector, or NULL. Fish ID(s) to sample from. Only used if fish_id column exists in the data. If NULL, samples from all fish. Default is NULL.
- time_select
Numeric vector, character vector, POSIXct vector, or NULL. Time period(s) to sample from. Only used if time-related columns exist. If NULL, samples from all time periods. Default is NULL.
- position_points
An sf object (e.g., from sample_points_from_probabilities) containing fish_id and time_period information to use as a template. When provided, this overrides fish_select and time_select parameters, sampling the same fish-time combinations present in position_points. Default is NULL.
- min_prob_threshold
Numeric. Minimum probability threshold (0-1). Only cells with probability above this threshold are eligible for sampling. Default is 0.001 to exclude zero-probability cells.
- max_prob_threshold
Numeric. Maximum probability threshold (0-1). Only cells with probability below this threshold are eligible for sampling. Default is 1.0 (no upper limit).
- seed
Integer. Random seed for reproducible sampling. Default is NULL.
- by_group
Logical. If TRUE and fish/time columns exist, samples n_points for each fish-time combination. If FALSE or no grouping columns exist, samples n_points total. Default is TRUE.
- uniform
Logical. If TRUE, samples uniformly from all eligible cells. If FALSE, samples with probability proportional to prob_column values. Default is FALSE (probability-weighted sampling).
- crs
Coordinate reference system for the output sf object. Can be:
NULL (default) - attempts to use CRS from system_de$crs, falls back to WGS84
Numeric EPSG code (e.g., 4326 for WGS84, 32618 for UTM Zone 18N)
Character proj4 string
An sf/sfc object from which to extract CRS
Value
An sf object containing the sampled points with columns:
- x
X coordinates
- y
Y coordinates
- probability
The cumulative probability value used for sampling
- sample_id
Sequential sample identifier
- fish_id
Fish identifier (if present in input data)
- time_period
Time period identifier (if present in input data)
- time_period_posix
POSIXct datetime (if present in input data)
- time_period_label
Human-readable time label (if present in input data)
- group_id
Unique identifier for each fish-time combination (if applicable)
- geometry
sf point geometry
Details
This function performs weighted random sampling where each spatial cell has a probability of being selected proportional to its cumulative detection probability value. This is useful for:
Simulating animal release locations based on detection coverage
Monte Carlo analysis of array performance
Generating test positions weighted by detection probability
Array design optimization studies
The function automatically detects whether fish_id and time-related columns exist in the input data and handles grouping accordingly. If these columns are not present, it performs simple random sampling across all cells.
Examples
if (FALSE) {
# Sample from system detection efficiency
sampled_points <- sample_points_from_system_de(
system_DE,
n_points = 500,
seed = 123
)
# Focus on moderate detection areas
moderate_samples <- sample_points_from_system_de(
system_DE,
n_points = 200,
min_prob_threshold = 0.3,
max_prob_threshold = 0.7
)
# Sample with specific CRS (UTM Zone 18N)
utm_samples <- sample_points_from_system_de(
system_DE,
n_points = 300,
crs = 32618 # EPSG code for UTM Zone 18N
)
# Uniform sampling (ignore probability values)
uniform_samples <- sample_points_from_system_de(
system_DE,
n_points = 500,
uniform = TRUE # All eligible cells have equal probability
)
# Plot sampled points
library(ggplot2)
ggplot() +
geom_sf(data = sampled_points, aes(color = probability), alpha = 0.5) +
scale_color_viridis_c(name = "Cumulative\nProbability") +
theme_minimal()
# If fish_id and time data exist in future versions:
# multi_samples <- sample_points_from_system_de(
# system_DE,
# n_points = 50,
# fish_select = c(1, 2, 3),
# time_select = c("2025-07-15", "2025-07-16"),
# by_group = TRUE
# )
# Use position_points as a template for fish-time combinations
# First get position points from positioning results
position_points <- sample_points_from_probabilities(
positioning_results,
n_points = 100,
fish_select = c(1, 2),
time_select = c("2025-07-15", "2025-07-16")
)
# Then sample from system_de using same fish-time combinations
matched_de_samples <- sample_points_from_system_de(
system_DE,
n_points = 100, # Same number per group as position_points
position_points = position_points # Use as template
)
}