Sample points from system detection efficiency surfaces

Generates random points based on cumulative probability surfaces from system detection efficiency calculations. Points are sampled with probability proportional to the cumulative detection probability across the receiver array.

Usage

sample_points_from_system_de(
  system_de,
  n_points = 100,
  prob_column = "cumulative_prob",
  fish_select = NULL,
  time_select = NULL,
  position_points = NULL,
  min_prob_threshold = 0.001,
  max_prob_threshold = 1,
  seed = NULL,
  by_group = TRUE,
  uniform = FALSE,
  crs = NULL
)

Arguments

system_de

A list returned by detection efficiency system calculation containing a data tibble with cumulative_prob values.

n_points

Integer. Number of points to sample per fish-time combination (if fish_id and time columns exist) or total. Default is 100.

prob_column

Character. Name of the probability column to use for sampling. Default is "cumulative_prob".

fish_select

Integer, character vector, or NULL. Fish ID(s) to sample from. Only used if fish_id column exists in the data. If NULL, samples from all fish. Default is NULL.

time_select

Numeric vector, character vector, POSIXct vector, or NULL. Time period(s) to sample from. Only used if time-related columns exist. If NULL, samples from all time periods. Default is NULL.

position_points

An sf object (e.g., from sample_points_from_probabilities) containing fish_id and time_period information to use as a template. When provided, this overrides fish_select and time_select parameters, sampling the same fish-time combinations present in position_points. Default is NULL.

min_prob_threshold

Numeric. Minimum probability threshold (0-1). Only cells with probability above this threshold are eligible for sampling. Default is 0.001 to exclude zero-probability cells.

max_prob_threshold

Numeric. Maximum probability threshold (0-1). Only cells with probability below this threshold are eligible for sampling. Default is 1.0 (no upper limit).

seed

Integer. Random seed for reproducible sampling. Default is NULL.

by_group

Logical. If TRUE and fish/time columns exist, samples n_points for each fish-time combination. If FALSE or no grouping columns exist, samples n_points total. Default is TRUE.

uniform

Logical. If TRUE, samples uniformly from all eligible cells. If FALSE, samples with probability proportional to prob_column values. Default is FALSE (probability-weighted sampling).

crs

Coordinate reference system for the output sf object. Can be:

NULL (default) - attempts to use CRS from system_de$crs, falls back to WGS84
Numeric EPSG code (e.g., 4326 for WGS84, 32618 for UTM Zone 18N)
Character proj4 string
An sf/sfc object from which to extract CRS

Value

An sf object containing the sampled points with columns:

x: X coordinates
y: Y coordinates
probability: The cumulative probability value used for sampling
sample_id: Sequential sample identifier
fish_id: Fish identifier (if present in input data)
time_period: Time period identifier (if present in input data)
time_period_posix: POSIXct datetime (if present in input data)
time_period_label: Human-readable time label (if present in input data)
group_id: Unique identifier for each fish-time combination (if applicable)
geometry: sf point geometry

Details

This function performs weighted random sampling where each spatial cell has a probability of being selected proportional to its cumulative detection probability value. This is useful for:

Simulating animal release locations based on detection coverage
Monte Carlo analysis of array performance
Generating test positions weighted by detection probability
Array design optimization studies

The function automatically detects whether fish_id and time-related columns exist in the input data and handles grouping accordingly. If these columns are not present, it performs simple random sampling across all cells.

Examples

if (FALSE) {
# Sample from system detection efficiency
sampled_points <- sample_points_from_system_de(
  system_DE,
  n_points = 500,
  seed = 123
)

# Focus on moderate detection areas
moderate_samples <- sample_points_from_system_de(
  system_DE,
  n_points = 200,
  min_prob_threshold = 0.3,
  max_prob_threshold = 0.7
)

# Sample with specific CRS (UTM Zone 18N)
utm_samples <- sample_points_from_system_de(
  system_DE,
  n_points = 300,
  crs = 32618  # EPSG code for UTM Zone 18N
)

# Uniform sampling (ignore probability values)
uniform_samples <- sample_points_from_system_de(
  system_DE,
  n_points = 500,
  uniform = TRUE  # All eligible cells have equal probability
)

# Plot sampled points
library(ggplot2)
ggplot() +
  geom_sf(data = sampled_points, aes(color = probability), alpha = 0.5) +
  scale_color_viridis_c(name = "Cumulative\nProbability") +
  theme_minimal()

# If fish_id and time data exist in future versions:
# multi_samples <- sample_points_from_system_de(
#   system_DE,
#   n_points = 50,
#   fish_select = c(1, 2, 3),
#   time_select = c("2025-07-15", "2025-07-16"),
#   by_group = TRUE
# )

# Use position_points as a template for fish-time combinations
# First get position points from positioning results
position_points <- sample_points_from_probabilities(
  positioning_results,
  n_points = 100,
  fish_select = c(1, 2),
  time_select = c("2025-07-15", "2025-07-16")
)

# Then sample from system_de using same fish-time combinations
matched_de_samples <- sample_points_from_system_de(
  system_DE,
  n_points = 100,  # Same number per group as position_points
  position_points = position_points  # Use as template
)
}

Usage

Arguments

Value

Details

See also

Examples