Skip to contents

Estimates fish positions by combining detection and non-detection data from acoustic telemetry arrays. The function aggregates detection events into time bins, models detection efficiency, and calculates weighted position probabilities.

Usage

calculate_fish_positions(
  station_detections,
  station_distances_df,
  receiver_stations,
  de_model = NULL,
  time_aggregation = "seconds",
  bin_size_seconds = 3600,
  detection_weight = 0.5,
  non_detection_weight = 1,
  integration_method = "subtractive",
  max_non_detection_distance = 2000,
  weighting_method = "information_theoretic",
  percentile_cutoff = 0.95,
  temporal_grouping = "day",
  dampening_factor = 1,
  normalization_method = "min_max",
  fish_id_col = "path_id",
  time_col = "time_seconds",
  station_col = "station_id",
  station_info = NULL,
  temporal_info = NULL,
  crs = NULL,
  include_barriers = FALSE,
  scale_non_detections = TRUE,
  verbose = TRUE
)

Arguments

station_detections

A data frame containing detection data with fish tracks and detection events at receiver stations.

station_distances_df

A data frame containing pre-calculated distances between receiver stations and spatial grid cells, typically from calculate_station_distances.

receiver_stations

An sf object containing receiver station locations and metadata, typically from point generation functions. This parameter is optional when station_info is provided, as receiver_stations will be automatically created from station_info coordinates.

de_model

A fitted detection efficiency model object (e.g., from create_logistic_curve_depth). The model should accept dist_m and depth_m as predictors. Default is NULL, which requires DE_pred column to already exist in station_distances_df.

time_aggregation

Character. Method for time aggregation. Options are:

  • "seconds" - Numeric seconds with bin_size_seconds (default)

  • "hour" - Hourly aggregation using POSIX datetime

  • "day" - Daily aggregation using POSIX datetime

  • "month" - Monthly aggregation using POSIX datetime

Default is "seconds".

bin_size_seconds

Numeric. Time bin size in seconds for aggregating detections when time_aggregation = "seconds". Default is 3600 (1 hour).

detection_weight

Numeric. Weight given to detection events in the integrated probability calculation (0-1). Default is 0.5.

non_detection_weight

Numeric. Weight given to non-detection events in the integrated probability calculation (0-1). Default is 0.5.

integration_method

Character. Method for integrating detection and non-detection evidence. Options are:

  • "subtractive" (default): Detection field is the base; non-detection evidence subtracts from it: det - (nondet * non_detection_weight), clamped to 0. Produces tight, detection-anchored position estimates.

  • "multiplicative": Detection field scaled down by non-detection evidence: det * (1 - nondet * non_detection_weight). Smoother penalty than subtractive; stays non-negative naturally.

  • "additive": Original WADE formula: weighted sum of detection and inverted non-detection probabilities. Can inflate spatial footprint beyond detection zones.

For "subtractive" and "multiplicative", detection_weight is ignored (detection is always the base); only non_detection_weight controls the strength of non-detection penalty.

max_non_detection_distance

Numeric. Maximum distance (in meters) from detecting stations to consider non-detecting stations. Set to NULL to include all stations. Default is 2000.

normalization_method

Character. Method for normalizing detection efficiency values. Options are "min_max", "z_score", or "robust". Default is "min_max".

fish_id_col

Character. Name of the column containing fish identifiers. Default is "path_id".

time_col

Character. Name of the column containing time values. Can be numeric seconds (for time_aggregation = "seconds") or POSIX datetime (for time_aggregation = "hour"/"day"/"month"). Default is "time_seconds".

station_col

Character. Name of the column containing station identifiers. Default is "station_id".

station_info

Station information as data frame, CSV file path, or sf object. Must contain station identifier (station_id or point_id) and coordinates (x,y). If sf object (e.g., from generate_exact_regular_points), coordinates are automatically extracted. Optional temporal columns: start_date, end_date for deployment windows. When provided, receiver_stations is auto-created. Default is NULL.

temporal_info

Data frame with daily environmental conditions for temporal DE prediction. Must contain date column and environmental predictors used by de_model. Only used when station_info contains temporal columns and de_model is provided. Default is NULL.

crs

Character. Coordinate reference system for station_info coordinates. If NULL (default), attempts to auto-detect from station_distances_df coordinate ranges or falls back to "EPSG:32617". Use same CRS as your depth_raster.

include_barriers

Logical. Whether to apply barrier masking to prevent position estimates through land obstacles. Default is FALSE. When TRUE, detection efficiency is set to 0 for any cell-receiver pair where the line-of-sight crosses a barrier.

scale_non_detections

Logical. Whether to scale non-detection evidence by total detections per fish/time period. Default is TRUE. When TRUE, non-detection fields are proportionally stronger in periods with many detections (more missed transmissions expected), balancing the detection-weighted mean that favours high-count stations. Requires crosses_barrier column in station_distances_df (generated by calculate_station_distances() with barrier_raster parameter).

verbose

Logical. Whether to print progress messages. Default is TRUE.

Value

A list containing:

position_probabilities

Data frame with integrated position probabilities for each fish, time period, and spatial cell. The integrated_prob column is rescaled to 0, 1 per fish/time period.

detection_data

Data frame with processed detection probability data

non_detection_data

Data frame with processed non-detection probability data

station_detections_binned

Data frame with time-binned detection events and station coordinates

station_coordinates

Data frame with station coordinate information

summary

List with summary statistics of the positioning analysis

Details

The function implements a multi-step positioning algorithm:

  1. Time binning of detection events (standardized to time_period)

  2. Aggregation of detection data with spatial distance information

  3. Creation of non-detection events for nearby stations

  4. Normalization of detection efficiency across receivers

  5. Integration of detection and non-detection probabilities

  6. Calculation of weighted position estimates

For additive integration, detection and non-detection weights must sum to 1. For subtractive and multiplicative integration, only non_detection_weight controls the penalty strength (0-1); detection_weight is ignored. The algorithm focuses non-detection analysis on stations within a specified distance of detecting stations to maintain biological realism and computational efficiency.

Barrier Masking: When include_barriers = TRUE, the function masks position estimates at cells where the direct path to any receiver crosses a land barrier. This prevents physically impossible position estimates through islands, shorelines, or other obstacles. The barrier information must be pre-computed in the station_distances_df data frame using calculate_station_distances() with a barrier raster. Barrier masking works with both static DE mode (pre-computed DE_pred) and temporal DE mode (on-the-fly DE calculation using de_model).

Time Aggregation Options:

  • "seconds": Uses numeric time with bin_size_seconds (backward compatible)

  • "hour": Groups POSIX datetime to hourly bins (e.g., 2024-07-01 15:00:00)

  • "day": Groups POSIX datetime to daily bins (e.g., 2024-07-01)

  • "month": Groups POSIX datetime to monthly bins (e.g., 2024-07-01)

Examples

if (FALSE) {
# Basic positioning analysis (backward compatible)
results <- calculate_fish_positions(
  station_detections = fish_tracks$detections,
  station_distances_df = distances,
  receiver_stations = stations,
  bin_size_seconds = 3600
)

# POSIX datetime aggregation by day
results <- calculate_fish_positions(
  station_detections = fish_tracks$detections,
  station_distances_df = distances,
  receiver_stations = stations,
  time_col = "datetime",
  time_aggregation = "day"
)

# POSIX datetime aggregation by hour
results <- calculate_fish_positions(
  station_detections = fish_tracks$detections,
  station_distances_df = distances,
  receiver_stations = stations,
  time_col = "datetime",
  time_aggregation = "hour"
)
}