Skip to contents

Estimates fish positions by combining detection and non-detection data from acoustic telemetry arrays. The function aggregates detection events into time bins, models detection efficiency, and calculates weighted position probabilities.

Usage

calculate_fish_positions(
  station_detections,
  station_distances_df,
  receiver_stations,
  de_model = NULL,
  time_aggregation = "seconds",
  bin_size_seconds = 3600,
  detection_weight = 0.5,
  non_detection_weight = 0.5,
  max_non_detection_distance = 2000,
  weighting_method = "information_theoretic",
  percentile_cutoff = 0.95,
  temporal_grouping = "day",
  dampening_factor = 1,
  normalization_method = "min_max",
  fish_id_col = "path_id",
  time_col = "time_seconds",
  station_col = "station_id",
  station_info = NULL,
  temporal_info = NULL,
  crs = NULL,
  verbose = TRUE
)

Arguments

station_detections

A data frame containing detection data with fish tracks and detection events at receiver stations.

station_distances_df

A data frame containing pre-calculated distances between receiver stations and spatial grid cells, typically from calculate_station_distances.

receiver_stations

An sf object containing receiver station locations and metadata, typically from point generation functions. This parameter is optional when station_info is provided, as receiver_stations will be automatically created from station_info coordinates.

de_model

A fitted detection efficiency model object (e.g., from create_logistic_curve_depth). The model should accept dist_m and depth_m as predictors. Default is NULL, which requires DE_pred column to already exist in station_distances_df.

time_aggregation

Character. Method for time aggregation. Options are:

  • "seconds" - Numeric seconds with bin_size_seconds (default)

  • "hour" - Hourly aggregation using POSIX datetime

  • "day" - Daily aggregation using POSIX datetime

  • "month" - Monthly aggregation using POSIX datetime

Default is "seconds".

bin_size_seconds

Numeric. Time bin size in seconds for aggregating detections when time_aggregation = "seconds". Default is 3600 (1 hour).

detection_weight

Numeric. Weight given to detection events in the integrated probability calculation (0-1). Default is 0.5.

non_detection_weight

Numeric. Weight given to non-detection events in the integrated probability calculation (0-1). Default is 0.5.

max_non_detection_distance

Numeric. Maximum distance (in meters) from detecting stations to consider non-detecting stations. Set to NULL to include all stations. Default is 2000.

normalization_method

Character. Method for normalizing detection efficiency values. Options are "min_max", "z_score", or "robust". Default is "min_max".

fish_id_col

Character. Name of the column containing fish identifiers. Default is "path_id".

time_col

Character. Name of the column containing time values. Can be numeric seconds (for time_aggregation = "seconds") or POSIX datetime (for time_aggregation = "hour"/"day"/"month"). Default is "time_seconds".

station_col

Character. Name of the column containing station identifiers. Default is "station_id".

station_info

Station information as data frame, CSV file path, or sf object. Must contain station identifier (station_id or point_id) and coordinates (x,y). If sf object (e.g., from generate_exact_regular_points), coordinates are automatically extracted. Optional temporal columns: start_date, end_date for deployment windows. When provided, receiver_stations is auto-created. Default is NULL.

temporal_info

Data frame with daily environmental conditions for temporal DE prediction. Must contain date column and environmental predictors used by de_model. Only used when station_info contains temporal columns and de_model is provided. Default is NULL.

crs

Character. Coordinate reference system for station_info coordinates. If NULL (default), attempts to auto-detect from station_distances_df coordinate ranges or falls back to "EPSG:32617". Use same CRS as your depth_raster.

verbose

Logical. Whether to print progress messages. Default is TRUE.

Value

A list containing:

position_probabilities

Data frame with integrated position probabilities for each fish, time period, and spatial cell

detection_data

Data frame with processed detection probability data

non_detection_data

Data frame with processed non-detection probability data

station_detections_binned

Data frame with time-binned detection events and station coordinates

station_coordinates

Data frame with station coordinate information

summary

List with summary statistics of the positioning analysis

Details

The function implements a multi-step positioning algorithm:

  1. Time binning of detection events (standardized to time_period)

  2. Aggregation of detection data with spatial distance information

  3. Creation of non-detection events for nearby stations

  4. Normalization of detection efficiency across receivers

  5. Integration of detection and non-detection probabilities

  6. Calculation of weighted position estimates

Detection and non-detection weights must sum to 1. The algorithm focuses non-detection analysis on stations within a specified distance of detecting stations to maintain biological realism and computational efficiency.

Time Aggregation Options:

  • "seconds": Uses numeric time with bin_size_seconds (backward compatible)

  • "hour": Groups POSIX datetime to hourly bins (e.g., 2024-07-01 15:00:00)

  • "day": Groups POSIX datetime to daily bins (e.g., 2024-07-01)

  • "month": Groups POSIX datetime to monthly bins (e.g., 2024-07-01)

Examples

if (FALSE) {
# Basic positioning analysis (backward compatible)
results <- calculate_fish_positions(
  station_detections = fish_tracks$detections,
  station_distances_df = distances,
  receiver_stations = stations,
  bin_size_seconds = 3600
)

# POSIX datetime aggregation by day
results <- calculate_fish_positions(
  station_detections = fish_tracks$detections,
  station_distances_df = distances,
  receiver_stations = stations,
  time_col = "datetime",
  time_aggregation = "day"
)

# POSIX datetime aggregation by hour
results <- calculate_fish_positions(
  station_detections = fish_tracks$detections,
  station_distances_df = distances,
  receiver_stations = stations,
  time_col = "datetime",
  time_aggregation = "hour"
)
}