Working with Imagery¶

This guide covers loading, managing, and processing imagery in VISTA.

Note

While working with imagery is a key VISTA use-case, users should be aware that VISTA can be used to analyze detections and tracks.

Loading Imagery¶

VISTA supports loading imagery from HDF5 files and can work with version 1.5, 1.6, and 1.7 formats. See Imagery Module for programmatic details.

From the GUI¶

To load imagery in the VISTA GUI:

Click File → Open
Select an HDF5 file (.h5 or .hdf5)
VISTA will automatically detect the format and load all sensors and imagery

From Python API¶

from vista.imagery import Imagery
from vista.sensors import SampledSensor

# Imagery is typically loaded through the GUI's data loader
# For programmatic creation, see the HDF5 Format section below

HDF5 File Format¶

VISTA uses HDF5 as its native format for storing multi-frame imagery along with metadata, sensor calibration, and coordinate transformation data.

Format Overview¶

Current Version: 1.7 (simplified timestamps with nanosecond precision)

Legacy Support: version 1.6 (hierarchical with split timestamps), version 1.5 (flat structure, deprecated)

The version 1.7 format uses a hierarchical sensor-based structure allowing multiple sensors and multiple imagery datasets per sensor in a single file, with simplified timestamp storage using a single nanosecond field.

File Structure (version 1.7)¶

The HDF5 file has the following hierarchical structure:

root/
├── [attributes]
│   ├── format_version: "1.7"
│   └── created: "2024-01-01T12:00:00"
└── sensors/
    └── <sensor_uuid>/
        ├── [attributes]
        │   ├── name: "Sensor Name"
        │   ├── uuid: "uuid-string"
        │   └── sensor_type: "Sensor" or "SampledSensor"
        ├── position/              (SampledSensor only)
        │   ├── positions          [3 × N array: x, y, z in ECEF meters]
        │   └── unix_nanoseconds   [N array: nanoseconds since epoch]
        ├── geolocation/           (optional, for coordinate transforms)
        │   ├── frames             [M array: frame numbers for polynomials]
        │   ├── pointing           [M × 2 array: azimuth, elevation]
        │   ├── poly_pixel_to_arf_azimuth    [M × P array]
        │   ├── poly_pixel_to_arf_elevation  [M × Q array]
        │   ├── poly_arf_to_row    [M × R array]
        │   └── poly_arf_to_col    [M × S array]
        ├── radiometric/           (optional, for calibration)
        │   ├── bias_images                     [K × H × W array]
        │   ├── bias_image_frames               [K array]
        │   ├── uniformity_gain_images          [L × H × W array]
        │   ├── uniformity_gain_image_frames    [L array]
        │   ├── bad_pixel_masks                 [J × H × W array]
        │   ├── bad_pixel_mask_frames           [J array]
        │   ├── radiometric_gain                [M array]
        │   └── radiometric_gain_frames         [M array]
        └── imagery/
            └── <imagery_uuid>/
                ├── [attributes]
                │   ├── name: "Imagery Name"
                │   ├── uuid: "uuid-string"
                │   ├── description: "Optional description"
                │   ├── row_offset: 0
                │   └── column_offset: 0
                ├── images           [N × H × W array, float32, chunked]
                ├── frames           [N array: frame numbers]
                └── unix_nanoseconds [N array: nanoseconds since epoch]

Detailed Dataset Descriptions¶

Root Attributes¶

format_version:: String indicating the format version (e.g., “1.6”)
created:: ISO 8601 timestamp of file creation

Sensor Attributes¶

name:: Human-readable sensor identifier
uuid:: Unique identifier for this sensor (UUID string)
sensor_type:: Either "Sensor" (base class) or "SampledSensor" (with position data)

Sensor Position Data (SampledSensor only)¶

positions:

3 × N array of ECEF (Earth-Centered, Earth-Fixed) positions in meters

Row 0: X coordinate
Row 1: Y coordinate
Row 2: Z coordinate

unix_nanoseconds:

N-element array of nanoseconds since Unix epoch (1970-01-01 00:00:00 UTC). int64 datatype provides nanosecond precision with valid range from 1970-01-01 to 2262-04-11.

Geolocation Data (optional)¶

Required for pixel-to-geodetic coordinate conversion:

frames:: Frame numbers for which polynomial coefficients apply
pointing:: M × 2 array of sensor pointing [azimuth, elevation] in radians
poly_pixel_to_arf_azimuth:: Polynomial coefficients for pixel → ARF azimuth
poly_pixel_to_arf_elevation:: Polynomial coefficients for pixel → ARF elevation
poly_arf_to_row:: Polynomial coefficients for ARF → pixel row
poly_arf_to_col:: Polynomial coefficients for ARF → pixel column

Radiometric Calibration (optional)¶

bias_images:: K × H × W array of bias frames (dark current corrections)
bias_image_frames:: Frame numbers indicating when each bias image applies
uniformity_gain_images:: L × H × W array of flat-field correction images
uniformity_gain_image_frames:: Frame numbers for uniformity corrections
bad_pixel_masks:: J × H × W boolean array identifying defective pixels
bad_pixel_mask_frames:: Frame numbers for bad pixel masks
radiometric_gain:: M-element array of overall gain values per frame
radiometric_gain_frames:: Frame numbers for radiometric gains

Note

Calibration arrays define frame ranges: calibration at frame N applies to all imagery frames until frame N+1 begins.

Imagery Attributes¶

name:: Human-readable imagery identifier
uuid:: Unique identifier for this imagery dataset
description:: Optional long-form description
row_offset:: Vertical offset if imagery is a spatial crop (default: 0)
column_offset:: Horizontal offset if imagery is a spatial crop (default: 0)

Imagery Datasets¶

images:

N × H × W array of image frames

Datatype: float32
Chunked: (1, H, W) for efficient frame-by-frame access
N = number of frames
H = image height (rows)
W = image width (columns)

frames:

N-element array of frame numbers (int64)

Frame numbers need not be sequential or start at zero. They identify each image within the sensor’s temporal sequence.

unix_nanoseconds:

N-element array of nanoseconds since Unix epoch (int64)

Note

Times are stored as nanoseconds since Unix epoch (1970-01-01 00:00:00 UTC) for nanosecond precision with int64 datatype. Valid range: 1970-01-01 to 2262-04-11.

datetime64[ns] = unix_nanoseconds

Creating HDF5 Files¶

Using the VISTA API¶

import numpy as np
from vista.imagery import Imagery, save_imagery_hdf5
from vista.sensors import SampledSensor

# Create sensor with position data
positions = np.array([[1e6], [2e6], [3e6]])  # ECEF coordinates
times = np.array([np.datetime64('2024-01-01T00:00:00')], dtype='datetime64[ns]')
frames = np.array([0])

sensor = SampledSensor(
    name="MySensor",
    positions=positions,
    times=times,
    frames=frames
)

# Create imagery
images = np.random.randn(100, 256, 256).astype(np.float32)
img_frames = np.arange(100)
img_times = np.array([
    np.datetime64('2024-01-01T00:00:00') + np.timedelta64(i*100, 'ms')
    for i in range(100)
], dtype='datetime64[ns]')

imagery = Imagery(
    name="Test Imagery",
    images=images,
    frames=img_frames,
    times=img_times,
    sensor=sensor,
    description="Example imagery dataset"
)

# Save to HDF5
save_imagery_hdf5("output.h5", {"MySensor": [imagery]})

Using h5py Directly¶

For advanced users, you can create HDF5 files directly:

import h5py
import numpy as np

with h5py.File('custom_imagery.h5', 'w') as f:
    # Set root attributes
    f.attrs['format_version'] = '1.6'
    f.attrs['created'] = '2024-01-01T12:00:00'

    # Create sensor structure
    sensors_group = f.create_group('sensors')
    sensor_group = sensors_group.create_group('sensor-uuid-here')
    sensor_group.attrs['name'] = 'MySensor'
    sensor_group.attrs['uuid'] = 'sensor-uuid-here'
    sensor_group.attrs['sensor_type'] = 'Sensor'

    # Create imagery structure
    imagery_group = sensor_group.create_group('imagery')
    img_group = imagery_group.create_group('imagery-uuid-here')
    img_group.attrs['name'] = 'MyImagery'
    img_group.attrs['uuid'] = 'imagery-uuid-here'
    img_group.attrs['description'] = 'Custom imagery'
    img_group.attrs['row_offset'] = 0
    img_group.attrs['column_offset'] = 0

    # Create datasets
    images = np.random.randn(100, 256, 256).astype(np.float32)
    img_group.create_dataset('images', data=images, chunks=(1, 256, 256))
    img_group.create_dataset('frames', data=np.arange(100))

    # Optional: Add timestamps
    unix_nanoseconds = np.arange(100, dtype=np.int64) * 100_000_000_000  # 100 second intervals in nanoseconds
    img_group.create_dataset('unix_nanoseconds', data=unix_nanoseconds)

Format Versions¶

Version 1.7 (Current)¶

Uses single unix_nanoseconds field for timestamps (int64)
Simplified timestamp storage with nanosecond precision
Valid time range: 1970-01-01 to 2262-04-11 (292 years)
All other features from version 1.6 retained

Version 1.6 (Legacy, Deprecated)¶

Hierarchical structure with sensors/ root group
Supports multiple sensors per file
Supports multiple imagery datasets per sensor
Uses split unix_times and unix_fine_times fields
Fully supported for loading (backward compatible)

Warning

When opening version 1.6 files, VISTA displays a deprecation warning. Convert legacy files to version 1.7 format by loading and re-saving through the GUI: File → Open (load version 1.5) then File → Save (saves as version 1.7).

Version 1.5 (Legacy, Deprecated)¶

Flat structure with datasets at root level
Single sensor, single imagery per file
Still supported for loading but not recommended for new files
Will be removed in a future VISTA version

Warning

When opening version 1.5 files, VISTA displays a deprecation warning. Convert legacy files to version 1.7 format by loading and re-saving through the GUI: File → Open (load version 1.5) then File → Save (saves as version 1.7).

Attitude Reference Frame (ARF)¶

The Attitude Reference Frame (ARF) is a local sensor-centric coordinate system used in VISTA for efficient pixel-to-geodetic coordinate transformations and geolocation calculations.

Purpose¶

The ARF serves as an intermediate coordinate system in the transformation chain between image pixel coordinates and Earth-centered, Earth-fixed (ECEF) geodetic coordinates:

Pixel (row, col) → ARF (azimuth, elevation) → ECEF (lat, lon, alt)
                 ↑                          ↑
          Polynomial transforms      Earth intersection

Using ARF as an intermediate step provides several benefits:

Compact polynomial representation: ARF angles change smoothly across the image, allowing accurate polynomial approximations with low-order terms
Sensor independence: ARF is defined relative to sensor pointing, not absolute coordinates
Numerical stability: Local coordinates avoid precision issues with large ECEF values
Physical intuition: Azimuth/elevation angles are easier to interpret than ECEF vectors

ARF Definition¶

The ARF is a right-handed Cartesian coordinate system defined by three orthonormal axes relative to the sensor’s position and pointing direction:

X-axis (Boresight): Points along the sensor’s boresight (pointing direction). This is the primary viewing direction of the sensor.
Z-axis (North-aligned): Points as close to North as possible while remaining orthogonal to the X-axis. Specifically, it’s the component of the “toward North pole” vector that is perpendicular to the boresight.
Y-axis (Completes right-hand system): Computed as the cross product of X and Z axes: Y = X × Z. This creates a right-handed coordinate system.

Note

The ARF rotates with the sensor. As the sensor moves and its pointing changes, the ARF axes change accordingly. This makes ARF a dynamic coordinate system that must be recomputed for each sensor position and pointing angle.

Mathematical Construction¶

Given sensor position P (in ECEF coordinates, km) and sensor pointing unit vector D (in ECEF), the ARF transformation matrix is constructed as follows:

X-axis: ARF X-axis = sensor pointing direction

\[\mathbf{\hat{x}}_{ARF} = \mathbf{D}\]
Northish vector: Vector from sensor toward North pole

\[\mathbf{N}_{pole} = [0, 0, 6356.752314245]^T \text{ km (Earth polar radius)}\]

\[\mathbf{\hat{N}} = \frac{\mathbf{N}_{pole} - \mathbf{P}}{|\mathbf{N}_{pole} - \mathbf{P}|}\]
Z-axis: Orthogonal component of northish vector

Remove the projection of ARF X-axis onto the northish vector:

\[\mathbf{z}_{ARF} = \mathbf{\hat{N}} - (\mathbf{\hat{x}}_{ARF} \cdot \mathbf{\hat{N}}) \mathbf{\hat{x}}_{ARF}\]

Normalize to unit vector:

\[\mathbf{\hat{z}}_{ARF} = \frac{\mathbf{z}_{ARF}}{|\mathbf{z}_{ARF}|}\]
Y-axis: Cross product of X and Z

\[\mathbf{y}_{ARF} = \mathbf{\hat{x}}_{ARF} \times \mathbf{\hat{z}}_{ARF}\]

Normalize to ensure unit length:

\[\mathbf{\hat{y}}_{ARF} = \frac{\mathbf{y}_{ARF}}{|\mathbf{y}_{ARF}|}\]
Transformation matrix: Converts global vectors to ARF

\[\begin{split}\mathbf{M}_{global \rightarrow ARF} = \begin{bmatrix} \mathbf{\hat{x}}_{ARF}^T \\ \mathbf{\hat{y}}_{ARF}^T \\ \mathbf{\hat{z}}_{ARF}^T \end{bmatrix}\end{split}\]

ARF Angles¶

Directions in ARF are commonly expressed as spherical coordinates (azimuth, elevation):

Azimuth: Angle in radians measured counter-clockwise from the ARF Y-axis in the Y-Z plane. Range: [-π, π] radians (-180° to 180°)
Elevation: Angle in radians measured from the Y-Z plane toward the ARF X-axis. Range: [-π/2, π/2] radians (-90° to 90°)

Conversion between ARF Cartesian and spherical coordinates:

# Cartesian (x, y, z) to spherical (azimuth, elevation)
azimuth = arctan2(y, z)
elevation = arctan2(x, sqrt(y**2 + z**2))

# Spherical to Cartesian
x = cos(elevation) * cos(azimuth)
y = cos(elevation) * sin(azimuth)
z = sin(elevation)

Usage in Geolocation¶

VISTA uses ARF in the pixel-to-geodetic transformation pipeline stored in the HDF5 geolocation data:

Step 1: Pixel → ARF angles

2D polynomials map pixel coordinates to ARF azimuth and elevation:

azimuth = evaluate_2d_polynomial(poly_pixel_to_arf_azimuth, row, col)
elevation = evaluate_2d_polynomial(poly_pixel_to_arf_elevation, row, col)

Step 2: ARF angles → ECEF direction

Convert ARF angles to Cartesian unit vector, then transform to ECEF:

arf_vector = spherical_to_cartesian(azimuth, elevation)
ecef_direction = arf_to_global_matrix @ arf_vector

Step 3: ECEF direction → Ground intersection

Ray-trace from sensor position along ECEF direction to intersect Earth ellipsoid:

lat, lon, alt = earth_intersection(sensor_pos, ecef_direction)

Inverse: ECEF → ARF angles → Pixel

The reverse transformation uses different polynomial coefficients:

# ECEF direction → ARF angles
arf_vector = global_to_arf_matrix @ ecef_direction
azimuth, elevation = cartesian_to_spherical(arf_vector)

# ARF angles → Pixel coordinates
row = evaluate_2d_polynomial(poly_arf_to_row, azimuth, elevation)
col = evaluate_2d_polynomial(poly_arf_to_col, azimuth, elevation)

Polynomial Coefficients¶

The geolocation data in HDF5 files stores polynomial coefficients for these transformations:

poly_pixel_to_arf_azimuth:: Maps (row, col) → ARF azimuth
poly_pixel_to_arf_elevation:: Maps (row, col) → ARF elevation
poly_arf_to_row:: Maps (ARF azimuth, ARF elevation) → pixel row
poly_arf_to_col:: Maps (ARF azimuth, ARF elevation) → pixel column

Each polynomial coefficient array has shape (num_frames, num_coefficients), where:

num_frames: Number of frames with polynomial data
num_coefficients: (order + 1) * (order + 2) / 2 for polynomial of given order

Polynomial terms are ordered by total degree, then by decreasing powers of the first variable:

Order 0: c₀ (1 coefficient)
Order 1: c₁·x + c₂·y (3 coefficients total)
Order 2: c₃·x² + c₄·x·y + c₅·y² (6 coefficients total)
Order 3: c₆·x³ + c₇·x²·y + c₈·x·y² + c₉·y³ (10 coefficients total)

Example: ARF Transform¶

import numpy as np
from vista.transforms.arf import get_arf_transform

# Sensor position in ECEF (km)
sensor_pos = np.array([5000, 2000, 3000])

# Sensor pointing direction (unit vector in ECEF)
sensor_pointing = np.array([0.0, 0.0, -1.0])  # Pointing down (nadir)

# Get transformation matrix: global → ARF
global_to_arf = get_arf_transform(sensor_pos, sensor_pointing)

# Transform a vector from global ECEF to ARF
global_vector = np.array([1.0, 0.0, 0.0])  # East direction
arf_vector = global_to_arf @ global_vector

print(f"Global vector: {global_vector}")
print(f"ARF vector: {arf_vector}")

# Get inverse transform: ARF → global
arf_to_global = global_to_arf.T  # Orthonormal matrix: inverse = transpose
global_vector_reconstructed = arf_to_global @ arf_vector
print(f"Reconstructed: {global_vector_reconstructed}")

Imagery Properties¶

Each imagery dataset has the following properties accessible in Python:

# Array properties
imagery.images        # 3D array: (frames, height, width)
imagery.frames        # 1D array: frame numbers
imagery.times         # 1D array: datetime64[ns] timestamps

# Dimensions
len(imagery)          # Number of frames
imagery.shape         # Tuple: (num_frames, height, width)

# Metadata
imagery.name          # String identifier
imagery.description   # Long-form description
imagery.uuid          # Unique identifier

# Offsets (for cropped imagery)
imagery.row_offset    # Vertical offset in pixels
imagery.column_offset # Horizontal offset in pixels

# Associated sensor
imagery.sensor        # Sensor object with calibration data

Slicing and Subsetting¶

VISTA supports efficient imagery slicing:

# Temporal slicing (by frame index, not frame number)
subset = imagery[10:50]  # Frames at indices 10-49

# Spatial cropping via AOI
from vista.aoi import AOI
aoi = AOI(name="Region", x=50, y=50, width=100, height=100)
cropped = imagery.get_aoi(aoi)

# Accessing individual frames
frame_0 = imagery.images[0]  # First frame as 2D array

# Frame number lookup
frame_idx = imagery.get_frame_index(42)  # Index of frame number 42
if frame_idx is not None:
    frame_data = imagery.images[frame_idx]

Imagery Controls¶

Pan and Zoom¶

Left-click and drag on the imagery viewer to pan.
Use scroll-wheel to zoom. Reset view using the context menu (right-click) and press “View All” or click the “A” in the lower-left of the imagery viewer.

Histogram¶

Left-click and drag the histogram boundary box to shift the histogram range displayed in the viewer.
Left-click and drag the bounds of the histogram to adjust its size.
Right-click on the gradient bar to adust the gradient color.
Left-click on the gradient bar to add ticks
Left-click and drag on gradient bar ticks to adjust
Use scrollwheel to zoom in and out on the histogram
Right click on the histogram and press “View All” to reset histogram view
Click and drag empty area of histogram to pan left and right

Playback¶

Drag the slider to slide through frames.
Click the play/pause button (or press spacebar) to play or pause imagery animation.
Click the reverse button to reverse playback direction.
Click the Prevous Frame or Next Frame buttons or use (left arrow/right arrow or A/D) to step forward or backward by one frame.
Check the “Bounce” checkbox to enable setting frames to bounce between.
Adjust the player’s objective Frames-Per-Second (FPS) using the FPS input or dial.

Note

The FPS defines the _objective_ FPS which may be unachievable on data and systems. In these cases, the playback is as quick as possible.

Tooltips¶

Select the geospatial tooltip to view the latitude / longitude corresponding to the hovered location in the imagery
Select the pixel details tooltip to view the row, column, counts for the hovered location in the imagery. The tooltip gives counts for the nearest hovered pixel.

Treatments and Processing¶

VISTA provides several image treatment operations accessible through the GUI:

Subset Frames¶

This tool crops the imagery to a subset of the input frames.

Radiometric Corrections¶

Bias Removal: Subtracts dark current using bias frames from sensor calibration data. Access via Algorithms → Treatments → Bias Removal
Non-Uniformity Correction (NUC): Applies flat-field correction using uniformity gain images. Access via Algorithms → Treatments → Non-Uniformity Correction
Bad Pixel Replacement: Interpolates over defective pixels identified in bad pixel masks. Automatically applied when sensor has bad pixel mask data.

Background Removal¶

Temporal Median: Removes static background by subtracting temporal median of surrounding frames. Access via Algorithms → Background Removal → Temporal Median
Robust PCA: Separates low-rank background from sparse foreground using robust PCA. Access via Algorithms → Background Removal → Robust PCA

Enhancement¶

Frame Coaddition: Improves SNR by averaging multiple frames. Access via Algorithms → Enhancement → Coadd Frames

Saving and Exporting¶

Save Entire Dataset¶

To save imagery with all metadata and calibration:

Select imagery in the Imagery Panel
Click File → Save
Choose output filename
File is saved in 1.7 HDF5 format with all associated data

Export Specific Frames¶

To export a subset of frames or processed imagery:

Select imagery in the Imagery Panel
Click Export button in the panel
Configure export options:
- Frame range
- Output format (HDF5, TIFF sequence, etc.)
- Bit depth and scaling
Click Save

Programmatic Export¶

# Save to HDF5
from vista.imagery import save_imagery_hdf5
save_imagery_hdf5("output.h5", {sensor.name: [imagery]})

# Export frames as numpy array
frames_subset = imagery[10:50].images  # Get frames 10-49
np.save("frames.npy", frames_subset)

# Export single frame as image
from PIL import Image
frame = imagery.images[0]
# Scale to 0-255 for 8-bit export
scaled = ((frame - frame.min()) / (frame.max() - frame.min()) * 255).astype(np.uint8)
Image.fromarray(scaled).save("frame_0.png")

Best Practices¶

Storage and Performance¶

Use chunking: HDF5 files created by VISTA use (1, H, W) chunking for efficient frame access
Compression: Consider enabling gzip compression for archival (slower but smaller)
Frame ordering: Keep frames sorted by frame number for faster lookups
Reasonable sizes: Very large datasets (>10,000 frames) may benefit from splitting

Metadata Management¶

Descriptive names: Use clear, descriptive names for imagery datasets
Add descriptions: Use the description field to document processing history
Preserve calibration: Always include sensor calibration data when available
UUID tracking: UUIDs help track imagery across processing workflows

Coordinate Systems¶

Check sensor: Verify sensor has geolocation polynomials before using coordinate conversion
Frame alignment: Ensure polynomial frame numbers align with imagery frame numbers
Time synchronization: For multi-sensor data, verify time alignment across sensors

Working with Imagery¶

Loading Imagery¶

From the GUI¶

From Python API¶

HDF5 File Format¶

Format Overview¶

File Structure (version 1.7)¶

Detailed Dataset Descriptions¶

Root Attributes¶

Sensor Attributes¶

Sensor Position Data (SampledSensor only)¶

Geolocation Data (optional)¶

Radiometric Calibration (optional)¶

Imagery Attributes¶

Imagery Datasets¶

Creating HDF5 Files¶

Using the VISTA API¶

Using h5py Directly¶

Format Versions¶

Version 1.7 (Current)¶

Version 1.6 (Legacy, Deprecated)¶

Version 1.5 (Legacy, Deprecated)¶

Attitude Reference Frame (ARF)¶

Purpose¶

ARF Definition¶

Mathematical Construction¶

ARF Angles¶

Usage in Geolocation¶

Polynomial Coefficients¶

Example: ARF Transform¶

See Also¶

Imagery Properties¶

Slicing and Subsetting¶

Imagery Controls¶

Pan and Zoom¶

Histogram¶

Playback¶

Tooltips¶

Treatments and Processing¶

Subset Frames¶

Radiometric Corrections¶

Background Removal¶

Enhancement¶

Saving and Exporting¶

Save Entire Dataset¶

Export Specific Frames¶

Programmatic Export¶

Best Practices¶

Storage and Performance¶

Metadata Management¶

Coordinate Systems¶

See Also¶