Core Classes

The core classes provide STAC-compliant data models for working with satellite imagery metadata and assets.

STACItem

class open_geodata_api.core.items.STACItem(item_data: Dict, provider: str = 'unknown')[source]

Bases: object

Universal STAC Item wrapper - focused on URL access and flexibility.

__init__(item_data: Dict, provider: str = 'unknown')[source]
get(key, default=None)[source]
to_dict()[source]
copy()[source]
get_asset_url(asset_key: str, signed: bool | None = None) str[source]

Get ready-to-use asset URL with automatic provider handling.

get_all_asset_urls(signed: bool | None = None) Dict[str, str][source]

Get all asset URLs as a dictionary - ready for any raster package.

get_assets_by_type(asset_type: str = 'image/tiff', exact_match: bool = False) Dict[str, str][source]

πŸ”§ FIXED: Get URLs for assets of specific type with flexible matching.

Parameters:
  • asset_type – MIME type to search for (default: β€œimage/tiff”)

  • exact_match – If True, requires exact string match. If False, uses substring matching.

Returns:

url} for matching assets

Return type:

Dictionary of {asset_key

get_raster_assets(signed: bool | None = None) Dict[str, str][source]

πŸ†• NEW: Get all raster/image assets (COGs, TIFFs, etc.).

get_metadata_assets() Dict[str, str][source]

πŸ†• NEW: Get all metadata assets (XML, JSON, etc.).

list_asset_types() Dict[str, List[str]][source]

πŸ†• NEW: List all unique asset types and which assets have them.

get_band_urls(bands: List[str], signed: bool | None = None) Dict[str, str][source]

Get URLs for specific bands/assets.

list_assets() List[str][source]

Return list of available asset keys.

has_asset(asset_key: str) bool[source]

Check if asset exists.

get_rgb_urls(signed: bool | None = None) Dict[str, str][source]

Get RGB band URLs (convenience method).

get_sentinel2_urls(signed: bool | None = None) Dict[str, str][source]

Get common Sentinel-2 band URLs (convenience method).

print_assets_info()[source]

πŸ”§ ENHANCED: Print detailed information about all available assets including types.

Represents a single satellite scene or product with metadata and associated assets.

Key Properties:

  • id: Unique identifier for the item

  • collection: Collection this item belongs to

  • properties: Metadata dictionary including datetime, cloud cover, etc.

  • assets: Dictionary of available assets (files)

  • bbox: Bounding box coordinates

  • provider: Data provider name

Usage Example:

# Create from search results
item = results.get_all_items()[0]

# Access metadata
print(f"Item ID: {item.id}")
print(f"Date: {item.properties['datetime']}")
print(f"Cloud cover: {item.properties['eo:cloud_cover']}%")

# Get asset URLs
red_url = item.get_asset_url('B04')
all_urls = item.get_all_asset_urls()

STACItemCollection

class open_geodata_api.core.collections.STACItemCollection(items: List[Dict], provider: str = 'unknown')[source]

Bases: object

Complete STAC Item Collection with all essential functions including the missing ones.

__init__(items: List[Dict], provider: str = 'unknown')[source]

Initialize STAC Item Collection.

Parameters:
  • items – List of STAC item dictionaries

  • provider – Provider name (e.g., β€œplanetary_computer”, β€œearthsearch”)

property items

Get all items as list of STACItem objects.

property raw_items

Get raw item dictionaries (for internal use).

get_available_bands() List[str][source]

πŸ†• ADDED: Get list of all available bands/assets across the collection.

Returns:

Sorted list of unique band/asset names available in the collection

get_all_urls(signed: bool | None = None) Dict[str, Dict[str, str]][source]

πŸ†• ADDED: Get all URLs in the requested format.

Format: {<product_id>: {<band_name>: <url>, <band_name>: <url>, …}, …}

Parameters:

signed – Whether to sign URLs (auto-detected by provider if None)

Returns:

url} mapping

Return type:

Dictionary with product_id -> {band_name

get_band_urls(band_names: List[str] | None = None, asset_type: str = 'all', signed: bool | None = None) Dict[str, Dict[str, str]][source]

πŸ†• ADDED: Get URLs for specific bands or asset types with filtering options.

Parameters:
  • band_names – List of specific band names to include (None for all)

  • asset_type – Filter by asset type: - β€œall”: All assets (default) - β€œimage”: Only image/tiff assets - β€œbands”: Only spectral bands (B01, B02, etc.) - β€œvisual”: Only visual/RGB assets

  • signed – Whether to sign URLs (auto-detected by provider if None)

Returns:

url} mapping

Return type:

Dictionary with product_id -> {band_name

to_simple_products_list(include_urls: bool = True, url_bands: List[str] | None = None) List[Dict[str, Any]][source]

πŸ”§ UPDATED: Convert collection to simple products list with optional URLs.

Parameters:
  • include_urls – Whether to include URLs with href key

  • url_bands – Specific bands to include URLs for (None for all)

Returns:

List of simplified product dictionaries with optional URLs

to_list() List[Dict][source]

Convert collection to list of dictionaries.

to_dict() Dict[str, Any][source]

Convert collection to dictionary format.

to_geojson(filename: str | None = None) Dict[str, Any][source]

Convert collection to GeoJSON format.

get_all_assets() Dict[str, List[str]][source]

Get all unique assets across all items.

get_assets_by_collection() Dict[str, List[str]][source]

Get assets grouped by collection.

to_products_dict() Dict[str, Dict[str, Any]][source]

Convert collection to products dictionary with detailed metadata.

get_common_bands(min_occurrence: float = 0.5) List[str][source]

Get commonly available bands/assets across the collection.

get_assets_by_pattern(pattern: str, match_type: str = 'extension') List[str][source]

Get asset names that match the specified pattern.

get_assets_by_extension(extension: str) List[str][source]

Convenience method to get assets by file extension.

get_assets_by_mime_type(mime_type: str) List[str][source]

Convenience method to get assets by MIME type.

list_asset_extensions() Dict[str, List[str]][source]

List all unique file extensions and which assets have them.

filter_by_cloud_cover(max_cloud_cover: float) STACItemCollection[source]

Filter items by cloud cover percentage.

filter_by_date_range(start_date: str | None = None, end_date: str | None = None, days_back: int | None = None, auto_fix_dates: bool = True) STACItemCollection[source]

πŸ”§ ENHANCED: Filter items by date range with automatic invalid date correction.

Parameters:
  • start_date – Start date (YYYY-MM-DD format) - optional if using days_back

  • end_date – End date (YYYY-MM-DD format) - optional if using days_back

  • days_back – Number of days back from today - alternative to start_date/end_date

  • auto_fix_dates – Whether to automatically fix invalid dates (default: True)

Returns:

New STACItemCollection with filtered items

Examples

# Using date range with auto-correction filtered = collection.filter_by_date_range(β€œ2024-01-01”, β€œ2024-02-31”) # Auto-fixes to 2024-02-29

# Using days back (last 30 days) filtered = collection.filter_by_date_range(days_back=30)

# Disable auto-correction filtered = collection.filter_by_date_range(β€œ2024-01-01”, β€œ2024-02-31”, auto_fix_dates=False)

get_unique_collections() List[str][source]

Get list of unique collection names.

get_date_range() Dict[str, str][source]

Get date range of items in collection.

to_dataframe(include_geometry: bool = True) DataFrame[source]

Convert collection to pandas/geopandas DataFrame.

to_geodataframe() GeoDataFrame[source]

Convert collection to geopandas GeoDataFrame.

export_urls_json(filename: str, asset_keys: List[str] | None = None)[source]

Export all URLs to JSON file for external processing.

print_collection_summary()[source]

Print a comprehensive summary of the collection.

check_dependencies()[source]

Check status of optional dependencies.

Collection of STAC items with bulk operations and data conversion capabilities.

Key Features:

  • Iterable container for multiple STACItems

  • Bulk URL retrieval across all items

  • DataFrame conversion for analysis

  • Filtering and subsetting operations

Usage Example:

# Create from search results
items = results.get_all_items()

# Collection operations
print(f"Found {len(items)} items")

# Convert to DataFrame
df = items.to_dataframe()

# Bulk URL retrieval
all_urls = items.get_all_urls(['B04', 'B03', 'B02'])

# Iteration
for item in items:
    print(f"Processing {item.id}")

STACAsset

class open_geodata_api.core.assets.STACAsset(asset_data: Dict)[source]

Bases: object

Universal wrapper for STAC assets compatible with both PC and EarthSearch.

__init__(asset_data: Dict)[source]
get(key, default=None)[source]
to_dict()[source]
copy()[source]

Represents a single asset (file) within a STAC item.

Key Properties:

  • href: URL to the asset file

  • type: MIME type of the asset

  • title: Human-readable title

  • roles: List of asset roles (e.g., β€˜data’, β€˜thumbnail’)

Usage Example:

# Access asset directly
asset = item.assets['B04']

print(f"Asset URL: {asset.href}")
print(f"Asset type: {asset.type}")
print(f"Asset title: {asset.title}")

# Get signed URL if needed
signed_url = asset.get_signed_url()

STACSearch

class open_geodata_api.core.search.STACSearch(search_results: Dict, provider: str = 'unknown', client_instance=None, original_search_params: Dict | None = None, search_url: str = None, verbose: bool = False)[source]

Bases: object

Optimized STAC Search with smart fallback strategy and improved performance.

__init__(search_results: Dict, provider: str = 'unknown', client_instance=None, original_search_params: Dict | None = None, search_url: str = None, verbose: bool = False)[source]
get_all_items() STACItemCollection[source]

πŸš€ OPTIMIZED: Fast return for simple cases, fallback only when needed.

item_collection() STACItemCollection[source]

Alias for get_all_items().

items()[source]

πŸš€ OPTIMIZED: Return iterator over items with smart caching.

matched() int | None[source]

Return total number of matched items.

total_items() int | None[source]

Return total number of items.

search_params() dict | None[source]

Return search parameters used for the query.

all_keys() List[str][source]

Return all keys from the search results.

list_product_ids() List[str][source]

πŸ”§ FIXED: Return product IDs with simplified, reliable logic.

get_fallback_status() Dict[str, Any][source]

Get detailed fallback status information.

set_limit_enforcement(enforce: bool)[source]

Control whether to enforce the original limit parameter.

Container for search results with pagination and metadata.

Key Features:

  • Lazy loading of search results

  • Pagination support

  • Search metadata and statistics

  • Result caching

Usage Example:

# Search returns STACSearch object
search_results = client.search(collections=['sentinel-2-l2a'])

# Access results
items = search_results.get_all_items()

# Check metadata
print(f"Total results: {search_results.total_results}")
print(f"Returned: {len(items)} items")

Common Patterns

Working with Multiple Items

# Process multiple items efficiently
items = results.get_all_items()

for item in items:
    # Check data quality
    cloud_cover = item.properties.get('eo:cloud_cover', 100)
    if cloud_cover < 20:
        # Get URLs for analysis
        urls = item.get_band_urls(['B08', 'B04'])  # NIR, Red
        print(f"Clear scene: {item.id}")

Provider-Agnostic Asset Access

def get_rgb_urls(item):
    """Get RGB URLs regardless of provider."""
    assets = item.list_assets()

    # Try different naming conventions
    if all(band in assets for band in ['B04', 'B03', 'B02']):
        return item.get_band_urls(['B04', 'B03', 'B02'])  # PC
    elif all(band in assets for band in ['red', 'green', 'blue']):
        return item.get_band_urls(['red', 'green', 'blue'])  # ES
    else:
        print(f"Available assets: {assets}")
        return {}

Data Conversion

# Convert collection to DataFrame for analysis
df = items.to_dataframe()

# Filter by date
summer_items = df[df['datetime'].str.contains('2024-0[678]')]

# Group by month
monthly_counts = df.groupby(df['datetime'].str[:7]).size()
print("Monthly data availability:")
print(monthly_counts)

Error Handling

# Robust asset access
def safe_get_asset_url(item, asset_name):
    """Safely get asset URL with error handling."""
    try:
        return item.get_asset_url(asset_name)
    except KeyError:
        available = item.list_assets()
        print(f"Asset {asset_name} not found. Available: {available}")
        return None
    except Exception as e:
        print(f"Error getting URL for {asset_name}: {e}")
        return None