2. Adding and Configuring Custom Extensions for STAC-Items and STAC-Collections#

There exist two distinct methodologies for incorporating a custom extension into STAC-Items and STAC-Collections.

2.1. First approach:#

If the extension is already present in the PySTAC package (refer to pySTAC extensions), it can be readily utilized by invoking the extension class and incorporating it into the STAC-Item or STAC-Collection, as elucidated below. For instance, we will incorporate the pySTAC scientific extension to the STAC-Item. To achieve this objective, it is necessary to configure the tag_config.json file in the following manner. To obtain further details on creating a ztag_config.jsonz file, please refer to the Creating the tag_config.json Configuration File: A Step-by-Step Guide.

{
"scientific_extension": {
                    "doi": {
                            "tds2stac_mode_analyser": "str",
                            "tds2stac_manual_variable": "10.1080/17550874.2023.2274839"
                            },
                    "citation": {
                            "tds2stac_mode_analyser": "str",
                            "tds2stac_manual_variable": "10.1080/17550874.2023.2274839"
                            },
                    "publications": {
                            "tds2stac_mode_analyser": "list",
                            "tds2stac_manual_variable": "[10.1080/17550874.2023.2274839, 'name']"

                            }
                    },

}

As observed in the aforementioned JSON file, the extension is denoted by the name “scientific_extension” and encompasses three distinct keys, namely “doi”, “citation”, and “publications”. All keys of this extension are consistently represented as string and constant data types. For more information after incorporating the application into our harvesting_var dictionary, we will observe the presence of three distinct keys, namely doi, citation, and publications. The corresponding values for these keys are specified as the value of the tds2stac_manual_variable key.

harvesting_var = {
    "doi": "10.1080/17550874.2023.2274839",
    "citation": "10.1080/17550874.2023.2274839",
    "publications": "[10.1080/17550874.2023.2274839, 'name']",
}

In this stage, the values will be incorporated into the STAC-Items as a the extension that already exists within the PySTAC package. By employing two distinct methodologies, namely class and function, the provided script enables the seamless integration of code into the STAC-Item.

It is imperative to note that when defining a function or class, the input from the source code must include two parameters. The first parameter, referred to as item, represents the STAC-Item object to which we intend to add assets. The second parameter, known as harvesting_vars, represents a dictionary containing variables that have already been harvested and are to be added to the STAC-Item.

from pystac.extensions.scientific import Publication, ScientificExtension


# First
class Scientific:
    """
    a class-based custom extension
    for the item via the defined extension
    in pystac
    """

    def item(self, item, harvesting_vars):
        item_publication = []
        item_publication = [
            Publication(
                harvesting_vars["publications"][0],
                harvesting_vars["publications"][1],
            )
        ]
        scientific = ScientificExtension.ext(item, add_if_missing=True)
        scientific.apply(
            doi=harvesting_vars["doi"],
            citation=harvesting_vars["citation"],
            publications=item_publication,
        )


# Second
def item(item, harvesting_vars):
    """
    a function-based custom extension
    for the item via the defined extension
    in pystac
    """
    item_publication = []
    item_publication = [
        Publication(
            harvesting_vars["publications"][0],
            harvesting_vars["publications"][1],
        )
    ]
    scientific = ScientificExtension.ext(item, add_if_missing=True)
    scientific.apply(
        doi=harvesting_vars["doi"],
        citation=harvesting_vars["citation"],
        publications=item_publication,
    )

To execute the aforementioned script, it is necessary to invoke the TDS2STACIntegrator class in the following manner. There are two distinct techniques available for this goal.

  1. If the script mentioned above is saved in a separate file named custom_extension.py, it can be invoked in the third element of a tuple to execute it from that specific location.

TDS2STACIntegrator(
    "TDS_catalog_url",
    stac_dir="/path/to/stac_dir/",
    extension_properties={
        "item_extensions": [
            "common_metadata",
            (
                "scientific_extension",
                "item",  # or "Scientific.item"
                "/path/to/custom_extension.py",
            ),
        ]
    },
)
  1. Alternatively, if the TDS2STACIntegrator calling script is a continuation of the aforementioned script, there is no requirement to include the script’s address in the third element of the tuple. In this case, the tuple will consist of only two items.

TDS2STACIntegrator(
    "TDS_catalog_url",
    stac_dir="/path/to/stac_dir/",
    extension_properties={
        "item_extensions": [
            "common_metadata",
            (
                "scientific_extension",
                "item",  # or "Scientific.item"
            ),
        ]
    },
)

2.2. Second approach:#

The second approach involves the definition of a custom extension, which is based on the STAC extensions list provided by the STAC extensions organization on Github. This process is outlined in the manual available in the pySTAC library. In order to fulfill this objective, a bespoke extension script was developed for the contact extension within the stac extension. This script was inspired by the guidelines provided in the pySTAC documentation.

from typing import Any, Dict, Literal, Union

import pystac
from pystac.extensions.base import (
    ExtensionManagementMixin,
    PropertiesExtension,
)
from pystac.utils import get_required, map_opt

CONTACTS = "contacts"
# contact

NAME = "name"
ORGANIZATION = "organization"
IDENTIFIER = "identifier"
EMAILS = "emails"
PHONES = "phones"
POSITION = "position"
LOGO = "logo"
ADDRESSES = "addresses"
LINKS = "links"
CONTACTINSTRUCTIONS = "contactInstructions"
ROLES = "roles"

# Info

VALUE = "value"
ROLES = "roles"

# Address

DELIVERYPOINT = "deliveryPoint"
CITY = "city"
ADMINISTRATIVEAREA = "administrativeArea"
POSTALCODE = "postalCode"
COUNTRY = "country"

# Link

HREF = "href"
REL = "rel"
TYPE = "type"
TITLE = "title"


class Info:
    properties: Dict[str, Any]

    def __init__(self, properties: Dict[str, Any]) -> None:
        self.properties = properties

    @property
    def value(self) -> str:
        return get_required(self.properties.get(VALUE), self, VALUE)

    @value.setter
    def value(self, v: str) -> None:
        self.properties[VALUE] = v

    @property
    def roles(self) -> list | None:
        return self.properties.get(ROLES)

    @roles.setter
    def roles(self, v: list | None) -> None:
        if v is None:
            self.properties.pop(ROLES, None)
        else:
            self.properties[ROLES] = v

    def to_dict(self) -> dict[str, Any]:
        return self.properties

    @staticmethod
    def from_dict(d: dict[str, str]) -> "Info":
        return Info(d.get("value"), d.get("roles"))  # type: ignore


class Address:
    properties: Dict[str, Any]

    def __init__(self, properties: Dict[str, Any]) -> None:
        self.properties = properties

    @property
    def deliveryPoint(self) -> list[str] | None:
        return self.properties.get(DELIVERYPOINT)

    @deliveryPoint.setter
    def deliveryPoint(self, v: list[str]) -> None:
        self.properties[DELIVERYPOINT] = v

    @property
    def city(self):
        return self.properties.get(CITY)

    @city.setter
    def city(self, v: str) -> None:
        self.properties[CITY] = v

    @property
    def administrativeArea(self):
        return self.properties.get(ADMINISTRATIVEAREA)

    @administrativeArea.setter
    def administrativeArea(self, v: str) -> None:
        self.properties[ADMINISTRATIVEAREA] = v

    @property
    def postalCode(self):
        return self.properties.get(POSTALCODE)

    @postalCode.setter
    def postalCode(self, v: str) -> None:
        self.properties[POSTALCODE] = v

    @property
    def country(self):
        return self.properties.get(COUNTRY)

    @country.setter
    def country(self, v: str) -> None:
        self.properties[COUNTRY] = v

    def to_dict(self) -> dict[str, Any]:
        return self.properties

    @staticmethod
    def from_dict(d: dict[str, str]) -> "Address":
        return Address(
            d.get("deliveryPoint"),
            d.get("city"),
            d.get("administrativeArea"),
            d.get("postalCode"),
            d.get("country"),
        )


class Link:
    properties: Dict[str, Any]

    def __init__(self, properties: Dict[str, Any]) -> None:
        self.properties = properties

    @property
    def href(self) -> str:
        return get_required(self.properties.get(HREF), self, HREF)

    @href.setter
    def href(self, v: str) -> None:
        self.properties[HREF] = v

    @property
    def rel(self) -> str:
        return get_required(self.properties.get(REL), self, REL)

    @rel.setter
    def rel(self, v: str) -> None:
        self.properties[REL] = v

    @property
    def type(self):
        return self.properties.get(TYPE)

    @type.setter
    def type(self, v: str) -> None:
        self.properties[TYPE] = v

    @property
    def title(self):
        return self.properties.get(TITLE)

    @title.setter
    def title(self, v: str) -> None:
        self.properties[TITLE] = v

    def to_dict(self) -> dict[str, Any]:
        return self.properties

    @staticmethod
    def from_dict(d: dict[str, str]) -> "Info":
        return Info(d.get("href"), d.get("rel"), d.get("type"), d.get("title"))  # type: ignore


class Contact:
    properties: dict[str, str]

    def __init__(self, properties) -> None:
        self.properties = properties

    @property
    def name(self) -> str | None:
        return get_required(self.properties.get(NAME), self, NAME)

    @name.setter
    def name(self, v: str) -> None:
        self.properties[NAME] = v

    @property
    def organization(self) -> str | None:
        return get_required(self.properties.get(ORGANIZATION), self, NAME)

    @organization.setter
    def organization(self, v: str) -> None:
        self.properties[ORGANIZATION] = v

    @property
    def identifier(self) -> str | None:
        return self.properties.get(IDENTIFIER)

    @identifier.setter
    def identifier(self, v: str) -> None:
        self.properties[IDENTIFIER] = v

    @property
    def position(self) -> str | None:
        return self.properties.get(POSITION)

    @position.setter
    def position(self, v: str) -> None:
        self.properties[POSITION] = v

    @property
    def logo(self) -> Link | None:
        return map_opt(Link.from_dict, self.properties.get(LOGO))

    @logo.setter
    def logo(self, v: Link | None) -> None:
        self.properties[LOGO] = map_opt(lambda link: link.to_dict(), v)

    @property
    def phones(self) -> list[Info] | None:
        return map_opt(
            lambda phones: [Info.from_dict(phone) for phone in phones],
            self.properties.get(PHONES),
        )

    @phones.setter
    def phones(self, v: list[Info] | None) -> None:
        self.properties[PHONES] = map_opt(
            lambda phones: [phone.to_dict() for phone in phones], v
        )

    @property
    def emails(self) -> list[Info] | None:
        return map_opt(
            lambda emails: [Info.from_dict(email) for email in emails],
            self.properties.get(EMAILS),
        )

    @emails.setter
    def emails(self, v: list[Info] | None) -> None:
        self.properties[EMAILS] = map_opt(
            lambda emails: [email.to_dict() for email in emails], v
        )

    @property
    def addresses(self) -> list[Address] | None:
        return map_opt(
            lambda addresses: [
                Address.from_dict(address) for address in addresses
            ],
            self.properties.get(ADDRESSES),
        )

    @addresses.setter
    def addresses(self, v: list[Address] | None) -> None:
        self.properties[ADDRESSES] = map_opt(
            lambda addresses: [address.to_dict() for address in addresses],
            v,
        )

    @property
    def links(self) -> list[Link] | None:
        return map_opt(
            lambda links: [Link.from_dict(link) for link in links],
            self.properties.get(LINKS),
        )

    @links.setter
    def links(self, v: list[Link] | None) -> None:
        self.properties[LINKS] = map_opt(
            lambda links: [link.to_dict() for link in links], v
        )

    @property
    def contactInstructions(self) -> str | None:
        return self.properties.get(CONTACTINSTRUCTIONS)

    @contactInstructions.setter
    def contactInstructions(self, v: str) -> None:
        self.properties[CONTACTINSTRUCTIONS] = v

    @property
    def roles(self) -> str | None:
        return self.properties.get(ROLES)

    @roles.setter
    def roles(self, v: str) -> None:
        self.properties[ROLES] = v

    def to_dict(self) -> dict[str, Any]:
        return self.properties

    @staticmethod
    def from_dict(d: dict[str, str]) -> "Contact":
        return Contact(  # type: ignore
            d.get("name"),
            d.get("organization"),
            d.get("identifier"),
            d.get("position"),
            d.get("logo"),
            d.get("phones"),
            d.get("emails"),
            d.get("addresses"),
            d.get("links"),
            d.get("contactInstructions"),
            d.get("roles"),
        )


SCHEMA_URI: str = (
    "https://stac-extensions.github.io/contacts/v0.1.1/schema.json"
)


class ContactsExtension(
    PropertiesExtension,
    ExtensionManagementMixin[Union[pystac.Item, pystac.Collection]],
):
    name: Literal["contacts"] = "contacts"
    obj: pystac.STACObject

    def __init__(self, item: pystac.Item):
        self.item = item
        self.properties = item.properties

    @classmethod
    def get_schema_uri(cls) -> str:
        return SCHEMA_URI

    def apply(
        self,
        contacts: list[Contact],
    ) -> None:
        self.contacts = contacts

    @property
    def contacts(self) -> list[Contact] | None:
        return map_opt(
            lambda conts: [Contact.from_dict(cont) for cont in conts],
            self._get_property(CONTACTS, list[dict[str, Any]]),
        )

    @contacts.setter
    def contacts(self, v: list[Contact] | None) -> None:
        self._set_property(
            CONTACTS,
            map_opt(lambda conts: [cont.to_dict() for cont in conts], v),
        )

    @classmethod
    def ext(
        cls, obj: pystac.Item, add_if_missing: bool = False
    ) -> "ContactsExtension":
        if isinstance(obj, pystac.Item):
            cls.validate_has_extension(obj, add_if_missing)
            return ContactsExtension(obj)
        else:
            raise pystac.ExtensionTypeError(
                f"ContactExtension does not apply to type '{type(obj).__name__}'"
            )

Other steps are the same as the first approach.