1. tds2stac package#

TDS2STAC

TDS2STAC

class tds2stac.CollectionHarvester(url: str, recognizer: str | None, subdirs: list | None = [], collection_tuples: list[tuple] | None = None, logger_properties: dict = {}, requests_properties: dict = {})[source]#

Bases: object

This class harvests data pertaining to Collections from TDS catalogs. Depending on the sort of dataset scenario, it returns one of the five variables below. collection_id , collection_title , collection_description , collection_url , and collection_subdirs.

Parameters:
  • url (str) – TDS catalog URL address

  • recognizer (str) – status scenario number of Recognizer

  • subdirs (list) – subdirs is a list of url, id, title, and subdirs of a nested dataset

  • collection_tuples (list) – a tuple of STAC collection’s auto-generated ID, user-ID, user-Title and user-Description defined by user.

  • logger_properties (dict) – dictionary of logger properties

  • requests_properties (dict) – dictionary of requests properties

collection_id_desc_maker(url: str, collection_tuples: list[tuple] | None = None, recognizer_output: str | None = None)[source]#

A function for getting collection id and description from the TDS catalog urls and pre-defined collection_tuples for scenarios number 4, 5, 6, 7 ,and 9

Parameters:
  • url (str) – TDS catalog URL address

  • collection_tuples (list) – a tuple of STAC collection’s auto-generated ID, user-ID, user-Title and user-Description defined by user.

  • recognizer_output (str) – status scenario output of Recognizer class

collection_tuples: list[tuple] | None#

a tuple of STAC collection’s auto-generated ID, user-ID, user-Title and user-Description defined by user.

logger_properties: dict#

dictionary of logger properties, more information in Logger

recognizer: str | None#

status scenario output of Recognizer class

requests_properties: dict#

To obtain additional information on this topic, refer to the requests_properties. The default value is an empty dictionary.

subdirs: list | None#

subdirs is a list of url, id, title, and subdirs of a nested dataset

url: str#

TDS catalog URL address. Initial point of harvesting e.g. https://thredds.atmohub.kit.edu/thredds/catalog/caribic/IAGOS-CARIBIC_MS_files_collection_20231017/catalog.html (*)

class tds2stac.Datacube[source]#

Bases: object

This class is responsible for adding the datacube extension to the STAC Item. :param item: The STAC Item to be extended. :type item: pystac.Item :param harvesting_vars: The dictionary of the variables and dimensions of the dataset. :type harvesting_vars: dict :param logger_properties: The dictionary of the logger properties. :type logger_properties: dict

item_extension(item, harvesting_vars, logger_properties: dict = {})[source]#
class tds2stac.ExistenceValidator(stac_dir: str = '/home/docs/checkouts/readthedocs.org/user_builds/tds2stac/checkouts/latest/docs', logger_properties: dict | None = {})[source]#

Bases: object

A class for verifying the main STAC catalog’s existence. This class is implemented in STACCreator.

Parameters:
  • stac_dir (st, Optional) – Directory of the main STAC catalog (*)

  • logger_properties (dict, optional) – A dictionary of properties for logger. default is None.

logger_properties: dict | None#

A dictionary of properties for logger. default is None. You can look at keys in Logger class.

stac_dir: str#

Directory of the main STAC catalog. It can be a relative or absolute path.

class tds2stac.ItemHarvester(url: str, elem: Element, harvesting_vars: dict, web_service_dict: dict | None, datetime_after: datetime | None = None, datetime_before: datetime | None = None, spatial_information: list | None = None, temporal_format_by_dataname: str | None = None, extension_properties: dict | None = None, linestring: bool = False, requests_properties: dict = {}, logger_properties: dict = {})[source]#

Bases: object

This class harvests information about an Item from TDS data catalogs. It ultimately returns a dictionary of harvesting variables, based on the type of dataset scenario and activated extensions.

Parameters:
  • url (str) – TDS catalog URL address

  • elem (str) – xml element of the data in dataset

  • harvesting_vars (dict) – dictionary of harvesting variables that is going to be filled

  • web_service_dict (dict) – web service that the user wants to harvest from

  • datetime_after (str) – datetime that the user wants to harvest data after that

  • datetime_before (str) – datetime that the user wants to harvest data before that

  • spatial_information (list) – Spatial information of 2D datasets e.g. [minx, maxx, miny, maxy] or 1D dataset e.g. [x,y]

  • temporal_format_by_dataname (str) – datetime format for datasets that have datetime in their name e.g `e%y%m%d%H.%M%S%f`(optional),

  • extension_properties (dict) – dictionary of extension properties (optional)

  • linestring (bool) – using this attribute, user activate making LineString instead of Polygon (True and False) (optional)

  • logger_properties (dict) – dictionary of logger properties

datetime_after: str | None#

datetime that the user wants to harvest data after that

datetime_before: str | None#

datetime that the user wants to harvest data before that

elem: Element#

xml element of the data in dataset. It’s an element of the xml file that is going to be harvested

extension_properties: dict | None#

dictionary of extension properties (optional)

harvesting_vars: dict#

dictionary of harvesting variables that is going to be filled

linestring: bool#

using this attribute, user activate making LineString instead of Polygon (True and False) (optional)

logger_properties: dict#

dictionary of logger properties, more information in Logger

requests_properties: dict#

To obtain additional information on this topic, refer to the requests_properties. The default value is an empty dictionary.

spatial_information: list | None#

Spatial information of 2D datasets e.g. [minx, maxx, miny, maxy] or 1D dataset e.g. [x,y] (optional)

temporal_format_by_dataname: str | None#

datetime format for datasets that have datetime in their name e.g `e%y%m%d%H.%M%S%f`(optional)

usl: str#

TDS catalog URL address. Initial point of harvesting e.g. https://thredds.atmohub.kit.edu/thredds/catalog/caribic/IAGOS-CARIBIC_MS_files_collection_20231017/catalog.html (*)

web_service_dict: dict | None#

web service that the user wants to harvest from

class tds2stac.JSONFileWebServiceListScraper(json_file: str, logger_properties: dict = {})[source]#

Bases: object

A class to get all tds2stac_webservice_analyser in tag_config.json file when tds2stac_mode_analyser is get or check.

load_and_process_json()[source]#
class tds2stac.Logger(logger_properties: dict[str, Any] | None = {})[source]#

Bases: object

A class-based logger for TDS2STAC. It supports all the handlers from the standard python logging library.

Parameters:

logger_properties (dict, optional) –

Logger properties. Defaults to dict(). It’s optional and has the following keys:

logger_msg (str, optional)

logger_handler (str, optional)

logger_name (str, optional)

logger_id (str, optional)

logger_level (str, optional)

logger_formatter (str, optional)

logger_handler_host (str, optional)

logger_handler_port (str, optional)

logger_handler_url (str, optional)

logger_handler_method (str, optional)

logger_handler_secure (bool, optional)

logger_handler_credentials (tuple, optional)

logger_handler_context (tuple, optional)

logger_handler_filename (str, optional)

logger_handler_mode (str, optional)

logger_handler_encoding (str, optional)

logger_handler_delay (bool, optional)

logger_handler_errors (str, optional)

logger_handler_mailhost (str, optional)

logger_handler_fromaddr (str, optional)

logger_handler_toaddrs (str, optional)

logger_handler_subject (str, optional)

logger_handler_timeout (str, optional)

Null_Handler()[source]#

This is a function to return a NullHandler

logger_properties: dict[str, Any] | None#

A dictionary that contains all the logger properties.

It is optional and it is set to None by default. The following keys are supported:

logger_msg (str, optional):

Logger message. Defaults to None. But it is required when you want to log a message.

logger_handler (str, optional):

Logger handler. Defaults to NullHandler. Check the following website for more information:

https://docs.python.org/3/library/logging.handlers.html#module-logging.handlers

logger_name (str, optional):

Logger name. Defaults to INSUPDEL4STAC. It’s required when you choose HTTPHandler as logger_handler.

logger_id (str, optional):

Logger id. Defaults to 1. It’s required when you choose HTTPHandler as logger_handler.

logger_level (str, optional):

Logger level. Defaults to DEBUG. It’s optional. For more information check the following website:

https://docs.python.org/3/library/logging.html#levels

logger_formatter (str, optional):

Logger format. Defaults to %(levelname)-8s %(asctime)s t %(filename)s @function %(funcName)s line %(lineno)s - %(message)s. For more information check the following website:

https://docs.python.org/3/library/logging.html#formatter-objects

logger_handler_host (str, optional):

Logger host. Sets the value to ‘None’ by default. It is required when HTTPHandler or SocketHandler are selected as the logger_handler. The logger_handler will be set to ‘NullHandler’ if HTTPHandler or SocketHandler is selected as the logger_handler value and neither logger_handler_host nor logger_handler_port nor are specified.

logger_handler_port (str, optional):

Logger port. Sets the value to ‘None’ by default. It is required when HTTPHandler or SocketHandler are selected as the logger_handler. The logger_handler will be set to ‘NullHandler’ if HTTPHandler or SocketHandler is selected as the logger_handler value and neither logger_handler_host nor logger_handler_port are specified.

logger_handler_url (str, optional):

Logger url. Sets the value to ‘None’ by default. It is required when HTTPHandler is selected as the logger_handler. The logger_handler will be set to ‘NullHandler’ if HTTPHandler is selected as the logger_handler value and neither logger_handler_url is specified.

logger_handler_method (str, optional):

HTTP methods. It supports sending logging messages to a web server, using either GET or POST semantics. Sets the value to ‘None’ by default. It is required when HTTPHandler is selected as the logger_handler. The logger_handler will be set to ‘NullHandler’ if HTTPHandler is selected as the logger_handler value and logger_handler_method is not specified.

logger_handler_secure (bool, optional):

HTTP secure. Sets the value to ‘False’ by default. It is utilized when HTTPHandler or SMTPHandler are selected as the logger_handler. But it is optional in both logger handlers.

logger_handler_credentials (tuple, optional):

HTTP credentials. Sets the value to ‘None’ by default. It is utilized when HTTPHandler or SMTPHandler are selected as the logger_handler. But it is optional in both logger handlers.

logger_handler_context (tuple, optional):

HTTP context. Sets the value to ‘None’ by default. It is utilized when HTTPHandler is selected as the logger_handler. But it is optional in both logger handlers.

logger_handler_filename (str, optional):

File name. Sets the value to ‘None’ by default. It is required when FileHandler or WatchedFileHandler are selected as the logger_handler. The logger_handler will be set to ‘NullHandler’ if FileHandler or WatchedFileHandler is selected as the logger_handler value and logger_handler_filename is not specified.

logger_handler_mode (str, optional):

File mode. Sets the value to ‘None’ by default. It is required when FileHandler or WatchedFileHandler are selected as the logger_handler. The logger_handler will be set to ‘NullHandler’ if FileHandler or WatchedFileHandler is selected as the logger_handler value and logger_handler_mode is not specified.

logger_handler_encoding (str, optional):

File encoding. Sets the value to ‘None’ by default. It is utilized when FileHandler or WatchedFileHandler are selected as the logger_handler. But it is optional in both logger handlers.

logger_handler_delay (bool, optional):

File delay. Sets the value to ‘False’ by default. It is utilized when FileHandler or WatchedFileHandler are selected as the logger_handler. But it is optional in both logger handlers.

logger_handler_errors (str, optional):

File errors. Sets the value to ‘None’ by default. It is utilized when FileHandler or WatchedFileHandler are selected as the logger_handler. But it is optional in both logger handlers.

logger_handler_mailhost (str, optional):

Mail host. Sets the value to ‘None’ by default. It is required when SMTPHandler is selected as the logger_handler. The logger_handler will be set to ‘NullHandler’ if SMTPHandler is selected as the logger_handler value and logger_handler_mailhost is not specified.

logger_handler_fromaddr (str, optional):

Mail from address. Sets the value to ‘None’ by default. It is required when SMTPHandler is selected as the logger_handler. The logger_handler will be set to ‘NullHandler’ if SMTPHandler is selected as the logger_handler value and logger_handler_fromaddr is not specified.

logger_handler_toaddrs (str, optional):

Mail to address. Sets the value to ‘None’ by default. It is required when SMTPHandler is selected as the logger_handler. The logger_handler will be set to ‘NullHandler’ if SMTPHandler is selected as the logger_handler value and logger_handler_toaddrs is not specified.

logger_handler_subject (str, optional):

Mail subject. Sets the value to ‘None’ by default. It is utilized when SMTPHandler is selected as the logger_handler. But it is optional in both logger handlers.

logger_handler_timeout (str, optional):

Mail timeout. Sets the value to ‘None’ by default. It is utilized when SMTPHandler is selected as the logger_handler. But it is optional in both logger handlers.

class tds2stac.NestedCollectionInspector(main_catalog_url: str, nested_number: int | None = None, logger_properties: dict = {}, requests_properties: dict = {})[source]#

Bases: object

This class will generate Collection IDs, Titles and their corresponding URLs for a presumed nested number originating from the Recognizer class in TDS. Only works for nested scenarios number 1,2,3,8 and 9 in Recognizer class. The output will be a list of the tuples: (Root collection URL, Collection ID, Collection Title, corresponding subset URLs)

Parameters:
  • main_catalog_url (str) – The URL of the TDS catalog

  • nested_number (int, optional) – Number of depth for nested datasets

  • logger_properties (dict, optional) – A dictionary for logger properties

  • requests_properties (dict, optional) – A dictionary for requests properties

aslist()[source]#

A function for returning the list of tuples

end_point_url_extractor_dict(d: dict)[source]#

A function for extracting the end point URLs of a nested dictionary.

Parameters:

d (dict) – A nested dictionary

end_point_url_extractor_list(list_: list)[source]#

A function for extracting the end point URLs of a nested list.

Parameters:

list (list) – A nested list

final_collections_details_returner(url: str)[source]#

A function for returning the URLs of input URL in First and Third cases in TDS

Parameters:

url (str) – The URL of the TDS catalog

logger_properties: dict#

A dictionary for logger properties. For more information see Logger

main_catalog_url: str#

The URL of the TDS catalog

n_level(d: dict, layer: int)[source]#

For decoding the generator object of to_level function. https://stackoverflow.com/a/68228562

Parameters:
  • d (dict) – A nested dictionary

  • layer (int) – The depth of the dictionary

nested_dict_returner(url: str, dict: dict)[source]#

A function for getting the nested dictionary of a given URL.

Parameters:
  • url (str) – The URL of the TDS catalog

  • dict (dict) – A nested dictionary

nested_number: int | None#

Number of depth for nested datasets

requests_properties: dict#

To obtain additional information on this topic, refer to the requests_properties. The default value is an empty dictionary.

to_level(d: dict, layer: int)[source]#

A function for getting the a dictionary in a given depth. https://stackoverflow.com/a/68228562

Parameters:
  • d (dict) – A nested dictionary

  • layer (int) – The depth of the dictionary

class tds2stac.Recognizer(main_catalog_url: str, nested_check: bool = False, logger_properties: dict = {}, requests_properties: dict = {})[source]#

Bases: object

A class for recognizing nine different and possible scenarios in management of TDS datasets. We will explain each scenario in the following.

First scenario: Just catalogRef tags are located directly under the dataset element tag.

tag https://thredds.imk-ifu.kit.edu/thredds/catalog/regclim/raster/global/era5/sfc/single/catalog.xml (nested)

Second senarion: CatalogRefs are not under a dataset element tag and directly come below the catalog.

https://thredds.imk-ifu.kit.edu/thredds/catalog/catalogues/sensor_catalog_ext.xml (nested)

Third scenario: One single dataset tag is located next to CatalogRef tags. All are under a dataset tag.

https://thredds.imk-ifu.kit.edu/thredds/catalog/regclim/raster/global/chirps/catalog.xml (nested)

Fourth scenario: An empty datasets.

https://thredds.imk-ifu.kit.edu/thredds/catalog/catalogues/bio_geo_chem_catalog_ext.xml or https://thredds.atmohub.kit.edu/thredds/catalog/snowfogs/catalog.xml

Fifth scenario: There is no CatalogRef tag and all are dataset tag. All of them are under a dataset tag.

https://thredds.imk-ifu.kit.edu/thredds/catalog/climate/raster/global/chelsa/v1.2/catalog.html

Sixth scenario: A single dataset

https://thredds.imk-ifu.kit.edu/thredds/catalog/regclim/raster/global/era5/sfc/single/catalog.xml?dataset=regclim/raster/global/era5/sfc/single/era5_sfc_20210101.nc

Seventh scenario: An aggregated dataset

https://thredds.imk-ifu.kit.edu/thredds/catalog/catalogues/swabian_moses_2021.xml?dataset=swabian_moses_aggregation

Eighth scenario: A combination of caralogRef and dataset tags that is not under a dataset tag.It’s similar to second scenario but with datasets

https://thredds.imk-ifu.kit.edu/thredds/catalog/catalogues/transfer.xml

Ninth scenario: When we have a bunch of single dataset tags next to catalogref. It’s similar to third scenario but with more datasets.

https://thredds.imk-ifu.kit.edu/thredds/catalog/regclim/raster/global/hydrogfd/v3.0/catalog.xml (nested)

Parameters:
  • main_catalog_url – TDS Catalog url to start harvesting

  • nested_check – An option for checking nested datasets in TDS (True or False)

  • auth – Authentication for TDS catalog e.g.(‘user’, ‘password’)

  • logger_properties – A dictionary for logger properties.

  • requests_properties – A dictionary for requests properties.

logger_properties: dict#

A dictionary for logger properties. For more information see Logger

main_catalog_url: str#

TDS Catalog url to start harvesting (*)

nested_check: bool#

An option for checking nested datasets in TDS (True or False) (optional)

nested_checker(url: str)[source]#

A function for returning the depth of nested datasets in TDS for scenarios 1, 3, ,and 9

nested_checker_exceptions(url: str)[source]#

A function for returning the depth of nested datasets in TDS for scenarios 2 and 8

recognition_function(url: str, xml_content)[source]#

A function for recognizing number of scenarios in TDS

requests_properties: dict#

To obtain additional information on this topic, refer to the requests_properties. The default value is an empty dictionary.

class tds2stac.STACCreator[source]#

Bases: object

A class for creating STAC catalog, -Collections and its -Items from TDS datasets catalogs.

STACCatalog(url: str, stac_id: str, stac_title: str | None, stac_desc: str | None, stac_dir: str, stac_existence: bool = False, logger_properties: dict = {}, requests_properties: dict = {})[source]#

A function for creating STAC catalog from TDS dataset catalog.

Parameters:
  • url – The URL of the TDS catalog.

  • stac_id – The ID of the STAC catalog.

  • stac_title – The title of the STAC catalog.

  • stac_desc – The description of the STAC catalog.

  • stac_dir – The directory of saving the STAC catalog.

  • stac_existence – If it is True, it means that the STAC catalog already exists in the directory and for the harvesting, there is no need to create a new STAC-Catalog and import new collections In the existed STAC-Catalog. False by default.

  • logger_properties – The properties of the logger. For more information please check the Logger class.

  • requests_properties – The properties of the requests. For more information please check the requests_properties class.

STACCollection(catalog: Catalog, collection_id: str, collection_title: str, collection_description: str, stac_existence_collection: bool = False, logger_properties: dict = {}, extra_metadata: dict = {})[source]#

This is a function for creating STAC collection from harvested information from TDS dataset catalog. This function returns a dictionary with two keys:

  1. collection: The STAC collection

2. existed_items_id_list: The list of the items that already exist in the STAC collection and it is going to be used for the harvesting process.

Parameters:
  • catalog – The STAC catalog.

  • collection_id – The ID of the STAC collection.

  • collection_title – The title of the STAC collection.

  • collection_description – The description of the STAC collection.

  • collection_scientific – The scientific extension of the STAC collection.

  • stac_existence_collection – If it is True, it means that the STAC collection already exists in the catalog and for the harvesting, there is no need to create a new STAC-Collection and import new items In the existed STAC-Collection. False by default.

  • logger_properties – The properties of the logger. For more information please check the Logger class.

STACItem(url: str, catalog: Catalog, harvesting_vars: dict, Recognizer_output: str | None, collection_id: str, aggregated_dataset_url: str | None = None, extension_properties: dict | None = None, asset_properties: dict | None = {}, logger_properties: dict = {}, extra_metadata: dict = {}, stac_existence_collection: bool = False, collection_bbox_existed: list | None = None, collection_interval_time_final_existed: list | None = None)[source]#

This is a function for creating STAC item from harvested data in TDS dataset catalog.

Parameters:
  • url – The URL of the TDS catalog.

  • catalog – The STAC catalog.

  • harvesting_vars – The harvested data from TDS catalog.

  • Recognizer_output – The output of the Recognizer class.

  • collection_id – The ID of the STAC collection.

  • aggregated_dataset_url – The URL of the aggregated dataset that whole of data is located there.

  • extension_properties – The properties of the extensions.

  • asset_properties – The properties of the assets.

  • logger_properties – The properties of the logger. For more information please check the Logger class.

SaveCatalog(catalog, catalog_dir, logger_properties: dict = {})[source]#
class tds2stac.Spatial[source]#

Bases: object

harvester(main_dict, linestring=None)[source]#
regulator(main_dict, spatial_information)[source]#

A function for regulating the spatial information of a catalog

class tds2stac.TDS2STACIntegrator(TDS_catalog: str, stac_dir: str = '/home/docs/checkouts/readthedocs.org/user_builds/tds2stac/checkouts/latest/docs', stac_id: str = 'TDS2STAC', stac_title: str | None = 'TDS2STAC', stac_description: str | None = None, stac_existence: bool = False, stac_existence_collection: bool = False, collection_tuples: list[tuple] | None = None, datetime_filter: list | None = None, aggregated_dataset_url: str | None = None, depth_number: int | None = None, limited_number: int | None = None, spatial_information: list | None = None, temporal_format_by_dataname: str | None = None, item_geometry_linestring: bool = False, webservice_properties: dict | None = {}, asset_properties: dict | None = {}, extension_properties: dict | None = {}, logger_properties: dict = {}, requests_properties: dict = {}, extra_metadata: dict = {})[source]#

Bases: object

This class is the central component of the TDS2STAC. It harvests the TDS catalog and then generates the STAC-Catalog, -Collections, and -Items through the TDS catalogs, based on the user’s input. This class mainly defines all configurations related to harvesting and STAC creation. In the first step, it recognizes the scenario of the TDS catalog using Recognizer. If it is recognized as a nested collection, NestedCollectionInspector is responsible for determining the nested collection’s ID, Title, and url of subdirectories. Other procedures follow in succession. For example, CollectionHarvester harvests the collection’s information and STACCreator creates the STAC-Catalog and -Collection. Then, ItemHarvester harvests the item’s information and STACCreator creates the STAC-Item and connect them to the related STAC-Collections. At the end each STAC-Collection will be connected to the main STAC-Catalog.

Parameters:
  • TDS_catalog (str) – The URL address of the TDS catalog that will be harvested.

  • stac_dir (str, Optional) – Directory of saving created STAC catalogs.

  • stac_id (str, Optional) – STAC catalog ID. default value is ‘TDS2STAC’.

  • stac_title (str, optional) – STAC catalog Title. default value is ‘TDS2STAC’.

  • stac_description (str, optional) – STAC catalog description.

  • stac_existence (bool, optional) – Verifying the presence of the STAC catalog in order to update an existing catalog; if not, a new catalog will be generated.

  • stac_existence_collection (bool, optional) – Verifying the presence of the STAC Collection in order to update an existing catalog; if not, a new collection will be generated.

  • collection_tuples (list, optional) – The elements of this tuple comprise the auto-TDS2STAC-generated ID , the user-defined ID, title, and description of the STAC-Collection respectively. (auto-ID, user-ID, user-title, user-description).

  • datetime_filter (list, optional) – Datetime-based filtering of harvesting. It works based on the modified tag in each dataset at TDS.

  • aggregated_dataset_url (str, optional) – Dataset’s URL of each data entry in the Aggregated datasets of TDS.

  • depth_number (int, optional) – depth number of nested datasets if it is a nested collection. default value is 0.

  • limited_number (int, optional) – The objective is to reduce the quantity of harvested items in each collection. It is beneficial for developing and testing purposes.

  • spatial_information (list, optional) – Spatial information of 2D datasets e.g. [minx, maxx, miny, maxy] or 1D dataset e.g. [x,y]. default value is None.

  • temporal_format_by_dataname (str, optional) – A preferred datetime format for datasets that include the time period in their names. e.g “e%y%m%d%H.%M%S%f”

  • item_geometry_linestring (bool, optional) – Set True to make a LineString geometry for STAC-Items from wms service. Otherwise it makes Polygon geometry for the given Item. default value is False.

  • extension_properties (dict, optional) – A dictionary of properties for extensions. default is None. For more information about the keys, please refer to the extension_properties.

  • webservice_properties (dict, optional) – A dictionary of properties for web_service. default is None (optional) For more information about the keys, please refer to the webservice_properties.

  • asset_properties (dict, optional) – A dictionary of properties for assets. default is None (optional) For more information about the keys, please refer to the asset_properties.

  • logger_properties (dict, optional) – A dictionary of properties for logger. default is None.

  • requests_properties – A dictionary that modify the requests to URLs. To obtain additional information on this topic, refer to the requests_properties. The default value is an empty dictionary.

TDS_catalog: str#

TDS catalog URL address. Initial point of harvesting e.g. https://thredds.atmohub.kit.edu/thredds/catalog/caribic/IAGOS-CARIBIC_MS_files_collection_20231017/catalog.html

aggregated_dataset_url: str | None#

Dataset’s URL of each data entry in the Aggregated datasets of TDS.. default value None. The HTTPServer is not functional in the aggregated dataset. Therefore, in order to utilize this service as an asset in our STAC-Item, we should employ the aggregated_dataset_url, which links the individual datasets to the HTTPServer asset of the relevant Item.

asset_properties: dict | None#

A dictionary of properties for assets. default is None. When it’s None, keys look like the following example:

collection_thumbnail (str, optional):

A thumbnail asset for STAC-collection sourced from the Web Map Service (WMS) of the TDS. It can be chosen from wms, link, or None. The default value is set to None.

collection_overview (str, optional):

A overview asset for STAC-collection sourced from the Web Map Service (WMS) of the TDS. It can be chosen from wms, link, or None. The default value is set to None.

collection_thumbnail_link (str, optional):

This property is reliant upon the values of collection_thumbnail and collection_overview. When the value of either of these attributes is set to link, it allows for the inclusion of a hyperlink to an image for collection_thumbnail or collection_overview.

collection_overview_link (str, optional):

This property is reliant upon the values of collection_thumbnail and collection_overview. When the value of either of these attributes is set to link, it allows for the inclusion of a hyperlink to an image for collection_thumbnail or collection_overview.

collection_custom_assets (list, optional):

This is a list of asset dictionaris that includes the key, href, and title, role (as a list), and media_type of the asset. The default value is set to None. For more information, refer to the How to make a custom asset for STAC-Collection and STAC-Item:.

item_thumbnail (bool, optional):

A thumbnail asset for STAC-Items sourced from the Web Map Service (WMS) of the TDS. The default value is set to False.

item_overview (bool, optional):

A overview asset for STAC-Items sourced from the Web Map Service (WMS) of the TDS. The default value is set to False.

item_getminmax_thumbnail (bool, optional):

The TDS offers a function that allows users to obtain the minimum and maximum values of the colorbar associated with an image through the use of metadata. The aforementioned attribute is contingent upon both the item_thumbnail and item_overview. The default value is set to False.

assets_list_allowed (list, optional):

This is a list of permissible web services that will be incorporated as assets in the STAC-Item. The WebServiceScraper class provides access to the list of available web services. Default value is None.

assets_list_avoided (list, optional):

This is a list of web services that will be excluded from the STAC-Item asset list. The WebServiceScraper class provides access to the list of available webservices. Default value is None.

explore_data (bool, optional):

By enabling the True setting, the inclusion of Godiva3 as an exploration asset will be implemented.

verify_explore_data (bool, optional):

This argument verifies the availability of the GetMetadata function. The provided function facilitates the retrieval of data necessary for generating maps using the Web Map Service (WMS) protocol. However, an error occurs when attempting to open Godiva3 when this function doesn’t work. In order to mitigate such errors, it would be advisable to establish this argument.

jupyter_notebook (bool, optional):

This argument posits the inclusion of the Jupyter Notebook as an asset.

collection_tuples: list[tuple] | None#

STAC collection auto-generated ID, user-ID, user-Title and user-Description defined by user. It is worth mentioning that in order to obtain the list of automatically generated collection IDs, one can employ the NestedCollectionInspector for the given TDS Catalog and subsequently utilize this argument. Warning - Identifiers should consist of only lowercase characters, numbers, ‘_’, and ‘-‘. Default value None. e.g. (ID, Title, Description)

datetime_filter: list | None#

Datetime-based filtering. e.g. ['2010-02-18T00:00:00.000Z','2020-02-22T00:00:00.000Z'] Default value None. It should be noted it works based on the modified tag in each dataset at TDS.

depth_number: int | None#

The depth refers to the number of layered datasets. If the collection is nested, this argument is applicable; otherwise, employing this argument would be futile. default value None (optional)

extension_properties: dict | None#

A dictionary of properties for extensions. default is None.

item_extensions (list[str, tuple], optional):

The argument can consist of either a list of extension names (string) or a list of tuples containing three elements: the extension name, the function name or class name associated with the extension, and the Python script required for execution. For more explanation, refer to the Adding and Configuring Custom Extensions for STAC-Items and STAC-Collections.

collection_extensions (Union[list, tuple], optional):

It works as same as item_extensions argument. For more explanation, refer to the Adding and Configuring Custom Extensions for STAC-Items and STAC-Collections.

extra_metadata: dict#

A dictionary of extra metadata that you want to add to the STAC-Collection and STAC-Items. It has two main keys, extra_metadata that is boolean and extra_metadata_file that is the address of extra_metadata.json JSON file. For getting more information about making the extra_metadata.json file, please refer to How to create extra_metadata.json file. By default, if ‘extra_metadata’ is set to True, the ‘extra_metadata.json’ file is utilised for the ‘extra_metadata_file’ key, which is situated in the’sta2stac’ main directory.

item_geometry_linestring: Literal[False]#

The default value for the LineString geometry in the STAC Items from the WMS service is set to False and the default geometry type for the STAC-Item is Polygon. However, in instances where the item has a POINT geometry, it can be automatically detected. However, in order to obtain the LineString geometry, it is necessary to set this argument to True.

limited_number: int | None#

The objective is to reduce the quantity of harvested items in each collection. It is beneficial for developing and testing purposes.. default value None (optional)

logger_properties: dict#

A dictionary of properties for logger. default is None. You can look at keys in Logger class.

requests_properties: dict#

A dictionary of properties that adjust the requests to URLs. It contains the following keys:

verify (bool, optional):

It is a boolean that if it is True, it verifies the SSL certificate. By default it is False.

timeout (int, optional):

It is an integer that sets the timeout of the requests. By default it is 10 seconds.

auth (tuple, optional):

It is a tuple that contains the username and password for the authentication. By default it is None.

spatial_information: list | None#

Spatial information of 2D datasets e.g. [minx, maxx, miny, maxy] or 1D dataset e.g. [x,y]. Default value `None`(optional)

stac_description: str | None#

STAC catalog description

stac_dir: str#

Directory of saving created STAC catalogs e.g. /path/to/stac/directory/

stac_existence: Literal[False]#

Verifying the existence of STAC catalog. If the catalog exists in the directory, it updates a existed catalog, otherwise it creates new catalog. default value False

stac_existence_collection: Literal[False]#

Verifying the existence of STAC Collections. If the collection exists in the directory, it updates a existed collection, otherwise it creates new collection. default value False

stac_id: str#

STAC catalog ID. default value TDS2STAC

stac_title: str | None#

STAC catalog Title. default value TDS2STAC

temporal_format_by_dataname: str | None#

A preferred datetime format for datasets that include the time period in their names e.g “e%y%m%d%H.%M%S%f”. Default value None (optional)

webservice_properties: dict | None#

A dictionary of properties for web_service. default is None.

It has the following keys.
web_service_config_file(str, opntional):

The primary tag_config.json file is situated in the primary directory of the installed TDS2STAC. However, the user has the ability to declare an alternative tag_config.json file, which allows for customization of the settings. The user can specify the location of their own JSON file in this section. To obtain further details on the creation of a tag_config.json file, refer: Creating the tag_config.json Configuration File: A Step-by-Step Guide. The default value is set to tag_config.json in the root directory of the installed app.

class tds2stac.Temporal[source]#

Bases: object

parse_datetime_with_fallback(datetime_str, primary_format, fallback_format, tzinfo)[source]#
regulator(main_dict, temporal_format_by_dataname, data_name)[source]#
safe_strip(value)[source]#
class tds2stac.Thumbnails[source]#

Bases: object

This class is used to create thumbnail images for STAC-Collections and STAC-Items.

collection(collection_thumbnail: str, collection_overview: str, services: Element, dataset: dict, harvesting_vars: dict, collection_id: str, url: str, catalog: Catalog, collection_thumbnail_link: str, collection_overview_link: str, logger_properties: dict = {})[source]#

A function to create thumbnail images for STAC-Collections.

Parameters:
  • collection_thumbnail (str) – The type of thumbnail image for STAC-Collections. It can be wms or link.

  • collection_overview (str) – The type of overview image for STAC-Collections. It can be wms or link.

  • services (list) – A list of services for STAC-Collections.

  • dataset (dict) – A dictionary of dataset information.

  • harvesting_vars (dict) – A dictionary of harvesting variables.

  • collection_id (str) – The ID of STAC-Collections.

  • url (str) – The URL of STAC-Catalog.

  • catalog (pystac.Catalog) – A STAC-Catalog.

  • collection_thumbnail_link (str) – The link of thumbnail image for STAC-Collections when collection_thumbnail or collection_overview set as link.

  • collection_overview_link (str) – The link of overview image for STAC-Collections when collection_thumbnail or collection_overview set as link.

  • logger_properties (dict) – A dictionary of logger properties. For more information, please see Logger class.

item(service: Element, dataset: dict, harvesting_vars: dict, url: str, item: Item, item_thumbnail: bool, item_overview: bool, item_getminmax_thumbnail: bool, logger_properties: dict = {})[source]#

A function to create thumbnail images for STAC-Items.

Parameters:
  • service (list) – A list of services for STAC-Items.

  • dataset (dict) – A dictionary of dataset information.

  • harvesting_vars (dict) – A dictionary of harvesting variables.

  • url (str) – The URL of STAC-Catalog.

  • item (pystac.Item) – A STAC-Item.

  • item_thumbnail (bool) – A boolean to create thumbnail image for STAC-Items.

  • item_overview (bool) – A boolean to create overview image for STAC-Items.

  • item_getminmax_thumbnail (bool) – A boolean to create thumbnail image for STAC-Items based on minmax.

  • logger_properties (dict) – A dictionary of logger properties. For more information, please see Logger class.

class tds2stac.Verifier[source]#

Bases: object

This class is responsible for verifying the properties of the dictionary arguments.

asset_properties(asset_properties: dict)[source]#

This function is responsible for refining the values of the asset_properties dictionary.

extension_properties(extension_properties: dict)[source]#

This function is responsible for refining the values of the extension_properties dictionary.

extra_metadata(extra_metadata: dict) dict[source]#

This function is responsible for refining the values of the extra_metadata dictionary.

logger_properties(logger_properties: dict) dict[source]#

This function is responsible for refining the values of the logger_properties dictionary.

requests_properties(requests_properties: dict) dict[source]#

This function is responsible for refining the values of the requests_properties dictionary.

webservice_properties(webservice_properties: dict)[source]#

This function is responsible for refining the values of the webservice_properties dictionary.

class tds2stac.WebServiceContentScraper(root: _Element, service_url: str, json_file: str, extensions_list: list, harvesting_vars: dict | None = None, logger_properties: dict = {})[source]#

Bases: object

The functionality of the existing class is dependent on the settings specified in the tag_config.json file in order to harvest targeted information from a selected web service. For comprehensive instructions on configuring the tag_config.json file, refer to the following link: Creating the tag_config.json Configuration File: A Step-by-Step Guide.

Args:

root (etree._Element): The root of the XML-based web service json_file (str): The path to the tag_config.json file extensions_list (list): The list of extensions to be harvested

from the web service (main keys in the tag_config.json file)

harvesting_vars (dict, optional): The dictionary of harvesting variables logger_properties (dict, optional):The dictionary of the logger properties.

extensions_list: list#

The list of extensions to be harvested from the web service. Main keys in the tag_config.json file. For example item_datacube_extension and so on.

harvester(root, service_url, json_file, ext_name, harvesting_vars=None)[source]#
harvesting_vars: dict | None#

It’s a dictionary that keys are variable names and values are the result of harvesting.

json_file: str#

The path to the tag_config.json file

logger_properties: dict#

The dictionary of the logger properties. You can look at keys in Logger class.

root: _Element#

Etree root object of the XML-based web service

class tds2stac.WebServiceListScraper(url: str, logger_properties: dict = {}, requests_properties: dict = {})[source]#

Bases: object

A class for getting the list of available web services of a TDS catalogs.

Args:

url (str): The catalog URL from TDS to provide its web services logger_properties (dict, optional):The dictionary of the logger properties. requests_properties (dict, optional): A dictionary that modify the requests to URLs.

aslist()[source]#
logger_properties: dict#

The dictionary of the logger properties. You can look at keys in Logger class.

requests_properties: dict#

To obtain additional information on this topic, refer to the requests_properties. The default value is an empty dictionary.

url: str#

url is the url of the TDS catalog

1.1. Subpackages#

1.2. Submodules#