Utils API Docs#

utils#

pyearthtools.utils.dynamic_import(object_path)#

Provide dynamic import capability

Parameters:

object_path (str) – Path to import

Raises:

(ImportError, ModuleNotFoundError) – If cannot be imported

Returns:

Imported objects

Return type:

(Callable | ModuleType)

pyearthtools.utils.load(stream, **kwargs)#

Load pyearthtools file replacing defaults

Parameters:

stream (str | Path)

pyearthtools.utils.save(obj, path=None)#

Save pyearthtools objects

Parameters:

path (str | Path | None)

utils.config#

pyearthtools.utils.config.canonical_name(k, config)#

Return the canonical name for a key.

Handles user choice of ‘-’ or ‘_’ conventions by standardizing on whichever version was set first. If a key already exists in either hyphen or underscore form, the existing version is the canonical name. If neither version exists the original key is used as is.

Parameters:
  • k (str)

  • config (dict)

Return type:

str

pyearthtools.utils.config.merge(*dicts)#

Update a sequence of nested dictionaries

This prefers the values in the latter dictionaries to those in the former

Examples

>>> a = {'x': 1, 'y': {'a': 2}}
>>> b = {'y': {'b': 3}}
>>> merge(a, b)
{'x': 1, 'y': {'a': 2, 'b': 3}}

See also

pyearthtools.utils.config.update

Parameters:

dicts (Mapping)

Return type:

dict

pyearthtools.utils.config.collect_yaml(paths=['/etc/pyearthtools', '/home/docs/checkouts/readthedocs.org/user_builds/pyearthtools/envs/latest/etc/pyearthtools', '/home/docs/.config/pyearthtools'])#

Collect configuration from yaml files

This searches through a list of paths, expands to find all yaml or json files, and then parses each file.

Parameters:

paths (Sequence[str])

Return type:

list[dict]

pyearthtools.utils.config.collect_env(env=None)#

Collect config from environment variables

This grabs environment variables of the form “pyearthtools_FOO__BAR_BAZ=123” and turns these into config variables of the form {"foo": {"bar_baz": 123}} It transforms the key and value in the following way:

  • Lower-cases the key text

  • Treats __ (double-underscore) as nested access

  • Calls ast.literal_eval on the value

Parameters:

env (Mapping[str, str] | None)

Return type:

dict

pyearthtools.utils.config.ensure_file(source, destination=None, comment=True)#

Copy file to default location if it does not already exist

This tries to move a default configuration file to a default location if if does not already exist. It also comments out that file by default.

Parameters:
  • source (string, filename) – Source configuration file, typically within a source directory.

  • destination (string, directory) – Destination directory. Configurable by pyearthtools_CONFIG environment variable, falling back to ~/.config/pyearthtools.

  • comment (bool, True by default) – Whether or not to comment out the config file when copying.

Return type:

None

pyearthtools.utils.config.collect(paths=['/etc/pyearthtools', '/home/docs/checkouts/readthedocs.org/user_builds/pyearthtools/envs/latest/etc/pyearthtools', '/home/docs/.config/pyearthtools'], env=None)#

Collect configuration from paths and environment variables

Parameters:
  • paths (list[str]) – A list of paths to search for yaml config files

  • env (Mapping[str, str]) – The system environment variables

Returns:

config

Return type:

dict

See also

pyearthtools.utils.config.refresh

collect configuration and update into primary config

pyearthtools.utils.config.refresh(config={'data': {'experimental': False, 'future_warning': True, 'open': {'xarray': {'chunks': 'auto', 'combine_attrs': 'drop_conflicts', 'engine': 'netcdf4'}, 'xarray_mf': {'parallel': False}}, 'patterns': {'default_extension': '.nc'}, 'save': {'xarray': {'engine': 'netcdf4'}, 'zarr': {'compute': True}}, 'search_function': 'filesystem', 'series': {'warning_threshold': 5}}, 'logger': {'default': {'backupcount': 50, 'formatter': None, 'log_file_directory': None, 'log_file_name': None, 'logfile_logger_level': 'DEBUG', 'maxBytes': 128000000, 'stream_logger_level': 'WARNING'}}, 'models': {'assets': '~/.pyearthtools/models/assets/', 'cache': '~/tmp/pyearthtools/models/cache/', 'configs': None, 'imports': None, 'pattern': {'ForecastExpandedDateVariable': {'directory_resolution': 'year'}}, 'training_data_index': {'pattern': 'ForecastExpandedDateVariable'}}, 'pipeline': {'exceptions': {'default_ignore': [], 'max_filter': 10, 'max_iterator': 20}, 'parallel': {'dask': {'client': {'processes': False}, 'config': {}, 'start': True}, 'default': 'Serial', 'enabled': {'Delayed': True, 'Futures': False}}, 'repr': {'show_graph': True}, 'run_parallel': False}, 'utils': {'repr': None}}, defaults=[{'utils': {'repr': None}}, {'logger': {'default': {'backupcount': 50, 'formatter': None, 'log_file_directory': None, 'log_file_name': None, 'logfile_logger_level': 'DEBUG', 'maxBytes': 128000000, 'stream_logger_level': 'WARNING'}}}, {'data': {'experimental': False, 'future_warning': True, 'open': {'xarray': {'chunks': 'auto', 'combine_attrs': 'drop_conflicts', 'engine': 'netcdf4'}, 'xarray_mf': {'parallel': False}}, 'patterns': {'default_extension': '.nc'}, 'save': {'xarray': {'engine': 'netcdf4'}, 'zarr': {'compute': True}}, 'search_function': 'filesystem', 'series': {'warning_threshold': 5}}}, {'pipeline': {'exceptions': {'default_ignore': [], 'max_filter': 10, 'max_iterator': 20}, 'parallel': {'dask': {'client': {'processes': False}, 'config': {}, 'start': True}, 'default': 'Serial', 'enabled': {'Delayed': True, 'Futures': False}}, 'repr': {'show_graph': True}, 'run_parallel': False}}, {'models': {'assets': '~/.pyearthtools/models/assets/', 'cache': '~/tmp/pyearthtools/models/cache/', 'configs': None, 'imports': None, 'pattern': {'ForecastExpandedDateVariable': {'directory_resolution': 'year'}}, 'training_data_index': {'pattern': 'ForecastExpandedDateVariable'}}}], **kwargs)#

Update configuration by re-reading yaml files and env variables

This mutates the global pyearthtools.utils.config.config, or the config parameter if passed in.

This goes through the following stages:

  1. Clearing out all old configuration

  2. Updating from the stored defaults from downstream libraries (see update_defaults)

  3. Updating from yaml files and environment variables

  4. Automatically renaming deprecated keys (with a warning)

Note that some functionality only checks configuration once at startup and may not change behavior, even if configuration changes. It is recommended to restart your python process if convenient to ensure that new configuration changes take place.

Parameters:
  • config (dict)

  • defaults (list[Mapping])

Return type:

None

pyearthtools.utils.config.get(key, default=None, config={'data': {'experimental': False, 'future_warning': True, 'open': {'xarray': {'chunks': 'auto', 'combine_attrs': 'drop_conflicts', 'engine': 'netcdf4'}, 'xarray_mf': {'parallel': False}}, 'patterns': {'default_extension': '.nc'}, 'save': {'xarray': {'engine': 'netcdf4'}, 'zarr': {'compute': True}}, 'search_function': 'filesystem', 'series': {'warning_threshold': 5}}, 'logger': {'default': {'backupcount': 50, 'formatter': None, 'log_file_directory': None, 'log_file_name': None, 'logfile_logger_level': 'DEBUG', 'maxBytes': 128000000, 'stream_logger_level': 'WARNING'}}, 'models': {'assets': '~/.pyearthtools/models/assets/', 'cache': '~/tmp/pyearthtools/models/cache/', 'configs': None, 'imports': None, 'pattern': {'ForecastExpandedDateVariable': {'directory_resolution': 'year'}}, 'training_data_index': {'pattern': 'ForecastExpandedDateVariable'}}, 'pipeline': {'exceptions': {'default_ignore': [], 'max_filter': 10, 'max_iterator': 20}, 'parallel': {'dask': {'client': {'processes': False}, 'config': {}, 'start': True}, 'default': 'Serial', 'enabled': {'Delayed': True, 'Futures': False}}, 'repr': {'show_graph': True}, 'run_parallel': False}, 'utils': {'repr': None}}, override_with=None)#

Get elements from global config

If override_with is not None this value will be passed straight back. Useful for getting kwarg defaults from pyearthtools config.

Use ‘.’ for nested access

Examples

>>> from pyearthtools import config
>>> config.get('foo')
{'x': 1, 'y': 2}
>>> config.get('foo.x')
1
>>> config.get('foo.x.y', default=123)
123
>>> config.get('foo.y', override_with=None)
2
>>> config.get('foo.y', override_with=3)
3
Parameters:
  • key (str)

  • default (Any)

  • config (dict)

  • override_with (Any)

Return type:

Any

pyearthtools.utils.config.pop(key, default=None, config={'data': {'experimental': False, 'future_warning': True, 'open': {'xarray': {'chunks': 'auto', 'combine_attrs': 'drop_conflicts', 'engine': 'netcdf4'}, 'xarray_mf': {'parallel': False}}, 'patterns': {'default_extension': '.nc'}, 'save': {'xarray': {'engine': 'netcdf4'}, 'zarr': {'compute': True}}, 'search_function': 'filesystem', 'series': {'warning_threshold': 5}}, 'logger': {'default': {'backupcount': 50, 'formatter': None, 'log_file_directory': None, 'log_file_name': None, 'logfile_logger_level': 'DEBUG', 'maxBytes': 128000000, 'stream_logger_level': 'WARNING'}}, 'models': {'assets': '~/.pyearthtools/models/assets/', 'cache': '~/tmp/pyearthtools/models/cache/', 'configs': None, 'imports': None, 'pattern': {'ForecastExpandedDateVariable': {'directory_resolution': 'year'}}, 'training_data_index': {'pattern': 'ForecastExpandedDateVariable'}}, 'pipeline': {'exceptions': {'default_ignore': [], 'max_filter': 10, 'max_iterator': 20}, 'parallel': {'dask': {'client': {'processes': False}, 'config': {}, 'start': True}, 'default': 'Serial', 'enabled': {'Delayed': True, 'Futures': False}}, 'repr': {'show_graph': True}, 'run_parallel': False}, 'utils': {'repr': None}})#

Like get, but remove the element if found

Parameters:
  • key (str)

  • default (Any)

  • config (dict)

Return type:

Any

pyearthtools.utils.config.update_defaults(new, config={'data': {'experimental': False, 'future_warning': True, 'open': {'xarray': {'chunks': 'auto', 'combine_attrs': 'drop_conflicts', 'engine': 'netcdf4'}, 'xarray_mf': {'parallel': False}}, 'patterns': {'default_extension': '.nc'}, 'save': {'xarray': {'engine': 'netcdf4'}, 'zarr': {'compute': True}}, 'search_function': 'filesystem', 'series': {'warning_threshold': 5}}, 'logger': {'default': {'backupcount': 50, 'formatter': None, 'log_file_directory': None, 'log_file_name': None, 'logfile_logger_level': 'DEBUG', 'maxBytes': 128000000, 'stream_logger_level': 'WARNING'}}, 'models': {'assets': '~/.pyearthtools/models/assets/', 'cache': '~/tmp/pyearthtools/models/cache/', 'configs': None, 'imports': None, 'pattern': {'ForecastExpandedDateVariable': {'directory_resolution': 'year'}}, 'training_data_index': {'pattern': 'ForecastExpandedDateVariable'}}, 'pipeline': {'exceptions': {'default_ignore': [], 'max_filter': 10, 'max_iterator': 20}, 'parallel': {'dask': {'client': {'processes': False}, 'config': {}, 'start': True}, 'default': 'Serial', 'enabled': {'Delayed': True, 'Futures': False}}, 'repr': {'show_graph': True}, 'run_parallel': False}, 'utils': {'repr': None}}, defaults=[{'utils': {'repr': None}}, {'logger': {'default': {'backupcount': 50, 'formatter': None, 'log_file_directory': None, 'log_file_name': None, 'logfile_logger_level': 'DEBUG', 'maxBytes': 128000000, 'stream_logger_level': 'WARNING'}}}, {'data': {'experimental': False, 'future_warning': True, 'open': {'xarray': {'chunks': 'auto', 'combine_attrs': 'drop_conflicts', 'engine': 'netcdf4'}, 'xarray_mf': {'parallel': False}}, 'patterns': {'default_extension': '.nc'}, 'save': {'xarray': {'engine': 'netcdf4'}, 'zarr': {'compute': True}}, 'search_function': 'filesystem', 'series': {'warning_threshold': 5}}}, {'pipeline': {'exceptions': {'default_ignore': [], 'max_filter': 10, 'max_iterator': 20}, 'parallel': {'dask': {'client': {'processes': False}, 'config': {}, 'start': True}, 'default': 'Serial', 'enabled': {'Delayed': True, 'Futures': False}}, 'repr': {'show_graph': True}, 'run_parallel': False}}, {'models': {'assets': '~/.pyearthtools/models/assets/', 'cache': '~/tmp/pyearthtools/models/cache/', 'configs': None, 'imports': None, 'pattern': {'ForecastExpandedDateVariable': {'directory_resolution': 'year'}}, 'training_data_index': {'pattern': 'ForecastExpandedDateVariable'}}}])#

Add a new set of defaults to the configuration

It does two things:

  1. Add the defaults to a global collection to be used by refresh later

  2. Updates the global config with the new configuration. Old values are prioritized over new ones, unless the current value is the old default, in which case it’s updated to the new default.

Parameters:
  • new (Mapping)

  • config (dict)

  • defaults (list[Mapping])

Return type:

None

pyearthtools.utils.config.expand_environment_variables(config)#

Expand environment variables in a nested config dictionary

This function will recursively search through any nested dictionaries and/or lists.

Parameters:

config (dict, iterable, or str) – Input object to search for environment variables

Returns:

config

Return type:

same type as input

Examples

>>> expand_environment_variables({'x': [1, 2, '$USER']})
{'x': [1, 2, 'my-username']}
pyearthtools.utils.config.rename(deprecations={}, config={'data': {'experimental': False, 'future_warning': True, 'open': {'xarray': {'chunks': 'auto', 'combine_attrs': 'drop_conflicts', 'engine': 'netcdf4'}, 'xarray_mf': {'parallel': False}}, 'patterns': {'default_extension': '.nc'}, 'save': {'xarray': {'engine': 'netcdf4'}, 'zarr': {'compute': True}}, 'search_function': 'filesystem', 'series': {'warning_threshold': 5}}, 'logger': {'default': {'backupcount': 50, 'formatter': None, 'log_file_directory': None, 'log_file_name': None, 'logfile_logger_level': 'DEBUG', 'maxBytes': 128000000, 'stream_logger_level': 'WARNING'}}, 'models': {'assets': '~/.pyearthtools/models/assets/', 'cache': '~/tmp/pyearthtools/models/cache/', 'configs': None, 'imports': None, 'pattern': {'ForecastExpandedDateVariable': {'directory_resolution': 'year'}}, 'training_data_index': {'pattern': 'ForecastExpandedDateVariable'}}, 'pipeline': {'exceptions': {'default_ignore': [], 'max_filter': 10, 'max_iterator': 20}, 'parallel': {'dask': {'client': {'processes': False}, 'config': {}, 'start': True}, 'default': 'Serial', 'enabled': {'Delayed': True, 'Futures': False}}, 'repr': {'show_graph': True}, 'run_parallel': False}, 'utils': {'repr': None}})#

Rename old keys to new keys

This helps migrate older configuration versions over time

Parameters:
  • deprecations (Mapping[str, str | None])

  • config (dict)

Return type:

None

pyearthtools.utils.config.check_deprecations(key, deprecations={})#

Check if the provided value has been renamed or removed

Parameters:
  • key (str) – The configuration key to check

  • deprecations (Dict[str, str]) – The mapping of aliases

Return type:

str

Examples

>>> deprecations = {"old_key": "new_key", "invalid": None}
>>> check_deprecations("old_key", deprecations=deprecations)
FutureWarning: pyearthtools configuration key 'old_key' has been deprecated; please use "new_key" instead
>>> check_deprecations("invalid", deprecations=deprecations)
Traceback (most recent call last):
    ...
ValueError: pyearthtools configuration key 'invalid' has been removed
>>> check_deprecations("another_key", deprecations=deprecations)
'another_key'
Returns:

new – The proper key, whether the original (if no deprecation) or the aliased value

Return type:

str

Parameters:
  • key (str)

  • deprecations (Mapping[str, str | None])

See also

rename

class pyearthtools.utils.config.set(arg=None, config={'data': {'experimental': False, 'future_warning': True, 'open': {'xarray': {'chunks': 'auto', 'combine_attrs': 'drop_conflicts', 'engine': 'netcdf4'}, 'xarray_mf': {'parallel': False}}, 'patterns': {'default_extension': '.nc'}, 'save': {'xarray': {'engine': 'netcdf4'}, 'zarr': {'compute': True}}, 'search_function': 'filesystem', 'series': {'warning_threshold': 5}}, 'logger': {'default': {'backupcount': 50, 'formatter': None, 'log_file_directory': None, 'log_file_name': None, 'logfile_logger_level': 'DEBUG', 'maxBytes': 128000000, 'stream_logger_level': 'WARNING'}}, 'models': {'assets': '~/.pyearthtools/models/assets/', 'cache': '~/tmp/pyearthtools/models/cache/', 'configs': None, 'imports': None, 'pattern': {'ForecastExpandedDateVariable': {'directory_resolution': 'year'}}, 'training_data_index': {'pattern': 'ForecastExpandedDateVariable'}}, 'pipeline': {'exceptions': {'default_ignore': [], 'max_filter': 10, 'max_iterator': 20}, 'parallel': {'dask': {'client': {'processes': False}, 'config': {}, 'start': True}, 'default': 'Serial', 'enabled': {'Delayed': True, 'Futures': False}}, 'repr': {'show_graph': True}, 'run_parallel': False}, 'utils': {'repr': None}}, lock=<unlocked _thread.lock object>, **kwargs)#

Temporarily set configuration values within a context manager

Parameters:
  • arg (mapping or None, optional) – A mapping of configuration key-value pairs to set.

  • **kwargs – Additional key-value pairs to set. If arg is provided, values set in arg will be applied before those in kwargs. Double-underscores (__) in keyword arguments will be replaced with ., allowing nested values to be easily set.

  • config (dict)

  • lock (threading.Lock)

Examples

>>> import pyearthtools

Set 'foo.bar' in a context, by providing a mapping.

>>> with pyearthtools.utils.config.set({'foo.bar': 123}):
...     pass

Set 'foo.bar' in a context, by providing a keyword argument.

>>> with pyearthtools.utils.config.set(foo__bar=123):
...     pass

Set 'foo.bar' globally.

>>> pyearthtools.utils.config.set(foo__bar=123)

utils.context#

class pyearthtools.utils.context.ChangeValue(object, key, value)#

Context Manager to change attribute of object and revert after

Example

>>> object.attribute = 'value'
>>> print(object.attribute) # 'value'
>>>
>>> with ChangeValue(object, key = 'attribute', value = 'NewValue'):
>>>     object.attribute = 'NewValue'
>>>     print(object.attribute) # 'NewValue'
>>>
>>> print(object.attribute) # 'value'

Update Attribute of an object

Parameters:
  • object (Any) – Object to update

  • key (str) – Attribute Name

  • value (Any) – Value to update key to

Raises:

AttributeError – If object has no attribute key

class pyearthtools.utils.context.Catch(exceptions, *excep, logger=None)#

Catch and ignore exceptions raised within scope of the context

Catch exceptions occuring within context.

Can also log exceptions if given a logger.

Parameters:
  • exceptions (tuple[Type[Exception]] | Type[Exception]) – Types of exceptions to catch.

  • logger (logging.Logger | None, optional) – Logger to log exceptions to if given. Logs as debug. Defaults to None.

  • excep (Type[Exception])

class pyearthtools.utils.context.PrintOnError(msg)#

Print a msg on exception

Parameters:

msg (str | Callable)

utils.data#

class pyearthtools.utils.data.Tesselator(kernel_size, stride=None, padding='reflect', coord_template=None, out_name='Reconstructed', ignore_difference=False)#

Data Tesselator.

Used to split a numpy or xarray object into patches of a given size, and given stride.

Provides methods to stitch the patches back together into the input object.

Create Tesselator

Parameters:
  • kernel_size (int) – Size of each individual kernel

  • stride (int, optional) – Distance between kernels.If none, set to kernel_size. Defaults to None.

  • padding (str, optional) – Padding operation, either str or function. Must be one of np.pad modes. If padding is None, patches will not consist of the whole data, and issues will arise with stitching. Defaults to “reflect”.

  • coord_template (xr.Dataset | xr.DataArray, optional) – Set coordinate template for stitch output. Defaults to None.

  • out_name (str, optional) – Name of dataArray outputted from stitch. Defaults to “Reconstructed”.

  • ignore_difference (bool, optional) – Quiet warnings about differences in shapes when undoing

patch(input_data, data_format=None, **kwargs)#

Patch incoming data into patches as configured

Parameters:
  • input_data (xr.DataArray | xr.Dataset | np.ndarray) – Data to Patch

  • data_format (str, optional) – Format of Data if not normal. Defaults to None.

  • **kwargs (Any, optional) – Extra keyword args to be passed to [make_patches][pyearthtools.utils.data.tesselator._patching.patches.make_patches]

Returns:

Patches of data, with the first dimension being the squashed patch dim

Return type:

(np.ndarray)

stitch(input_data, data_format='TCHW', override=None, var_select=None, as_numpy=False)#

Stitch back together patches of data generated by this Tesselator

If original data was an [xarray][xarray] object, this will attempt to reconstruct it with dims & coords intact

Parameters:
  • input_data (np.ndarray) – Patches to stitch together, must have come from this Tesselator

  • data_format (str, optional) – Format of Data after patch fim. Defaults to “TCHW”.

  • override (dict, optional) – Override for coordinates for xarray. Defaults to None.

  • var_select (str, optional) – If only one variable is given back, use this to select it. Defaults to None.

  • as_numpy (bool, optional) – Whether to return only as numpy. Defaults to False.

Raises:

NotImplementedError – If offset to remove padding is negative

Returns:

Data in same format as it came in as, unless as_numpy == True

Return type:

(np.ndarray | xr.Dataset | xr.DataArray)

class pyearthtools.utils.data.converter.xarrayConverter(warn=True)#

Stateful xarray converter.

This class will record records of the attributes and coordinates of the incoming datasets, which are then used to rebuild the array.

This operates on a first in first out (FIFO) approach with storing, and removing records. So if two datasets are converter, it is expected that they will come be converted back to xarray in the same order.

Cannot be used directly, use either NumpyConverter or DaskConverter

Examples

>>> converter = NumpyConverter()
>>> np_data_1 = converter(dataset_1)
>>> np_data_2 = converter(dataset_2)
>>> converter(np_data_1) == dataset_1
True
>>> converter(np_data_2) == dataset_2
True
>>> np_data_2 = converter(dataset_2)
>>> np_data_1 = converter(dataset_1)
>>> converter(np_data_1) == dataset_1
False
>>> converter(np_data_2) == dataset_2
False

Converter to and from xarray objects

Parameters:

warn (bool, optional) – Warn on incorrect shape. Defaults to True.

class pyearthtools.utils.data.converter.NumpyConverter(warn=True)#

Converter to and from xarray objects

Parameters:

warn (bool, optional) – Warn on incorrect shape. Defaults to True.

convert_from_xarray(data, replace=False)#

Convert a given dataset/s to [np.array/s][numpy.ndarray]

Reminder, this class operates with a FIFO approach, each incoming dataset’s records will be saved, and popped out when being rebuilt. Unless replace is True, then will replace instead.

Parameters:
  • data (tuple[xr.Dataset] | xr.Dataset) – data/s to convert into arrays

  • replace (bool, optional) – Whether to replace entries, instead of inserting them. Defaults to False

Raises:

TypeError – If invalid data passed

Returns:

Generated array/s from Dataset/s

Return type:

(np.ndarray | tuple[np.ndarray])

convert_to_xarray(data, pop=True)#

Convert [array/s][numpy.ndarray] into [Dataset/s][xarray.Dataset] inferring metadata from saved records.

Reminder, this class operates on a FIFO approach, records will be popped from the saved records, unless turned off.

!!! Warning

If a tuple of datasets was passed to [convert_xarray_to_numpy][pyearthtools.pipeline.operations.to_numpy.NumpyConverter.convert_xarray_to_numpy] and they are different, it is best to pass a tuple to this function replicating the order

Parameters:
  • data (np.ndarray) – [array/s][numpy.ndarray] to convert back to [Dataset/s][xarray.Dataset]

  • pop (bool, optional) – Whether to pop records from _records. Defaults to True

Returns:

Rebuilt [Dataset/s][xarray.Dataset]

Return type:

(xr.Dataset | tuple[xr.Dataset])

class pyearthtools.utils.data.converter.DaskConverter(*args, **kwargs)#

Converter to and from xarray objects

Parameters:

warn (bool, optional) – Warn on incorrect shape. Defaults to True.

convert_from_xarray(data, replace=False)#

Convert a given dataset/s to dask arrays

Reminder, this class operates with a FIFO approach, each incoming dataset’s records will be saved, and popped out when being rebuilt. Unless replace is True, then will replace instead.

Parameters:
  • data (tuple[xr.Dataset] | xr.Dataset) – data/s to convert into arrays

  • replace (bool, optional) – Whether to replace entries, instead of inserting them. Defaults to False

Raises:

TypeError – If invalid data passed

Returns:

Generated array/s from Dataset/s

Return type:

(dask.array.Array | tuple[dask.array.Array, …])

convert_to_xarray(data, pop=True)#

Convert [array/s][numpy.ndarray] into [Dataset/s][xarray.Dataset] inferring metadata from saved records.

Reminder, this class operates on a FIFO approach, records will be popped from the saved records, unless turned off.

!!! Warning

If a tuple of datasets was passed to [convert_xarray_to_numpy][pyearthtools.pipeline.operations.to_numpy.NumpyConverter.convert_xarray_to_numpy] and they are different, it is best to pass a tuple to this function replicating the order

Parameters:
  • data (np.ndarray) – [array/s][numpy.ndarray] to convert back to [Dataset/s][xarray.Dataset]

  • pop (bool, optional) – Whether to pop records from _records. Defaults to True

Returns:

Rebuilt [Dataset/s][xarray.Dataset]

Return type:

(xr.Dataset | tuple[xr.Dataset])

utils.decorators#

pyearthtools.utils.decorators.alias_arguments(**aliases)#

Setup aliases for parameters

Parameters:

**aliases (str | list[str]) –

Dictionary pair, of true name to aliases

Values can be either str or list of strings

Return type:

Callable[[C], C]

Examples

>>> @alias_arguments(response = ['rep', 'answer'])
    def function(response):
        return response
>>> function('yes')
... 'yes'
>>> function(rep = 'maybe')
... 'maybe'
>>> function(hello = 'maybe')
... # An Error is raised
pyearthtools.utils.decorators.invert_dictionary_list(dictionary)#
Parameters:

dictionary (dict)

Return type:

dict

class pyearthtools.utils.decorators.classproperty(fget=None, fset=None, fdel=None, doc=None)#

Set a method available as a property on the class

utils.initialisation#

pyearthtools.utils.initialisation.dynamic_import(object_path)#

Provide dynamic import capability

Parameters:

object_path (str) – Path to import

Raises:

(ImportError, ModuleNotFoundError) – If cannot be imported

Returns:

Imported objects

Return type:

(Callable | ModuleType)

pyearthtools.utils.initialisation.load(stream, **kwargs)#

Load pyearthtools file replacing defaults

Parameters:

stream (str | Path)

pyearthtools.utils.initialisation.save(obj, path=None)#

Save pyearthtools objects

Parameters:

path (str | Path | None)

pyearthtools.utils.initialisation.update_contents(contents, **kwargs)#

Update contents

Looking for str values to attempt a str replacement defined by the kwargs

A key inside the dictionary must be of form __KEY__, with KEY being a str.

If ‘:’ follows the KEY part and still within ‘__*__’, anything following will be considered the default value.

Parameters:

contents (str)

Return type:

str

class pyearthtools.utils.initialisation.InitialisationRecordingMixin#

Mixin to record initialisation arguments of the child class.

Also provides a repr from these initialisation args.

Children must call the following for the functionality to work. `python super().__init__() self.record_initialisation() `

Properties:
_pyearthtools_initialisation (dict[pyearthtools_INIT_KEYS, Any]):

‘class’ (str): Override for class location.

_pyearthtools_repr (dict[pyearthtools_REPR_KEYS, Any]):

ignore (Sequence[str]): Arguments to ignore from _initialisation. expand (Sequence[str]): Arguments to expand from _initialisation. expand_attr (Sequence[str]): Arguments to get from class and expand.

_desc_ (dict[str, Any]):

Description of object to display at the top of the repr. Use ‘singleline’ to change element not minimised next to Description.

_property (str):

Property to call on self to get back correct object when loaded in from yaml.

`to_repr_dict`

How to display object. Must return a dictionary.

copy(**overrides)#

Using recorded initialisation create a copy of self.

Parameters:
  • self (Self)

  • overrides (Any)

Return type:

Self

record_initialisation(ignore=None)#

Record initialisation of class

super().__init__() must be called before.

Parameters:

ignore (Sequence[str], optional) – Ignore arguments. Defaults to [].

to_repr_dict()#

Convert to dictionary ready for repr

Return type:

dict[str, Any]

update_initialisation(update=None, /, **upda)#

Update components of the initialisation dictionary

Parameters:
  • update (Optional[dict[str, Any]], optional) – Dictionary to update with. Defaults to None.

  • **upda (Any) – Kwarg form of update.

class pyearthtools.utils.initialisation.Dumper(stream, default_style=None, default_flow_style=False, canonical=None, indent=None, width=None, allow_unicode=None, line_break=None, encoding=None, explicit_start=None, explicit_end=None, version=None, tags=None, sort_keys=True)#

pyearthtools yaml dumper

class pyearthtools.utils.initialisation.Loader(stream)#

Initialize the scanner.

utils.logger#

pyearthtools.utils.logger.initiate_logging(submodule)#

Setup logger for submodule of pyearthtools

Uses pyearthtools.config.logger to configure the levels and logging behaviour.

The setup logger is accessible from logging at, `python logger = logging.getLogger(f"pyearthtools.{submodule}") `

Parameters:

submodule (str) – Submodule to setup logger for

pyearthtools.utils.logger.reconfigure()#

utils.parameter#

pyearthtools.utils.parameter.search(function, verbose=False, **kwargs)#

Try calling the given function with every permuation of the given arguments

Suggested to use SingleParameter, ListParameter, RangeParameter to create the arguments

Parameters:
  • function (Callable) – Function to call

  • verbose (bool, optional) – Whether to print configs. Defaults to False.

Returns:

List of valid configurations

Return type:

(list[dict])

pyearthtools.utils.parameter.search_threaded(function, verbose=False, **kwargs)#

Threaded version of search

Try calling the given function with every permuation of the given arguments

Suggested to use SingleParameter, ListParameter, RangeParameter to create the arguments

Parameters:
  • function (Callable) – Function to call

  • verbose (bool, optional) – Whether to print configs. Defaults to False.

Returns:

List of valid configurations

Return type:

(list[dict])

class pyearthtools.utils.parameter.SingleParameter(item)#

Single parameter for searching

Examples >>> list(SingleParameter((0,3))) [(0, 3)]

class pyearthtools.utils.parameter.ListParameter(element_range, num_elements)#

Parameter which is a list, and each element is with a range

Examples

>>> list(ListParameter.from_minmax(0, 3, 3))
>>> [(0, 0, 0),
>>> (0, 0, 1),
>>> ...
>>> (2, 2, 1),
>>> (2, 2, 2)]
>>>
>>> list(ListParameter(['a','b','c'], 3))
>>> [('a', 'a', 'a'),
>>> ('a', 'a', 'b'),
>>> ...
>>> ('c', 'c', 'b'),
>>> ('c', 'c', 'c')]
Parameters:
  • element_range (list)

  • num_elements (int)

class pyearthtools.utils.parameter.RangeParameter(*args, **kwargs)#

Integer range of parameters

Examples >>> list(RangeParameter(0,3)) [0, 1, 2]

utils.repr_utils#

pyearthtools.utils.repr_utils.provide_html(*objects, name=None, description=None, documentation_attr=None, info_attr=None, name_attr=None, backup_repr='Failed to create HTML repr', expanded=None)#

Create a html_repr from a list of objects.

Formatted like the xarray repr, with docstrings or documentation_attr being retrieved

Parameters:
  • name (str, optional) – Name of overall repr. Defaults to None.

  • documentation_attr (str, optional) – Attribute to retrieve as documentation. Defaults to None.

  • backup_repr (str, optional) – If HTML repr fails, fail over repr. Defaults to “Failed to create HTML repr”.

  • description (dict)

  • info_attr (str)

  • name_attr (str)

  • expanded (bool)

Returns:

HTML repr of objects

Return type:

(str)

pyearthtools.utils.repr_utils.html(*objects, name=None, description=None, documentation_attr=None, info_attr=None, name_attr=None, backup_repr='Failed to create HTML repr', expanded=None)#

Create a html_repr from a list of objects.

Formatted like the xarray repr, with docstrings or documentation_attr being retrieved

Parameters:
  • name (str, optional) – Name of overall repr. Defaults to None.

  • documentation_attr (str, optional) – Attribute to retrieve as documentation. Defaults to None.

  • backup_repr (str, optional) – If HTML repr fails, fail over repr. Defaults to “Failed to create HTML repr”.

  • description (dict)

  • info_attr (str)

  • name_attr (str)

  • expanded (bool)

Returns:

HTML repr of objects

Return type:

(str)

pyearthtools.utils.repr_utils.default(*objects, name=None, description=None, documentation_attr=None, info_attr=None, name_attr=None, backup_repr='Failed to create HTML repr', expanded=None)#
Parameters:
  • name (str | None)

  • description (dict | None)

  • documentation_attr (str | None)

  • info_attr (str | None)

  • name_attr (str | None)

  • backup_repr (str)

  • expanded (bool | None)

Return type:

str