deepsensor.data.task

deepsensor.data.task#

class Task(task_dict)[source]#

Bases: dict

Task dictionary class.

Inherits from dict and adds methods for printing and modifying the data.

Parameters:

task_dict (dict) – Dictionary containing the task.

add_batch_dim()[source]#

Add a batch dimension to the arrays in the task dictionary.

Returns:

deepsensor.data.task.Task – Task with batch dimension added to the array elements.

cast_to_float32()[source]#

Cast the arrays in the task dictionary to float32.

Returns:

deepsensor.data.task.Task – Task with arrays cast to float32.

convert_to_tensor()[source]#

Convert to tensor object based on deep learning backend.

Returns:

deepsensor.data.task.Task – Task with arrays converted to deep learning tensor objects.

flatten_gridded_data()[source]#

Convert any gridded data in Task to flattened arrays.

Necessary for AR sampling, which doesn’t yet permit gridded context sets.

Parameters:

taskTask

Returns:

deepsensor.data.task.Task – …

mask_nans_nps()[source]#

Returns:

deepsensor.data.task.Task – …

mask_nans_numpy()[source]#

Replace NaNs with zeroes and set a mask to indicate where the NaNs were.

Returns:

deepsensor.data.task.Task – Task with NaNs set to zeros and a mask indicating where the missing values are.

op(f, op_flag=None)[source]#

Apply function f to the array elements of a task dictionary.

Useful for recasting to a different dtype or reshaping (e.g. adding a batch dimension).

Parameters:
  • f (callable) – Function to apply to the array elements of the task.

  • op_flag (str) – Flag to set in the task dictionary’s ops key.

Returns:

deepsensor.data.task.Task – Task with f applied to the array elements and op_flag set in the ops key.

remove_context_nans()[source]#

If NaNs are present in task[“Y_c”], remove them (and corresponding task[“X_c”])

Returns:

deepsensor.data.task.Task – …

remove_target_nans()[source]#

If NaNs are present in task[“Y_t”], remove them (and corresponding task[“X_t”])

Returns:

deepsensor.data.task.Task – …

classmethod summarise_repr(k, v)[source]#

Summarise the task in a representation that can be printed.

:param cls (deepsensor.data.task.Task: ):

Task class.

Parameters:
  • k (str) – Key of the task dictionary.

  • v (object) – Value of the task dictionary.

Returns:

str – String representation of the task.

classmethod summarise_str(k, v)[source]#
append_obs_to_task(task, X_new, Y_new, context_set_idx)[source]#

Append a single observation to a context set in task.

Makes a deep copy of the data structure to avoid affecting the original object.

:param task (deepsensor.data.task.Task: ): The task to modify. :param X_new: New observation coordinates. :type X_new: array-like :param Y_new: New observation values. :type Y_new: array-like :param context_set_idx: Index of the context set to append to. :type context_set_idx: int

Returns:

deepsensor.data.task.Task – Task with new observation appended to the context set.

concat_tasks(tasks, multiple=1)[source]#

Concatenate a list of tasks into a single task containing multiple batches.

:param tasks (List[deepsensor.data.task.Task: ]):

List of tasks to concatenate into a single task.

Parameters:

multiple (int, optional) – Contexts are padded to the smallest multiple of this number that is greater than the number of contexts in each task. Defaults to 1 (padded to the largest number of contexts in the tasks). Setting to a larger number will increase the amount of padding but decrease the range of tensor shapes presented to the model, which simplifies the computational graph in graph mode.

Returns:

Task – Task containing multiple batches.

Raises:
  • ValueError – If the tasks have different numbers of target sets.

  • ValueError – If the tasks have different numbers of targets.

  • ValueError – If the tasks have different types of target sets (gridded/ non-gridded).

flatten_X(X)[source]#

Convert tuple of gridded coords to (2, N) array if necessary.

Parameters:

X (numpy.ndarray | Tuple[numpy.ndarray, numpy.ndarray]) –

Returns:

numpy.ndarray

flatten_Y(Y)[source]#

Convert gridded data of shape (N_dim, N_x1, N_x2) to (N_dim, N_x1 * N_x2) array if necessary.

Parameters:

Y (numpy.ndarray | Tuple[numpy.ndarray, numpy.ndarray]) –

Returns:

numpy.ndarray