snorkel.classification.DictDataset¶
-
class
snorkel.classification.
DictDataset
(name, split, X_dict, Y_dict)[source]¶ Bases:
torch.utils.data.dataset.Dataset
A dataset where both the data fields and labels are stored in as dictionaries.
- Parameters
name (
str
) – The name of the dataset (e.g., this will be used to report metrics on a per-dataset basis)split (
str
) – The name of the split that the data in this object representsX_dict (
Dict
[str
,Any
]) – A map from field name to values (e.g., {“tokens”: …, “uids”: …})Y_dict (
Dict
[str
,Tensor
]) – A map from task name to its corresponding set of labels
- Raises
ValueError – All values in the
Y_dict
must be of type torch.Tensor
-
__init__
(name, split, X_dict, Y_dict)[source]¶ Initialize self. See help(type(self)) for accurate signature.
- Return type
None
Methods
__init__
(name, split, X_dict, Y_dict)Initialize self.
from_tensors
(X_tensor, Y_tensor, split[, …])Initialize a
DictDataset
from PyTorch Tensors.-
classmethod
from_tensors
(X_tensor, Y_tensor, split, input_data_key='input_data', task_name='task', dataset_name='SnorkelDataset')[source]¶ Initialize a
DictDataset
from PyTorch Tensors.- Parameters
X_tensor (
Tensor
) – Input data of shape [num_examples, feature_dim]Y_tensor (
Tensor
) – Labels of shape [num_samples, num_classes]split (
str
) – Name of data split corresponding to this dataset.input_data_key (
str
) – Name of data field to initialize inX_dict
task_name (
str
) – Name of task and corresponding label key inY_dict
dataset_name (
str
) – Name of DictDataset to be initialized; See__init__
above.
- Returns
Class initialized with single task and label corresponding to input data
- Return type