snorkel.augmentation.RandomPolicy¶

class snorkel.augmentation.RandomPolicy(n_tfs, sequence_length=1, n_per_original=1, keep_original=True)[source]¶

Bases: snorkel.augmentation.policy.sampling.MeanFieldPolicy

Naive random augmentation policy.

Samples sequences of TF indices a specified length at random from the total number of TFs. Sampling uniformly at random is a common baseline approach to data augmentation.

Parameters

n_tfs (int) – Total number of TFs
sequence_length (int) – Number of TFs to run on each data point
n_per_original (int) – Number of transformed data points per original
keep_original (bool) – Keep untransformed data point in augmented data set? Note that even if in-place modifications are made to the original data point by the TFs being applied, the original data point will remain unchanged.

n[source]¶: Total number of TFs

n_per_original[source]¶: See above

keep_original[source]¶: See above

sequence_length[source]¶: See above

__init__(n_tfs, sequence_length=1, n_per_original=1, keep_original=True)[source]¶

Initialize self. See help(type(self)) for accurate signature.

Return type: None

Methods

`__init__`(n_tfs[, sequence_length, …])	Initialize self.
`generate`()	Generate a sequence of TF indices by sampling from distribution.
`generate_for_example`()	Generate all sequences of TF indices for a single example.

generate()[source]¶

Generate a sequence of TF indices by sampling from distribution.

Returns: Indices of TFs to run on data point in order.
Return type: List[int]

generate_for_example()[source]¶

Generate all sequences of TF indices for a single example.

Generates n_per_original sequences, and adds an empty sequence if keep_original is True.

Returns: Sequences of indices of TFs to run on data point in order.
Return type: List[List[int]]