snorkel.augmentation.PandasTFApplier¶
-
class
snorkel.augmentation.
PandasTFApplier
(tfs, policy)[source]¶ Bases:
snorkel.augmentation.apply.core.BaseTFApplier
TF applier for a Pandas DataFrame.
Data points are stored as Series in a DataFrame. The TFs run on data points obtained via a
pandas.DataFrame.iterrows
call, which is single-process and can be slow for large DataFrames. For large datasets, considerDaskTFApplier
orSparkTFApplier
.-
__init__
(tfs, policy)[source]¶ Initialize self. See help(type(self)) for accurate signature.
- Return type
None
Methods
__init__
(tfs, policy)Initialize self.
apply
(df[, progress_bar])Augment a Pandas DataFrame of data points using TFs and policy.
apply_generator
(df, batch_size)Augment a Pandas DataFrame of data points using TFs and policy in batches.
-
apply
(df, progress_bar=True)[source]¶ Augment a Pandas DataFrame of data points using TFs and policy.
- Parameters
df (
DataFrame
) – Pandas DataFrame containing data points to be transformedprogress_bar (
bool
) – Display a progress bar?
- Returns
Pandas DataFrame of data points in augmented data set
- Return type
pd.DataFrame
-
apply_generator
(df, batch_size)[source]¶ Augment a Pandas DataFrame of data points using TFs and policy in batches.
This method acts as a generator, yielding augmented data points for a given input batch of data points. This can be useful in a training loop when it is too memory-intensive to pregenerate all transformed examples.
- Parameters
df (
DataFrame
) – Pandas DataFrame containing data points to be transformedbatch_size (
int
) – Batch size for generator. Yields augmented data points for the nextbatch_size
input data points.
- Returns
Pandas DataFrame of data points in augmented data set
- Return type
pd.DataFrame
-