snorkel.labeling.PandasLFApplier

class snorkel.labeling.PandasLFApplier(lfs)[source]

Bases: snorkel.labeling.apply.core.BaseLFApplier

LF applier for a Pandas DataFrame.

Data points are stored as Series in a DataFrame. The LFs are executed via a pandas.DataFrame.apply call, which is single-process and can be slow for large DataFrames. For large datasets, consider DaskLFApplier or SparkLFApplier.

Parameters

lfs (List[LabelingFunction]) – LFs that this applier executes on examples

Example

>>> from snorkel.labeling import labeling_function
>>> @labeling_function()
... def is_big_num(x):
...     return 1 if x.num > 42 else 0
>>> applier = PandasLFApplier([is_big_num])
>>> applier.apply(pd.DataFrame(dict(num=[10, 100], text=["hello", "hi"])))
array([[0], [1]])
__init__(lfs)[source]

Initialize self. See help(type(self)) for accurate signature.

Return type

None

Methods

__init__(lfs)

Initialize self.

apply(df[, progress_bar])

Label Pandas DataFrame of data points with LFs.

apply(df, progress_bar=True)[source]

Label Pandas DataFrame of data points with LFs.

Parameters
  • df (DataFrame) – Pandas DataFrame containing data points to be labeled by LFs

  • progress_bar (bool) – Display a progress bar?

Returns

Matrix of labels emitted by LFs

Return type

np.ndarray