snorkel.labeling.PandasLFApplier¶
-
class
snorkel.labeling.
PandasLFApplier
(lfs)[source]¶ Bases:
snorkel.labeling.apply.core.BaseLFApplier
LF applier for a Pandas DataFrame.
Data points are stored as
Series
in a DataFrame. The LFs are executed via apandas.DataFrame.apply
call, which is single-process and can be slow for large DataFrames. For large datasets, considerDaskLFApplier
orSparkLFApplier
.- Parameters
lfs (
List
[LabelingFunction
]) – LFs that this applier executes on examples
Example
>>> from snorkel.labeling import labeling_function >>> @labeling_function() ... def is_big_num(x): ... return 1 if x.num > 42 else 0 >>> applier = PandasLFApplier([is_big_num]) >>> applier.apply(pd.DataFrame(dict(num=[10, 100], text=["hello", "hi"]))) array([[0], [1]])
-
__init__
(lfs)[source]¶ Initialize self. See help(type(self)) for accurate signature.
- Return type
None
Methods
__init__
(lfs)Initialize self.
apply
(df[, progress_bar, fault_tolerant, …])Label Pandas DataFrame of data points with LFs.
-
apply
(df, progress_bar=True, fault_tolerant=False, return_meta=False)[source]¶ Label Pandas DataFrame of data points with LFs.
- Parameters
df (
DataFrame
) – Pandas DataFrame containing data points to be labeled by LFsprogress_bar (
bool
) – Display a progress bar?fault_tolerant (
bool
) – Output-1
if LF execution fails?return_meta (
bool
) – Return metadata from apply call?
- Return type
Union
[ndarray
,Tuple
[ndarray
,ApplierMetadata
]]- Returns
np.ndarray – Matrix of labels emitted by LFs
ApplierMetadata – Metadata, such as fault counts, for the apply call