snorkel.labeling.PandasLFApplier¶
-
class
snorkel.labeling.PandasLFApplier(lfs)[source]¶ Bases:
snorkel.labeling.apply.core.BaseLFApplierLF applier for a Pandas DataFrame.
Data points are stored as
Seriesin a DataFrame. The LFs are executed via apandas.DataFrame.applycall, which is single-process and can be slow for large DataFrames. For large datasets, considerDaskLFApplierorSparkLFApplier.- Parameters
lfs (
List[LabelingFunction]) – LFs that this applier executes on examples
Example
>>> from snorkel.labeling import labeling_function >>> @labeling_function() ... def is_big_num(x): ... return 1 if x.num > 42 else 0 >>> applier = PandasLFApplier([is_big_num]) >>> applier.apply(pd.DataFrame(dict(num=[10, 100], text=["hello", "hi"]))) array([[0], [1]])
-
__init__(lfs)[source]¶ Initialize self. See help(type(self)) for accurate signature.
- Return type
None
Methods
__init__(lfs)Initialize self.
apply(df[, progress_bar, fault_tolerant, …])Label Pandas DataFrame of data points with LFs.
-
apply(df, progress_bar=True, fault_tolerant=False, return_meta=False)[source]¶ Label Pandas DataFrame of data points with LFs.
- Parameters
df (
DataFrame) – Pandas DataFrame containing data points to be labeled by LFsprogress_bar (
bool) – Display a progress bar?fault_tolerant (
bool) – Output-1if LF execution fails?return_meta (
bool) – Return metadata from apply call?
- Return type
Union[ndarray,Tuple[ndarray,ApplierMetadata]]- Returns
np.ndarray – Matrix of labels emitted by LFs
ApplierMetadata – Metadata, such as fault counts, for the apply call