snorkel.labeling.apply.dask.DaskLFApplier¶
-
class
snorkel.labeling.apply.dask.DaskLFApplier(lfs)[source]¶ Bases:
snorkel.labeling.apply.core.BaseLFApplierLF applier for a Dask DataFrame.
Dask DataFrames consist of partitions, each being a Pandas DataFrame. This allows for efficient parallel computation over DataFrame rows. For more information, see https://docs.dask.org/en/stable/dataframe.html
-
__init__(lfs)[source]¶ Initialize self. See help(type(self)) for accurate signature.
- Return type
None
Methods
__init__(lfs)Initialize self.
apply(df[, scheduler, fault_tolerant])Label Dask DataFrame of data points with LFs.
-
apply(df, scheduler='processes', fault_tolerant=False)[source]¶ Label Dask DataFrame of data points with LFs.
- Parameters
df (dask.dataframe.DataFrame) – Dask DataFrame containing data points to be labeled by LFs
scheduler (
Union[str, dask.distributed.Client]) – A Dask scheduling configuration: either a string option or aClient. For more information, see https://docs.dask.org/en/stable/scheduling.html#fault_tolerant (
bool) – Output-1if LF execution fails?
- Returns
Matrix of labels emitted by LFs
- Return type
np.ndarray
-