snorkel.preprocess.Preprocessor¶
-
class
snorkel.preprocess.
Preprocessor
(name, field_names=None, mapped_field_names=None, pre=None, memoize=False, memoize_key=None)[source]¶ Bases:
snorkel.map.core.Mapper
Base class for preprocessors.
See
snorkel.map.core.Mapper
for details.-
__init__
(name, field_names=None, mapped_field_names=None, pre=None, memoize=False, memoize_key=None)[source]¶ Initialize self. See help(type(self)) for accurate signature.
- Return type
None
Methods
__init__
(name[, field_names, …])Initialize self.
Reset the memoization cache.
run
(**kwargs)Run the mapping operation using the input fields.
-
__call__
(x)[source]¶ Run mapping function on input data point.
Deep copies the data point first so as not to make accidental in-place changes. If
memoize
is set toTrue
, an internal cache is checked for results. If no cached results are found, the computed results are added to the cache.- Parameters
x (
Any
) – Data point to run mapping function on- Returns
Mapped data point of same format but possibly different fields
- Return type
DataPoint
-
run
(**kwargs)[source]¶ Run the mapping operation using the input fields.
The inputs to this function are fed by extracting the fields of the input data point using the keys of
field_names
. The output field names are converted usingmapped_field_names
and added to the data point.- Returns
A mapping from canonical output field names to their values.
- Return type
Optional[FieldMap]
- Raises
NotImplementedError – Subclasses must implement this method
-