snorkel.labeling.lf.nlp_spark.spark_nlp_labeling_function¶

class snorkel.labeling.lf.nlp_spark.spark_nlp_labeling_function(name=None, resources=None, pre=None, text_field='text', doc_field='doc', language='en_core_web_sm', disable=None, memoize=True, memoize_key=None, gpu=False)[source]¶

Bases: snorkel.labeling.lf.nlp.base_nlp_labeling_function

Decorator to define a SparkNLPLabelingFunction object from a function.

Parameters

name (Optional[str]) – Name of the LF
resources (Optional[Mapping[str, Any]]) – Labeling resources passed in to f via kwargs
pre (Optional[List[BaseMapper]]) – Preprocessors to run before SpacyPreprocessor is executed
text_field (str) – Name of data point text field to input
doc_field (str) – Name of data point field to output parsed document to
language (str) – SpaCy model to load See https://spacy.io/usage/models#usage
disable (Optional[List[str]]) – List of pipeline components to disable See https://spacy.io/usage/processing-pipelines#disabling
memoize (bool) – Memoize preprocessor outputs?
memoize_key (Optional[Callable[[Any], Hashable]]) – Hashing function to handle the memoization (default to snorkel.map.core.get_hashable)

Example

>>> @spark_nlp_labeling_function()
... def has_person_mention(x):
...     person_ents = [ent for ent in x.doc.ents if ent.label_ == "PERSON"]
...     return 0 if len(person_ents) > 0 else -1
>>> has_person_mention
SparkNLPLabelingFunction has_person_mention, Preprocessors: [SpacyPreprocessor...]

>>> from pyspark.sql import Row
>>> x = Row(text="The movie was good.")
>>> has_person_mention(x)
-1

__init__(name=None, resources=None, pre=None, text_field='text', doc_field='doc', language='en_core_web_sm', disable=None, memoize=True, memoize_key=None, gpu=False)[source]¶

Initialize self. See help(type(self)) for accurate signature.

Return type: None

Methods

__init__([name, resources, pre, text_field, …])

Initialize self.

__call__(f)[source]¶

Wrap a function to create an BaseNLPLabelingFunction.

Parameters: f (Callable[…, int]) – Function that implements the core NLP LF logic
Returns: New BaseNLPLabelingFunction executing logic in wrapped function
Return type: BaseNLPLabelingFunction