snorkel.labeling.lf.nlp_spark.spark_nlp_labeling_function

class snorkel.labeling.lf.nlp_spark.spark_nlp_labeling_function(name=None, resources=None, pre=None, text_field='text', doc_field='doc', language='en_core_web_sm', disable=None, memoize=True, memoize_key=None, gpu=False)[source]

Bases: snorkel.labeling.lf.nlp.base_nlp_labeling_function

Decorator to define a SparkNLPLabelingFunction object from a function.

Parameters
  • name (Optional[str]) – Name of the LF

  • resources (Optional[Mapping[str, Any]]) – Labeling resources passed in to f via kwargs

  • pre (Optional[List[BaseMapper]]) – Preprocessors to run before SpacyPreprocessor is executed

  • text_field (str) – Name of data point text field to input

  • doc_field (str) – Name of data point field to output parsed document to

  • language (str) – SpaCy model to load See https://spacy.io/usage/models#usage

  • disable (Optional[List[str]]) – List of pipeline components to disable See https://spacy.io/usage/processing-pipelines#disabling

  • memoize (bool) – Memoize preprocessor outputs?

  • memoize_key (Optional[Callable[[Any], Hashable]]) – Hashing function to handle the memoization (default to snorkel.map.core.get_hashable)

Example

>>> @spark_nlp_labeling_function()
... def has_person_mention(x):
...     person_ents = [ent for ent in x.doc.ents if ent.label_ == "PERSON"]
...     return 0 if len(person_ents) > 0 else -1
>>> has_person_mention
SparkNLPLabelingFunction has_person_mention, Preprocessors: [SpacyPreprocessor...]
>>> from pyspark.sql import Row
>>> x = Row(text="The movie was good.")
>>> has_person_mention(x)
-1
__init__(name=None, resources=None, pre=None, text_field='text', doc_field='doc', language='en_core_web_sm', disable=None, memoize=True, memoize_key=None, gpu=False)[source]

Initialize self. See help(type(self)) for accurate signature.

Return type

None

Methods

__init__([name, resources, pre, text_field, …])

Initialize self.

__call__(f)[source]

Wrap a function to create an BaseNLPLabelingFunction.

Parameters

f (Callable[…, int]) – Function that implements the core NLP LF logic

Returns

New BaseNLPLabelingFunction executing logic in wrapped function

Return type

BaseNLPLabelingFunction