snorkel.labeling.model.baselines.MajorityLabelVoter

class snorkel.labeling.model.baselines.MajorityLabelVoter(cardinality=2, **kwargs)[source]

Bases: snorkel.labeling.model.base_labeler.BaseLabeler

Majority vote label model.

__init__(cardinality=2, **kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

Return type

None

Methods

__init__([cardinality])

Initialize self.

load(source)

Load existing label model.

predict(L[, return_probs, tie_break_policy])

Return predicted labels, with ties broken according to policy.

predict_proba(L)

Predict probabilities using majority vote.

save(destination)

Save label model.

score(L, Y[, metrics, tie_break_policy])

Calculate one or more scores from user-specified and/or user-defined metrics.

load(source)[source]

Load existing label model.

Parameters

source (str) – Filename to load model from

Example

Load parameters saved in saved_label_model

>>> label_model.load('./saved_label_model.pkl')  # doctest: +SKIP
Return type

None

predict(L, return_probs=False, tie_break_policy='abstain')[source]

Return predicted labels, with ties broken according to policy.

Policies to break ties include: “abstain”: return an abstain vote (-1) “true-random”: randomly choose among the tied options “random”: randomly choose among tied option using deterministic hash

NOTE: if tie_break_policy=”true-random”, repeated runs may have slightly different results due to difference in broken ties

Parameters
  • L (ndarray) – An [n,m] matrix with values in {-1,0,1,…,k-1}

  • return_probs (Optional[bool]) – Whether to return probs along with preds

  • tie_break_policy (str) – Policy to break ties when converting probabilistic labels to predictions

Return type

Union[ndarray, Tuple[ndarray, ndarray]]

Returns

  • np.ndarray – An [n,1] array of integer labels

  • (np.ndarray, np.ndarray) – An [n,1] array of integer labels and an [n,k] array of probabilistic labels

predict_proba(L)[source]

Predict probabilities using majority vote.

Assign vote by calculating majority vote across all labeling functions. In case of ties, non-integer probabilities are possible.

Parameters

L (ndarray) – An [n, m] matrix of labels

Returns

A [n, k] array of probabilistic labels

Return type

np.ndarray

Example

>>> L = np.array([[0, 0, -1], [-1, 0, 1], [1, -1, 0]])
>>> maj_voter = MajorityLabelVoter()
>>> maj_voter.predict_proba(L)
array([[1. , 0. ],
       [0.5, 0.5],
       [0.5, 0.5]])
save(destination)[source]

Save label model.

Parameters

destination (str) – Filename for saving model

Example

>>> label_model.save('./saved_label_model.pkl')  # doctest: +SKIP
Return type

None

score(L, Y, metrics=['accuracy'], tie_break_policy='abstain')[source]

Calculate one or more scores from user-specified and/or user-defined metrics.

Parameters
  • L (ndarray) – An [n,m] matrix with values in {-1,0,1,…,k-1}

  • Y (ndarray) – Gold labels associated with data points in L

  • metrics (Optional[List[str]]) – A list of metric names

  • tie_break_policy (str) – Policy to break ties when converting probabilistic labels to predictions

Returns

A dictionary mapping metric names to metric scores

Return type

Dict[str, float]