snorkel.labeling.model.baselines.MajorityLabelVoter¶

class snorkel.labeling.model.baselines.MajorityLabelVoter(cardinality=2, **kwargs)[source]¶

Bases: snorkel.labeling.model.base_labeler.BaseLabeler

Majority vote label model.

__init__(cardinality=2, **kwargs)[source]¶

Initialize self. See help(type(self)) for accurate signature.

Return type: None

Methods

`__init__`([cardinality])	Initialize self.
`load`(source)	Load existing label model.
`predict`(L[, return_probs, tie_break_policy])	Return predicted labels, with ties broken according to policy.
`predict_proba`(L)	Predict probabilities using majority vote.
`save`(destination)	Save label model.
`score`(L, Y[, metrics, tie_break_policy])	Calculate one or more scores from user-specified and/or user-defined metrics.

load(source)[source]¶

Load existing label model.

Parameters: source (str) – Filename to load model from

Example

Load parameters saved in saved_label_model

>>> label_model.load('./saved_label_model.pkl')  # doctest: +SKIP

Return type: None

predict(L, return_probs=False, tie_break_policy='abstain')[source]¶

Return predicted labels, with ties broken according to policy.

Policies to break ties include: “abstain”: return an abstain vote (-1) “true-random”: randomly choose among the tied options “random”: randomly choose among tied option using deterministic hash

NOTE: if tie_break_policy=”true-random”, repeated runs may have slightly different results due to difference in broken ties

Parameters

L (ndarray) – An [n,m] matrix with values in {-1,0,1,…,k-1}
return_probs (Optional[bool]) – Whether to return probs along with preds
tie_break_policy (str) – Policy to break ties when converting probabilistic labels to predictions

Return type

Union[ndarray, Tuple[ndarray, ndarray]]

Returns

np.ndarray – An [n,1] array of integer labels
(np.ndarray, np.ndarray) – An [n,1] array of integer labels and an [n,k] array of probabilistic labels

predict_proba(L)[source]¶

Predict probabilities using majority vote.

Assign vote by calculating majority vote across all labeling functions. In case of ties, non-integer probabilities are possible.

Parameters: L (ndarray) – An [n, m] matrix of labels
Returns: A [n, k] array of probabilistic labels
Return type: np.ndarray

Example

>>> L = np.array([[0, 0, -1], [-1, 0, 1], [1, -1, 0]])
>>> maj_voter = MajorityLabelVoter()
>>> maj_voter.predict_proba(L)
array([[1. , 0. ],
       [0.5, 0.5],
       [0.5, 0.5]])

save(destination)[source]¶

Save label model.

Parameters: destination (str) – Filename for saving model

Example

>>> label_model.save('./saved_label_model.pkl')  # doctest: +SKIP

Return type: None

score(L, Y, metrics=['accuracy'], tie_break_policy='abstain')[source]¶

Calculate one or more scores from user-specified and/or user-defined metrics.

Parameters

L (ndarray) – An [n,m] matrix with values in {-1,0,1,…,k-1}
Y (ndarray) – Gold labels associated with data points in L
metrics (Optional[List[str]]) – A list of metric names
tie_break_policy (str) – Policy to break ties when converting probabilistic labels to predictions

Returns

A dictionary mapping metric names to metric scores

Return type

Dict[str, float]