snorkel.labeling.MajorityClassVoter¶

class snorkel.labeling.MajorityClassVoter(cardinality=2, **kwargs)[source]¶

Bases: snorkel.labeling.model.label_model.LabelModel

Majority class label model.

__init__(cardinality=2, **kwargs)[source]¶

Initialize self. See help(type(self)) for accurate signature.

Return type: None

Methods

`__init__`([cardinality])	Initialize self.
`add_module`(name, module)	Adds a child module to the current module.
`apply`(fn)	Applies `fn` recursively to every submodule (as returned by `.children()`) as well as self.
`buffers`([recurse])	Returns an iterator over module buffers.
`children`()	Returns an iterator over immediate children modules.
`cpu`()	Moves all model parameters and buffers to the CPU.
`cuda`([device])	Moves all model parameters and buffers to the GPU.
`double`()	Casts all floating point parameters and buffers to `double` datatype.
`eval`()	Sets the module in evaluation mode.
`extra_repr`()	Set the extra representation of the module
`fit`(balance, args, *kwargs)	Train majority class model.
`float`()	Casts all floating point parameters and buffers to float datatype.
`forward`(*input)	Defines the computation performed at every call.
`get_conditional_probs`()	Return the estimated conditional probabilities table.
`get_weights`()	Return the vector of learned LF weights for combining LFs.
`half`()	Casts all floating point parameters and buffers to `half` datatype.
`load`(source)	Load existing label model.
`load_state_dict`(state_dict[, strict])	Copies parameters and buffers from `state_dict` into this module and its descendants.
`modules`()	Returns an iterator over all modules in the network.
`named_buffers`([prefix, recurse])	Returns an iterator over module buffers, yielding both the name of the buffer as well as the buffer itself.
`named_children`()	Returns an iterator over immediate children modules, yielding both the name of the module as well as the module itself.
`named_modules`([memo, prefix])	Returns an iterator over all modules in the network, yielding both the name of the module as well as the module itself.
`named_parameters`([prefix, recurse])	Returns an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself.
`parameters`([recurse])	Returns an iterator over module parameters.
`predict`(L[, return_probs, tie_break_policy])	Return predicted labels, with ties broken according to policy.
`predict_proba`(L)	Predict probabilities using majority class.
`register_backward_hook`(hook)	Registers a backward hook on the module.
`register_buffer`(name, tensor)	Adds a persistent buffer to the module.
`register_forward_hook`(hook)	Registers a forward hook on the module.
`register_forward_pre_hook`(hook)	Registers a forward pre-hook on the module.
`register_parameter`(name, param)	Adds a parameter to the module.
`save`(destination)	Save label model.
`score`(L, Y[, metrics, tie_break_policy])	Calculate one or more scores from user-specified and/or user-defined metrics.
`share_memory`()
`state_dict`([destination, prefix, keep_vars])	Returns a dictionary containing a whole state of the module.
`to`(args, *kwargs)	Moves and/or casts the parameters and buffers.
`train`([mode])	Sets the module in training mode.
`type`(dst_type)	Casts all parameters and buffers to `dst_type`.
`zero_grad`()	Sets gradients of all model parameters to zero.

Attributes

dump_patches

fit(balance, *args, **kwargs)[source]¶

Train majority class model.

Set class balance for majority class label model.

Parameters: balance (ndarray) – A [k] array of class probabilities
Return type: None

get_conditional_probs()[source]¶

Return the estimated conditional probabilities table.

Return the estimated conditional probabilites table cprobs, where cprobs is an (m, k+1, k)-dim np.ndarray with:

cprobs[i, j, k] = P(lf_i = j-1 | Y = k)

where m is the number of LFs, k is the cardinality, and cprobs includes the conditional abstain probabilities P(lf_i = -1 | Y = y).

Returns: An [m, k + 1, k] np.ndarray conditional probabilities table.
Return type: np.ndarray

get_weights()[source]¶

Return the vector of learned LF weights for combining LFs.

Returns: [m,1] vector of learned LF weights for combining LFs.
Return type: np.ndarray

Example

>>> L = np.array([[1, 1, 1], [1, 1, -1], [-1, 0, 0], [0, 0, 0]])
>>> label_model = LabelModel(verbose=False)
>>> label_model.fit(L, seed=123)
>>> np.around(label_model.get_weights(), 2)  # doctest: +SKIP
array([0.99, 0.99, 0.99])

load(source)[source]¶

Load existing label model.

Parameters: source (str) – Filename to load model from

Example

Load parameters saved in saved_label_model

>>> label_model.load('./saved_label_model.pkl')  # doctest: +SKIP

Return type: None

predict(L, return_probs=False, tie_break_policy='abstain')[source]¶

Return predicted labels, with ties broken according to policy.

Policies to break ties include: “abstain”: return an abstain vote (-1) “true-random”: randomly choose among the tied options “random”: randomly choose among tied option using deterministic hash

NOTE: if tie_break_policy=”true-random”, repeated runs may have slightly different results due to difference in broken ties

Parameters

L (ndarray) – An [n,m] matrix with values in {-1,0,1,…,k-1}
return_probs (Optional[bool]) – Whether to return probs along with preds
tie_break_policy (str) – Policy to break ties when converting probabilistic labels to predictions

Return type

Union[ndarray, Tuple[ndarray, ndarray]]

Returns

np.ndarray – An [n,1] array of integer labels
(np.ndarray, np.ndarray) – An [n,1] array of integer labels and an [n,k] array of probabilistic labels

Example

>>> L = np.array([[0, 0, -1], [1, 1, -1], [0, 0, -1]])
>>> label_model = LabelModel(verbose=False)
>>> label_model.fit(L)
>>> label_model.predict(L)
array([0, 1, 0])

predict_proba(L)[source]¶

Predict probabilities using majority class.

Assign majority class vote to each datapoint. In case of multiple majority classes, assign equal probabilities among them.

Parameters: L (ndarray) – An [n, m] matrix of labels
Returns: A [n, k] array of probabilistic labels
Return type: np.ndarray

Example

>>> L = np.array([[0, 0, -1], [-1, 0, 1], [1, -1, 0]])
>>> maj_class_voter = MajorityClassVoter()
>>> maj_class_voter.fit(balance=np.array([0.8, 0.2]))
>>> maj_class_voter.predict_proba(L)
array([[1., 0.],
       [1., 0.],
       [1., 0.]])

save(destination)[source]¶

Save label model.

Parameters: destination (str) – Filename for saving model

Example

>>> label_model.save('./saved_label_model.pkl')  # doctest: +SKIP

Return type: None

score(L, Y, metrics=['accuracy'], tie_break_policy='abstain')[source]¶

Calculate one or more scores from user-specified and/or user-defined metrics.

Parameters

L (ndarray) – An [n,m] matrix with values in {-1,0,1,…,k-1}
Y (ndarray) – Gold labels associated with data points in L
metrics (Optional[List[str]]) – A list of metric names
tie_break_policy (str) – Policy to break ties when converting probabilistic labels to predictions

Returns

A dictionary mapping metric names to metric scores

Return type

Dict[str, float]

Example

>>> L = np.array([[1, 1, -1], [0, 0, -1], [1, 1, -1]])
>>> label_model = LabelModel(verbose=False)
>>> label_model.fit(L)
>>> label_model.score(L, Y=np.array([1, 1, 1]))
{'accuracy': 0.6666666666666666}
>>> label_model.score(L, Y=np.array([1, 1, 1]), metrics=["f1"])
{'f1': 0.8}