Learning

Core Data Models

Core Objects for Learning + Inference

class snorkel.learning.LSTM[source]

Long Short-Term Memory.

class snorkel.learning.LogRegSKLearn[source]

Logistic regression.

class snorkel.learning.NoiseAwareModel(bias_term=False)[source]

Simple abstract base class for a model.

load(session, param_set_name)[source]

Load the Parameters into self.w, given ParameterSet.name

predict(X, b=0.5)[source]

Return numpy array of elements in {-1,0,1} based on predicted marginal probabilities.

save(session, param_set_name)[source]

Save the Parameter (weight) values, i.e. the model, as a new ParameterSet

train(X, training_marginals, **hyperparams)[source]

Trains the model; also must set self.X_train and self.w

snorkel.learning.exact_data(X, w, evidence=None)[source]

We calculate the exact conditional probability of the decision variables in logistic regression; see sample_data

snorkel.learning.log_odds(p)[source]

This is the logit function

snorkel.learning.odds_to_prob(l)[source]

This is the inverse logit function logit^{-1}:

l = log
rac{p}{1-p}
exp(l) =
rac{p}{1-p}
p =

rac{exp(l)}{1 + exp(l)}

snorkel.learning.sample_data(X, w, n_samples)[source]

Here we do Gibbs sampling over the decision variables (representing our objects), o_j corresponding to the columns of X The model is just logistic regression, e.g.

P(o_j=1 | X_{,j}; w) = logit^{-1}(w dot X_{,j})

This can be calculated exactly, so this is essentially a noisy version of the exact calc...

snorkel.learning.transform_sample_stats(Xt, t, f, Xt_abs=None)[source]

Here we calculate the expected accuracy of each LF/feature (corresponding to the rows of X) wrt to the distribution of samples S:

E_S[ accuracy_i ] = E_(t,f)[
rac{TP + TN}{TP + FP + TN + FN} ]
=
rac{X_{i|x_{ij}>0}*t - X_{i|x_{ij}<0}*f}{t+f}
=

rac12left( rac{X*(t-f)}{t+f} + 1 ight)