Documentation¶
beta_nmf_minibatch.py¶
Contents
The beta_nmf_minibatch module includes the betaNMF class, fit function and theano functions to compute updates and cost.
-
class
beta_nmf_minibatch.
BetaNMF
(data_shape, n_components=50, beta=2, n_iter=50, fixed_factors=None, cache1_size=0, batch_size=100, verbose=0, init_mode='random', W=None, H=None, solver='mu_batch', nb_batch_w=1, sag_memory=0)[source]¶ BetaNMF class
Performs nonnegative matrix factorization with mini-batch multiplicative updates. GPGPU implementation based on Theano.
Parameters: data_shape : tuple composed of integers
The shape of the data to approximate
n_components : positive integer
The number of latent components for the NMF model
beta: arbitrary float (default 2).
- The beta-divergence to consider. Particular cases of interest are
- beta=2 : Euclidean distance
- beta=1 : Kullback Leibler
- beta=0 : Itakura-Saito
n_iter: positive integer
number of iterations
fixed_factors: array of intergers
Indexes of the factors that are kept fixed during the updates * [0] : corresponds to fixed H * [1] : corresponds to fixed W
cache1_size: integer
Size (in frames) of the first data cache. The size is reduced to the closest multiple of the batch_size. If set to zero the algorithm tries to fit all the data in cache
batch_size: integer
Size (in frames) of the batch for batch processing. The batch size has an impact on the parrelization and the memory needed to store partial gradients (see Schmidt et al.)
verbose: integer
The numer of iterations to wait between two computation and printing of the cost
init_mode : string (default ‘random’)
- random : initalise the factors randomly
- custom : intialise the factors with custom value
W : array (optionnal)
Initial wvalue for factor W when custom initialisation is used
H : array (optionnal)
Initial wvalue for factor H when custom initialisation is used
solver : string (default ‘mu_batch’)
- mu_batch : mini-batch version of the MU updates.
(fully equivalent to standard NMF with MU).
nb_batch_w : interger (default 1)
number of batches on which W updates is computed * 1 : greedy approaches [1]
sag_memory : integer (default 0)
number of batches used to compute the average gradient * 0 : SG approaches * nb_batches : SAG approaches
References
[1] R. Serizel, S. Essid, and G. Richard. “Mini-batch stochastic approaches for accelerated multiplicative updates in nonnegative matrix factorisation with beta-divergence”. Accepted for publication In Proc. of MLSP, p. 5, 2016. [2] Schmidt, M., Roux, N. L., & Bach, F. (2013). Minimizing finite sums with the stochastic average gradient https://hal.inria.fr/hal-00860051/PDF/sag_journal.pdf Attributes
nb_cache1 (integer) number of caches needed to fill the full data forget_factor (float) forgetting factor for SAG scores (array) reconstruction cost and iteration time for each iteration factors_ (list of arrays) The estimated factors w (theano tensor) factor W h_cache1 (theano tensor) part of the factor H in cache1 x_cache1 (theano tensor) data cache Methods
check_shape
()Check that all the matrix have consistent shapes fit
(data[, cyclic, warm_start])Learns NMF model get_div_function
()compile the theano-based divergence function get_gradient_mu_batch
()compile the theano based gradient functions for mu get_gradient_mu_sag
()compile the theano based gradient functions for mu_sag algorithms get_gradient_mu_sg
()compile the theano based gradient functions for mu_sg algorithms get_updates
()compile the theano based update functions init
()Initialise theano variable to store the gradients prepare_batch
([randomize])Arrange data for batches prepare_cache1
([randomize])Arrange data for to fill cache1 set_factors
(data[, W, H, fixed_factors])Re-set theano based parameters according to the object attributes. transform
(data[, warm_start])Project data X on the basis W update_mu_batch_h
(batch_ind, update_func, ...)Update h for current batch with standard MU update_mu_batch_w
(udpate_func)Update W with standard MU update_mu_sag
(batch_ind, update_func, grad_func)Update current batch with SAG based algorithms -
fit
(data, cyclic=False, warm_start=False)[source]¶ Learns NMF model
Parameters: data : ndarray with nonnegative entries
The input array
cyclic : Boolean (default False)
pick the sample cyclically
warm_start : Boolean (default False)
start from previous values
-
prepare_batch
(randomize=True)[source]¶ Arrange data for batches
Parameters: randomize : boolean (default True)
Randomise the data (time-wise) before preparing batch indexes
-
prepare_cache1
(randomize=True)[source]¶ Arrange data for to fill cache1
Parameters: randomize : boolean (default True)
Randomise the data (time-wise) before preparing cahce indexes
-
set_factors
(data, W=None, H=None, fixed_factors=None)[source]¶ Re-set theano based parameters according to the object attributes.
Parameters: W : array (optionnal)
Value for factor W when custom initialisation is used
H : array (optionnal)
Value for factor H when custom initialisation is used
fixed_factors : array (default Null)
- list of factors that are not updated
e.g. fixed_factors = [0] -> H is not updated
fixed_factors = [1] -> W is not updated
-
transform
(data, warm_start=False)[source]¶ Project data X on the basis W
Parameters: X : array
The input data
warm_start : Boolean (default False)
start from previous values
Returns: H : array
Activations
-
update_mu_batch_h
(batch_ind, update_func, grad_func)[source]¶ Update h for current batch with standard MU
Parameters: batch_ind : array with 2 elements
batch_ind[0]: batch start batch_ind[1]: batch end update_func : Theano compiled function
Update function
grad_func : Theano compiled function
Gradient function
-
update_mu_batch_w
(udpate_func)[source]¶ Update W with standard MU
Parameters: update_func : Theano compiled function
Update function
-
update_mu_sag
(batch_ind, update_func, grad_func)[source]¶ Update current batch with SAG based algorithms
Parameters: batch_ind : array with 2 elements
batch_ind[0]: batch start batch_ind[1]: batch end update_func : Theano compiled function
Update function
grad_func : Theano compiled function
Gradient function
base.py¶
Contents
The base module includes the basic functions such as to load data, annotations, to normalize matrices and generate nonnegative random matrices
-
base.
get_norm_col
(w)[source]¶ returns the norm of a column vector
Parameters: w: 1-dimensionnal array
vector to be normalised
Returns: norm: scalar
norm-2 of w
-
base.
load_all_data
(f_name, scale=True, rnd=False)[source]¶ Get data from from all sets stored H5FS file.
Parameters: f_name : String
file name
scale : Boolean (default True)
scale data to unit variance (scikit-learn function)
rnd : Boolean (default True)
randomize the data along time axis
Returns: data_dic : Dictionnary
dictionary containing the data
x_train: numpy array
train data matrix
x_test: numpy array
test data matrix
x_dev: numpy array
dev data matrix
-
base.
load_all_data_labels
(f_name, scale=True, rnd=False)[source]¶ Get data with labels, for all sets.
Parameters: f_name : String
file name
scale : Boolean (default True)
scale data to unit variance (scikit-learn function)
rnd : Boolean (default True)
randomize the data along time axis
Returns: data_dic : Dictionnary
dictionary containing the data
x_train: numpy array
train data matrix
x_test: numpy array
test data matrix
x_dev: numpy array
dev data matrix
y_train: numpy array
train labels vector
y_test: numpy array
test labels vector
y_dev: numpy array
dev labels vector
-
base.
load_all_data_labels_fids
(f_name, scale=True, rnd=False)[source]¶ Get data with labels and file ids for all sets.
Parameters: f_name : String
file name
scale : Boolean (default True)
scale data to unit variance (scikit-learn function)
rnd : Boolean (default True)
randomize the data along time axis
Returns: data_dic : Dictionnary
dictionary containing the data
x_train: numpy array
train data matrix
x_test: numpy array
test data matrix
x_dev: numpy array
dev data matrix
y_train: numpy array
train labels vector
y_test: numpy array
test labels vector
y_dev: numpy array
dev labels vector
f_train: numpy array
train file ids vector
f_test: numpy array
test file ids vector
f_dev: numpy array
dev file ids vector
-
base.
load_all_fids
(f_name)[source]¶ Get file ids for all sets.
Parameters: f_name : String
file name
Returns: fids_dic : Dictionnary
dictionary containing the data
f_train: numpy array
train file ids vector
f_test: numpy array
test file ids vector
f_dev: numpy array
dev file ids vector
-
base.
load_all_labels
(f_name)[source]¶ Get labels for all sets.
Parameters: f_name : String
file name
Returns: lbl_dic : Dictionnary
dictionary containing the data
y_train: numpy array
train labels vector
y_test: numpy array
test labels vector
y_dev: numpy array
dev labels vector
-
base.
load_data
(f_name, dataset, scale=True, rnd=False)[source]¶ Get data from from a specific set stored H5FS file.
Parameters: f_name : String
file name
dataset : String
name of the set to load (e.g., train, dev, test)
scale : Boolean (default True)
scale data to unit variance (scikit-learn function)
rnd : Boolean (default True)
randomize the data along time axis
Returns: data_dic : Dictionnary
dictionary containing the data
data: numpy array
data matrix
-
base.
load_data_labels
(f_name, dataset, scale=True, rnd=False)[source]¶ Get data with labels, for a particular set.
Parameters: f_name : String
file name
dataset : String
name of the set to load (e.g., train, dev, test)
scale : Boolean (default True)
scale data to unit variance (scikit-learn function)
rnd : Boolean (default True)
randomize the data along time axis
Returns: data_dic : Dictionnary
dictionary containing the data
x: numpy array
data matrix
y: numpy array
labels vector
-
base.
load_data_labels_fids
(f_name, dataset, scale=True, rnd=False)[source]¶ Get data with labels and file ids for a specific set.
Parameters: f_name : String
file name
dataset : String
name of the set to load (e.g., train, dev, test)
scale : Boolean (default True)
scale data to unit variance (scikit-learn function)
rnd : Boolean (default True)
randomize the data along time axis
Returns: data_dic : Dictionnary
dictionary containing the data
x: numpy array
data matrix
y: numpy array
labels vector
f: numpy array
file ids vector
-
base.
load_fids
(f_name, dataset)[source]¶ Get file ids for a specific set.
Parameters: f_name : String
file name
dataset : String
name of the set to load (e.g., train, dev, test)
Returns: fids_dic : Dictionnary
dictionary containing the files ids
file_ids: numpy array
file ids vector
-
base.
load_labels
(f_name, dataset)[source]¶ Get labels for a specific set.
Parameters: f_name : String
file name
dataset : String
name of the set to load (e.g., train, dev, test)
Returns: lbl_dic : Dictionnary
dictionary containing the labels
labels: numpy array
labels vector
-
base.
nnrandn
(shape)[source]¶ generates randomly a nonnegative ndarray of given shape
Parameters: shape : tuple
The shape
Returns: out : array of given shape
The non-negative random numbers
-
base.
norm_col
(w, h)[source]¶ normalize the column vector w (Theano function). Apply the invert normalization on h such that w.h does not change
Parameters: w: Theano vector
vector to be normalised
h: Ttheano vector
vector to be normalised by the invert normalistation
Returns: w : Theano vector with the same shape as w
normalised vector (w/norm)
h : Theano vector with the same shape as h
h*norm
cost.py¶
Contents
The cost module regroups the cost functions used for the group NMF
updates.py¶
Contents
The update module regroups the update functions used for the mini-batch NMF
-
updates.
gradient_h
(X, W, H, beta)[source]¶ Compute the gradient of the beta-divergence relatively to the factor H
Parameters: X: theano tensor
Data matrix to be decomposed
W: theano tensor
Factor matrix containing the bases of the decomposition
H: theano tensor
Factor matrix containing the actiovations of the decomposition
beta: theano scalar
Coefficient beta for the beta-divergence Special cases: * beta = 1: Itakura-Saito * beta = 1: Kullback-Leibler * beta = 2: Euclidean distance
Returns: grad_h: theano matrix
Gradient of the local beta-divergence with respect to H
-
updates.
gradient_h_mu
(X, W, H, beta)[source]¶ Compute the gradient of the beta-divergence relatively to the factor H Return positive and negative contribution e.g. for multiplicative updates
Parameters: X: theano tensor
Data matrix to be decomposed
W: theano tensor
Factor matrix containing the bases of the decomposition
H: theano tensor
Factor matrix containing the actiovations of the decomposition
beta: theano scalar
Coefficient beta for the beta-divergence Special cases: * beta = 1: Itakura-Saito * beta = 1: Kullback-Leibler * beta = 2: Euclidean distance
Returns: grad_h: theano matrix (T.stack(grad_h_pos, grad_h_neg))
grad_h_pos: Positive term of the gradient of the local beta-divergence with respect to H grad_h_neg: Positive term of the gradient of the local beta-divergence with respect to H
-
updates.
gradient_w
(X, W, H, beta)[source]¶ Compute the gradient of the beta-divergence relatively to the factor W
Parameters: X: theano tensor
Data matrix to be decomposed
W: theano tensor
Factor matrix containing the bases of the decomposition
H: theano tensor
Factor matrix containing the actiovations of the decomposition
beta: theano scalar
Coefficient beta for the beta-divergence Special cases: * beta = 1: Itakura-Saito * beta = 1: Kullback-Leibler * beta = 2: Euclidean distance
Returns: grad_w: theano matrix
Gradient of the local beta-divergence with respect to W
-
updates.
gradient_w_mu
(X, W, H, beta)[source]¶ Compute the gradient of the beta-divergence relatively to the factor W Return positive and negative contribution e.g. for multiplicative updates
Parameters: X: theano tensor
Data matrix to be decomposed
W: theano tensor
Factor matrix containing the bases of the decomposition
H: theano tensor
Factor matrix containing the actiovations of the decomposition
beta: theano scalar
Coefficient beta for the beta-divergence Special cases: * beta = 1: Itakura-Saito * beta = 1: Kullback-Leibler * beta = 2: Euclidean distance
Returns: grad_w: theano matrix (T.stack(grad_w_pos, grad_w_neg))
grad_w_pos: Positive term of the gradient of the local beta-divergence with respect to W grad_w_neg: Positive term of the gradient of the local beta-divergence with respect to W
-
updates.
mu_update
(factor, gradient_pos, gradient_neg)[source]¶ Update the factor based on multiplicative rules
Parameters: factor: theano tensor
The factor to be updated
gradient_pos: theano tensor
Positive part of gradient relatively to factor
gradient_neg: theano tensor
Negative part of gradient relatively to factor
Returns: factor: theano matrix
New value of factor update with multiplicative updates
-
updates.
mu_update_h
(X, W, H, beta)[source]¶ - Compute the gradient of the beta-divergence relatively to the factor H
- and update H with multiplicative rules
Parameters: X: theano tensor
Data matrix to be decompsed
W: theano tensor
Factor matrix containing the bases of the decomposition
H: theano tensor
Factor matrix containing the activations of the decomposition
beta: theano scalar
Coefficient beta for the beta-divergence Special cases: * beta = 1: Itakura-Saito * beta = 1: Kullback-Leibler * beta = 2: Euclidean distance
Returns: H: theano matrix
New value of H updated with multiplicative updates
-
updates.
update_grad_w
(grad, grad_old, grad_new)[source]¶ Update the global gradient for W
Parameters: grad: theano tensor
The global gradient
grad_old: theano tensor
The previous value of the local gradient
grad_new: theano tensor
The new version of the local gradient
Returns: grad: theano tensor
New value of the global gradient