Documentation

beta_nmf.py

Contents

The beta_nmf module includes the beta_nmf class, fit function and theano functions to compute updates and cost.

class beta_nmf.BetaNMF(data_shape, n_components=50, beta=2, n_iter=100, fixed_factors=None, verbose=0, cold_start=True)[source]

BetaNMF class

Performs nonnegative matrix factorization with Theano.

Parameters:

data_shape : tuple composed of integers

the shape of the data to approximate

n_components : positive integer (default 50)

the number of latent components for the NMF model

beta : arbitrary float (default 2)

the beta-divergence to consider, particular cases of interest are
  • beta=2 : Euclidean distance
  • beta=1 : Kullback Leibler
  • beta=0 : Itakura-Saito

n_iter : Positive integer (default 100)

number of iterations

fixed_factors : array (default Null)

list of factors that are not updated

e.g. fixed_factors = [0] -> H is not updated

fixed_factors = [1] -> W is not updated

verbose : Integer

the frequence at which the score should be computed and displayed (number of iterations between each computation)

Attributes

factors (list of arrays) The estimated factors (factors[0] = H)

Methods

check_shape() Check that all the matrix have consistent shapes
fit(data[, warm_start]) Learns NMF model
get_div_function() Compile the theano-based divergence function
get_updates_functions() Compile the theano based update functions
score() Compute factorisation score
set_factors(X[, fixed_factors]) reset factors
transform(X[, warm_start]) Project data X on the basis W
check_shape()[source]

Check that all the matrix have consistent shapes

fit(data, warm_start=False)[source]

Learns NMF model

Parameters:

X : ndarray with nonnegative entries

The input array

warm_start : Boolean (default False)

start from new values

get_div_function()[source]

Compile the theano-based divergence function

get_updates_functions()[source]

Compile the theano based update functions

score()[source]

Compute factorisation score

Returns:

out : Float

factorisation score

set_factors(X, fixed_factors=None)[source]

reset factors

Parameters:

X : array

The input data

fixed_factors : array (default Null)

list of factors that are not updated

e.g. fixed_factors = [0] -> H is not updated

fixed_factors = [1] -> W is not updated

transform(X, warm_start=False)[source]

Project data X on the basis W

Parameters:

X : array

The input data

warm_start : Boolean (default False)

start from previous values

Returns:

H : array

Activations

base.py

Contents

The base module includes the basic functions such as beta-divergence, nonnegative random matrices generator or load_data.

base.beta_div(X, W, H, beta)[source]

Compute beta divergence D(X|WH)

Parameters:

X : Theano tensor

data

W : Theano tensor

Bases

H : Theano tensor

activation matrix

beta : Theano scalar

Returns:

div : Theano scalar

beta divergence D(X|WH)

base.load_data(f_name, scale=True, rnd=True)[source]

Get data from H5FS file.

Parameters:

f_name : String

file name

scale : Boolean (default True)

scale data to unit variance (scikit-learn function)

rnd : Boolean (default True)

randomize the data along time axis

Returns:

data : Dictionnary

dictionary containing the data

x_train: numpy array

train data matrix

base.nnrandn(shape)[source]

generates randomly a nonnegative ndarray of given shape

Parameters:

shape : tuple

The shape

Returns:

out : array of given shape

The non-negative random numbers

cost.py

Contents

The cost module regroups the cost functions used for the beta NMF

costs.beta_div(X, W, H, beta)[source]

Compute beta divergence D(X|WH)

Parameters:

X : Theano tensor

data

W : Theano tensor

Bases

H : Theano tensor

activation matrix

beta : Theano scalar

Returns:

div : Theano scalar

beta divergence D(X|WH)

updates.py

Contents

The update module regroupse the update functions used for the beta NMF

updates.beta_H(X, W, H, beta)[source]

Update activation with beta divergence

Parameters:

X : Theano tensor

data

W : Theano tensor

Bases

H : Theano tensor

activation matrix

beta : Theano scalar

Returns:

H : Theano tensor

Updated version of the activations

updates.beta_W(X, W, H, beta)[source]

Update bases with beta divergence

Parameters:

X : Theano tensor

data

W : Theano tensor

Bases

H : Theano tensor

activation matrix

beta : Theano scalar

Returns:

W : Theano tensor

Updated version of the bases