Available Modules¶

List of all files, classes and methods available in the library.

dataset.py¶

class keras_wrapper.dataset.Data_Batch_Generator(set_split, net, dataset, num_iterations, batch_size=50, normalization=False, data_augmentation=True, mean_substraction=True, predict=False, random_samples=-1, shuffle=True)¶

Batch generator class. Retrieves batches of data.

generator()¶: Gets and processes the data :return: generator with the data

class keras_wrapper.dataset.Dataset(name, path, silence=False)¶

Class for defining instances of databases adapted for Keras. It includes several utility functions for easily managing data splits, image loading, mean calculation, etc.

build_vocabulary(captions, id, tokfun, do_split, min_occ=0, n_words=0)¶

Vocabulary builder for data of type ‘text’

Parameters:	captions – Corpus sentences id – Dataset id of the text tokfun – Tokenization function. (used?) do_split – Split sentence by words or use the full sentence as a class. min_occ – Minimum occurrences of each word to be included in the dictionary. n_words – Maximum number of words to include in the dictionary.
Returns:	None.

calculateTrainMean(id)¶: Calculates the mean of the data belonging to the training set split in each channel.

convert_3DLabels_to_bboxes(predictions, original_sizes, threshold=0.5, idx_3DLabel=0, size_restriction=0.001)¶

Converts a set of predictions of type 3DLabel to their corresponding bounding boxes.

Parameters:	predictions – 3DLabels predicted by Model_Wrapper. If type is list it will be assumed that position 0 corresponds to 3DLabels original_sizes – original sizes of the predicted images width and height threshold – minimum overlapping threshold for considering a prediction valid
Returns:	predicted_bboxes, predicted_Y, predicted_scores for each image

convert_GT_3DLabels_to_bboxes(gt)¶

Converts a GT list of 3DLabels to a set of bboxes.

Parameters:	gt – list of Dataset output of type 3DLabels
Returns:	[out_list, original_sizes], where out_list contains a list of samples with the following info [GT_bboxes, GT_Y], and original_sizes contains the original width and height for each image

getClassID(class_name, id)¶

Returns:	the class id (int) for a given class string.

getX(set_name, init, final, normalization_type='0-1', normalization=False, meanSubstraction=True, dataAugmentation=True, debug=False)¶

Gets all the data samples stored between the positions init to final

Parameters:	set_name – ‘train’, ‘val’ or ‘test’ set init – initial position in the corresponding set split. Must be bigger or equal than 0 and smaller than final. final – final position in the corresponding set split. debug – if True all data will be returned without preprocessing

# ‘raw-image’, ‘video’, ‘image-features’ and ‘video-features’-related parameters

Parameters:	normalization – indicates if we want to normalize the data.

# ‘image-features’ and ‘video-features’-related parameters

Parameters:	normalization_type – indicates the type of normalization applied. See available types in self.__available_norm_im_vid for ‘raw-image’ and ‘video’ and self.__available_norm_feat for ‘image-features’ and ‘video-features’.

# ‘raw-image’ and ‘video’-related parameters

Parameters:	meanSubstraction – indicates if we want to substract the training mean from the returned images (only applicable if normalization=True) dataAugmentation – indicates if we want to apply data augmentation to the loaded images (random flip and cropping)
Returns:	X, list of input data variables from sample ‘init’ to ‘final’ belonging to the chosen ‘set_name’

getXY(set_name, k, normalization_type='0-1', normalization=False, meanSubstraction=True, dataAugmentation=True, debug=False)¶

Gets the [X,Y] pairs for the next ‘k’ samples in the desired set.

Parameters:	set_name – ‘train’, ‘val’ or ‘test’ set k – number of consecutive samples retrieved from the corresponding set. sorted_batches – If True, it will pick data of the same size debug – if True all data will be returned without preprocessing

# ‘raw-image’, ‘video’, ‘image-features’ and ‘video-features’-related parameters

Parameters:	normalization – indicates if we want to normalize the data.

# ‘image-features’ and ‘video-features’-related parameters

Parameters:	normalization_type – indicates the type of normalization applied. See available types in self.__available_norm_im_vid for ‘image’ and ‘video’ and self.__available_norm_feat for ‘image-features’ and ‘video-features’.

# ‘raw-image’ and ‘video’-related parameters

Parameters:	meanSubstraction – indicates if we want to substract the training mean from the returned images (only applicable if normalization=True) dataAugmentation – indicates if we want to apply data augmentation to the loaded images (random flip and cropping)
Returns:	[X,Y], list of input and output data variables of the next ‘k’ consecutive samples belonging to the chosen ‘set_name’
Returns:	[X, Y, [new_last, last, surpassed]] if debug==True

getXY_FromIndices(set_name, k, normalization_type='0-1', normalization=False, meanSubstraction=True, dataAugmentation=True, debug=False)¶

Gets the [X,Y] pairs for the samples in positions ‘k’ in the desired set.

Parameters:	set_name – ‘train’, ‘val’ or ‘test’ set k – positions of the desired samples sorted_batches – If True, it will pick data of the same size debug – if True all data will be returned without preprocessing

# ‘raw-image’, ‘video’, ‘image-features’ and ‘video-features’-related parameters

Parameters:	normalization – indicates if we want to normalize the data.

# ‘image-features’ and ‘video-features’-related parameters

Parameters:	normalization_type – indicates the type of normalization applied. See available types in self.__available_norm_im_vid for ‘raw-image’ and ‘video’ and self.__available_norm_feat for ‘image-features’ and ‘video-features’.

# ‘raw-image’ and ‘video’-related parameters

Parameters:	meanSubstraction – indicates if we want to substract the training mean from the returned images (only applicable if normalization=True) dataAugmentation – indicates if we want to apply data augmentation to the loaded images (random flip and cropping)
Returns:	[X,Y], list of input and output data variables of the samples identified by the indices in ‘k’ samples belonging to the chosen ‘set_name’

getY(set_name, init, final, normalization_type='0-1', normalization=False, meanSubstraction=True, dataAugmentation=True, debug=False)¶

Gets the [Y] samples for the FULL dataset

Parameters:	set_name – ‘train’, ‘val’ or ‘test’ set init – initial position in the corresponding set split. Must be bigger or equal than 0 and smaller than final. final – final position in the corresponding set split. debug – if True all data will be returned without preprocessing

# ‘raw-image’, ‘video’, ‘image-features’ and ‘video-features’-related parameters

Parameters:	normalization – indicates if we want to normalize the data. normalization_type – indicates the type of normalization applied. See available types in self.__available_norm_im_vid for ‘raw-image’ and ‘video’ and self.__available_norm_feat for ‘image-features’ and ‘video-features’.

# ‘raw-image’ and ‘video’-related parameters

Parameters:	meanSubstraction – indicates if we want to substract the training mean from the returned images (only applicable if normalization=True) dataAugmentation – indicates if we want to apply data augmentation to the loaded images (random flip and cropping)
Returns:	Y, list of output data variables from sample ‘init’ to ‘final’ belonging to the chosen ‘set_name’

load3DLabels(bbox_list, nClasses, dataAugmentation, daRandomParams, img_size, size_crop, image_list)¶

Loads a set of outputs of the type 3DLabel (used for detection)

Parameters:

bbox_list – list of bboxes, labels and original sizes
nClasses – number of different classes to be detected
dataAugmentation – are we applying data augmentation?
daRandomParams – random parameters applied on data augmentation (vflip, hflip and random crop)
img_size – resized applied to input images
size_crop – crop size applied to input images
image_list – list of input images used as identifiers to ‘daRandomParams’

Returns:

3DLabels with shape (batch_size, width*height, classes)

load3DSemanticLabels(labeled_images_list, nClasses, classes_to_colour, dataAugmentation, daRandomParams, img_size, size_crop, image_list)¶

Loads a set of outputs of the type 3DSemanticLabel (used for semantic segmentation)

Parameters:

labeled_images_list – list of labeled images
nClasses – number of different classes to be detected
classes_to_colour – dictionary relating each class id to their corresponding colour in the labeled image
dataAugmentation – are we applying data augmentation?
daRandomParams – random parameters applied on data augmentation (vflip, hflip and random crop)
img_size – resized applied to input images
size_crop – crop size applied to input images
image_list – list of input images used as identifiers to ‘daRandomParams’

Returns:

3DSemanticLabels with shape (batch_size, width*height, classes)

loadFeatures(X, feat_len, normalization_type='L2', normalization=False, loaded=False, external=False, data_augmentation=True)¶

Loads and normalizes features.

Parameters:

X – Features to load.
feat_len – Length of the features.
normalization_type – Normalization to perform to the features (see: self.__available_norm_feat)
normalization – Whether to normalize or not the features.
loaded – Flag that indicates if these features have been already loaded.
external – Boolean indicating if the paths provided in ‘X’ are absolute paths to external images
data_augmentation – Perform data augmentation (with mean=0.0, std_dev=0.01)

Returns:

Loaded features as numpy array

loadImages(images, id, normalization_type='0-1', normalization=False, meanSubstraction=True, dataAugmentation=True, daRandomParams=None, external=False, loaded=False)¶

Loads a set of images from disk.

:param images : list of image string names or list of matrices representing images (only if loaded==True) :param id : identifier in the Dataset object of the data we are loading :param normalization_type: type of normalization applied :param normalization : whether we applying a 0-1 normalization to the images :param meanSubstraction : whether we are removing the training mean :param dataAugmentation : whether we are applying dataAugmentatino (random cropping and horizontal flip) :param daRandomParams : dictionary with results of random data augmentation provided by self.getDataAugmentationRandomParams() :param external : if True the images will be loaded from an external database, in this case the list of images must be absolute paths :param loaded : set this option to True if images is a list of matricies instead of a list of strings

loadMapping(path_list)¶: Loads a mapping of Source – Target words. :param path_list: Pickle object with the mapping :return: None

loadText(X, vocabularies, max_len, offset, fill, pad_on_batch, words_so_far, loading_X=False)¶

Text encoder: Transforms samples from a text representation into a numerical one. It also masks the text.

Parameters:

X – Text to encode.
vocabularies – Mapping word -> index
max_len – Maximum length of the text.
offset – Shifts the text to the right, adding null symbol at the start
fill – ‘start’: the resulting vector will be filled with 0s at the beginning, ‘end’: it will be filled with 0s at the end.
pad_on_batch – Whether we get sentences with length of the maximum length of the minibatch or sentences with a fixed (max_text_length) length.
words_so_far – Experimental feature. Use with caution.
loading_X – Whether we are loading an input or an output of the model

Returns:

Text as sequence of number. Mask for each sentence.

loadVideos(n_frames, id, last, set_name, max_len, normalization_type, normalization, meanSubstraction, dataAugmentation)¶

Loads a set of videos from disk. (Untested!)

Parameters:

n_frames – Number of frames per video
id – Id to load
last – Last video loaded
set_name – ‘train’, ‘val’, ‘test’
max_len – Maximum length of videos
normalization_type – Type of normalization applied
normalization – Whether we apply a 0-1 normalization to the images
meanSubstraction – Whether we are removing the training mean
dataAugmentation – Whether we are applying dataAugmentatino (random cropping and horizontal flip)

load_GT_3DSemanticLabels(gt, id)¶

Loads a GT list of 3DSemanticLabels in a 2D matrix and reshapes them to an Nx1 array

Parameters:	gt – list of Dataset output of type 3DSemanticLabels id – id of the input/output we are processing
Returns:	out_list: containing a list of label images reshaped as an Nx1 array

merge_vocabularies(ids)¶

Merges the vocabularies from a set of text inputs/outputs into a single one.

Parameters:	ids – identifiers of the inputs/outputs whose vocabularies will be merged
Returns:	None

preprocessBinary(labels_list)¶

Preprocesses binary classes.

Parameters:	labels_list – Binary label list given as an instance of the class list.
Returns:	Preprocessed labels.

preprocessCategorical(labels_list)¶

Preprocesses categorical data.

Parameters:	labels_list – Label list. Given as a path to a file or as an instance of the class list.
Returns:	Preprocessed labels.

preprocessFeatures(path_list, id, set_name, feat_len)¶

Preprocesses features. We should give a path to a text file where each line must contain a path to a .npy file storing a feature vector. Alternatively “path_list” can be an instance of the class list.

Parameters:	path_list – Path to a text file where each line must contain a path to a .npy file storing a feature vector. Alternatively, instance of the class list. id – Dataset id set_name – Used? feat_len – Length of features. If all features have the same length, given as a number. Otherwise, list.
Returns:	Preprocessed features

preprocessReal(labels_list)¶

Preprocesses real classes.

Parameters:	labels_list – Label list. Given as a path to a file or as an instance of the class list.
Returns:	Preprocessed labels.

preprocessText(annotations_list, id, set_name, tokenization, build_vocabulary, max_text_len, max_words, offset, fill, min_occ, pad_on_batch, words_so_far)¶

Preprocess ‘text’ data type: Builds vocabulary (if necessary) and preprocesses the sentences. Also sets Dataset parameters.

Parameters:

annotations_list – Path to the sentences to process.
id – Dataset id of the data.
set_name – Name of the current set (‘train’, ‘val’, ‘test’)
tokenization – Tokenization to perform.
build_vocabulary – Whether we should build a vocabulary for this text or not.
max_text_len – Maximum length of the text. If max_text_len == 0, we treat the full sentence as a class.
max_words – Maximum number of words to include in the dictionary.
offset – Text shifting.
fill – Whether we path with zeros at the beginning or at the end of the sentences.
min_occ – Minimum occurrences of each word to be included in the dictionary.
pad_on_batch – Whether we get sentences with length of the maximum length of the minibatch or sentences with a fixed (max_text_length) length.
words_so_far – Experimental feature. Should be ignored.

Returns:

Preprocessed sentences.

resetCounters(set_name='all')¶: Resets some basic counter indices for the next samples to read.

setClasses(path_classes, id)¶

Loads the list of classes of the dataset. Each line must contain a unique identifier of the class.

Parameters:	path_classes – Path to a text file with the classes or an instance of the class list. id – Dataset id
Returns:	None

setInput(path_list, set_name, type='raw-image', id='image', repeat_set=1, required=True, overwrite_split=False, img_size=[256, 256, 3], img_size_crop=[227, 227, 3], use_RGB=True, max_text_len=35, tokenization='tokenize_basic', offset=0, fill='end', min_occ=0, pad_on_batch=True, build_vocabulary=False, max_words=0, words_so_far=False, feat_len=1024, max_video_len=26)¶

Loads a list which can contain all samples from either the ‘train’, ‘val’, or ‘test’ set splits (specified by set_name).

# General parameters

Parameters:

path_list – can either be a path to a text file containing the paths to the images or a python list of paths
set_name – identifier of the set split loaded (‘train’, ‘val’ or ‘test’)
type – identifier of the type of input we are loading (accepted types can be seen in self.__accepted_types_inputs)
id – identifier of the input data loaded
repeat_set – repeats the inputs given (useful when we have more outputs than inputs). Int or array of ints.
required – flag for optional inputs

Param:

overwrite_split: indicates that we want to overwrite the data with id that was already declared in the dataset

# ‘raw-image’-related parameters

Parameters:	img_size – size of the input images (any input image will be resized to this) img_size_crop – size of the cropped zone (when dataAugmentation=False the central crop will be used)

# ‘text’-related parameters

Parameters:

tokenization – type of tokenization applied (must be declared as a method of this class) (only applicable when type==’text’).
build_vocabulary – whether a new vocabulary will be built from the loaded data or not (only applicable when type==’text’).
max_text_len – maximum text length, the rest of the data will be padded with 0s (only applicable if the output data is of type ‘text’).
max_words – a maximum of ‘max_words’ words from the whole vocabulary will be chosen by number or occurrences
offset – number of timesteps that the text is shifted to the right (for sequential conditional models, which take as input the previous output)
fill – select whether padding before or after the sequence
min_occ – minimum number of occurrences allowed for the words in the vocabulary. (default = 0)
pad_on_batch – the batch timesteps size will be set to the length of the largest sample +1 if True, max_len will be used as the fixed length otherwise
words_so_far – if True, each sample will be represented as the complete set of words until the point defined by the timestep dimension (e.g. t=0 ‘a’, t=1 ‘a dog’, t=2 ‘a dog is’, etc.)

# ‘image-features’ and ‘video-features’- related parameters

Parameters:	feat_len – size of the feature vectors for each dimension. We must provide a list if the features are not vectors.

# ‘video’-related parameters

Parameters:	max_video_len – maximum video length, the rest of the data will be padded with 0s (only applicable if the input data is of type ‘video’ or video-features’).

setInputGeneral(path_list, split=[0.8, 0.1, 0.1], shuffle=True, type='raw-image', id='image')¶

DEPRECATED

Loads a single list of samples from which train/val/test divisions will be applied.

Parameters:	path_list – path to the text file with the list of images. split – percentage of images used for [training, validation, test]. shuffle – whether we are randomly shuffling the input samples or not. type – identifier of the type of input we are loading (accepted types can be seen in self.__accepted_types_inputs) id – identifier of the input data loaded

setLabels(labels_list, set_name, type='categorical', id='label')¶: DEPRECATED

setList(path_list, set_name, type='raw-image', id='image')¶: DEPRECATED

setListGeneral(path_list, split=[0.8, 0.1, 0.1], shuffle=True, type='raw-image', id='image')¶: Deprecated

setOutput(path_list, set_name, type='categorical', id='label', repeat_set=1, overwrite_split=False, tokenization='tokenize_basic', max_text_len=0, offset=0, fill='end', min_occ=0, pad_on_batch=True, words_so_far=False, build_vocabulary=False, max_words=0, associated_id_in=None, sample_weights=False)¶

Loads a set of output data, usually (type==’categorical’) referencing values in self.classes (starting from 0)

# General parameters

Parameters:

path_list – can either be a path to a text file containing the labels or a python list of labels.
set_name – identifier of the set split loaded (‘train’, ‘val’ or ‘test’).
type – identifier of the type of input we are loading (accepted types can be seen in self.__accepted_types_outputs).
id – identifier of the input data loaded.
repeat_set – repeats the outputs given (useful when we have more inputs than outputs). Int or array of ints.

Param:

overwrite_split: indicates that we want to overwrite the data with id that was already declared in the dataset

# ‘text’-related parameters

Parameters:

tokenization – type of tokenization applied (must be declared as a method of this class) (only applicable when type==’text’).
build_vocabulary – whether a new vocabulary will be built from the loaded data or not (only applicable when type==’text’).
max_text_len – maximum text length, the rest of the data will be padded with 0s (only applicable if the output data is of type ‘text’) Set to 0 if the whole sentence will be used as an output class.
max_words – a maximum of ‘max_words’ words from the whole vocabulary will be chosen by number or occurrences
offset – number of timesteps that the text is shifted to the right (for sequential conditional models, which take as input the previous output)
fill – select whether padding before or after the sequence
min_occ – minimum number of occurrences allowed for the words in the vocabulary. (default = 0)
pad_on_batch – the batch timesteps size will be set to the length of the largest sample +1 if True, max_len will be used as the fixed length otherwise
words_so_far – if True, each sample will be represented as the complete set of words until the point defined by the timestep dimension (e.g. t=0 ‘a’, t=1 ‘a dog’, t=2 ‘a dog is’, etc.)

# ‘3DLabel’ or ‘3DSemanticLabel’-related parameters

Parameters:	associated_id_in – id of the input ‘raw-image’ associated to the inputted 3DLabels or 3DSemanticLabel

setRawInput(path_list, set_name, type='file-name', id='raw-text', overwrite_split=False)¶

Loads a list which can contain all samples from either the ‘train’, ‘val’, or ‘test’ set splits (specified by set_name).

# General parameters

Parameters:

path_list – can either be a path to a text file containing the paths to the images or a python list of paths
set_name – identifier of the set split loaded (‘train’, ‘val’ or ‘test’)
type – identifier of the type of input we are loading (accepted types can be seen in self.__accepted_types_inputs)
id – identifier of the input data loaded
repeat_set – repeats the inputs given (useful when we have more outputs than inputs). Int or array of ints.
required – flag for optional inputs

setRawOutput(path_list, set_name, type='file-name', id='raw-text', overwrite_split=False)¶

Loads a list which can contain all samples from either the ‘train’, ‘val’, or ‘test’ set splits (specified by set_name).

# General parameters

Parameters:

path_list – can either be a path to a text file containing the paths to the images or a python list of paths
set_name – identifier of the set split loaded (‘train’, ‘val’ or ‘test’)
type – identifier of the type of input we are loading (accepted types can be seen in self.__accepted_types_inputs)
id – identifier of the input data loaded
repeat_set – repeats the inputs given (useful when we have more outputs than inputs). Int or array of ints.
required – flag for optional inputs

setSemanticClasses(path_classes, id)¶

Loads the list of semantic classes of the dataset together with their corresponding colours in the GT image. Each line must contain a unique identifier of the class and its associated RGB colour representation separated by commas.

Parameters:	path_classes – Path to a text file with the classes and their colours. id – input/output id
Returns:	None

setSilence(silence)¶: Changes the silence mode of the ‘Dataset’ instance.

setTrainMean(mean_image, id, normalization=False)¶

Loads a pre-calculated training mean image, ‘mean_image’ can either be:

numpy.array (complete image)
list with a value per channel
string with the path to the stored image.

Parameters:	id – identifier of the type of input whose train mean is being introduced.

shuffleTraining()¶: Applies a random shuffling to the training samples.

tokenize_CNN_sentence(caption)¶: Tokenization employed in the CNN_sentence package (https://github.com/yoonkim/CNN_sentence/blob/master/process_data.py#L97).

tokenize_aggressive(caption, lowercase=True)¶

Aggressive tokenizer for the input/output data of type ‘text’: * Removes punctuation * Optional lowercasing

Parameters:	caption – String to tokenize lowercase – Whether to lowercase the caption or not
Returns:	Tokenized version of caption

tokenize_basic(caption, lowercase=True)¶

Basic tokenizer for the input/output data of type ‘text’:

Splits punctuation
Optional lowercasing

Parameters:	caption – String to tokenize lowercase – Whether to lowercase the caption or not
Returns:	Tokenized version of caption

tokenize_icann(caption)¶

Tokenization used for the icann paper: * Removes some punctuation (. , ”) * Lowercasing

Parameters:	caption – String to tokenize
Returns:	Tokenized version of caption

tokenize_montreal(caption)¶

Similar to tokenize_icann

Removes some punctuation
Lowercase

Parameters:	caption – String to tokenize
Returns:	Tokenized version of caption

tokenize_none(caption)¶

Does not tokenizes the sentences. Only performs a stripping

Parameters:	caption – String to tokenize
Returns:	Tokenized version of caption

tokenize_none_char(caption)¶: Character-level tokenization. Respects all symbols. Separates chars. Inserts <space> sybmol for spaces. If found an escaped char, “'” symbol, it is converted to the original one # List of escaped chars (by moses tokenizer) & -> & | -> | < -> < > -> > ‘ -> ' ” -> " [ -> [ ] -> ] :param caption: String to tokenize :return: Tokenized version of caption

tokenize_questions(caption)¶

Basic tokenizer for VQA questions:

Lowercasing
Splits contractions
Removes punctuation
Numbers to digits

Parameters:	caption – String to tokenize
Returns:	Tokenized version of caption

tokenize_soft(caption, lowercase=True)¶

Tokenization used for the icann paper:

Removes very little punctuation
Lowercase

Parameters:	caption – String to tokenize lowercase – Whether to lowercase the caption or not
Returns:	Tokenized version of caption

class keras_wrapper.dataset.Homogeneous_Data_Batch_Generator(set_split, net, dataset, num_iterations, batch_size=50, maxlen=100, normalization=False, data_augmentation=True, mean_substraction=True, predict=False)¶: Retrieves batches of the same length. Parts of the code borrowed from https://github.com/kelvinxu/arctic-captions/blob/master/homogeneous_data.py

keras_wrapper.dataset.create_dir_if_not_exists(directory)¶

Creates a directory if it doen’t exist

Parameters:	directory – Directory to create
Returns:	None

keras_wrapper.dataset.loadDataset(dataset_path)¶

Loads a previously saved Dataset object.

Parameters:	dataset_path – Path to the stored Dataset to load
Returns:	Loaded Dataset object

keras_wrapper.dataset.saveDataset(dataset, store_path)¶

Saves a backup of the current Dataset object.

Parameters:	dataset – Dataset object to save store_path – Saving path
Returns:	None

cnn_model.py¶

keras_wrapper.cnn_model.CNN_Model¶: alias of Model_Wrapper

class keras_wrapper.cnn_model.Model_Wrapper(nOutput=1000, type='basic_model', silence=False, input_shape=[256, 256, 3], structure_path=None, weights_path=None, seq_to_functional=False, model_name=None, plots_path=None, models_path=None, inheritance=False)¶

Wrapper for Keras’ models. It provides the following utilities:

Training visualization module.
Set of already implemented CNNs for quick definition.
Easy layers re-definition for finetuning.
Model backups.
Easy to use training and test methods.

BeamSearchNet(ds, parameters)¶: DEPRECATED, use predictBeamSearchNet() instead.

Empty(nOutput, input)¶: Creates an empty Model_Wrapper (can be externally defined)

GAP(nOutput, input)¶

Creates a GAP network for object localization as described in the paper: Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A. Learning Deep Features for Discriminative Localization. arXiv preprint arXiv:1512.04150. 2015 Dec 14.
Outputs:: ‘GAP/softmax’ output of the final softmax classification ‘GAP/conv’ output of the generated convolutional maps.

Identity_Layer(nOutput, input)¶: Builds an dummy Identity_Layer, which should give as output the same as the input. Only used for passing the output from a previous stage to the next (see Staged_Network).

One_vs_One(nOutput, input)¶: Builds a simple One_vs_One network with 3 convolutional layers (useful for ECOC models).

One_vs_One_Inception(nOutput=2, input=[224, 224, 3])¶: Builds a simple One_vs_One_Inception network with 2 inception layers (useful for ECOC models).

One_vs_One_Inception_v2(nOutput=2, input=[224, 224, 3])¶: Builds a simple One_vs_One_Inception_v2 network with 2 inception layers (useful for ECOC models).

Union_Layer(nOutput, input)¶: Network with just a dropout and a softmax layers which is intended to serve as the final layer for an ECOC model

VGG_16(nOutput, input)¶: Builds a VGG model with 16 layers.

VGG_16_FunctionalAPI(nOutput, input)¶: 16-layered VGG model implemented in Keras’ Functional API

VGG_16_PReLU(nOutput, input)¶: Builds a VGG model with 16 layers and with PReLU activations.

add_One_vs_One_Inception(input, input_shape, id_branch, nOutput=2, activation='softmax')¶: Builds a simple One_vs_One_Inception network with 2 inception layers on the top of the current model (useful for ECOC_loss models).

add_One_vs_One_Inception_Functional(input, input_shape, id_branch, nOutput=2, activation='softmax')¶: Builds a simple One_vs_One_Inception network with 2 inception layers on the top of the current model (useful for ECOC_loss models).

add_One_vs_One_Inception_v2(input, input_shape, id_branch, nOutput=2, activation='softmax')¶: Builds a simple One_vs_One_Inception_v2 network with 2 inception layers on the top of the current model (useful for ECOC_loss models).

add_dense_block(in_layer, nb_layers, k=12, drop=0.2)¶

Adds a Dense Block.

# References: Jegou S, Drozdzal M, Vazquez D, Romero A, Bengio Y. The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation. arXiv preprint arXiv:1611.09326. 2016 Nov 28.

Parameters:	in_layer – input layer to the dense block. nb_layers – number of dense layers included in the dense block (see self.add_dense_layer() for information about the internal layers). k – growth rate. Number of additional feature maps learned at each layer. drop – dropout rate.
Returns:	output layer of the dense block

add_dense_layer(in_layer, k, drop)¶

Adds a Dense Layer inside a Dense Block, which is composed of BN, ReLU, Conv and Dropout

# References: Jegou S, Drozdzal M, Vazquez D, Romero A, Bengio Y. The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation. arXiv preprint arXiv:1611.09326. 2016 Nov 28.

Parameters:	in_layer – input layer to the dense block. k – growth rate. Number of additional feature maps learned at each layer.
Returns:	output layer

add_transitionup_block(x, skip_conn, skip_conn_shapes, out_dim, nb_filters_deconv, nb_layers, growth, drop)¶

Adds a Transition Up Block. Consisting of Deconv, Skip Connection, Dense Block.

# References: Jegou S, Drozdzal M, Vazquez D, Romero A, Bengio Y. The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation. arXiv preprint arXiv:1611.09326. 2016 Nov 28.

# Input layers parameters :param x: input layer. :param skip_conn: list of layers to be used as skip connections. :param skip_conn_shapes: list of output shapes of the skip connection layers.

# Deconvolutional layer parameters :param out_dim: output dimensionality of the Deconvolutional layer [width, height] :param nb_filters_deconv: number of deconvolutional filters to learn.

# Dense Block parameters :param nb_layers: number of dense layers included in the dense block (see self.add_dense_layer() for information about the internal layers). :param growth: growth rate. Number of additional feature maps learned at each layer. :param drop: dropout rate.

Returns:	output layer

basic_model(nOutput, input)¶: Builds a basic CNN model.

basic_model_seq(nOutput, input)¶: Builds a basic CNN model.

beam_search(X, params, null_sym=2)¶

Beam search method for Cond models. (https://en.wikibooks.org/wiki/Artificial_Intelligence/Search/Heuristic_search/Beam_search) The algorithm in a nutshell does the following:

k = beam_size
open_nodes = [[]] * k
while k > 0:

3.1. Given the inputs, get (log) probabilities for the outputs.

3.2. Expand each open node with all possible output.

3.3. Prune and keep the k best nodes.

3.4. If a sample has reached the <eos> symbol:

3.4.1. Mark it as final sample.

3.4.2. k -= 1

3.5. Build new inputs (state_below) and go to 1.
return final_samples, final_scores

Parameters:	X – Model inputs params – Search parameters null_sym – <null> symbol
Returns:	UNSORTED list of [k_best_samples, k_best_scores] (k: beam size)

checkParameters(input_params, default_params)¶: Validates a set of input parameters and uses the default ones if not specified.

decode_predictions(preds, temperature, index2word, sampling_type, verbose=0)¶: Decodes predictions :param preds: Predictions codified as the output of a softmax activation function. :param temperature: Temperature for sampling. :param index2word: Mapping from word indices into word characters. :param sampling_type: ‘max_likelihood’ or ‘multinomial’. :param verbose: Verbosity level, by default 0. :return: List of decoded predictions.

decode_predictions_beam_search(preds, index2word, alphas=None, heuristic=0, x_text=None, unk_symbol='<unk>', pad_sequences=False, mapping=None, verbose=0)¶: Decodes predictions from the BeamSearch method. :param preds: Predictions codified as word indices. :param index2word: Mapping from word indices into word characters. :param pad_sequences: Whether we should make a zero-pad on the input sequence. :param verbose: Verbosity level, by default 0. :return: List of decoded predictions

decode_predictions_one_hot(preds, index2word, verbose=0)¶: Decodes predictions following a one-hot codification. :param preds: Predictions codified as one-hot vectors. :param index2word: Mapping from word indices into word characters. :param verbose: Verbosity level, by default 0. :return: List of decoded predictions

ended_training()¶: Indicates if the model has early stopped.

log(mode, data_type, value)¶

Stores the train and val information for plotting the training progress.

Parameters:	mode – ‘train’, or ‘val’ data_type – ‘iteration’, ‘loss’ or ‘accuracy’ value – numerical value taken by the data_type

plot()¶: Plots the training progress information.

predictBeamSearchNet(ds, parameters)¶

Approximates by beam search the best predictions of the net on the dataset splits chosen.

Parameters:

batch_size – size of the batch
n_parallel_loaders – number of parallel data batch loaders
normalization – apply data normalization on images/features or not (only if using images/features as input)
mean_substraction – apply mean data normalization on images or not (only if using images as input)
predict_on_sets – list of set splits for which we want to extract the predictions [‘train’, ‘val’, ‘test’]
optimized_search – boolean indicating if the used model has the optimized Beam Search implemented (separate self.model_init and self.model_next models for reusing the information from previous timesteps).

The following attributes must be inserted to the model when building an optimized search model:

ids_inputs_init: list of input variables to model_init (must match inputs to conventional model)

ids_outputs_init: list of output variables of model_init (model probs must be the first output)

ids_inputs_next: list of input variables to model_next (previous word must be the first input)

ids_outputs_next: list of output variables of model_next (model probs must be the first output and the number of out variables must match the number of in variables)

matchings_init_to_next: dictionary from ‘ids_outputs_init’ to ‘ids_inputs_next’

matchings_next_to_next: dictionary from ‘ids_outputs_next’ to ‘ids_inputs_next’

Returns predictions:
	dictionary with set splits as keys and matrices of predictions as values.

predictNet(ds, parameters={}, postprocess_fun=None)¶

Returns the predictions of the net on the dataset splits chosen. The input ‘parameters’ is a dict() which may contain the following parameters:

Parameters:

batch_size – size of the batch
n_parallel_loaders – number of parallel data batch loaders
normalize – apply data normalization on images/features or not (only if using images/features as input)
mean_substraction – apply mean data normalization on images or not (only if using images as input)
predict_on_sets – list of set splits for which we want to extract the predictions [‘train’, ‘val’, ‘test’]

Additional parameters:

:param postprocess_fun : post-processing function applied to all predictions before returning the result. The output of the function must be a list of results, one per sample. If postprocess_fun is a list, the second element will be used as an extra input to the function.

Returns predictions:
	dictionary with set splits as keys and matrices of predictions as values.

predictOnBatch(X, in_name=None, out_name=None, expand=False)¶: Applies a forward pass and returns the predicted values.

predict_cond(X, states_below, params, ii)¶: Returns predictions on batch given the (static) input X and the current history (states_below) at time-step ii. WARNING!: It’s assumed that the current history (state_below) is the last input of the model! See Dataset class for more information :param X: Input context :param states_below: Batch of partial hypotheses :param params: Decoding parameters :param ii: Decoding time-step :return: Network predictions at time-step ii

predict_cond_optimized(X, states_below, params, ii, prev_out)¶: Returns predictions on batch given the (static) input X and the current history (states_below) at time-step ii. WARNING!: It’s assumed that the current history (state_below) is the last input of the model! See Dataset class for more information :param X: Input context :param states_below: Batch of partial hypotheses :param params: Decoding parameters :param ii: Decoding time-step :param prev_out: output from the previous timestep, which will be reused by self.model_next (only applicable if beam search specific models self.model_init and self.model_next models are defined) :return: Network predictions at time-step ii

prepareData(X_batch, Y_batch=None)¶: Prepares the data for the model, depending on its type (Sequential, Model, Graph). :param X_batch: Batch of input data. :param Y_batch: Batch output data. :return: Prepared data.

replaceLastLayers(num_remove, new_layers)¶: Replaces the last ‘num_remove’ layers in the model by the newly defined in ‘new_layers’. Function only valid for Sequential models. Use self.removeLayers(...) for Graph models.

replace_unknown_words(src_word_seq, trg_word_seq, hard_alignment, unk_symbol, heuristic=0, mapping=None, verbose=0)¶: Replaces unknown words from the target sentence according to some heuristic. Borrowed from: https://github.com/sebastien-j/LV_groundhog/blob/master/experiments/nmt/replace_UNK.py :param src_word_seq: Source sentence words :param trg_word_seq: Hypothesis words :param hard_alignment: Target-Source alignments :param unk_symbol: Symbol in trg_word_seq to replace :param heuristic: Heuristic (0, 1, 2) :param mapping: External alignment dictionary :param verbose: Verbosity level :return: trg_word_seq with replaced unknown words

resumeTrainNet(ds, parameters, out_name=None)¶

DEPRECATED

Resumes the last training state of a stored model keeping also its training parameters. If we introduce any parameter through the argument ‘parameters’, it will be replaced by the old one.

Parameters:	out_name – name of the output node that will be used to evaluate the network accuracy. Only applicable for Graph models.

sample(a, temperature=1.0)¶: Helper function to sample an index from a probability array :param a: Probability array :param temperature: The higher, the flatter probabilities. Hence more random outputs. :return:

sampling(scores, sampling_type='max_likelihood', temperature=1.0)¶: Sampling words (each sample is drawn from a categorical distribution). Or picks up words that maximize the likelihood. :param scores: array of size #samples x #classes; every entry determines a score for sample i having class j :param sampling_type: :param temperature: Temperature for the predictions. The higher, the flatter probabilities. Hence more random outputs. :return: set of indices chosen as output, a vector of size #samples

setInputsMapping(inputsMapping)¶

Sets the mapping of the inputs from the format given by the dataset to the format received by the model.

Parameters:	inputsMapping – dictionary with the model inputs’ identifiers as keys and the dataset inputs identifiers’ position as values. If the current model is Sequential then keys must be ints with the desired input order (starting from 0). If it is Model then keys must be str.

setName(model_name, plots_path=None, models_path=None, create_plots=False, clear_dirs=True)¶

Changes the name (identifier) of the Model_Wrapper instance.

Parameters:	model_name – New model name plots_path – Path where to store the plots models_path – Path where to store the model create_plots – Whether we’ll store plots or not clear_dirs – Whether the store_path directory will be erased or not
Returns:	None

setOptimizer(lr=None, momentum=None, loss=None, metrics=None, decay=0.0, clipnorm=10.0, optimizer=None)¶

Sets a new optimizer for the CNN model.

Parameters:

lr – learning rate of the network
momentum – momentum of the network (if None, then momentum = 1-lr)
loss – loss function applied for optimization
metrics – list of Keras’ metrics used for evaluating the model. To specify different metrics for different outputs of a multi-output model, you could also pass a dictionary, such as metrics={‘output_a’: ‘accuracy’}.
decay – lr decay
clipnorm – gradients’ clip norm
optimizer – string identifying the type of optimizer used (default: SGD)

setOutputsMapping(outputsMapping, acc_output=None)¶

Sets the mapping of the outputs from the format given by the dataset to the format received by the model.

Parameters:

outputsMapping – dictionary with the model outputs’ identifiers as keys and the dataset outputs identifiers’ position as values. If the current model is Sequential then keys must be ints with the desired output order (in this case only one value can be provided). If it is Model then keys must be str.
acc_output – name of the model’s output that will be used for calculating the accuracy of the model (only needed for Graph models)

testNetSamples(X, batch_size=50)¶: Applies a forward pass on the samples provided and returns the predicted classes and probabilities.

testNet_deprecated(ds, parameters, out_name=None)¶

Applies a complete round of tests using the test set in the provided Dataset instance.

Parameters:	out_name – name of the output node that will be used to evaluate the network accuracy. Only applicable for Graph models.

The available (optional) testing parameters are the following ones:

Parameters:	batch_size – size of the batch (number of images) applied on each interation

#### Data processing parameters

Parameters:	n_parallel_loaders – number of parallel data loaders allowed to work at the same time normalization – boolean indicating if we want to 0-1 normalize the image pixel values mean_substraction – boolean indicating if we want to substract the training mean

testOnBatch(X, Y, accuracy=True, out_name=None)¶

Applies a test on the samples provided and returns the resulting loss and accuracy (if True).

Parameters:	out_name – name of the output node that will be used to evaluate the network accuracy. Only applicable for Graph models.

trainNet(ds, parameters, out_name=None)¶

Trains the network on the given dataset ‘ds’.

Parameters:	out_name – name of the output node that will be used to evaluate the network accuracy. Only applicable to Graph models.

The input ‘parameters’ is a dict() which may contain the following (optional) training parameters:

#### Visualization parameters

Parameters:	report_iter – number of iterations between each loss report iter_for_val – number of interations between each validation test num_iterations_val – number of iterations applied on the validation dataset for computing the average performance (if None then all the validation data will be tested)

#### Learning parameters

Parameters:

n_epochs – number of epochs that will be applied during training
batch_size – size of the batch (number of images) applied on each interation by the SGD optimization
lr_decay – number of iterations passed for decreasing the learning rate
lr_gamma – proportion of learning rate kept at each decrease. It can also be a set of rules defined by a list, e.g. lr_gamma = [[3000, 0.9], ..., [None, 0.8]] means 0.9 until iteration 3000, ..., 0.8 until the end.

#### Data processing parameters

Parameters:

n_parallel_loaders – number of parallel data loaders allowed to work at the same time
normalize – boolean indicating if we want to 0-1 normalize the image pixel values
mean_substraction – boolean indicating if we want to substract the training mean
data_augmentation – boolean indicating if we want to perform data augmentation (always False on validation)
shuffle – apply shuffling on training data at the beginning of each epoch.

#### Other parameters

Parameters:	save_model – number of iterations between each model backup

keras_wrapper.cnn_model.loadModel(model_path, update_num, custom_objects={}, full_path=False)¶

Loads a previously saved Model_Wrapper object.

Parameters:	model_path – path to the Model_Wrapper object to load update_num – identifier of the number of iterations/updates/epochs elapsed custom_objects – dictionary of custom layers (i.e. input to model_from_json)
Returns:	loaded Model_Wrapper

keras_wrapper.cnn_model.saveModel(model_wrapper, update_num, path=None, full_path=False, store_iter=True)¶

Saves a backup of the current Model_Wrapper object after being trained for ‘update_num’ iterations/updates/epochs.

Parameters:	model_wrapper – object to save update_num – identifier of the number of iterations/updates/epochs elapsed path – path where the model will be saved full_path – Whether we save to the path of from path + ‘/epoch_‘ + update_num store_iter – Whether we store the current update_num
Returns:	None

callbacks_keras_wrapper.py¶

beam_search_ensemble.py¶

utils.py¶

keras_wrapper.utils.bbox(img, mode='max')¶: Returns a bounding box covering all the non-zero area in the image. “mode” : “width_height” returns width in [2] and height in [3], “max” returns xmax in [2] and ymax in [3]

keras_wrapper.utils.prepareGoogleNet_Food101(model_wrapper)¶: Prepares the GoogleNet model after its conversion from Caffe

keras_wrapper.utils.prepareGoogleNet_Food101_ECOC_loss(model_wrapper)¶: Prepares the GoogleNet model for inserting an ECOC structure after removing the last part of the net

keras_wrapper.utils.prepareGoogleNet_Food101_Stage1(model_wrapper)¶: Prepares the GoogleNet model for serving as the first Stage of a Staged_Netork

keras_wrapper.utils.prepareGoogleNet_Stage2(stage1, stage2)¶: Removes the second part of the GoogleNet for inserting it into the second stage.