Modules¶

Unagi has three submodules binarize, dataset and train. Each submodule has a main method, which can be used to create a dataset, train the model with the created data and then binarize the image using saved model weights.

unagi.binarize¶

Binarize module uses a pretrained model saved from the model training and performs the model prediction i.e binarization on the given image.

unagi.binarize.main(input_path: str = './input', output_path: str = './output', weights_path: Optional[str] = None, batchsize: int = 2) → List[numpy.ndarray][source]¶

Binarize images from input directory and write them to output directory.

Parameters:	input_path (str, optional) – input path for images. (default is input folder in current directory) output_path (str, optional) – output path to save images. (default is output folder in current directory) weights_path (str or None, optional) – path to weights file. if None default weights will be loaded from package root directory. batchsize (int, optional) – batchsize to use in model prediction
Returns:	list of binary images in np.ndarray format
Return type:	List[numpy.ndarray]

Note

All input image names should be in png format “sample_1.png”. All output image names will end with “_bin” like “sample_1_bin.png”.

Example

unagi.binarize.main(‘input_path’, ‘output_path’, 2)

unagi.dataset¶

Dataset module can be used to create the traindata set. It takes a folder with input image and it’s corrosponding ground truth image. Image name should end with _in and ground truth image name should end with _gt. Input and ground truth images should have same file extention.

Tip

Consider saving the images in PNG format and not JPG. Saving binary images in JPG format would make the image to carry some gray level pixels.

Images are cropped into smaller image parts based on the input size of the U-net model. Output folder will contain two sub folders such as in and gt. in folder contains the input images and gt contains the respective ground images.

unagi.dataset.main(input_path: str = './input', output_path: str = './output', shuffle: bool = True, size_x: int = 128, size_y: int = 128, step_x: int = 128, step_y: int = 128, processes: int = 2) → None[source]¶

Create train and ground-truth images suitable for unagi training.

Parameters:	input_path (str, optional) – path to input images (default is os.path.join(“.”, “input”)) output_path (str, optional) – path to created images (default is os.path.join(“.”, “output”)) shuffle (bool, optional) – shuffle the newly created images (default is True) size_x (int, optional) – width for image part (deafult is 128). size_y (int, optional) – height for image part (deafult is 128). step_x (int, optional) – width overlay for image part (deafult is 128). step_y (int, optional) – height overlay for image part (deafult is 128). processes (int, optional) – number of cpu cores to use (default is cpu_count()
Returns:
Return type:	None

See also

split_img_overlay(), save_imgs()

Example

unagi.dataset.process_img(img_name, 128, 128, 128, 128)

unagi.dataset.save_imgs(imgs_in: List[numpy.ndarray], imgs_gt: List[numpy.ndarray], fname_in: str) → None[source]¶

Save image parts to one folder.

Save all image parts to folder with name ‘(original image name) + _parts’.

Parameters:	imgs_in (List[np.ndarray]) – list of input image arrays imgs_gt (List[np.ndarray]) – list of gt image arrays fname_in (str) – original full image
Returns:
Return type:	None

Example

unagi.dataset.save_imgs(in_img_list, gt_img_list, in_img)

unagi.dataset.shuffle_imgs(dname: str) → None[source]¶

Shuffle input and ground-truth images.

(actual, if You are using different datasets as one).

Parameters:	dname (str) – directory name with image to shuffle
Returns:
Return type:	None

Example

unagi.dataset.shuffle_imgs(images_dir)

unagi.dataset.split_img_overlay(img: numpy.ndarray, size_x: int = 128, size_y: int = 128, step_x: int = 128, step_y: int = 128) → Tuple[List[numpy.ndarray], int, int][source]¶

Split image to parts (little images) with possible overlay.

Parameters:	img (np.ndarray) – input image array size_x (int, optional) – width for image part (deafult is 128). size_y (int, optional) – height for image part (deafult is 128). step_x (int, optional) – width overlay for image part (deafult is 128). step_y (int, optional) – height overlay for image part (deafult is 128).
Returns:	list of numpy arrays border value along width border value along height
Return type:	Tuple[List[numpy.ndarray], int, int]

Note

Walk through the whole image by the window of size size_x * size_y with step step_x, step_y and save all parts in list. If the image sizes are not multiples of the window sizes, the image will be complemented by a frame of suitable size. If step_x, step_y are not equal to size_x, size_y, parts overlay each other, or have spaces between each other.

Example

unagi.dataset.split_img_overlay(img_name, 128, 128, 128, 128)

unagi.train¶

Train module is used to train the U-net model. Traindata set is split into train, validation and test datasets to use in model training. Best fitting weights are saved for each epoch and the model performance can be visualized by using some images to test the model performance. These images are independent of the training set.

Loss functions can be selected from the available options and the train data is augmented on fly during the training to make the model robust to the distrotions in data.

class unagi.train.ParallelDataGenerator(fnames_in: List[str], fnames_gt: List[str], batch_size: int, augmentate: bool)[source]¶

Bases: tensorflow.python.keras.utils.data_utils.Sequence

Generate images for training/validation/testing (parallel version).

Parameters:	fnames_in (List[str]) – list of input images fnames_gt (List[str]) – list of gt images batch_size (int) – batch size to generate augmentations on images augmentate (bool) – apply augmentate to batch of images

augmentate_batch(imgs_in: List[numpy.ndarray], imgs_gt: List[numpy.ndarray]) → Tuple[List[numpy.ndarray], List[numpy.ndarray]][source]¶

Generate ordered augmented batch of images, using Augmentor.

Parameters:	imgs_in (List[numpy.ndarray]) – list of input images as array imgs_gt (List[numpy.ndarray]) – list of gt image as array
Returns:	List of input images after applying augmentation List of gt images after applying augmentation
Return type:	Tuple[List[numpy.ndarray], List[numpy.ndarray]]

on_epoch_end()[source]¶: Shuffles the images at the end of epoch.

unagi.train.main(input_path: str = './input', vis: str = './vis', debug: str = './train_logs', loss: Union[Callable[[Any, Any], float], str] = 'dice_coef_loss', epochs: int = 1, batchsize: int = 32, augmentate: bool = True, train_split: int = 80, val_split: int = 10, test_split: int = 10, weights_path: str = './bin_weights.hdf5', num_gpus: int = 1, extraprocesses: int = 0, queuesize: int = 10)[source]¶

Train U-net with pairs of train and ground-truth images.

Parameters:	input_path (str, optional) – input dir with in and gt sub folders to train (default is os.path.join(“.”, “input”)). vis (str, optional) – dir with image to use for train visualization (default is os.path.join(“.”, “vis”)). debug (str, optional) – path to save training logs (default is os.path.join(“.”, “train_logs”)). loss (str or function, optional) – loss function (default is dice_coef_loss - dice loss). epochs (int, optional) – number of epochs to train unagi (default is 1). batchsize (int, optional) – batchsize to train unagi (default is 32). augmentate (bool, optional) – argumentate the original images for training unagi (default is True) train_split (int, optional) – train dataset split percentage (default is 80). val_split (int, optional) – validation dataset split percentage (default is 10). test_split (int, optional) – train dataset split percentage (default is 10). weights_path (str, optional) – path to save final weights (default is os.path.join(“.”, “bin_weights.hdf5”)). num_gpus (int, optional) – number of gpus to use for training unagi (default is 1) extraprocesses (int, optional) – number of extraprocesses to use (default is 0). queuesize (int, optional) – number of batches to generate in queue while training (default is 10).
Returns:
Return type:	None

Note

All train images should be in “in” directory. All ground-truth images should be in “gt” directory.

Example

unagi.train.main(input, vis, logs_dir, 2, 4)