Modules

Unagi has three submodules binarize, dataset and train. Each submodule has a main method, which can be used to create a dataset, train the model with the created data and then binarize the image using saved model weights.

unagi.binarize

Binarize module uses a pretrained model saved from the model training and performs the model prediction i.e binarization on the given image.

unagi.binarize.main(input_path: str = './input', output_path: str = './output', weights_path: str | None = None, batchsize: int = 2) → List[ndarray][source]

Binarize images from input directory and write them to output directory.

Parameters:

input_path (str, optional) – input path for images. (default is input folder in current directory)
output_path (str, optional) – output path to save images. (default is output folder in current directory)
weights_path (str or None, optional) – path to weights file. if None default weights will be loaded from package root directory.
batchsize (int, optional) – batchsize to use in model prediction

Returns:

list of binary images in np.ndarray format

Return type:

List[numpy.ndarray]

Note

All input image names should be in png format “sample_1.png”. All output image names will end with “_bin” like “sample_1_bin.png”.

Example

unagi.binarize.main(‘input_path’, ‘output_path’, 2)

unagi.binarize.parse_args() → Namespace[source]: Parse command-line arguments for binarize module.

unagi.dataset

Dataset module can be used to create the traindata set. It takes a folder with input image and it’s corrosponding ground truth image. Image name should end with _in and ground truth image name should end with _gt. Input and ground truth images should have same file extention.

Tip

Consider saving the images in PNG format and not JPG. Saving binary images in JPG format would make the image to carry some gray level pixels.

Images are cropped into smaller image parts based on the input size of the U-net model. Output folder will contain two sub folders such as in and gt. in folder contains the input images and gt contains the respective ground images.

class unagi.dataset.ImageProcessor(size_x: int = 128, size_y: int = 128, step_x: int = 128, step_y: int = 128)[source]

Bases: object

process_img(fname_in: str) → None[source]

Read train and ground_truth images, split them and save.

Parameters:: fname_in (str,) – input image name
Return type:: None

unagi.train

Train module is used to train the U-net model. Train dataset is split into train, validation and test datasets to use in model training. Best fitting weights are saved for each epoch and the model performance can be visualized by using some images to test the model performance. These images are independent of the training set.

Loss functions can be selected from the available options and the train data is augmented on fly during the training to make the model robust to the distortions in data.

class unagi.train.ParallelDataGenerator(fnames_in: List[str], fnames_gt: List[str], batch_size: int, augmentate: bool)[source]

Bases: Sequence

Generate images for training/validation/testing (parallel version).

Parameters:

fnames_in (List[str]) – list of input images
fnames_gt (List[str]) – list of gt images
batch_size (int) – batch size to generate augmentations on images
augmentate (bool) – apply augmentate to batch of images

augmentate_batch(imgs_in: List[ndarray], imgs_gt: List[ndarray]) → Tuple[List[ndarray], List[ndarray]][source]

Generate ordered augmented batch of images, using Augmentor.

Parameters:

imgs_in (List[numpy.ndarray]) – list of input images as array
imgs_gt (List[numpy.ndarray]) – list of gt image as array

Returns:

List of input images after applying augmentation List of gt images after applying augmentation

Return type:

Tuple[List[numpy.ndarray], List[numpy.ndarray]]

on_epoch_end() → None[source]: Shuffles the images at the end of epoch.

unagi.train.main(input_path: str = './input', vis: str = './vis', debug: str = './train_logs', loss: str = 'dice_coef_loss', epochs: int = 1, batchsize: int = 32, augmentate: bool = True, train_split: int = 80, val_split: int = 10, test_split: int = 10, weights_path: str = './bin_weights.hdf5', num_gpus: int = 1, extraprocesses: int = 0, queuesize: int = 10) → None[source]

Train U-net with pairs of train and ground-truth images.

Parameters:

input_path (str, optional) – input dir with in and gt sub folders to train (default is os.path.join(“.”, “input”)).
vis (str, optional) – dir with image to use for train visualization (default is os.path.join(“.”, “vis”)).
debug (str, optional) – path to save training logs (default is os.path.join(“.”, “train_logs”)).
loss (str, optional) – loss function (default is dice_coef_loss - dice loss).
epochs (int, optional) – number of epochs to train unagi (default is 1).
batchsize (int, optional) – batchsize to train unagi (default is 32).
augmentate (bool, optional) – argumentate the original images for training unagi (default is True)
train_split (int, optional) – train dataset split percentage (default is 80).
val_split (int, optional) – validation dataset split percentage (default is 10).
test_split (int, optional) – train dataset split percentage (default is 10).
weights_path (str, optional) – path to save final weights (default is os.path.join(“.”, “bin_weights.hdf5”)).
num_gpus (int, optional) – number of gpus to use for training unagi (default is 1)
extraprocesses (int, optional) – number of extraprocesses to use (default is 0).
queuesize (int, optional) – number of batches to generate in queue while training (default is 10).

Return type:

None

Note

All train images should be in “in” directory. All ground-truth images should be in “gt” directory.

Example

unagi.train.main(input, vis, logs_dir, 2, 4)

unagi.train.parse_args() → Namespace[source]: Parse command-line arguments for train module.

unagi.cli

Command line interface for the unagi package. It can be used to create dataset, train the model and binarize the image.

$ unagi --help
usage: unagi [-h] [-v] {dataset,train,binarize} ...

command-line interface for Unagi package

optional arguments:
-h, --help            show this help message and exit
-v, --version         show package version and exit

available commands:
{dataset,train,binarize}
   dataset             Create dataset to train unagi model
   train               Train the unagi model
   binarize            Use the model weights to binarize images

unagi.cli.get_version() → str[source]

Get the version of the package to print in CLI.

Returns:: version of the package
Return type:: str

unagi.cli.main() → None[source]

Main function for the CLI entry point.

Return type:: None