Master Keras’ ImageDataGenerator

Train deep learning models without worrying about memory

Pedro Vilares
CodeX
Published in
7 min readJan 6, 2021

--

For all of you novice, eager data scientists out there who are looking to develop deep learning models for classification tasks using image data, you have come to the right place!

In this article, I will explain the easiest ways to properly train a deep learning model using Keras API, more specifically the ImageDataGenerator class, and all the bits and bobs required to suitably take advantage of your own data and data structure. This becomes extremely valuable when working with a large number of images, and relying on RAM to make your data readily available just isn’t viable.

1. How is your data organized?

The data structure is very important to consider when training a deep learning model. Having a tailored and organized structure will certainly make your life easier, especially when using image data.

According to your image dataset, you might have loose images in a specific folder, or different sub-folders with each class of your data, or even a subfolder for each image, which is common when dealing with medical data, since each folder may represent a different patient. Nevertheless, there are two main approaches provided by Keras to handle big image data.

Train and Validation sub-folders

Having your data separated into two different folders for model training and model validation is the most straightforward and natural type of organization. Inside the Train and Validation folders, there are more subfolders, as many as the number of classes of your data, and finally, inside each class subfolder, you can find the images. To make it all clearer, there is an example of the sub-folder approach below:

As it is clear to see above, there are two sub-folders inside the main folder, which in turn have sub-sub-folders corresponding to each class of the dataset. Some datasets are organized by this structure already, however, if you don’t have this data structure from the get-go, it is relatively easy to accomplish by defining a training split (normally 0.8) and then organizing your images accordingly (80% of your images will be used for training and 20% for validation).

Having images organized by class and by training and validation folders is the most common organization scheme in machine learning, and it is the go-to for large datasets.

Note: If your data is organized with a sub-folder for each image (i.e. medical data), you may want to consider the second approach, since it can be a pain in the head to move each image from their folder to another.

Dataframe with Image Paths and corresponding Classes

If you are using an image dataset that comes organized in a particular manner and you are wondering how much of a hassle it is going to be to put together train and validation folders, worry not! There is an alternative that is just as functional.

Keras’ ImageDataGenerator allows for another approach that doesn’t require a training folder and validation folder with all the different classes. It requires, however, a dataframe with two columns: the first column should contain the images’ full paths and the second column corresponding classes. This is particularly useful for datasets that provide a .csv file containing text/numeric features as well as image paths.

Example dataframe containing image paths and corresponding labels. Image by author

Note that you will need a dataframe for the training set as well as another one for the validation images!

Now that your data is organized, you’re all set! Let’s start building those generators.

2. Building Image Generators

Now that your data is organized, we can finally start importing the images and build batches of data.

The first step when building a generator is… you guessed it! Checking for dependencies. The only dependencies needed for the occasion are mainly Tensorflow, and Pandas if you are using the dataframe approach. If you have not yet installed Tensorflow, you can install it through pip using the command prompt. Pandas should come pre-installed, so no worries there.

After, start your script/notebook with the necessary imports.

Good! Now, regardless of your data structure, the next step is building an ImageDataGenerator object. According to the Keras documentation, it is possible to implement numerous data augmentation techniques, such as rotations, crops, zoom in/zoom out, etc., using the ImageDataGenerator object, as well as declaring a preprocessing function to be applied to each image. For now, we will build a simple ImageDataGenerator object. It is also possible to create more than one ImageDataGenerator object, if you intend to apply data augmentation techniques on your validation or test sets, but not on your training set, for example.

There are many different parameters to customize said methods, as pictured in the Keras documentation. The most important ones to use in the flow_from_dataframe() method are:

  • dataframe = a Pandas dataframe containing image paths and classes
  • directory = if the paths declared in the dataframe aren’t absolute paths, the directory where images are stored should be declared here.
  • x_col = column in the dataframe that contains image paths
  • y_col = column in the dataframe that contains image classes

In the flow_from_directory() method, there is only one specific argument:

  • directory = the path of the main folder containing the Train and Validation folders

Nevertheless, it is important to define some extra arguments, that both methods share:

  • target_size = a tuple with the dimensions you want your images to take/to be resized to. The default dimensions are (256,256), regardless of the size of your images, so I always declare the target dimensions.
  • color_mode = a string declaring the number of color channels. It defaults to ‘rgb’, so if you want grayscale or RGBA images, declare it to ‘grayscale’ or ‘rgba’.
  • class_mode = a string defining the type of classification your model does. If the output layer of your model has only one node and sigmoid activation, use ‘binary’, if it has the same number of nodes as the number of classes and has a softmax activation, use ‘categorical’.
  • batch_size = an integer that defines the number of images per batch. The default is 32.

If you have organized your image data within a train and validation folders, use the flow_from_directory() method, as demonstrated below.

Whereas if your data is organized according to a dataframe containing image paths and classes, you should use the flow_from_dataframe() method.

Fantastic! It is actually amazing to think that you can take advantage of using large image datasets without requiring a lot of computational power, due to the peculiar design of the generators we’ve just created:

Even though they batch as many images as we want, they take little to no memory to create, because the batches are only generated when you try to access the generators: during the training phase, or when you call for a specific batch.

The first time I used image generators, I struggled to comprehend their structure, so I delved deep into them. Also, I concluded that it is important to do this just to confirm the generated images have the desired characteristics. A generated batch can be accessed through subscription:

Each batch is represented by a tuple of size 2, where the first item is an array containing the images and the second item another array that holds the labels. So, if you want to check if ImageDataGenerator correctly picked up your images, I recommend plotting the generated images. Below, the first image from the first batch is plotted.

3. Time to train!

Photo by Victor Freitas on Unsplash

Data ready, time to train the model! It is pretty easy to train a deep learning model using Keras and image generators. Assuming you have built a deep learning model already, you just have to compile it and train it. In the model.fit() method, you won’t need to declare y and batch_size parameters, since these are already included in the generator objects. So, after compiling your architecture, a fit() method will look something like this:

And that’s it! You can now train a model using how many images you want without worrying about memory or input shapes, and all in a day’s work!

I hope you understood the fundamentals of working with a large number of images and the potential of Keras’ image generators!

For any questions, feel free to leave a comment!

Have a great one! 😄

--

--

Pedro Vilares
CodeX
Writer for

Master Degree Student/Deep Learning Enthusiast