How to Build and Run Neural Network model on Edge Devices?

Mikal Shrestha
8 min readJan 11, 2023

--

Overall flow for running NN model on android

Hey folks, welcome to this story. In this story I will share you how to implement a neural network model and run into the edge mobile devices. For offline or real-time machine learning on edge devices, we can deploy machine learning or deep learning model directly into the edge devices rather send data to server for processing. It will reduce the latency and helps to get the instant results. Here I will explain by taking a one example of image classification.

I think all of us know about the approach of deploying model by making APIs to interact with the edge devices and server. But can we deploy the model directly into the edge devices so that we can achieve better performance for some applications? The answer is yes. Here TensorFlow Lite comes into the picture.

For who don’t know, TensorFlow is an open source framework developed by Google to build and run machine learning, deep learning and other statistical and predictive analytics model. And TensorFlow Lite is an open source deep learning framework that is designed to run machine learning models on mobile and embedded devices. It allows developers to build and deploy machine learning models on devices with limited computational resources, such as smartphones and IoT devices.

Dataset

Fruits 360 dataset is used for this example to train and test the model. It is a dataset with 90380 images of 131 fruits and vegetables which is publicly available in Kaggle. You can checkout for more details about the dataset from Kaggle website.
https://www.kaggle.com/datasets/moltean/fruits

Building and Training CNN Model

Let’s understand about the Convolutional Neural Network(CNN) first. It is a kind of network architecture for deep learning algorithms and is specifically used for image recognition and tasks that involve the processing of pixel data. Some of the terminologies used in CNN are:
Kernel or Filter or Feature Detectors: Used to extract the features from the images.
Stride: Stride is a parameter of the neural network’s filter that modifies the amount of movement over the image.
Padding: Padding is a term relevant to convolutional neural networks as it refers to the number of pixels added to an image when it is being processed by the kernel of a CNN.
Pooling: Pooling in convolutional neural networks is a technique for generalizing features extracted by convolutional filters and helping the network recognize features independent of their location in the image.
Flatten: Flattening is used to convert all the resultant 2-Dimensional arrays from pooled feature maps into a single long continuous linear vector. The flattened matrix is fed as input to the fully connected layer to classify the image.
The typical CNN model architecture:

There are other types of neural networks in deep learning, but for identifying and recognizing objects, CNNs are the network architecture of choice. Ohh that’s alot about CNN::

Let’s jump into the implementation part. Here I used TensorFlow and Keras library. And For IDE you can use Google Collab or Jupyter notebook for coding. Let’s create a new workspace notebook in Google Collab and change runtime environment to GPU or TPU.

First step is to download the dataset from the kaggle. Here I am using keras utils class to download the dataset from the link and and extract the zip and store it. After extracting the dataset there will be two directory one is Training and another is Test. Images under Training directory is used for model training purpose and Images under Test directory is used for model testing purpose.

#download dataset from the kaggle link 
data = tf.keras.utils.get_file(origin="https://storage.googleapis.com/kaggle-data-sets/5857/414958/bundle/archive.zip?X-Goog-Algorithm=GOOG4-RSA-SHA256&X-Goog-Credential=gcp-kaggle-com%40kaggle-161607.iam.gserviceaccount.com%2F20230110%2Fauto%2Fstorage%2Fgoog4_request&X-Goog-Date=20230110T001704Z&X-Goog-Expires=259200&X-Goog-SignedHeaders=host&X-Goog-Signature=773ba1adaf07ebbf2be894c4997884c7f05562b27afe4ba8b1f1336c35ebcc5b0a2beddca8712ac029124daa606c1590c305bcd0b4ef1abfa5b836e77a5d7f62a33dac1478b6eae61e5371e9d6d46e8f643418813d76b54333ee6a1300328d3fbc57a7f3edf974ffb89d35264d04103e89bc971d935334f5a1c1e0ea8d34ac244e7e64fe775bc9a03713e8d0d7583d9b582269751263e014d26094082159a3409a50eadf13f26a4f6d0f578579b2ce41f98d4d3f9e85d72d8c1c5c551c96cab5809bc4e650eb85cdd7295116f03b57d854eaa6294d52ee20a1d50c8ef8ffa318e248f94626e94bf29a571f3d1609fef550df0021992f37ac3fe5cbf78b3d7bd8",
fname="fruits-360.zip", extract=True)
base_dir, _ = os.path.splitext(data)

train_dir = os.path.join(base_dir, 'Training')
test_dir = os.path.join(base_dir, 'Test')

Now split the dataset into train set and test set. Keras ImageDataGenerator is used to take the inputs of the original data and then transform it on a random basis, returning the output resultant containing solely the newly changed data. flow_from_directory() method of ImageDataGenerator class takes an input of image directory, batch size and target size. Batch size is a number of samples processed before the model is updated and target size is a size of the input image. The flow_from_directory() method allows you to read the images directly from the directory and augment them while the neural network model is learning on the training data.

train_datagen = tf.keras.preprocessing.image.ImageDataGenerator()

train_generator = train_datagen.flow_from_directory(directory=train_dir, target_size=(image_size, image_size), batch_size=batch_size)

test_datagen = tf.keras.preprocessing.image.ImageDataGenerator()

test_generator = test_datagen.flow_from_directory(directory=test_dir, target_size=(image_size, image_size), batch_size=batch_size)

After splitting into train and test generator, we will apply transfer learning approach before training the custom dataset. Transfer learning is a research problem in machine learning that focuses on storing knowledge gained while solving one problem and applying it to a different but related problem. Basically it is the reuse of a pre-trained model on a new problem. Here I am using MobileNet model for transfer learning.
MobileNet is a deep learning model designed to run efficiently on mobile devices. It was developed by Google and introduced in a 2017 research paper. The model is based on the idea of using depthwise separable convolutions, which factorize a standard convolution into a depthwise convolution (which acts on each input channel separately) followed by a pointwise convolution (which mixes the output of the depthwise convolution). This approach allows for a significant reduction in the number of parameters and computational complexity of the model, making it well-suited for deployment on mobile devices with limited resources. MobileNet also uses other techniques such as using smaller filter sizes and using fewer layers to further reduce the computational requirements of the model.
The trainable property of the model is set to False. This means the weights of the model will not change but just be used. We’re going to use the version of MobileNet that’s trained on images of sizes 128*128 because image size of input dataset is 100*100 which is nearest to the 128*128.

#Transfer Learning: used pre-trained MobileNet Model
pre_trained_model = tf.keras.applications.MobileNet(input_shape=IMG_SHAPE, include_top=False)
pre_trained_model.trainable = False

Yes that’s not enough to classify custom image dataset we have to build our own custom model which is pre-trained on MobileNet with the help of transfer learning. The Sequential class is used to add layers to the model architecture that are compatible with our dataset. A new model can be made using a list of layers that is accepted. The base model is the first item on the list. This indicates that the new model includes all levels in MobileNet, with the exception of its top layers. In addition to these layers, average pooling and dense layers are two new layers that are added. Basically Dense Layer is simple layer of neurons in which each neuron receives input from all the neurons of previous layer. Due to the 102 classes in our dataset, the dense layer has 102 neurons. The compile() method is used to compile the new model that the Sequential class returned after being saved in the model variable. The optimizer, loss function, and metrics are all accepted by this approach. 50 epochs is used for training the model.

#Create Own Custom Model
model = tf.keras.Sequential([
pre_trained_model,
tf.keras.layers.GlobalAveragePooling2D(),
tf.keras.layers.Dense(102, activation='softmax')
])
model.summary()
model.compile(optimizer=tf.keras.optimizers.Adam(),
loss='categorical_crossentropy',
metrics=['accuracy'])
epochs = 50
steps_per_epoch = numpy.ceil(train_generator.n / batch_size)
test_steps = numpy.ceil(test_generator.n / batch_size)

model.fit_generator(generator=train_generator,
steps_per_epoch = steps_per_epoch,
epochs=epochs,
validation_data=test_generator,
validation_steps=test_steps)

Softmax activation function is used in each neuron which determines the neuron should activated or not. It transforms the raw outputs of the neural network into a vector of probabilities, essentially a probability distribution over the input classes. And Adam optimizer is used for minimize the loss.

You can also checkout full snippet of code:

Summary of model training epoch by epoch:

Converting Trained Model to TensorFlow Lite Format

After compiling the model, we need to save the trained model then convert to tensorflow lite format using TFLiteConverter. After conversion is completed, model.tflite file will generate. Then generate the labels of respective train class fruits label. Which is saved as a label.txt.

That’s it on the Google Collab, we got trained tflite model and labels of images. Now let’s build the android app and make use of that model.

Building an Android Application

For developing android app, kotlin programming language is used and Android Studio is used as a IDE. First create a new project from android studio by filling up project name and, application id and other required fields. Then do the following steps.

  1. In app/src/main location, create a new directory named “assets”
  2. Add model.tflite and labels.txt file into assets folder
  3. In the app-level build.gradle file, add the following dependencies for leverage the tensorflow lite in android.
implementation 'org.tensorflow:tensorflow-lite:2.x.x'
implementation 'org.tensorflow:tensorflow-lite-gpu:2.x.

4. Create a MainActivity.kt class and in onCreate() initialize the TensorFlow lite interpreter. And load the model and labels from assets folder.

        val model = loadModelFile(assetManager, "model.tflite")
val labels = loadLines(context, "labels.txt")
val options = Interpreter.Options()
gpuDelegate = GpuDelegate()
options.addDelegate(gpuDelegate)
val interpreter = Interpreter(model, options)

5. After camera captured the image, convert bitmap image into ByteBuffer. run() function of interpreter class takes an input: ByteBuffer Object and output: list of labels object. Runs model inference if the model takes only one input, and provides only one output. run() function returns the string which is label name of image detected by the model which is then populate in the View.

val byteBuffer = convertBitmapToByteBuffer(resizedImage)
val output = Array(1) { FloatArray(labels.size) }
val startTime = SystemClock.uptimeMillis()
interpreter?.run(byteBuffer, output)
Android processing flow

Output

Tested on real android device which classify image with pretty good accuracy. But this model can be optimized alot more. You can try it on your own custom dataset.

That’s it folks. Thanks for reading this story 📗. If you find this helpful, hit the clap button 😄 and If you are stuck on any point feel free to ask below. Keep learning!!

--

--

Mikal Shrestha
Mikal Shrestha

No responses yet