Deploying AI Models on Edge using OpenVino

4 min readMay 24, 2022

In this blog tutorial, we walk you through the deployment of AI based models on the Edge Devices such as Raspberry Pi, Intel Upsquared Board, Jetson Nano etc. We will train a simple binary Image Classification model using Keras API over Tensorflow 2.0 and deploy using OpenVino and OpenCV.

Conversion of File Formats of Deep Learning Neural Network

Step 1: Go to Teachable Machine and train an image classification model

Step 2: Download the Keras model file (.h5 file type) from the teachable machine

Downloading the model containing architecture and weights

Step 3: Convert Keras model (.h5 file type) to Tensorflow model (.pb Graph file type) using h5 to pb model converter notebook ( Use Google Colab )

# Convert .h5 to .pb(Graph)[Accepted by Opencv-TF]

# Upload the Keras model(.h5 file)
import tensorflow as tf
from tensorflow.python.tools import freeze_graph
from tensorflow.python.framework.convert_to_constants import convert_variables_to_constants_v2

model = tf.keras.models.load_model('keras_model.h5')

full_model = tf.function(lambda x: model(x))
full_model = full_model.get_concrete_function(
    tf.TensorSpec(model.inputs[0].shape, model.inputs[0].dtype))

# Get frozen ConcreteFunction
frozen_func = convert_variables_to_constants_v2(full_model)
frozen_func.graph.as_graph_def()

layers = [op.name for op in frozen_func.graph.get_operations()]

# Save frozen graph from frozen ConcreteFunction to hard drive
tf.io.write_graph(graph_or_graph_def=frozen_func.graph,
                  logdir="./frozen_models",
                  name="tf_final.pb",
                  as_text=False)

Model saved in Keras isnt directly considered by OpenCV DNN library. It is important to convert the model into Tensorflow Graph which is accepted by Opencv DNN library.

pb stands for protobuf. In TensorFlow, the protbuf file contains the graph definition as well as the weights of the model. Thus, a pb file is all you need to be able to run a given trained model.

Step 4: Use tf-cv.py file for running model on Opencv DNN library

OpenCV DNN module is used for inference of Deep Learning model trained using heavy libraries like Keras, Tensorflow, Pytorch etc . It is proved that Opencv works faster for inference than other libraries

Opencv would require 2 important files for inference:

Model Weights
Model Architecture

These files are now available in the converted file (.pb)

Either the files can be used in native format using cv2.dnn.readNetFromTensorflow(‘tf_final.pb’) or convert into required .xml and .bin file using Model Optimizer

Code with native tensorflow format

import numpy as np
import cv2net = cv2.dnn.readNetFromTensorflow('tf_final.pb')label = ['Category1','Category2']#net.setPreferableBackend(cv2.dnn.DNN_BACKEND_INFERENCE_ENGINE)
net.setPreferableTarget(cv2.dnn.DNN_TARGET_CPU)vid_cap = cv2.VideoCapture(0)
if not vid_cap.isOpened():
    raise IOError("Webcam cannot be opened!")while True:
    # Capture frames
    ret, frame = vid_cap.read()
    inWidth = 224
    inHeight = 224
    blob = cv2.dnn.blobFromImage(frame,
                                 scalefactor=1.0 / 255,
                                 size=(inWidth, inHeight),
                                 mean=(0, 0, 0),
                                 swapRB=False,
                                 crop=False)
    net.setInput(blob)
    out = net.forward()
    out = out.flatten()
    classId = np.argmax(out)
    confidence = np.round(out[classId]*100,2)
    op = f'{label[classId]} - {confidence}%'
    print(op)
    
    cv2.putText(frame, op, (30,50), cv2.FONT_HERSHEY_SIMPLEX, 0.8, (0, 0, 255), 2, cv2.LINE_AA)
    t, _ = net.getPerfProfile()
    l = 'Inference time: %.2f ms' % (t * 1000.0 / cv2.getTickFrequency())
    cv2.putText(frame, l, (0, 15), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0))
    cv2.imshow('frame', frame)
    if cv2.waitKey(1) == 27:
        break# Release video capture object and close the window
vid_cap.release()
cv2.destroyAllWindows()
cv2.waitKey(1)

OpenVINO optimizes running the model on specific hardware through the Inference Engine plugin. This plugin is available for all intel hardware (GPUs, CPUs, VPUs, FPGAs). You can change the Target by specifying CPU/MYRIAD etc.

net.setPreferableBackend(cv2.dnn.DNN_BACKEND_INFERENCE_ENGINE)
net.setPreferableTarget(cv2.dnn.DNN_TARGET_CPU)

(Optional) Step 5: Convert the (.pb) model into (.xml) and (.bin) file using model optimizer

I have used Google Colab again for the same

!pip install openvino-dev[tensorflow2]==2021.4.2!mo_tf —-input_model /content/frozen_models/tf_final.pb --input_shape “[1,224,224,3]” --data_type=FP16 --output_dir /content

OpenVINO™ toolkit introduces its own format of graph representation and its own operation set. A graph is represented with two files: an XML file and a binary file. This representation is commonly referred to as the Intermediate Representation or IR.

We have chosen FP(Floating Point) as FP16 as it is the only precision which works with Intel Movidius Neural Compute Stick.

Change the following code from readNetFromTensorflow to readNet considering xml and bin code.

# Change this line
net = cv2.dnn.readNetFromTensorflow('tf_final.pb')#Change to this line 
net = cv2.dnn.readNet('tf_final.xml','tf_final.bin')

Inspired by : Click here

Github Link : https://github.com/diazoniclabs/openvino-teachable-machine

OpenVino : https://www.intel.com/content/www/us/en/developer/tools/openvino-toolkit/overview.html

Deploying AI Models on Edge using OpenVino

Written by Diazonic Labs

No responses yet