TensorFlow Inference of Visual Images, Orchestrated by Cloudify, with Intel Optimizations

The following blog is co-authored by Petar Torre, Solutions Architect at Intel.

This blog describes how Cloudify automates the deployment and monitoring of Machine Learning systems, by orchestrating an Intel-optimized TensorFlow workload running inference with a pre-trained ResNet-50 model from the Intel Model Zoo.

In a nutshell, a container running a Jupyter Notebook with the Intel optimized TensorFlow model is scheduled as a Kubernetes pod on K3S on AWS EC2. This scenario is orchestrated by Cloudify. 

The design is in:

Quick Terminology Primer

Jupyter Notebooks: Software, standards, and services for interactive computing across multiple programming languages; web-based application suitable for capturing the whole computation process.

Machine Learning (ML) Inference: The process of running live data points into an ML model to calculate an output such as a single numerical score.

TensorFlow: Software library for machine learning and artificial intelligence. It can be used across a range of tasks but has a particular focus on training and inference of deep neural networks.

ResNet-50: 50-layer convolutional neural network (class of artificial neural network, most commonly applied to analyze visual imagery).

How It Works 

We use an AWS EC2 instance c6i.4xlarge with 16 vCPUs, based on 3rd Generation Intel Xeon Scalable Processor, running CentOS Stream 9, and having a public IP address with security group allowing access from your client and from Cloudify Manager to at minimum TCP ports 22, 6443 (k3s) and 8888 (Jupyter Notebook web server).

As a light Kubernetes scheduler we run K3S, which can also be deployed on scarce resource environments, like a Far Edge one. Single K3S node is run with:

curl -sfL https://get.k3s.io | sudo INSTALL_K3S_SKIP_ENABLE=true sh -

sudo K3S_KUBECONFIG_MODE="644" INSTALL_K3S_EXEC="--token 12345" /usr/local/bin/k3s server

For the application workload, we use a prepared image with Jupyter Notebook, intel/intel-optimized-tensorflow:tf-2.3.0-imz-2.2.0-jupyter-performance from DockerHub, as denoted here

TensorFlow is a widely-used machine learning framework in the deep learning arena, demanding efficient utilization of computational resources. In order to take full advantage of Intel Architecture and extract maximum performance, the TensorFlow framework has been optimized using one API Deep Neural Network Library (oneDNN) primitives, a popular performance library for deep learning applications.

It gets as input visual images and classifies them, utilizing TensorFlow running inference, with a pre-trained ResNet-50 FP32 model from the Intel Model Zoo.

Cloudify Orchestration

Cloudify is an orchestrator that orchestrates infrastructure as well as applications, on multiple clouds, AWS, GCP, and Azure. In this example, we use the AWS cloud. 

Cloudify can create the infrastructure as well as provision the K3S. On top of the Kubernetes K3S, Cloudify provisions and manages the LCO, and life cycle operations, of the pod that runs the Jupyter notebook, with the Intel optimized one API software and the TensorFlow Zoo model.  

Cloudify uses a YAML DSL, domain specific language, called a Blueprint, to orchestrate workloads. 

The next figure shows the Blueprint that provisions a pod on the Kubernetes node (in our case there is one node but if there were multiple nodes, it could use node labels to select the right node with the required resources like for example specific AVX-512 instructions). 

TOSCA blueprint is in:

# SPDX-License-Identifier: Apache-2.0
#
tosca_definitions_version: cloudify_dsl_1_3
imports:
  - 'https://cloudify.co/spec/cloudify/5.1.0/types.yaml'
  - 'plugin:cloudify-kubernetes-plugin?version= >=2.7.0'
inputs:
  validate_status:
    type: boolean
    display_label:
    default: false # true
dsl_definitions:
  client_config: &client_config
    configuration:
      manager_file_path: { get_secret: kubernetes_config_path }
node_templates:
  tensorflow-demo:
    type: cloudify.kubernetes.resources.FileDefinedResource
    properties:
      client_config: *client_config
      validate_resource_status:
        get_input: validate_status
      file:
        resource_path: resources/tensorflow-demo.yaml
        template_variables:
          JUPYTERNOTEBOOKPASSWORD: { get_secret: jupyter_notebook_password }

For this simple example, prepare (change PUBLICIP with the public IP address of your instance) and copy kube config file into the Cloudify Manager image with:

sed "s/127.0.0.1/PUBLICIP/g" < /etc/rancher/k3s/k3s.yaml > kube_config
docker cp kube_config cfy_manager_local:/etc/cloudify/kube_config
docker exec -it cfy_manager_local sh -c "chown cfyuser:cfyuser /etc/cloudify/kube_config; chmod 600 /etc/cloudify/kube_config;"
cfy secrets create -s /etc/cloudify/kube_config -u kubernetes_config_path

The Jupyter pod definition, with its resources, labels, and selectors can be modified to match it with the right EC2 instance that runs on the Intel Xeon processor. This pod will be provisioned by Cloudify on top of the K3S and expose k8s service:

# SPDX-License-Identifier: Apache-2.0
#
---
apiVersion: v1
kind: Namespace
metadata:
  name: tensorflow-demo
  labels:
    name: tensorflow-demo
---
apiVersion: apps/v1
kind: Deployment
metadata:
  namespace: tensorflow-demo
  name: tensorflow-demo
  labels:
    app.kubernetes.io/name: tensorflow-demo
    app.kubernetes.io/component: backend
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: tensorflow-demo
      app.kubernetes.io/component: backend
  replicas: 1
  template:
    metadata:
      labels:
        app.kubernetes.io/name: tensorflow-demo
        app.kubernetes.io/component: backend
    spec:
      containers:
      - name: tensorflow-demo
        image: intel/intel-optimized-tensorflow:tf-2.3.0-imz-2.2.0-jupyter-performance
        imagePullPolicy: Always
        ports:
        - containerPort: 8888
        command: [ "sh", "-c" ]
        args:
        - apt-get update -y;
          mkdir /root/.jupyter;
          python3 -c "import base64; from notebook.auth import passwd; print(base64.b64decode('eyAiTm90ZWJvb2tBcHAiOiB7ICJwYXNzd29yZCI6ICJQV0QiIH0gfQ==').decode('ascii').replace('PWD',passwd('{{ JUPYTERNOTEBOOKPASSWORD }}')))" > /root/.jupyter/jupyter_notebook_config.json;
          jupyter notebook --port=8888 --no-browser --ip="0.0.0.0" --allow-root;
---
apiVersion: v1
kind: Service
metadata:
  name: tensorflow-demo
  namespace: tensorflow-demo
  labels:
    app.kubernetes.io/name: tensorflow-demo
    app.kubernetes.io/component: backend
spec:
  ports:
  - name: jupyter-service
    port: 8888
    targetPort: 8888
  selector:
    app.kubernetes.io/name: tensorflow-demo
    app.kubernetes.io/component: backend
  #type: LoadBalancer

The figure below shows the Cloudify TensorFlow Jupyter Demo blueprint.

Create deployment from that blueprint. When asked, enter the desired password for the Jupyter Notebook web UI.

The next figure shows how in Cloudify we execute the TensorFlow deployment Install action.

Cloudify’s deployment and execution taskbar are shown in the figure below.

Here for simplicity, we don’t have external ingress but only forward port from bastion VM to exposed k8s service with:

kubectl port-forward -n tensorflow-demo service/tensorflow-demo --address 0.0.0.0 8888:8888

Results

Connect to Jupyter Notebook with a modern browser using a URL like http://cfy-tensorflow-demo:8888/ where cfy-tensorflow-demo points to the external IP of bastion VM. When asked, log in with a password.

Available Jupyter Notebooks are for two different types of analysis:

  1. Notebook “benchmark_data_types_perf_comparison.ipynb”: Comparison of data types FP32 and int8 when running on Intel Optimization for TensorFlow
  2. Notebook “benchmark_perf_comparison.ipynb”: Comparing Intel Optimizations for TensorFlow with “Stock TensorFlow” (note that TensorFlow from v2.5 also includes oneDNN optimizations that can be enabled by setting an environment variable)

Comparison of data types FP32 and int8

On Notebook benchmark_data_types_perf_comparison.ipynb UI run all cells with:

While it runs it will show that the kernel is busy:

so wait until it is idle again:

In Cell 3.2 “Pick a Data Type” change data_type_index = 0 to data_type_index = 1.

Run all cells below with:

Wait for the test to stop, and in Step 7 see the result as:

Comparison of Intel Optimizations for TensorFlow with “Stock TensorFlow”

On Notebook benchmark_perf_comparison.ipynb UI run all cells, wait for it to finish (kernel idle), change kernel with:

Again run all cells, wait for it to finish, and in Step 5 see:

Summary

  • In this blog, we described how Cloudify orchestrates an AI use case, a TensorFlow inference model optimized with Intel oneAPI Deep Neural Network library.
  • The inference runs on Kubernetes on AWS EC2 instances. It uses the Jupyter Notebook as a tool for users to change inputs and see the results.

comments

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    Back to top