Deployment to Production
Chapter 11
Last updated
Chapter 11
Last updated
Machine learning solutions cannot exist in isolation, and need to be surfaced to the end-users. To allow a solution to be used, it must be deployed to an environment that users have access to.
Figure 11-1 shows the various ways a TensorFlow.js application can be deployed in a production environment.
Figure 11-1: Deployment of TensorFlow.js Solutions
Users have the option to access that machine learning solution by deploying as either an application (web or desktop), a web browser extension (Google Chrome or Microsoft Edge), or a single-board computer (Raspberry Pi or Arduino) as shown in the diagram above. Each of these deployment environments are discussed below:
A web application refers to any application that resides and allows access through a web browser. This, however, does not mean that the execution happens in the web browser as well. Web application deployments can either execute on a server in the cloud, or run in the web browser after the page has finished loading.
A machine learning solution developed using TensorFlow.js can be deployed to a server in a private data center, or the public cloud such as Microsoft Azure or Google Cloud Platform (GCP) and the output is displayed to the end-user in the web browser. The model executes on the web server using Node.js as the framework, and is covered in detail in Chapter 12.
The second option in deploying as a web application is client-based. While the solution resides on the server, it executes in the web browser client without ever uploading the data to the server. This ensures the security of the data as it never leaves the user machine to train the machine learning model, and eventually to perform a prediction from the model.
There are scenarios when applications have been developed using HTML, CSS, and JavaScript but need to be deployed on the users’ desktops as Windows applications. Such web applications need to be developed and deployed using the Electron.js framework. More details here: https://www.electronjs.org/
Note I did not cover mobile application separately as JavaScript (and by extension TensorFlow.js) cannot be used to develop native mobile applications. Mobiles can, however, access web applications using their web browser. If a native machine learning solution needs to be developed for cellphones, TensorFlow Lite is a more viable solution. You can get more information at https://www.tensorflow.org/lite
Anything and everything that can be done using TensorFlow.js in a web application can also be done in a web browser extension, since the machine learning code is running using HTML and JavaScript, a trait that is exactly similar to a client-based web application. This is due to the fact that data never leaves the web browser, and the code executes in the same context as the web page.
Microcontrollers or single-board computers allow hobbyists to develop JavaScript applications for a host of devices. Examples of microcontrollers are Raspberry Pi, Arduino, NVIDIA Jetson Nano, etc. Access to the underlying hardware is enabled using the two runtimes namely tfjs-node and tfjs-headless-nodegl. The URLs for the GitHub repositories for both these libraries are:
TensorFlow backend for TensorFlow.js (https://github.com/tensorflow/tfjs/tree/master/tfjs-node)
Headless WebGL backend for TensorFlow.js (https://github.com/tensorflow/tfjs/tree/master/tfjs-backend-nodegl)
While it would not make much sense to explain how to deploy TensorFlow.js machine learning solutions to every environment and device covered in the previous section, the following sections describe the main operations that must be performed for deployment.
Copying the content entails replicating all web pages, images, scripts, etc. to a publicly accessible location on the web server so they can be accessed using HTTP or HTTPS in a web browser. The replication of content can either be automatic using a deployment script and server, or it can be manual, provided the script, server, or programmer has access to the web location or web server.
CORS is short for Cross-Origin Resource Sharing, and disallows web pages from using resources hosted on other servers they are not authorized for access. Forgetting to enable CORS is a very common mistake. CORS can be enabled in Microsoft Azure using the following ways (follow instructions by other cloud providers to enable CORS in their environments):
Azure Portal
Azure Storage Explorer
Common Language Runtime (CLI)
The final step of the deployment process is to find out if everything is working as expected by accessing the solution in a web browser using HTTP or HTTPS.
This short chapter covers the deployment steps for machine learning solutions using TensorFlow.js and describes the following:
The environments that TensorFlow.js solutions can be deployed to.
GitHub links to implementations supported by microcontrollers.
The steps for deploying a JavaScript machine learning solution.