Deploying and Hosting a Machine Learning Model Using Flask, Heroku and Gunicorn

One thing I’ve observed in many data science tutorials when it comes to modeling, is that once a certain performance threshold is achieved on test data, rarely is the model deployed/pushed to production—and it’s a common case in the industry more broadly.

This tutorial aims to take modeling a step further by building a REST API and deploying the model into production. In addition to the REST API, we’re building a simple web application that predicts whether a piece of text belongs to any of these classes: atheism, computer graphics, medical science, Christianity, or politics.

In case you missed my previous post in this series (it was a while ago, so I’d recommend it), check it out here:

The tools that are going to help us achieve our goal are Flask, Gunicorn, Travis CI, & Heroku.

We begin by creating a project folder where we define all dependencies in the requirements.txt file. To ensure that all dependencies are isolated from our application, we create a virtual environment for our working space. Inside the requirements.txt file, let’s define a few dependencies for now and install them:

Setting up a virtual environment

We can create a virtual environment by running the commands below in the terminal:

Activate the virtual environment by running:

Installing Dependencies

Run the command below to install the dependencies defined in the requirements.txt file:

Here’s how the project directory structure should look:

Now that the environment is set up, let’s delve into the various tools that will help us create our API.


Flask is a micro framework built in Python, which means it provides various tools and libraries for building web applications. Flask is very easy to learn and start working with—as long as you understand Python.

Let’s create a simple test endpoint in Flask using the syntax below. We call our flask app,

from flask import Flask

app = Flask(__name__)

def Test():
   return "This is working"
if __name__=='__main__':

What we’re doing here is importing the Flask class from the flask module, creating a Flask app, and defining an index route (‘/’). Now let’s run as follows:

If everything was setup right, we should see the image below on our terminal

Congrats! You’ve set up a basic Flask application. Now let’s see how we can serialize the model we built in part I and use it for predictions in our application.

Serializing the model

Serialization in machine learning means we’re saving the model in a file so that we can reuse it to make predictions, compare it with other models, or even save the hassle of training a model over and over again.

There are two primary ways of achieving this in scikit-learn. We either use pickle, or a joblib module. Both are equally good, and in our case we’re going to use the pickle module to serialize and de-serialize the model we built in the previous series.

We serialize the model by using the snippet below:

import pickle

with open('model.pkl', 'wb') as f:
    pickle.dump(clf, f)

In addition to serializing our model, we’re also going to serialize the training data using the same snippet for future purposes.

Inside our file let’s create an index route that renders the homepage. index.html is a template file, which can be located in the templates folder.

Below is the index route created to redirect to the index.html file:

def Homepage():
    return render_template('index.html')

Here’s the code I put in the index.html file. What it does is simple accept text as input from a user, display a success alert, and redirect to the prediction route when the form is submitted.

<!DOCTYPE html>
            Naive Bayes Classification
        <link rel="stylesheet" href="" integrity="sha384-Gn5384xqQ1aoWXA+058RXPxPg6fy4IWvTNh0E263XmFcJlSAwiGgFAW/dAiS6JXm" crossorigin="anonymous">
        <script src="" integrity="sha384-KJ3o2DKtIkvYIK3UENzmM7KCkRr/rE9/Qpg6aAZGJwFDMVNA/GpGFF93hXpG5KkN" crossorigin="anonymous"></script>
        <script src="" integrity="sha384-ApNbgh9B+Y1QKtv3Rn7W3mgPxhU9K/ScQsAP7hUibX39j7fakFPskvXusvfa0b4Q" crossorigin="anonymous"></script>
        <script src="" integrity="sha384-JZR6Spejh4U02d8jOt6vLEHfe/JQGiRRSQQxSfFWpi1MquVdAyjUar5+76PVCmYl" crossorigin="anonymous"></script>
        <script src=""></script>
        <link href="{{ url_for('static', filename='css/main.css') }}" rel="stylesheet" type= "text/css">

    <body style="background-image: url('static/img/2.jpeg');">
      <div class="style-form-within" >

    <form id="form" method="post" action="{{ url_for('Prediction') }}">
          <div class="form-group">
            <label>Enter text to predict</label>
            <input name="text_field" id="text_field" style="height: 200px;" type="text" class="form-control" placeholder="Enter text here">
          <button id="button-a" type="submit" class="btn btn-primary">Submit</button>
              $("#button-a").attr('disabled', true);
                  if ($(this).val().length != 0)
                    $("#button-a").attr('disabled', false);
                    $("#button-a").attr('disabled', true);


                icon: 'success',
                title: 'Submitted',
                showConfirmButton: true,
                timer: 20000

Now when the user submits text from the index.html file, we handle that in the /predict endpoint. The purpose of this endpoint is to accept the values submitted, load in the pickled model and the training dataset, and use CountVectorizer and TF-IDF to transform the values submitted by the user.

Note here that we’ve also defined a map of prediction indexes and their corresponding names in the prediction.json file.

After transforming the submitted text, the loaded model is applied, predictions are made, and the results are returned as JSON.

@app.route('/predict', methods=['POST'])
def Prediction():
    #fetch input from form + loading model
    from_form = request.form['text_field']
    with open('data/news_train.pkl', 'rb') as f:
        news_train = pickle.load(f)
    with open('models/model.pkl', 'rb') as f:
        clf = pickle.load(f)
    with open('prediction_map.json', 'r') as pred_map:
        prediction_map = json.load(pred_map)

    count_vect = CountVectorizer()
    tfidf_transformer = TfidfTransformer()
    cv_fit = count_vect.fit_transform(
    X_train_tfidf = tfidf_transformer.fit_transform(cv_fit)

    count_vect_data = count_vect.transform([from_form])
    tfidf_transformer_data = tfidf_transformer.transform(count_vect_data)
    prediction = clf.predict(tfidf_transformer_data)
    prediction_name = prediction_map.get(str(prediction[0]), "couldn't find name")
    response = {
        'status': 200,
    return jsonify(response)

Voila! We’ve just finished building our API to serve the model. It’s now time to push the model into production. And we are going to use Heroku since it has free tier for small applications. We are also going to set up a CI/CD pipeline so that any changes we make and push to GitHub, those changes are automatically pushed to the server.

Travis CI is going to be used for our CI/CD pipeline. This allows us to run automated tests and then deploy our application to Heroku when the tests pass. Feel free set up and account at and then select the particular GitHub repo you would want all tests to be run on, after that deployed to Heroku. Once you’re done, in the root directory create a .travis.yml file which we will define the Travis configurations.

language: python
dist: xenial
  - cd ./app
  - pip install -r requirements.txt
  - pytest

What this basically is doing is defining Python as the default language and the distribution to use in our case. Also because the app folder — the folder which contains the project code is on the root—we want to navigate to that folder first before we install all dependencies and run the tests.

After all these steps are completed, push the changes to git and watch Travis build successfully.

Heroku Deployment

As mentioned earlier, we’re going to use Heroku to host the application. Create an account on Heroku and then create an app. Once that’s done, navigate to the deploy tab. Under the deploy tab, select connection to GitHub and specify the corresponding repository.

Ensure the checkbox Wait for CI to pass before deploy is ticked. If everything runs successfully, we should see the screen below:

Since the built-in Flask server doesn’t scale in production, we’re going to use Gunicorn as our HTTP Server.

In the root folder, we add a Procfile that contains this next line of code—it informs Heroku what to launch and how:

Once you’re done with that, log in to Heroku in the terminal. What we intend to do is add our Heroku API key to the .travis.yml configuration file before we push the code to GitHub. We can achieve this by running the following commands:

followed by

Automatically, you’ll see the api key added in the .travis.yml file. Now specify the app name you created in Heroku and the provider. This code should look like this when done:

language: python
dist: xenial
- cd ./app
- pip install -r requirements.txt
- pytest
- provider: heroku
    secure: Hy/pqsm8y6u/pZNRqVGoXzQgpPHyTlG327R0ytr5mYDwkhwDTjTpIe8tkCpoRg9RdEhwpY6m/Ua33nRvOGtJCx7sgKgM/vmDpDFB3ogW/6uVRO+mxRJmtEQ1DAeHQaKMeTblW8TgJC0Z1zbihrc4VWwmI5cSz4UegFBK/eCLYa6dgJ+NnWzCVIfL+lNbXjfYlnMn950ukIuAv+JP+jBl00I6oDYpfqLCzYFASuxQedBHeV5cxpu6nioXEFhrx0J/QA4/x1nc8tZV8OT18jiLkZd99+whMSHk4UYbKbUhDPebEUIM9TazC/EW8hygLAfyin+SzCj4g70n36aCP85KYq0lfn392Wxod0MRAjCm5KduKCTWYvHVOzYk/BcEztrSzEQp6pLRezIz9jZCtVgu7cEk7Gn4xQGXpGFzB+xzx9t+MjVbVilZ8gxs+3IaRUh8ni34lzm1+anH7kgYbDfAyL6KgTlwwTacmFxSW4CkaeS5R04fkfs8D0qlwvvrENHldXI946KYnXaWgIfLjIA0Ux0zHE0btkxrhiak6L5S48/3c4J8nFG7f8EsaKm0nInWrsnXwGXeAlWE8ErFHWooWlOCdk/57KGT80rm2f4nDeRLoUDxDwKGu21+9qZ3E6cl9Xmfc9+XwIvim7Q01hZsgVazs/1q6NeVifg1iGYsDuA=
  app: naive-bayes-flask-app

If you’ve made it this far, congratulations! The very last step is to push all these changes to the GitHub repository, and automatically the application deploys to the Heroku server. And now, you can watch your app in action.

I hope you enjoyed this ride. To see the full code and live application, feel free to click on the respective links below.

GitHub code:

And you can try out the final app in production here!

Avatar photo


Our team has been at the forefront of Artificial Intelligence and Machine Learning research for more than 15 years and we're using our collective intelligence to help others learn, understand and grow using these new technologies in ethical and sustainable ways.

Comments 0 Responses

Leave a Reply

Your email address will not be published. Required fields are marked *