Text Classification in iOS using tensorflowlite [A How-To Guide]

Written by khurram-shehzad | Published 2020/03/08
Tech Story Tags: text-classification | natural-language-processing | nlp | sentiment-analysis | ios-text-classification | tensorflowlite | ios-tensorflowlite | we-building

TLDR Text classification is task of categorising text according to its content. More general applications of text classifications are in email spam detection, sentiment analysis and topic labelling. For integrating text classification in iOS we will be using a pre-trained model on IMDB database. We will use tensorflowlite as outlaying machine learning inference engine. In its simplest form our app will take a sentence as input from user, feeds into classification client, that will classify text as either positive or negative depending on text content.via the TL;DR App

Text classification is task of categorising text according to its content. It is the fundamental problem in the field of Natural Language Processing(NLP). More general applications of text classifications are in email spam detection, sentiment analysis and topic labelling etc.

For this post we restrict our text classification to only two classes also know as binary classification e.g into positive or negative text. For integrating text classification in iOS we will be using a pre-trained model on IMDB database. We will be using tensorflowlite as out underlaying machine learning inference engine.
In its simplest form our app will take a sentence as input from user, feeds into classification client, that will classify text as either positive or negative depending on text content.
Lets name our classification client as 
TextClassificationClient
 . Here is an overview of the client class
final class TextClassificationClient {
  // other stuff
  init?(modelFileInfo: FileInfo, labelsFileInfo: FileInfo, vocabFileInfo: FileInfo) {
    // we will initialise the tensorflowlite here
  }
}
The initialiser takes three parameters of type FileInfo which is a tuple defined below
let modelFileInfo = FileInfo(name: "text_classification", extension: "tflite")
let labelsFileInfo = FileInfo(name: "labels", extension: "txt")
let vocabFileInfo = FileInfo(name: "vocab", extension: "txt")
Model file is the trained model file which tensorflowlite takes to perform inference. Labels file is plain text file consisting of classes of classification e.g Positive and Negative. Vocab is again a plain text file containing the collection of some words and their corresponding embedding, these words were used during model training and also will be used during inference.
Model training is outside of scope of this post and as I mentioned in start we will use a pre-trained IMDB model from here. For detail overview about model training please see here.
In order to use tensorflowlite in iOS app we have to integrate it. For this purpose we will be using cocoapods. Cocopods is the most commonly used dependency management tool used for managing third party dependencies in iOS apps.
Add pod ‘
TensorFlowLiteSwift
’ in your project pod file and run pod install to install and integrate tensorflowlite in iOS app.
Let's get back to our 
TextClassificationClient
 class. In this class we have a method classify that will actually classify the text. Below is the implementation of this method
func classify(text: String) -> [Result] {
  let input = tokenizeInputText(text: text)
  let data = Data(copyingBufferOf: input[0])
  do {
    try interpreter.copy(data, toInputAt: 0)
    try interpreter.invoke()
    let outputTensor = try interpreter.output(at: 0)
    if outputTensor.dataType == .float32 {
      let outputArray = [Float](unsafeData: outputTensor.data) ?? []
      var output = [Result]()
      for (index, label) in labels.enumerated() {
        output.append(Result(id: "", title: label, confidence: outputArray[index]))
      }
      output.sort(by: >)
      return output
    }
  } catch {
    print(error)
  }
  return []
}
It takes a string as parameter and return an array of Result type. The first line takes string and converts it into tokens in order to feed it into inference engine(interpreter instance variable in this case). We run interpreter on tokenised text get the output from interpreter and return it. The Result type is defined as below
struct Result {
  let id: String
  let title: String
  let confidence: Float
}
Below is the screenshot of an inference
The complete working code of app can be found at Github repository here.

Published by HackerNoon on 2020/03/08