Internals of a engine — Intent Classification chatbot The series, started with an aim of understanding what happens underneath a chatbot engine. RasaNLU being an open source framework, I could read through the code to understand its internals. Demystifying RasaNLU The first two parts explains major functionalities of any bot framework, and the Chatbot. Part 3 was aimed to arrive at a deep understanding of the machine learning aspects of . This part deep-dives into the Training Deploying Entity Extraction intent classification. Intent classification builds a machine learning model, using a prepossessed training data and classifies the user’s text message to an intended action. For example, a message like should be mapped to an action called with and as search parameters. In the previous part we have seen how to extract these parameters using “Help me find a Mexican restaurant in Chennai” “restautant_search” “Mexican” “Chennai” Named Entity Recognition. RasaNLU and Intent Classifiers supports multiple intent classifiers like , and that can be configured based on your use-case. For the purpose of this blog we are going to stick with RasaNLU sklearn mitie tensorflow_embedding sklearn intent classifier. Training Sample RasaNLU has a specific data structure for its training data. You can visualize it as a table of sentences each associated with a tag(intent). hi - greethello - greetbye - end_conversationgoodbye - end_conversationsee you - end_conversationSuggest me some mexican restaurants - restaurant_searchim looking for restaurants - restaurant_search Preprocess For ML models to understand our text data, it needs to be in certain format. The preprocessing step takes care of this transformation. The aim of preprocessing is to arrive at a bunch of features and classes. Tokenize text to words "Suggest me some mexican restaurants"["suggest", "me", "some", "mexican", "restaurants"] 2. Convert words to features RasaNLU uses to convert words to numbers. During this process the tokens are mapped in a N-Dimensional space based on its similarity with other words in spacy’s pre-trained corpus. spacy word2vec ["suggest", "me", "some", "mexican", "restaurants"][1.40101, 1.3003, 0.45647, 1.8934, 1.67677] 3. Labeling Features Intents are categorized by a label given in training data. These labels are usually the actions the intent meant to perform. Few examples of intent labels are etc., book_tickets, cancel_tickets We have converted text-samples from words to numbers for our ML model to understand it. When it comes to labels we don’t convert it into word vectors rather assign a unique number to it. This process is called as encoding. RasaNLU uses to perform this step Sklearn's LabelEncoder >>> labels[greet, good_bye, restuarant_search, good_bye] >>> le = LabelEncoder()>>> y = le.fit_transform(labels)>>> y[0, 1, 2, 1 ... Classify Intents The ML algorithm that RasaNLU uses to classify intents is the with . The advantage of is that you can train with different configurations and at the end of training it returns a trained SVM model with the best configuration. SVM GridSearchCV GridSearchCV SVM Configuration The configuration defines list of all possible parameter would train on. will create SVM’s with all the combinations. GridSearchCV SVM GridSearchCV defaults = {"C": [1, 2, 5, 10, 20, 100],"kernels": ["linear"], "max_cross_validation_folds": 5} C = defaults["C"]kernels = defaults["kernels"] tuned_parameters = [{"C": C, "kernel": [str(k) for k in kernels]}]folds = defaults["max_cross_validation_folds"] cv_splits = max(2, min(folds, np.min(np.bincount(y)) // 5)) Train Create object with defined configuration GridSearchCV Fit the training data using method. fit After fitting the data, we will have an N-Dimensional statistical model which can classify similar texts to one of the trained intents. clf = GridSearchCV(SVC(C=1, probability=True, class_weight='balanced'),param_grid=tuned_parameters, n_jobs=1, cv=cv_splits, scoring='f1_weighted', verbose=1)clf.fit(X, y) Testing First let’s test our ML model with one training example from our training data. Since we already have the features extracted lets feed it into method. The prediction returns an numerical value which needs to be decoded to arrive at the actual intent tag. clf.predict >>> X_test = intent_examples[24].get("text_features").reshape(1, -1)>>> pred_result = clf.predict(X_test)>>> pred_result # show me few mexican restaurants [2] >>> le.inverse_transform(pred_result) ['restaurant_search'] Apart from testing it over the training data you also need to test it over a bunch of testing data, data that the ML model have never seen before to avoid over fitting. Deploy One major thing that we need to keep in mind while deploying the ML model is that the preprocessing pipeline should remain the same during training, testing and when it is in action. Any change in the pipeline parameters will have a huge impact on the end result. Now how does RasaNLU keeps these configurations intact? where we cover more about meta-data configuration. Checkout Part 2 With this we have came to end of the series of Demystifying RasaNLU and learning its internals. Hope you have enjoyed and learned a lot from this series. Do share your questions and comments below.