Intent Classification for Airline Traffic Information System queries (ATIS dataset)

Description

Classify user questions into 5 categories of an airline traffic information system.

Predicted Entities

atis_abbreviation, atis_airfare, atis_airline, atis_flight, atis_ground_service

How to use

document_assembler = DocumentAssembler()\
.setInputCol("text")\
.setOutputCol("document")

use = UniversalSentenceEncoder.pretrained('tfhub_use', lang="en") \
.setInputCols(["document"])\
.setOutputCol("sentence_embeddings")

document_classifier = ClassifierDLModel.pretrained('classifierdl_use_atis', 'en') \
.setInputCols(["document", "sentence_embeddings"]) \
.setOutputCol("class")

nlpPipeline = Pipeline(stages=[document_assembler, use, document_classifier])
light_pipeline = LightPipeline(nlp_pipeline.fit(spark.createDataFrame([['']]).toDF("text")))

annotations = light_pipeline.fullAnnotate(['what is the price of flight from newyork to washington', 'how many flights does twa have in business class'])

import nlu
nlu.load("en.classify.intent.airline").predict("""what is the price of flight from newyork to washington""")

Results

+-------------------------------------------------------------------+----------------+
| document                                                          | class          |
+-------------------------------------------------------------------+----------------+
| what is the price of flight from newyork to washington			| atis_airfare   |
| how many flights does twa have in business class					| atis_quantity  |
+-------------------------------------------------------------------+----------------+

Model Information

Model Name:	classifierdl_use_atis
Compatibility:	Spark NLP 2.7.1+
License:	Open Source
Edition:	Official
Input Labels:	[sentence_embeddings]
Output Labels:	[class]
Language:	en
Dependencies:	tfhub_use

Data Source

This model is trained on data obtained from https://www.kaggle.com/hassanamin/atis-airlinetravelinformationsystem

Benchmarking

precision    recall  f1-score   support

atis_abbreviation       1.00      1.00      1.00        33
atis_airfare       0.60      0.96      0.74        48
atis_airline       0.69      0.89      0.78        38
atis_flight       0.99      0.93      0.96       632
atis_ground_service       1.00      1.00      1.00        36

accuracy                           0.93       787
macro avg       0.86      0.96      0.90       787
weighted avg       0.95      0.93      0.94       787

PREVIOUSToxic Comment Classification - Small

NEXTExtract aspects and entities from airline questions (ATIS dataset)