Description
Classify user questions into 5 categories of an airline traffic information system.
Predicted Entities
atis_abbreviation
, atis_airfare
, atis_airline
, atis_flight
, atis_ground_service
How to use
document_assembler = DocumentAssembler()\
.setInputCol("text")\
.setOutputCol("document")
use = UniversalSentenceEncoder.pretrained('tfhub_use', lang="en") \
.setInputCols(["document"])\
.setOutputCol("sentence_embeddings")
document_classifier = ClassifierDLModel.pretrained('classifierdl_use_atis', 'en') \
.setInputCols(["document", "sentence_embeddings"]) \
.setOutputCol("class")
nlpPipeline = Pipeline(stages=[document_assembler, use, document_classifier])
light_pipeline = LightPipeline(nlp_pipeline.fit(spark.createDataFrame([['']]).toDF("text")))
annotations = light_pipeline.fullAnnotate(['what is the price of flight from newyork to washington', 'how many flights does twa have in business class'])
import nlu
nlu.load("en.classify.intent.airline").predict("""what is the price of flight from newyork to washington""")
Results
+-------------------------------------------------------------------+----------------+
| document | class |
+-------------------------------------------------------------------+----------------+
| what is the price of flight from newyork to washington | atis_airfare |
| how many flights does twa have in business class | atis_quantity |
+-------------------------------------------------------------------+----------------+
Model Information
Model Name: | classifierdl_use_atis |
Compatibility: | Spark NLP 2.7.1+ |
License: | Open Source |
Edition: | Official |
Input Labels: | [sentence_embeddings] |
Output Labels: | [class] |
Language: | en |
Dependencies: | tfhub_use |
Data Source
This model is trained on data obtained from https://www.kaggle.com/hassanamin/atis-airlinetravelinformationsystem
Benchmarking
precision recall f1-score support
atis_abbreviation 1.00 1.00 1.00 33
atis_airfare 0.60 0.96 0.74 48
atis_airline 0.69 0.89 0.78 38
atis_flight 0.99 0.93 0.96 632
atis_ground_service 1.00 1.00 1.00 36
accuracy 0.93 787
macro avg 0.86 0.96 0.90 787
weighted avg 0.95 0.93 0.94 787