Description
Understand user commands and find relevant entities and actions and tag them to get a structured representation for automation.
Predicted Entities
playlist_owner
, served_dish
, track
, poi
, cuisine
, spatial_relation
, object_type
, facility
, album
, country
, geographic_poi
, location_name
, object_part_of_series_type
, object_select
, artist
, rating_value
, best_rating
, sort
, party_size_description
, party_size_number
, restaurant_name
, object_location_type
, playlist
, service
, city
, O
, genre
, movie_name
, current_location
, rating_unit
, restaurant_type
, condition_temperature
, condition_description
, entity_name
, movie_type
, object_name
, state
, year
, music_item
, timeRange
Live Demo Open in Colab Download Copy S3 URI
How to use
...
embeddings = WordEmbeddingsModel.pretrained("glove_100d", "en")\
.setInputCols("sentence", "token") \
.setOutputCol("embeddings")
ner = NerDLModel.pretrained("nerdl_snips_100d") \
.setInputCols(["sentence", "token", "embeddings"]) \
.setOutputCol("ner")
ner_converter = NerConverter()\
.setInputCols(['document', 'token', 'ner']) \
.setOutputCol('ner_chunk')
nlp_pipeline = Pipeline(stages=[document_assembler, sentencer, tokenizer, embeddings, ner, ner_converter])
l_model = LightPipeline(nlp_pipeline.fit(spark.createDataFrame([['']]).toDF("text")))
annotations = l_model.fullAnnotate('book a spot for nona gray myrtle and alison at a top-rated brasserie that is distant from wilson av on nov the 4th 2030 that serves ouzeri')
...
...
val embeddings = WordEmbeddingsModel.pretrained("glove_100d", "en")
.setInputCols(Array("sentence", 'token'))
.setOutputCol("embeddings")
val ner = NerDLModel.pretrained('nerdl_snips_100d')
.setInputCols(Array('sentence', 'token', 'embeddings')).setOutputCol('ner')
val ner_converter = NerConverter.setInputCols(Array('document', 'token', 'ner'))
.setOutputCol('ner_chunk')
val pipeline = new Pipeline().setStages(Array(document_assembler, tokenizer, embeddings, ner, ner_converter))
val data = Seq("book a spot for nona gray myrtle and alison at a top-rated brasserie that is distant from wilson av on nov the 4th 2030 that serves ouzeri").toDF("text")
val result = pipeline.fit(data).transform(data)
...
import nlu
nlu.load("en.classify.snips").predict("""book a spot for nona gray myrtle and alison at a top-rated brasserie that is distant from wilson av on nov the 4th 2030 that serves ouzeri""")
Results
+------------------------------+------------------------+
| ner_chunk | label |
+------------------------------+------------------------+
| nona gray myrtle and alison | PARTY_SIZE_DESCRIPTION |
| top-rated | SORT |
| brasserie | RESTAURANT_TYPE |
| distant | SPATIAL_RELATION |
| wilson Macro-average | POI |
| nov the 4th 2030 | TIMERANGE |
| ouzeri | CUISINE |
+------------------------------+------------------------+
Model Information
Model Name: | nerdl_snips_100d |
Type: | ner |
Compatibility: | Spark NLP 2.7.3+ |
License: | Open Source |
Edition: | Official |
Input Labels: | [document, token, embeddings] |
Output Labels: | [ner] |
Language: | en |
Data Source
This model is trained on the NLU Benchmark SNIPS dataset https://github.com/MiuLab/SlotGated-SLU
Benchmarking
B-facility 3 0 0 1.0 1.0 1.0
B-poi 7 0 1 1.0 0.875 0.93333334
B-object_location_type 22 1 0 0.95652175 1.0 0.9777778
B-service 24 2 0 0.9230769 1.0 0.96000004
I-entity_name 53 2 1 0.96363634 0.9814815 0.9724771
B-genre 5 0 0 1.0 1.0 1.0
I-service 5 0 0 1.0 1.0 1.0
I-object_type 66 0 0 1.0 1.0 1.0
I-sort 9 0 0 1.0 1.0 1.0
I-city 19 1 0 0.95 1.0 0.9743589
B-music_item 102 2 2 0.9807692 0.9807692 0.9807692
I-movie_name 100 5 21 0.95238096 0.8264463 0.8849558
B-party_size_description 10 0 0 1.0 1.0 1.0
B-served_dish 10 3 2 0.7692308 0.8333333 0.8
B-object_type 161 8 1 0.9526627 0.99382716 0.9728097
B-playlist 123 6 6 0.95348835 0.95348835 0.95348835
B-restaurant_name 14 1 1 0.93333334 0.93333334 0.93333334
B-geographic_poi 11 0 0 1.0 1.0 1.0
B-condition_description 28 0 0 1.0 1.0 1.0
I-object_location_type 16 0 0 1.0 1.0 1.0
B-spatial_relation 70 3 1 0.9589041 0.9859155 0.9722222
I-party_size_description 35 0 0 1.0 1.0 1.0
I-poi 10 0 1 1.0 0.90909094 0.95238096
I-artist 111 4 1 0.9652174 0.9910714 0.9779735
B-condition_temperature 23 0 0 1.0 1.0 1.0
I-movie_type 16 0 0 1.0 1.0 1.0
I-object_part_of_series_type 0 0 1 0.0 0.0 0.0
B-city 60 1 0 0.9836066 1.0 0.9917355
I-location_name 29 0 1 1.0 0.96666664 0.9830508
B-album 0 2 10 0.0 0.0 0.0
I-genre 2 0 0 1.0 1.0 1.0
B-state 55 0 4 1.0 0.9322034 0.9649123
I-object_name 383 29 16 0.9296116 0.9598997 0.9445129
B-current_location 13 0 1 1.0 0.9285714 0.9629629
B-timeRange 102 8 5 0.92727274 0.95327103 0.9400922
B-sort 29 1 3 0.96666664 0.90625 0.9354838
I-timeRange 144 7 0 0.95364237 1.0 0.97627115
B-rating_unit 40 0 0 1.0 1.0 1.0
I-current_location 7 0 0 1.0 1.0 1.0
I-state 6 0 0 1.0 1.0 1.0
I-album 4 1 17 0.8 0.1904762 0.30769232
B-entity_name 31 4 2 0.8857143 0.93939394 0.9117647
B-object_name 134 22 13 0.85897434 0.91156465 0.88448846
B-playlist_owner 70 1 0 0.9859155 1.0 0.9929078
I-music_item 5 0 0 1.0 1.0 1.0
I-spatial_relation 41 2 1 0.95348835 0.97619045 0.9647058
I-country 25 1 0 0.96153843 1.0 0.98039216
B-rating_value 80 0 0 1.0 1.0 1.0
B-restaurant_type 64 0 1 1.0 0.9846154 0.9922481
I-playlist_owner 7 0 0 1.0 1.0 1.0
I-cuisine 1 0 0 1.0 1.0 1.0
B-track 7 10 2 0.4117647 0.7777778 0.5384615
B-movie_name 37 2 10 0.94871795 0.78723407 0.8604651
B-party_size_number 50 0 0 1.0 1.0 1.0
I-restaurant_type 7 0 0 1.0 1.0 1.0
B-year 24 1 0 0.96 1.0 0.9795918
B-location_name 23 0 1 1.0 0.9583333 0.9787234
B-object_part_of_series_type 11 1 0 0.9166667 1.0 0.95652175
B-country 43 4 1 0.9148936 0.97727275 0.94505495
I-playlist 218 4 13 0.981982 0.94372296 0.96247244
I-served_dish 2 1 2 0.6666667 0.5 0.57142854
I-track 19 29 2 0.39583334 0.9047619 0.5507246
B-artist 99 4 8 0.9611651 0.92523366 0.9428571
B-best_rating 43 0 0 1.0 1.0 1.0
I-restaurant_name 35 2 1 0.9459459 0.9722222 0.9589041
B-object_select 40 1 0 0.9756098 1.0 0.9876543
B-cuisine 12 1 2 0.9230769 0.85714287 0.8888889
B-movie_type 33 0 0 1.0 1.0 1.0
I-geographic_poi 33 0 0 1.0 1.0 1.0
tp: 3121 fp: 177 fn: 155 labels: 69
Macro-average prec: 0.91982585, rec: 0.9205297, f1: 0.9201776
Micro-average prec: 0.9463311, rec: 0.9526862, f1: 0.949498