Financial Chinese BERT Sentence Embeddings (Finance, BQCorpus)

Description

Pretrained BERT Sentence Embeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. sbert-chinese-qmc-finance-v1 is a Chinese Financial model originally trained upon large-scale banking problem matching dataset (BQCorpus), which is suitable for problem matching scenarios in the financial field.

Download Copy S3 URI

How to use

sentence_embeddings = BertSentenceEmbeddings.pretrained("sbert_chinese_qmc_finance_v1", "zh")\
  .setInputCols(["sentence"])\
  .setOutputCol("sbert_embeddings")
val sentence_embeddings = BertSentenceEmbeddings.pretrained("sbert_chinese_qmc_finance_v1", "zh")
  .setInputCols("sentence")
  .setOutputCol("bert_sentence"))
import nlu
nlu.load("zh.embed_sentence.bert").predict("""Put your text here.""")

Model Information

Model Name: sbert_chinese_qmc_finance_v1
Compatibility: Spark NLP 4.2.4+
License: Open Source
Edition: Official
Input Labels: [sentence]
Output Labels: [bert_sentence]
Language: zh
Size: 384.0 MB
Case sensitive: true

References

https://huggingface.co/DMetaSoul/sbert-chinese-qmc-finance-v1