sparknlp.annotator.seq2seq.mistral_transformer
#
Contains classes for the MistralTransformer.
Module Contents#
Classes#
Mistral 7B |
- class MistralTransformer(classname='com.johnsnowlabs.nlp.annotators.seq2seq.MistralTransformer', java_model=None)[source]#
Mistral 7B
Mistral 7B, a 7.3 billion-parameter model that stands out for its efficient and effective performance in natural language processing. Surpassing Llama 2 13B across all benchmarks and excelling over Llama 1 34B in various aspects, Mistral 7B strikes a balance between English language tasks and code comprehension, rivaling the capabilities of CodeLlama 7B in the latter.
Mistral 7B introduces Grouped-query attention (GQA) for quicker inference, enhancing processing speed without compromising accuracy. This streamlined approach ensures a smoother user experience, making Mistral 7B a practical choice for real-world applications.
Additionally, Mistral 7B adopts Sliding Window Attention (SWA) to efficiently handle longer sequences at a reduced computational cost. This feature enhances the model’s ability to process extensive textual input, expanding its utility in handling more complex tasks.
In summary, Mistral 7B represents a notable advancement in language models, offering a reliable and versatile solution for various natural language processing challenges.
Pretrained models can be loaded with
pretrained()
of the companion object:>>> mistral = MistralTransformer.pretrained() \ ... .setInputCols(["document"]) \ ... .setOutputCol("generation")
The default model is
"mistral_7b"
, if no name is provided. For available pretrained models please see the Models Hub.Input Annotation types
Output Annotation type
DOCUMENT
DOCUMENT
- Parameters:
- configProtoBytes
ConfigProto from tensorflow, serialized into byte array.
- minOutputLength
Minimum length of the sequence to be generated, by default 0
- maxOutputLength
Maximum length of output text, by default 20
- doSample
Whether or not to use sampling; use greedy decoding otherwise, by default False
- temperature
The value used to module the next token probabilities, by default 1.0
- topK
The number of highest probability vocabulary tokens to keep for top-k-filtering, by default 50
- topP
Top cumulative probability for vocabulary tokens, by default 1.0
If set to float < 1, only the most probable tokens with probabilities that add up to
topP
or higher are kept for generation.- repetitionPenalty
The parameter for repetition penalty, 1.0 means no penalty. , by default 1.0
- noRepeatNgramSize
If set to int > 0, all ngrams of that size can only occur once, by default 0
- ignoreTokenIds
A list of token ids which are ignored in the decoder’s output, by default []
Notes
This is a very computationally expensive module especially on larger sequence. The use of an accelerator such as GPU is recommended.
Question: Leonardo Da Vinci invented the microscope? Answer: No, Leonardo Da Vinci did not invent the microscope. The first microscope was invented |
in the late 16th century, long after Leonardo’] |—————————————————————————————————————————————————————————————————–+
- setIgnoreTokenIds(value)[source]#
A list of token ids which are ignored in the decoder’s output.
- Parameters:
- valueList[int]
The words to be filtered out
- setConfigProtoBytes(b)[source]#
Sets configProto from tensorflow, serialized into byte array.
- Parameters:
- bList[int]
ConfigProto from tensorflow, serialized into byte array
- setMinOutputLength(value)[source]#
Sets minimum length of the sequence to be generated.
- Parameters:
- valueint
Minimum length of the sequence to be generated
- setMaxOutputLength(value)[source]#
Sets maximum length of output text.
- Parameters:
- valueint
Maximum length of output text
- setDoSample(value)[source]#
Sets whether or not to use sampling, use greedy decoding otherwise.
- Parameters:
- valuebool
Whether or not to use sampling; use greedy decoding otherwise
- setTemperature(value)[source]#
Sets the value used to module the next token probabilities.
- Parameters:
- valuefloat
The value used to module the next token probabilities
- setTopK(value)[source]#
Sets the number of highest probability vocabulary tokens to keep for top-k-filtering.
- Parameters:
- valueint
Number of highest probability vocabulary tokens to keep
- setTopP(value)[source]#
Sets the top cumulative probability for vocabulary tokens.
If set to float < 1, only the most probable tokens with probabilities that add up to
topP
or higher are kept for generation.- Parameters:
- valuefloat
Cumulative probability for vocabulary tokens
- setRepetitionPenalty(value)[source]#
Sets the parameter for repetition penalty. 1.0 means no penalty.
- Parameters:
- valuefloat
The repetition penalty
References
See Ctrl: A Conditional Transformer Language Model For Controllable Generation for more details.
- setNoRepeatNgramSize(value)[source]#
Sets size of n-grams that can only occur once.
If set to int > 0, all ngrams of that size can only occur once.
- Parameters:
- valueint
N-gram size can only occur once
- static loadSavedModel(folder, spark_session, use_openvino=False)[source]#
Loads a locally saved model.
- Parameters:
- folderstr
Folder of the saved model
- spark_sessionpyspark.sql.SparkSession
The current SparkSession
- Returns:
- MistralTransformer
The restored model
- static pretrained(name='mistral_7b', lang='en', remote_loc=None)[source]#
Downloads and loads a pretrained model.
- Parameters:
- namestr, optional
Name of the pretrained model, by default “mistral_7b”
- langstr, optional
Language of the pretrained model, by default “en”
- remote_locstr, optional
Optional remote address of the resource, by default None. Will use Spark NLPs repositories otherwise.
- Returns:
- MistralTransformer
The restored model