sparknlp.annotator.seq2seq.mistral_transformer#
Contains classes for the MistralTransformer.
Module Contents#
Classes#
| Mistral 7B | 
- class MistralTransformer(classname='com.johnsnowlabs.nlp.annotators.seq2seq.MistralTransformer', java_model=None)[source]#
- Mistral 7B - Mistral 7B, a 7.3 billion-parameter model that stands out for its efficient and effective performance in natural language processing. Surpassing Llama 2 13B across all benchmarks and excelling over Llama 1 34B in various aspects, Mistral 7B strikes a balance between English language tasks and code comprehension, rivaling the capabilities of CodeLlama 7B in the latter. - Mistral 7B introduces Grouped-query attention (GQA) for quicker inference, enhancing processing speed without compromising accuracy. This streamlined approach ensures a smoother user experience, making Mistral 7B a practical choice for real-world applications. - Additionally, Mistral 7B adopts Sliding Window Attention (SWA) to efficiently handle longer sequences at a reduced computational cost. This feature enhances the model’s ability to process extensive textual input, expanding its utility in handling more complex tasks. - In summary, Mistral 7B represents a notable advancement in language models, offering a reliable and versatile solution for various natural language processing challenges. - Pretrained models can be loaded with - pretrained()of the companion object:- >>> mistral = MistralTransformer.pretrained() \ ... .setInputCols(["document"]) \ ... .setOutputCol("generation") - The default model is - "mistral_7b", if no name is provided. For available pretrained models please see the Models Hub.- Input Annotation types - Output Annotation type - DOCUMENT- DOCUMENT- Parameters:
- configProtoBytes
- ConfigProto from tensorflow, serialized into byte array. 
- minOutputLength
- Minimum length of the sequence to be generated, by default 0 
- maxOutputLength
- Maximum length of output text, by default 20 
- doSample
- Whether or not to use sampling; use greedy decoding otherwise, by default False 
- temperature
- The value used to module the next token probabilities, by default 1.0 
- topK
- The number of highest probability vocabulary tokens to keep for top-k-filtering, by default 50 
- topP
- Top cumulative probability for vocabulary tokens, by default 1.0 - If set to float < 1, only the most probable tokens with probabilities that add up to - topPor higher are kept for generation.
- repetitionPenalty
- The parameter for repetition penalty, 1.0 means no penalty. , by default 1.0 
- noRepeatNgramSize
- If set to int > 0, all ngrams of that size can only occur once, by default 0 
- ignoreTokenIds
- A list of token ids which are ignored in the decoder’s output, by default [] 
 
 - Notes - This is a very computationally expensive module especially on larger sequence. The use of an accelerator such as GPU is recommended. - References - Paper Abstract: - We introduce Mistral 7B v0.1, a 7-billion-parameter language model engineered for superior performance and efficiency. Mistral 7B outperforms Llama 2 13B across all evaluated benchmarks, and Llama 1 34B in reasoning, mathematics, and code generation. Our model leverages grouped-query attention (GQA) for faster inference, coupled with sliding window attention (SWA) to effectively handle sequences of arbitrary length with a reduced inference cost. We also provide a model fine-tuned to follow instructions, Mistral 7B – Instruct, that surpasses the Llama 2 13B – Chat model both on human and automated benchmarks. Our models are released under the Apache 2.0 license. - Examples - >>> import sparknlp >>> from sparknlp.base import * >>> from sparknlp.annotator import * >>> from pyspark.ml import Pipeline >>> documentAssembler = DocumentAssembler() \ ... .setInputCol("text") \ ... .setOutputCol("documents") >>> mistral = MistralTransformer.pretrained("mistral_7b") \ ... .setInputCols(["documents"]) \ ... .setMaxOutputLength(50) \ ... .setOutputCol("generation") >>> pipeline = Pipeline().setStages([documentAssembler, mistral]) >>> data = spark.createDataFrame([["My name is Leonardo."]]).toDF("text") >>> result = pipeline.fit(data).transform(data) >>> result.select("summaries.generation").show(truncate=False) +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |result | +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |[Leonardo Da Vinci invented the microscope?\n Question: Leonardo Da Vinci invented the microscope?\n Answer: No, Leonardo Da Vinci did not invent the microscope. The first microscope was invented | | in the late 16th century, long after Leonardo'] | -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ - setIgnoreTokenIds(value)[source]#
- A list of token ids which are ignored in the decoder’s output. - Parameters:
- valueList[int]
- The words to be filtered out 
 
 
 - setConfigProtoBytes(b)[source]#
- Sets configProto from tensorflow, serialized into byte array. - Parameters:
- bList[int]
- ConfigProto from tensorflow, serialized into byte array 
 
 
 - setMinOutputLength(value)[source]#
- Sets minimum length of the sequence to be generated. - Parameters:
- valueint
- Minimum length of the sequence to be generated 
 
 
 - setMaxOutputLength(value)[source]#
- Sets maximum length of output text. - Parameters:
- valueint
- Maximum length of output text 
 
 
 - setDoSample(value)[source]#
- Sets whether or not to use sampling, use greedy decoding otherwise. - Parameters:
- valuebool
- Whether or not to use sampling; use greedy decoding otherwise 
 
 
 - setTemperature(value)[source]#
- Sets the value used to module the next token probabilities. - Parameters:
- valuefloat
- The value used to module the next token probabilities 
 
 
 - setTopK(value)[source]#
- Sets the number of highest probability vocabulary tokens to keep for top-k-filtering. - Parameters:
- valueint
- Number of highest probability vocabulary tokens to keep 
 
 
 - setTopP(value)[source]#
- Sets the top cumulative probability for vocabulary tokens. - If set to float < 1, only the most probable tokens with probabilities that add up to - topPor higher are kept for generation.- Parameters:
- valuefloat
- Cumulative probability for vocabulary tokens 
 
 
 - setRepetitionPenalty(value)[source]#
- Sets the parameter for repetition penalty. 1.0 means no penalty. - Parameters:
- valuefloat
- The repetition penalty 
 
 - References - See Ctrl: A Conditional Transformer Language Model For Controllable Generation for more details. 
 - setNoRepeatNgramSize(value)[source]#
- Sets size of n-grams that can only occur once. - If set to int > 0, all ngrams of that size can only occur once. - Parameters:
- valueint
- N-gram size can only occur once 
 
 
 - static loadSavedModel(folder, spark_session, use_openvino=False)[source]#
- Loads a locally saved model. - Parameters:
- folderstr
- Folder of the saved model 
- spark_sessionpyspark.sql.SparkSession
- The current SparkSession 
 
- Returns:
- MistralTransformer
- The restored model 
 
 
 - static pretrained(name='mistral_7b', lang='en', remote_loc=None)[source]#
- Downloads and loads a pretrained model. - Parameters:
- namestr, optional
- Name of the pretrained model, by default “mistral_7b” 
- langstr, optional
- Language of the pretrained model, by default “en” 
- remote_locstr, optional
- Optional remote address of the resource, by default None. Will use Spark NLPs repositories otherwise. 
 
- Returns:
- MistralTransformer
- The restored model