sparknlp.common.properties
#
Contains classes for Annotator properties.
Module Contents#
Classes#
Components that take parameters. This also provides an internal |
Functions#
|
Sets the transformer's task, e.g. |
|
Sets minimum length of the sequence to be generated. |
|
Sets maximum length of output text. |
|
Sets whether or not to use sampling, use greedy decoding otherwise. |
|
Sets the value used to module the next token probabilities. |
|
Sets the number of highest probability vocabulary tokens to keep for |
|
Sets the top cumulative probability for vocabulary tokens. |
|
Sets the parameter for repetition penalty. 1.0 means no penalty. |
|
Sets size of n-grams that can only occur once. |
|
Sets the number of beam size for beam search. |
|
Sets the number of sequences to return from the beam search. |
- class HasEmbeddingsProperties[source]#
Components that take parameters. This also provides an internal param map to store parameter values attached to the instance.
New in version 1.3.0.
- setTask(self, value)[source]#
Sets the transformer’s task, e.g.
summarize:
.- Parameters:
- valuestr
The transformer’s task
- setMinOutputLength(self, value)[source]#
Sets minimum length of the sequence to be generated.
- Parameters:
- valueint
Minimum length of the sequence to be generated
- setMaxOutputLength(self, value)[source]#
Sets maximum length of output text.
- Parameters:
- valueint
Maximum length of output text
- setDoSample(self, value)[source]#
Sets whether or not to use sampling, use greedy decoding otherwise.
- Parameters:
- valuebool
Whether or not to use sampling; use greedy decoding otherwise
- setTemperature(self, value)[source]#
Sets the value used to module the next token probabilities.
- Parameters:
- valuefloat
The value used to module the next token probabilities
- setTopK(self, value)[source]#
Sets the number of highest probability vocabulary tokens to keep for top-k-filtering.
- Parameters:
- valueint
Number of highest probability vocabulary tokens to keep
- setTopP(self, value)[source]#
Sets the top cumulative probability for vocabulary tokens.
If set to float < 1, only the most probable tokens with probabilities that add up to
topP
or higher are kept for generation.- Parameters:
- valuefloat
Cumulative probability for vocabulary tokens
- setRepetitionPenalty(self, value)[source]#
Sets the parameter for repetition penalty. 1.0 means no penalty.
- Parameters:
- valuefloat
The repetition penalty
References
See Ctrl: A Conditional Transformer Language Model For Controllable Generation for more details.
- setNoRepeatNgramSize(self, value)[source]#
Sets size of n-grams that can only occur once.
If set to int > 0, all ngrams of that size can only occur once.
- Parameters:
- valueint
N-gram size can only occur once