sparknlp.annotator.param.evaluation_dl_params
#
Module Contents#
Classes#
Components that take parameters. This also provides an internal |
- class EvaluationDLParams[source]#
Components that take parameters. This also provides an internal param map to store parameter values attached to the instance.
New in version 1.3.0.
- setVerbose(value)[source]#
Sets level of verbosity during training
- Parameters:
- valueint
Level of verbosity
- setValidationSplit(v)[source]#
Sets the proportion of training dataset to be validated against the model on each Epoch, by default it is 0.0 and off. The value should be between 0.0 and 1.0.
- Parameters:
- vfloat
Proportion of training dataset to be validated
- setEvaluationLogExtended(v)[source]#
Sets whether logs for validation to be extended, by default False. Displays time and evaluation of each label.
- Parameters:
- vbool
Whether logs for validation to be extended
- setEnableOutputLogs(value)[source]#
Sets whether to use stdout in addition to Spark logs, by default False.
- Parameters:
- valuebool
Whether to use stdout in addition to Spark logs
- setOutputLogsPath(p)[source]#
Sets folder path to save training logs
- Parameters:
- pstr
Folder path to save training logs
- setTestDataset(path, read_as=ReadAs.SPARK, options={'format': 'parquet'})[source]#
Path to a parquet file of a test dataset. If set, it is used to calculate statistics on it during training.
The parquet file must be a dataframe that has the same columns as the model that is being trained. For example, if the model needs as input DOCUMENT, TOKEN, WORD_EMBEDDINGS (Features) and NAMED_ENTITY (label) then these columns also need to be present while saving the dataframe. The pre-processing steps for the training dataframe should also be applied to the test dataframe.
An example on how to create such a parquet file could be:
>>> # assuming preProcessingPipeline >>> (train, test) = data.randomSplit([0.8, 0.2]) >>> preProcessingPipeline ... .fit(test) ... .transform(test) ... .write ... .mode("overwrite") ... .parquet("test_data") >>> annotator.setTestDataset("test_data")
- Parameters:
- pathstr
Path to test dataset
- read_asstr, optional
How to read the resource, by default ReadAs.SPARK
- optionsdict, optional
Options for reading the resource, by default {“format”: “csv”}