sparknlp.annotator.cleaners.extractor#

Contains classes for Extractor.

Module Contents#

Classes#

Extractor

Base class for :py:class:`Model`s that wrap Java/Scala

class Extractor(classname='com.johnsnowlabs.nlp.annotators.cleaners.Extractor', java_model=None)[source]#

Base class for :py:class:`Model`s that wrap Java/Scala implementations. Subclasses should inherit this class before param mix-ins, because this sets the UID from the Java model.

name = 'Extractor'[source]#
inputAnnotatorTypes[source]#
outputAnnotatorType = 'chunk'[source]#
emailDateTimeTzPattern[source]#
emailAddress[source]#
ipAddressPattern[source]#
ipAddressNamePattern[source]#
mapiIdPattern[source]#
usPhoneNumbersPattern[source]#
imageUrlPattern[source]#
textPattern[source]#
extractorMode[source]#
index[source]#
setEmailDateTimeTzPattern(value)[source]#

Sets specifies the date-time pattern for email timestamps, including time zone formatting.

Parameters:
valuestr

Specifies the date-time pattern for email timestamps, including time zone formatting.

setEmailAddress(value)[source]#

Sets the pattern for email addresses.

Parameters:
valuestr

Specifies the pattern for email addresses.

setIpAddressPattern(value)[source]#

Sets the pattern for IP addresses.

Parameters:
valuestr

Specifies the pattern for IP addresses.

setIpAddressNamePattern(value)[source]#

Sets the pattern for IP addresses with names.

Parameters:
valuestr

Specifies the pattern for IP addresses with names.

setMapiIdPattern(value)[source]#

Sets the pattern for MAPI IDs.

Parameters:
valuestr

Specifies the pattern for MAPI IDs.

setUsPhoneNumbersPattern(value)[source]#

Sets the pattern for US phone numbers.

Parameters:
valuestr

Specifies the pattern for US phone numbers.

setImageUrlPattern(value)[source]#

Sets the pattern for image URLs.

Parameters:
valuestr

Specifies the pattern for image URLs.

setTextPattern(value)[source]#

Sets the pattern for text after and before.

Parameters:
valuestr

Specifies the pattern for text after and before.

setExtractorMode(value)[source]#
setIndex(value)[source]#

Sets the index of the pattern to extract in text after or before.

Parameters:
valueint

Specifies the index of the pattern to extract in text after or before.