Bangla RobertaForSequenceClassification Cased model (from neuralspace)

Description

Pretrained RobertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. autotrain-citizen_nlu_bn-1370652766 is a Bangla model originally trained by neuralspace.

Predicted Entities

ReportingMissingPets, EligibilityForBloodDonationCovidGap, ReportingPropertyTakeOver, IntentForBloodReceivalAppointment, EligibilityForBloodDonationSTD, InquiryForDoctorConsultation, InquiryOfCovidSymptoms, InquiryForVaccineCount, InquiryForCovidPrevention, InquiryForVaccinationRequirements, EligibilityForBloodDonationForPregnantWomen, ReportingCyberCrime, ReportingHitAndRun, ReportingTresspassing, InquiryofBloodDonationRequirements, ReportingMurder, ReportingVehicleAccident, ReportingMissingPerson, EligibilityForBloodDonationAgeLimit, ReportingAnimalPoaching, InquiryOfEmergencyContact, InquiryForQuarantinePeriod, ContactRealPerson, IntentForBloodDonationAppointment, ReportingMissingVehicle, InquiryForCovidRecentCasesCount, InquiryOfContact, StatusOfFIR, InquiryofVaccinationAgeLimit, InquiryForCovidTotalCasesCount, EligibilityForBloodDonationGap, InquiryofPostBloodDonationEffects, InquiryofPostBloodReceivalCareSchemes, EligibilityForBloodReceiversBloodGroup, EligitbilityForVaccine, InquiryOfLockdownDetails, ReportingSexualAssault, InquiryForVaccineCost, InquiryForCovidDeathCount, ReportingDrugConsumption, ReportingDrugTrafficing, InquiryofPostBloodDonationCertificate, ReportingDowry, ReportingChildAbuse, ReportingAnimalAbuse, InquiryofPostBloodReceivalEffects, Eligibility For BloodDonationWithComorbidities, InquiryOfTiming, InquiryForCovidActiveCasesCount, InquiryOfLocation, InquiryofPostBloodDonationCareSchemes, ReportingTheft, InquiryForTravelRestrictions, ReportingDomesticViolence, InquiryofBloodReceivalRequirements

Download Copy S3 URI

How to use

documentAssembler = DocumentAssembler() \
    .setInputCol("text") \
    .setOutputCol("document")

tokenizer = Tokenizer() \
    .setInputCols("document") \
    .setOutputCol("token")

roberta_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_citizen_nlu_bn_1370652766","bn") \
    .setInputCols(["document", "token"]) \
    .setOutputCol("class")

pipeline = Pipeline(stages=[documentAssembler, tokenizer, roberta_classifier])

data = spark.createDataFrame([["I love you!"], ["I feel lucky to be here."]]).toDF("text")

result = pipeline.fit(data).transform(data)
val documentAssembler = new DocumentAssembler()
    .setInputCols("text")
    .setOutputCols("document")

val tokenizer = new Tokenizer()
    .setInputCols("document")
    .setOutputCol("token")

val roberta_classifier = RoBertaForSequenceClassification.pretrained("roberta_classifier_autotrain_citizen_nlu_bn_1370652766","bn")
    .setInputCols(Array("document", "token"))
    .setOutputCol("class")

val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, roberta_classifier))

val data = Seq("I love you!").toDS.toDF("text")

val result = pipeline.fit(data).transform(data)
import nlu
nlu.load("bn.classify.roberta").predict("""I feel lucky to be here.""")

Model Information

Model Name: roberta_classifier_autotrain_citizen_nlu_bn_1370652766
Compatibility: Spark NLP 4.2.4+
License: Open Source
Edition: Official
Input Labels: [document, token]
Output Labels: [class]
Language: bn
Size: 312.2 MB
Case sensitive: true
Max sentence length: 128

References

  • https://huggingface.co/neuralspace/autotrain-citizen_nlu_bn-1370652766