package finisher
Ordering
- Alphabetic
Visibility
- Public
- All
Type Members
- case class DocumentSimilarityRankerFinisher(uid: String) extends Transformer with DefaultParamsWritable with Product with Serializable
-
case class
GGUFRankingFinisher(uid: String) extends Transformer with DefaultParamsWritable with Product with Serializable
Finisher for AutoGGUFReranker outputs that provides ranking capabilities including top-k selection, sorting by relevance score, and score normalization.
Finisher for AutoGGUFReranker outputs that provides ranking capabilities including top-k selection, sorting by relevance score, and score normalization.
This finisher processes the output of AutoGGUFReranker, which contains documents with relevance scores in their metadata. It provides several options for post-processing:
- Top-k selection: Select only the top k documents by relevance score
- Score thresholding: Filter documents by minimum relevance score
- Min-max scaling: Normalize relevance scores to 0-1 range
- Sorting: Sort documents by relevance score in descending order
- Ranking: Add rank information to document metadata
The finisher preserves the document annotation structure while adding ranking information to the metadata and optionally filtering/sorting the documents.
Example
import com.johnsnowlabs.nlp.base._ import com.johnsnowlabs.nlp.annotators._ import com.johnsnowlabs.nlp.finisher._ import org.apache.spark.ml.Pipeline import spark.implicits._ val document = new DocumentAssembler() .setInputCol("text") .setOutputCol("document") val reranker = AutoGGUFReranker .pretrained("bge_reranker_v2_m3-Q4_K_M") .setInputCols("document") .setOutputCol("reranked_documents") .setQuery("A man is eating pasta.") val finisher = new GGUFRankingFinisher() .setInputCols("reranked_documents") .setOutputCol("ranked_documents") .setTopK(3) .setMinRelevanceScore(0.1) .setMinMaxScaling(true) val pipeline = new Pipeline().setStages(Array(document, reranker, finisher)) val data = Seq( "A man is eating food.", "A man is eating a piece of bread.", "The girl is carrying a baby.", "A man is riding a horse." ).toDF("text") val result = pipeline.fit(data).transform(data) result.select("ranked_documents").show(truncate = false) // Documents will be sorted by relevance with rank information in metadata
- uid
required uid for storing finisher to disk
Value Members
- object DocumentSimilarityRankerFinisher extends DefaultParamsReadable[DocumentSimilarityRankerFinisher] with Serializable
- object GGUFRankingFinisher extends DefaultParamsReadable[GGUFRankingFinisher] with Serializable