Introducing the Ettin Reranker Family
Six new state-of-the-art Sentence Transformers CrossEncoder rerankers are released, built on top of the Ettin ModernBERT encoders.

The field of natural language processing has just gotten a significant boost with the introduction of the Ettin Reranker Family. Today, six new Sentence Transformers CrossEncoder rerankers are being released, boasting state-of-the-art performance at their respective sizes. These models are built on top of the Ettin ModernBERT encoders and come with a comprehensive data set and full training recipe.
The Ettin Reranker Family was trained using a distillation recipe that leverages pointwise MSE on mixedbread-ai/mxbai-rerank-large-v2 scores over cross-encoder/ettin-reranker-v1-data. This data set is a subset of lightonai/embeddings-pre-training mixed with a reranked subset of lightonai/embeddings-fine-tuning. The results are six rerankers that, when paired with google/embeddinggemma-300m on MTEB(eng, v2) Retrieval, demonstrate impressive performance.
For those unfamiliar with rerankers, it's essential to understand their role in information retrieval systems. A reranker, also known as a pointwise cross-encoder, is a neural model that takes a query-document pair and outputs a relevance score. This score determines the relevance of a document to a given query.
Unlike embedding models that encode queries and documents separately, rerankers allow the two texts to attend to each other through every transformer layer, providing a more accurate but computationally expensive assessment. The released models are standard Sentence Transformers CrossEncoder models, making them easy to integrate into existing systems with just three lines of code. They support up to 8K tokens of context, thanks to ModernBERT's long-context pre-training, making them suitable for long-document reranking.
The Ettin Reranker Family consists of models with varying sizes, from 17M to 1B parameters. Each model uses a 4-module classification head on top of the Ettin encoder, which mirrors ModernBertForSequenceClassification but is built from Sentence Transformers' modular components. The use of unpadded attention and sequence unpadding for variable-length inputs enables Flash Attention 2, leading to significant speed improvements.
The performance of the Ettin Reranker Family is impressive, with the smallest model (17M) outperforming the 33M ms-marco-MiniLM-L12-v2 by +0.051 NDCG@10 on MTEB and +0.038 on NanoBEIR. The larger models also show competitive results, with the 1B model closely tracking its teacher, the 1.54B mxbai-rerank-large-v2. In terms of speed, the Ettin Reranker Family demonstrates remarkable efficiency.
The 17M model is the fastest reranker in the comparison, at 7517 pairs per second, while the 1B model hits 928 pairs per second. These speeds are achieved through the use of bfloat16 and Flash Attention 2, which significantly contribute to the overall speedup. The training recipe used to develop the Ettin Reranker Family is simple yet effective.
Source: Hugging Face