jina-reranker-v3 is a multilingual document reranking model designed for high-performance retrieval pipelines. It is described in the technical report jina-reranker-v3: Last but Not Late Interaction for Document Reranking.
Model resources, deployment options, and usage examples are available through the Jina AI model pages and repository ecosystem.
jina-reranker-v3 is a 0.6B-parameter multilingual reranker built for listwise document reranking. Unlike bi-encoder retrieval systems that independently embed queries and documents, or late-interaction systems such as ColBERT that delay matching until after separate encoding, jina-reranker-v3 processes the query and multiple candidate documents jointly within the same context window.
Its central architectural idea is “last but not late interaction.” The model applies causal self-attention across the query and candidate documents together, allowing cross-document and query-document interactions during encoding. It then extracts contextualized representations from the last token of each document, which are used for reranking. This gives the model listwise reasoning ability while remaining much smaller than large generative rerankers.
Key traits of jina-reranker-v3:
- Multilingual reranker: Supports multilingual retrieval and reranking scenarios.
- Listwise architecture: Processes multiple candidate documents jointly rather than scoring each pair independently.
- Last but not late interaction: Allows cross-document and query-document interaction during encoding.
- Efficient model size: Uses only 0.6B parameters while achieving strong reranking quality.
- Long-context support: Can process up to 64 documents simultaneously within a 131K token context window.

Figure 1 (from the technical report) illustrates the architecture and workflow of jina-reranker-v3:
- A user query and multiple candidate documents are packed into the same context window.
- The model performs causal self-attention jointly across query and documents, enabling contextual interactions before scoring.
- For each candidate document, a contextual embedding is extracted from its last token position.
- These document-level representations are then used to compute relevance scores for reranking.
- The design enables listwise reranking behavior while remaining substantially smaller than many generative rerankers.
jina-reranker-v3 is intended for:
- Second-stage retrieval reranking after an initial candidate retrieval step.
- Multilingual search systems where queries and documents may appear in different languages.
- High-quality semantic search pipelines requiring stronger ranking than embedding-only retrieval.
- Listwise reranking scenarios where multiple candidate documents should be judged jointly.
Limitations:
- jina-reranker-v3 is a reranker, not a first-stage retriever, so it is typically used after an initial candidate generation step.
- Its joint listwise processing can be more computationally demanding than simple embedding-based similarity search.
- The released model is licensed under CC BY-NC 4.0, which limits direct commercial use outside approved platforms or licensing arrangements.
- Practical quality depends on the quality of the candidate set coming from the upstream retriever.
¶ BibTeX entry and citation info
@misc{wang2025jinarerankerv3lateinteractiondocument,
title={jina-reranker-v3: Last but Not Late Interaction for Document Reranking},
author={Feng Wang and Yuqing Li and Han Xiao},
year={2025},
eprint={2509.25085},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2509.25085},
}