Granite Embedding Multilingual R2: Open Apache 2.0 Multilingual Embeddings with 32K Context — Best Sub-100M Retrieval Quality

TL;DR

IBM has introduced two new multilingual embedding models, granite-embedding-97m-multilingual-r2 and 311m-multilingual-r2, under Apache 2.0. They offer improved retrieval quality across 200+ languages, including code support, and are designed for enterprise deployment.

IBM has released two new open-source multilingual embedding models, granite-embedding-97m-multilingual-r2 and granite-embedding-311m-multilingual-r2, under the Apache 2.0 license, aiming to improve multilingual retrieval and code understanding for enterprise applications.

The models are built on ModernBERT architecture, supporting over 200 languages, with 52 languages receiving explicit training for higher-quality retrieval. The 97M-parameter model scores 60.3 on the Multilingual MTEB Retrieval benchmark, outperforming previous models of similar size. The full-size 311M model scores 65.2, ranking second among open models under 500M parameters.

Both models handle context lengths up to 32,768 tokens, support code retrieval across nine programming languages, and are compatible with popular frameworks such as sentence-transformers, LangChain, and Haystack. They are optimized for CPU inference via ONNX and OpenVINO and require no task-specific tuning.

Why It Matters

This release addresses a key challenge in multilingual NLP—balancing model size with language coverage and retrieval quality. By offering high-performance, open-source models that support over 200 languages, IBM enables broader access and deployment in multilingual and cross-lingual applications, including search, retrieval-augmented generation, and code understanding. The enterprise-ready design emphasizes responsible data handling and deployment suitability.

Natural Language Processing (NLP) and Machine Learning (ML): Theory and Applications

As an affiliate, we earn on qualifying purchases.

Background

Prior models like XLM-RoBERTa provided multilingual support but had limitations in context length and retrieval accuracy. The R2 models are a ground-up rebuild using ModernBERT, which incorporates recent advances in transformer architecture, resulting in improved efficiency and performance. The release builds on IBM’s previous efforts to create scalable, responsible NLP tools for enterprise use, with a focus on broad language support and technical robustness.

“The Granite Embedding Multilingual R2 models significantly narrow the gap between size and performance in multilingual embeddings, supporting over 200 languages with enterprise-level quality.”

— IBM Research

“Our models are designed to be plug-and-play, requiring no task-specific tuning, and are compatible with existing frameworks to facilitate easy integration.”

— IBM Data Science Team

Data Modeling with Microsoft Power BI: Self-Service and Enterprise Data Warehouse with Power BI

As an affiliate, we earn on qualifying purchases.

What Remains Unclear

It is not yet clear how these models will perform in large-scale, real-world enterprise deployments, or how they will compare with proprietary models in specific use cases. Further testing and user feedback are expected to clarify their practical effectiveness and limitations.

The Mathematics of Large Language Models: Machine Learning Theory Made Readable: LLMs, Transformers, Diffusion, Neural Networks, and Generative AI

As an affiliate, we earn on qualifying purchases.

What’s Next

IBM plans to continue evaluating these models in diverse applications and may release updates or additional tools to enhance their deployment. Monitoring user feedback and benchmarking in real-world scenarios will be key next steps.

The Unicode Framework: Building Multilingual Software (programming book)

As an affiliate, we earn on qualifying purchases.

Key Questions

What are the main differences between the 97M and 311M models?

The 97M model is more compact, with 384-dimensional embeddings, and scores highly on retrieval benchmarks for its size. The 311M model offers higher overall performance, supports longer contexts, and is suitable for more demanding applications.

Can these models be used for code retrieval?

Yes, both models support cross-lingual code retrieval across nine programming languages, making them suitable for technical and developer-focused applications.

Are these models ready for enterprise deployment?

Yes, they are designed to be enterprise-ready, with optimized inference options, broad language support, and compliance with responsible data handling practices.

What frameworks are compatible with these models?

They work out of the box with sentence-transformers, transformers, LangChain, LlamaIndex, Haystack, and Milvus, requiring only a one-line model name change for integration.

Granite Embedding Multilingual R2: Open Apache 2.0 Multilingual Embeddings with 32K Context — Best Sub-100M Retrieval Quality

Up next

SpaceX and Anthropic, xAI’s Two Companies, Elon Musk and SpaceXAI’s Future

Author

Artificial Intelligence

Share article

Why It Matters

Natural Language Processing (NLP) and Machine Learning (ML): Theory and Applications

Background

Data Modeling with Microsoft Power BI: Self-Service and Enterprise Data Warehouse with Power BI

What Remains Unclear

The Mathematics of Large Language Models: Machine Learning Theory Made Readable: LLMs, Transformers, Diffusion, Neural Networks, and Generative AI

What’s Next

The Unicode Framework: Building Multilingual Software (programming book)

Key Questions

What are the main differences between the 97M and 311M models?

Can these models be used for code retrieval?

Are these models ready for enterprise deployment?

What frameworks are compatible with these models?

Forezai · Polybot: When the AI Disagrees With the Odds

Palo Alto Reports Earnings as It Prepares for AI Security

Grok 4.5

The Forward-Deploy Pivot: Why Anthropic and OpenAI Are Becoming Consulting Firms in the Same Week

Old And New Apps, Via Modern Coding Agents

Migrating A Production AI Agent To GPT-5.6: 2.2X Faster, 27% Cheaper

Old and new apps, via modern coding agents

Claude Code Sends 33K Tokens Before Reading The Prompt; OpenCode Sends 7K

Granite Embedding Multilingual R2: Open Apache 2.0 Multilingual Embeddings with 32K Context — Best Sub-100M Retrieval Quality

Up next

Author

Artificial Intelligence

Share article

Why It Matters

Natural Language Processing (NLP) and Machine Learning (ML): Theory and Applications

Background

Data Modeling with Microsoft Power BI: Self-Service and Enterprise Data Warehouse with Power BI

What Remains Unclear

The Mathematics of Large Language Models: Machine Learning Theory Made Readable: LLMs, Transformers, Diffusion, Neural Networks, and Generative AI

What’s Next

The Unicode Framework: Building Multilingual Software (programming book)

Key Questions

What are the main differences between the 97M and 311M models?

Can these models be used for code retrieval?

Are these models ready for enterprise deployment?

What frameworks are compatible with these models?

You May Also Like