TL;DR

IBM has introduced two new multilingual embedding models, granite-embedding-97m-multilingual-r2 and 311m-multilingual-r2, under Apache 2.0. They offer improved retrieval quality across 200+ languages, including code support, and are designed for enterprise deployment.

IBM has released two new open-source multilingual embedding models, granite-embedding-97m-multilingual-r2 and granite-embedding-311m-multilingual-r2, under the Apache 2.0 license, aiming to improve multilingual retrieval and code understanding for enterprise applications.

The models are built on ModernBERT architecture, supporting over 200 languages, with 52 languages receiving explicit training for higher-quality retrieval. The 97M-parameter model scores 60.3 on the Multilingual MTEB Retrieval benchmark, outperforming previous models of similar size. The full-size 311M model scores 65.2, ranking second among open models under 500M parameters.

Both models handle context lengths up to 32,768 tokens, support code retrieval across nine programming languages, and are compatible with popular frameworks such as sentence-transformers, LangChain, and Haystack. They are optimized for CPU inference via ONNX and OpenVINO and require no task-specific tuning.

Why It Matters

This release addresses a key challenge in multilingual NLP—balancing model size with language coverage and retrieval quality. By offering high-performance, open-source models that support over 200 languages, IBM enables broader access and deployment in multilingual and cross-lingual applications, including search, retrieval-augmented generation, and code understanding. The enterprise-ready design emphasizes responsible data handling and deployment suitability.

Natural Language Processing (NLP) and Machine Learning (ML): Theory and Applications

Natural Language Processing (NLP) and Machine Learning (ML): Theory and Applications

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Background

Prior models like XLM-RoBERTa provided multilingual support but had limitations in context length and retrieval accuracy. The R2 models are a ground-up rebuild using ModernBERT, which incorporates recent advances in transformer architecture, resulting in improved efficiency and performance. The release builds on IBM’s previous efforts to create scalable, responsible NLP tools for enterprise use, with a focus on broad language support and technical robustness.

“The Granite Embedding Multilingual R2 models significantly narrow the gap between size and performance in multilingual embeddings, supporting over 200 languages with enterprise-level quality.”

— IBM Research

“Our models are designed to be plug-and-play, requiring no task-specific tuning, and are compatible with existing frameworks to facilitate easy integration.”

— IBM Data Science Team

Data Modeling with Microsoft Power BI: Self-Service and Enterprise Data Warehouse with Power BI

Data Modeling with Microsoft Power BI: Self-Service and Enterprise Data Warehouse with Power BI

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

What Remains Unclear

It is not yet clear how these models will perform in large-scale, real-world enterprise deployments, or how they will compare with proprietary models in specific use cases. Further testing and user feedback are expected to clarify their practical effectiveness and limitations.

Designing Large Language Model Applications: A Holistic Approach to LLMs

Designing Large Language Model Applications: A Holistic Approach to LLMs

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

What’s Next

IBM plans to continue evaluating these models in diverse applications and may release updates or additional tools to enhance their deployment. Monitoring user feedback and benchmarking in real-world scenarios will be key next steps.

Multilingual Generative Engine Optimization (GEO): The Global AI Visibility Blueprint for the Post-SEO Era

Multilingual Generative Engine Optimization (GEO): The Global AI Visibility Blueprint for the Post-SEO Era

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Key Questions

What are the main differences between the 97M and 311M models?

The 97M model is more compact, with 384-dimensional embeddings, and scores highly on retrieval benchmarks for its size. The 311M model offers higher overall performance, supports longer contexts, and is suitable for more demanding applications.

Can these models be used for code retrieval?

Yes, both models support cross-lingual code retrieval across nine programming languages, making them suitable for technical and developer-focused applications.

Are these models ready for enterprise deployment?

Yes, they are designed to be enterprise-ready, with optimized inference options, broad language support, and compliance with responsible data handling practices.

What frameworks are compatible with these models?

They work out of the box with sentence-transformers, transformers, LangChain, LlamaIndex, Haystack, and Milvus, requiring only a one-line model name change for integration.

You May Also Like

Agentic Trading with Safe Guardrails

Shuriken unveils infrastructure enabling autonomous agents to trade across assets with granular permissions and safety controls, marking a step toward autonomous finance.

The Rise of the One-Person Operations Team

I’m exploring how solo entrepreneurs excel in managing entire operations alone and the strategies that keep them thriving.

Productivity vs. Creativity: How AI Shifts Workplace Priorities

Optimizing workplace priorities with AI involves balancing productivity and creativity, but discovering the best approach requires exploring how to leverage AI effectively.

Advanced Micro Devices: AI Dream Faces Market Jitters

Market jitters surround AMD’s AI initiatives as investor confidence wavers amid recent volatility and mixed signals about future growth.