Show HN: Semble – Code search for agents that uses 98% fewer tokens than grep

TL;DR

Semble introduces a code search tool optimized for agents, reducing token usage by 98% compared to grep+read, with near-instant indexing and query speeds. It runs entirely on CPU, requiring no external services.

Semble, a new code search library tailored for AI agents, has been released, promising to deliver exact code snippets with approximately 98% fewer tokens than traditional grep+read methods.

Semble is built to enable agents like Claude, Codex, and others to perform rapid, precise code searches without external dependencies or GPU requirements. It indexes repositories in around 250 milliseconds and answers queries in approximately 1.5 milliseconds, all on CPU. Benchmarks indicate its retrieval quality is comparable to specialized transformer models, with an NDCG@10 score of 0.854. The system can be run as an MCP server or invoked via command line, supporting local paths and git URLs. It is designed to be token-efficient, returning only relevant code chunks, which results in substantial savings—up to 98% fewer tokens—compared to traditional grep and read workflows.

Why It Matters

This development matters because it significantly reduces the cost and latency of code searches for AI agents, enabling faster and more efficient development and debugging workflows. By eliminating the need for external services or GPUs, Semble offers a local, scalable solution that can be integrated into existing agent architectures, potentially improving productivity and lowering operational overhead.

FOXWELL NT301 OBD2 Scanner Live Data Professional Mechanic OBDII Diagnostic Code Reader Tool for Check Engine Light

【Read Fault Codes】About the read code funtion needs to be in the ignition on state and if the…

As an affiliate, we earn on qualifying purchases.

Background

Existing code search tools for AI agents often rely on large transformer models or external services, which can be costly and slow. Semble’s release addresses these limitations by providing a lightweight, CPU-only solution that maintains high retrieval accuracy. Its development builds on the need for more efficient code search in AI workflows, especially as codebases grow larger and more complex.

“Semble indexes an average repo in about 250 ms and answers queries in roughly 1.5 ms, all on CPU, with 98% fewer tokens than grep+read.”

— Semble development team

“Semble achieves an NDCG@10 of 0.854, comparable to specialized transformer models, at a fraction of the size and cost.”

— Benchmark researchers

ResumeMaker Professional Deluxe 20 – Software to Create Professional Resumes Includes Sample Resumes Written by Certified Resume Writers, Career Advice, Job Searches & Interview Questions – CD – PC

Works on Windows 11, 10, & 8

As an affiliate, we earn on qualifying purchases.

What Remains Unclear

It is not yet clear how Semble performs across very large or complex codebases, or how it compares in real-world debugging scenarios. Long-term stability and integration support are still to be tested in diverse environments.

The No-Code AI Business Toolkit: Tools, templates, workflows and prompts for building smarter business systems without code. (The Practical AI & SEO Business Library)

As an affiliate, we earn on qualifying purchases.

What’s Next

Next steps include broader adoption by AI development teams, integration into various agent frameworks, and further benchmarking in production settings. Updates may include enhancements to indexing speed, query accuracy, and support for additional code repositories.

Oruiiju 8 Piece Set Multi-Function CPU Removal Tool Set for Easy Smartphone and Computer Repair

Versatile Repair Kit：Perfect for safely removing BGA chips, CPUs, and other small components from smartphones and motherboards.

As an affiliate, we earn on qualifying purchases.

Key Questions

How does Semble achieve such a high speed and low token usage?

Semble uses optimized indexing and retrieval algorithms that focus on returning only the most relevant code snippets, drastically reducing token consumption and response time.

Can Semble be integrated with existing AI agents easily?

Yes, Semble supports integration via MCP servers or command-line interfaces, compatible with agents like Claude, Codex, and others, with straightforward setup instructions.

Does Semble require external hardware or cloud services?

No, it runs entirely on CPU without needing API keys, GPUs, or external services, making it suitable for local deployment.

What are the limitations or open questions about Semble?

Performance on very large codebases or in complex search scenarios remains to be fully evaluated. Long-term stability and support across diverse environments are still to be confirmed.

Show HN: Semble – Code search for agents that uses 98% fewer tokens than grep

Up next

Fortnite players get a 10-minute sneak peek of The Mandalorian and Grogu on May 19

Author

Artificial Intelligence

Share article

Why It Matters

FOXWELL NT301 OBD2 Scanner Live Data Professional Mechanic OBDII Diagnostic Code Reader Tool for Check Engine Light

Background

ResumeMaker Professional Deluxe 20 – Software to Create Professional Resumes Includes Sample Resumes Written by Certified Resume Writers, Career Advice, Job Searches & Interview Questions – CD – PC

What Remains Unclear

The No-Code AI Business Toolkit: Tools, templates, workflows and prompts for building smarter business systems without code. (The Practical AI & SEO Business Library)

What’s Next

Oruiiju 8 Piece Set Multi-Function CPU Removal Tool Set for Easy Smartphone and Computer Repair

Key Questions

How does Semble achieve such a high speed and low token usage?

Can Semble be integrated with existing AI agents easily?

Does Semble require external hardware or cloud services?

What are the limitations or open questions about Semble?

US reportedly allows 10 Chinese companies to buy NVIDIA’s coveted H200 AI chips

EMO: Pretraining mixture of experts for emergent modularity

San Francisco’s Robo-Fight Club, General Catalyst’s Divisive Virality

I Think I Have LLM Burnout

Mesh LLM: distributed AI computing on iroh

What xAI’s Grok Build CLI Actually Sends to xAI

10 Best Prompt Engineering Guides in 2026

What xAI’s Grok Build CLI Actually Sends To xAI

Show HN: Semble – Code search for agents that uses 98% fewer tokens than grep

Up next

Author

Artificial Intelligence

Share article

Why It Matters

FOXWELL NT301 OBD2 Scanner Live Data Professional Mechanic OBDII Diagnostic Code Reader Tool for Check Engine Light

Background

ResumeMaker Professional Deluxe 20 – Software to Create Professional Resumes Includes Sample Resumes Written by Certified Resume Writers, Career Advice, Job Searches & Interview Questions – CD – PC

What Remains Unclear

The No-Code AI Business Toolkit: Tools, templates, workflows and prompts for building smarter business systems without code. (The Practical AI & SEO Business Library)

What’s Next

Oruiiju 8 Piece Set Multi-Function CPU Removal Tool Set for Easy Smartphone and Computer Repair

Key Questions

How does Semble achieve such a high speed and low token usage?

Can Semble be integrated with existing AI agents easily?

Does Semble require external hardware or cloud services?

What are the limitations or open questions about Semble?

You May Also Like