TL;DR

Semble introduces a code search tool optimized for agents, reducing token usage by 98% compared to grep+read, with near-instant indexing and query speeds. It runs entirely on CPU, requiring no external services.

Semble, a new code search library tailored for AI agents, has been released, promising to deliver exact code snippets with approximately 98% fewer tokens than traditional grep+read methods.

Semble is built to enable agents like Claude, Codex, and others to perform rapid, precise code searches without external dependencies or GPU requirements. It indexes repositories in around 250 milliseconds and answers queries in approximately 1.5 milliseconds, all on CPU. Benchmarks indicate its retrieval quality is comparable to specialized transformer models, with an NDCG@10 score of 0.854. The system can be run as an MCP server or invoked via command line, supporting local paths and git URLs. It is designed to be token-efficient, returning only relevant code chunks, which results in substantial savings—up to 98% fewer tokens—compared to traditional grep and read workflows.

Why It Matters

This development matters because it significantly reduces the cost and latency of code searches for AI agents, enabling faster and more efficient development and debugging workflows. By eliminating the need for external services or GPUs, Semble offers a local, scalable solution that can be integrated into existing agent architectures, potentially improving productivity and lowering operational overhead.

FOXWELL NT301 OBD2 Scanner Live Data Professional Mechanic OBDII Diagnostic Code Reader Tool for Check Engine Light

FOXWELL NT301 OBD2 Scanner Live Data Professional Mechanic OBDII Diagnostic Code Reader Tool for Check Engine Light

【Vehicle CEL Doctor】The NT301 obd2 scanner enables you to read DTCs, access to e-missions readiness status, turn off…

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Background

Existing code search tools for AI agents often rely on large transformer models or external services, which can be costly and slow. Semble’s release addresses these limitations by providing a lightweight, CPU-only solution that maintains high retrieval accuracy. Its development builds on the need for more efficient code search in AI workflows, especially as codebases grow larger and more complex.

“Semble indexes an average repo in about 250 ms and answers queries in roughly 1.5 ms, all on CPU, with 98% fewer tokens than grep+read.”

— Semble development team

“Semble achieves an NDCG@10 of 0.854, comparable to specialized transformer models, at a fraction of the size and cost.”

— Benchmark researchers

ResumeMaker Professional Deluxe 20 - Software to Create Professional Resumes Includes Sample Resumes Written by Certified Resume Writers, Career Advice, Job Searches & Interview Questions - CD - PC

ResumeMaker Professional Deluxe 20 – Software to Create Professional Resumes Includes Sample Resumes Written by Certified Resume Writers, Career Advice, Job Searches & Interview Questions – CD – PC

Works on Windows 11, 10, & 8

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

What Remains Unclear

It is not yet clear how Semble performs across very large or complex codebases, or how it compares in real-world debugging scenarios. Long-term stability and integration support are still to be tested in diverse environments.

Inateck 2D Barcode Scanner, Wireless Bluetooth QR Code Scanner with AI APP & SDK, 180-Day Battery Life, Fast & Accurate Scanning, Compatible with iOS/Android/Windows

Inateck 2D Barcode Scanner, Wireless Bluetooth QR Code Scanner with AI APP & SDK, 180-Day Battery Life, Fast & Accurate Scanning, Compatible with iOS/Android/Windows

Powerful Scanning Capability: The Inateck 2D barcode scanner accurately reads almost all 1D and 2D barcodes within a…

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

What’s Next

Next steps include broader adoption by AI development teams, integration into various agent frameworks, and further benchmarking in production settings. Updates may include enhancements to indexing speed, query accuracy, and support for additional code repositories.

Amazon

CPU-based code indexing tools

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Key Questions

How does Semble achieve such a high speed and low token usage?

Semble uses optimized indexing and retrieval algorithms that focus on returning only the most relevant code snippets, drastically reducing token consumption and response time.

Can Semble be integrated with existing AI agents easily?

Yes, Semble supports integration via MCP servers or command-line interfaces, compatible with agents like Claude, Codex, and others, with straightforward setup instructions.

Does Semble require external hardware or cloud services?

No, it runs entirely on CPU without needing API keys, GPUs, or external services, making it suitable for local deployment.

What are the limitations or open questions about Semble?

Performance on very large codebases or in complex search scenarios remains to be fully evaluated. Long-term stability and support across diverse environments are still to be confirmed.

You May Also Like

AI for Project Management: Smart Scheduling and Risk Prediction

Boost your project success with AI-driven smart scheduling and risk prediction tools that revolutionize planning—discover how to stay ahead now.

Show HN: Semble – Code search for agents that uses 98% fewer tokens than grep

Semble, a new code search library for agents, reduces token usage by 98% compared to grep+read, offering faster, more efficient code retrieval on CPU.

Mitchellh – I strongly believe there are entire companies now under AI psychosis

Mitchellh claims many companies are suffering from ‘AI psychosis,’ raising concerns about overreliance on AI systems. Details are still emerging.

OpenAI launches new agent SDK with strict mode

OpenAI launches a new agent SDK featuring a strict mode aimed at enhancing safety and control for developers deploying AI agents.