Running local models on an M4 with 24GB memory

TL;DR

A user successfully ran a local AI model on a Mac M4 with 24GB memory, achieving functional performance for basic tasks. This showcases potential for local AI use on consumer hardware, though with limitations compared to state-of-the-art models.

A user has demonstrated that a manageable AI model can run locally on a Mac M4 with 24GB of RAM, enabling basic research and task automation without internet access. This development matters as it suggests more accessible, privacy-conscious AI use on consumer hardware.

The experiment involved running Qwen 3.5 9B (Q4) on a MacBook Pro equipped with 24GB RAM, using tools like LM Studio and OpenCode. The user reported achieving about 40 tokens per second with thinking mode enabled, supporting tasks such as coding assistance and research. While the model cannot match the capabilities of larger state-of-the-art models for complex, multi-step problem solving, it performs reasonably well for interactive workflows requiring guidance and step-by-step interaction.

Setup required selecting appropriate configurations, including enabling ‘thinking mode’ and adjusting parameters like temperature and top_p. The user noted that models such as Qwen 3.5 9B are limited in handling long, complex tasks and may get distracted or stuck, but they remain useful for basic automation and research. The process involved considerable configuration effort, and the user highlighted differences between tools like Pi and OpenCode, with preferences depending on usability and default settings.

Why It Matters

This development is significant because it demonstrates that capable AI models can be run locally on consumer-grade hardware, reducing reliance on cloud-based services and increasing privacy. It also opens possibilities for more accessible AI experimentation and use outside of large data centers, though with clear limitations compared to larger, more powerful models.

Amazon

MacBook Pro 24GB RAM external GPU

As an affiliate, we earn on qualifying purchases.

Background

Prior to this, running large language models locally was generally limited to high-end servers or specialized hardware. Recent efforts have focused on optimizing models and configurations to fit within consumer hardware constraints. The user’s experiment aligns with ongoing trends toward democratizing AI access, emphasizing that even modest hardware can support useful AI functions with proper setup. However, these models are still far from replacing state-of-the-art solutions for complex, long-term tasks.

“It’s surprisingly good for something that can run on a 24GB Macbook Pro while leaving space for lots of other things running too.”

— Hacker News user

“While it’s not as capable as SOTA models, it encourages a more engaged workflow and offers a level of privacy and independence.”

— Hacker News user

Amazon

AI model running on MacBook accessories

As an affiliate, we earn on qualifying purchases.

What Remains Unclear

It is not yet clear how scalable or stable these setups are over extended use or for more complex tasks. The performance varies depending on configurations, and the user’s experience might differ with different models or hardware setups. Additionally, the long-term practicality and ease of setup for casual users remain uncertain.

Amazon

privacy-focused AI software for Mac

As an affiliate, we earn on qualifying purchases.

What’s Next

Next steps include refining configuration settings, testing additional models, and exploring automation for setup. Further experimentation will determine how well these models can handle more demanding tasks and whether user-friendly tools can simplify the process for broader adoption.

Amazon

local AI model software for Mac

As an affiliate, we earn on qualifying purchases.

Key Questions

Can I run these models on my own Mac with 24GB RAM?

Yes, with appropriate setup and configuration, models like Qwen 3.5 9B can run on a Mac M4 with 24GB RAM, supporting basic AI tasks without internet access.

What are the limitations of running local models on consumer hardware?

They are limited in handling complex, multi-step tasks, may get distracted or stuck, and cannot match the capabilities of large, state-of-the-art models. Performance and stability depend heavily on configuration and model choice.

Do I need technical expertise to set this up?

Yes, setting up local models requires configuring software like LM Studio or OpenCode, adjusting parameters, and managing dependencies, which may be challenging for non-technical users.

Will these models replace cloud AI services?

Currently, they are suitable for basic tasks and research but cannot replace cloud-based, high-performance models for complex or commercial applications.

Running local models on an M4 with 24GB memory

Up next

Eight More ‘8-Bit Era’ Microprocessors

Author

Artificial Intelligence

Share article