Affiliate Disclosure
If you buy through our links, we may get a commission. Read our ethics policy.

Heavily upgraded M3 Ultra Mac Studio is great for AI projects

The Mac Studio is a great system for running AI models like DeepSeek locally. That is, if you're prepared to pay for M3 Ultra and a lot of upgrades.

Apple Silicon is dominating the AI-capable PC market, with its capability for machine learning applications making it an attractive purchase. In a new video, it seems that the M3 Ultra Mac Studio could offer people working in AI a considerable amount of performance to play with.

The video from Dave2D discusses the M3 Ultra Mac Studio, Apple's fastest Mac on the roster. However, what is being tested in the video is more to demonstrate the extremes of the hardware, rather than a more modest and conventional loadout.

The version shown in the video uses an M3 Ultra chip with the upper-tier configuration of a 32-core CPU, an 80-core GPU, and the 32-core Neural Engine. It's also packed with 512GB of unified memory, the maximum amount available for the model, with a memory bandwidth of 819GB/s.

Local LLM usage

While the review discusses how the model doesn't make a massive change to typical video content creator workflows, it does focus on the massive amount of memory. That, and its capability to run massive Large Language Models (LLMs) used for AI applications.

The most obvious use is to be able to use an LLM locally without needing to send the request out to a server farm. In cases such as a hospital, the need for patient privacy means keeping the data on-site is a better option than sending it off-site to be processed, where possible.

To test it out, Deepseek-r1 was loaded and ran locally on the Mac Studio. This wasn't easily possible before, due to the 671-billion-parameter model requiring just over 400 gigabytes of storage and a bit less than 450 gigabytes of video RAM to function.

Since Apple Silicon uses unified memory, the top 512GB memory configuration is able to handle the massive file size of the model. Though you can run lower-parameter versions on a Mac with a smaller amount of memory, you can only really use the highest configuration to use the biggest Deepseek-r1 model.

Indeed, during testing, it was found that macOS limits the amount of memory that can be used for video memory by default to 384GB, so that had to be overridden for testing to begin.

When it was up and running, the Mac Studio was able to churn through queries at approximately 17 to 18 tokens per second. This is a level that is considered usable for the majority of LLM uses.

While LLM performance is one factor of interest, another is power consumption, as running the models can require a lot of resources. In the case of the model on the Mac Studio, it was observed as requiring 160 to 180 Watts during use.

This may sound like a lot, but it is relatively small when compared against a custom-built PC with multiple GPUs made for the same task. It's proposed that the power draw of that hypothetical system could be ten times that of the Mac Studio.

An expensive option

Even though the Mac Studio with M3 Ultra seems like a great option for LLM usage and development, there is a big drawback in terms of cost.

The 512GB option is only available for the upper-tier M3 Ultra Chip, which adds $1,500 to the base model's $3,999 price. Going from the base 96GB memory to 512GB is a $4,000 add-on again, bringing the total cost to $9,499 with the base 1TB of storage untouched.

Dropping around $10,000 on a Mac for AI purposes is out of the question for most people. But, it can be an option for businesses and corporations who have the finances and can more easily justify having a Mac Studio with M4 Ultra to run an on-site LLM.

Such operations may even consider creating a cluster of Macs, if they have the budget.

Watch the Latest from AppleInsider TV

26 Comments

blastdoor 16 Years · 3700 comments

Even though the Mac Studio with M3 Ultra seems like a great option for LLM usage and development, there is a big drawback in terms of cost.


But compared to what? How much would you have to spend to do the same job on a PC?

4 Likes · 1 Dislike
RDW 3 Years · 8 comments

Well, DUH! Who knew if you throw a lot of money and spec out a machine to the fullest with tons of RAM and GPU and CPU cores, it would be a beast of a machine?

1 Like · 0 Dislikes
CarmB 5 Years · 97 comments

Considering how much is being made regarding the power needed to implement AI, the most important element of Apple's hardware is its efficiency. Could be a dramatic advantage going forward. 

4 Likes · 1 Dislike
tiredskills 1 Year · 77 comments

Why on earth would I want to run an AI model?  Locally or otherwise?

2 Likes · 5 Dislikes
brianus 19 Years · 180 comments

Why on earth would I want to run an AI model?  Locally or otherwise?

I’m sure this was meant to be snarky, but for me it’s a genuine question: what are the envisioned real world use cases? What might a business (even a home one) use a local LLM for?

The article mentions a hospital in the context of patient privacy, but what would that model actually be *doing*?

3 Likes · 1 Dislike