Apple’s Hidden AI Strategy

The Lindahl Letter

0:00

-3:53

Apple’s Hidden AI Strategy

Waiting it out with token avoidance as a first principle

Dr. Nels Lindahl

Sep 19, 2025

Transcript

Thank you for tuning in to week 205 of the Lindahl Letter publication. A new edition arrives every Friday. This week the topic under consideration for the Lindahl Letter is, “Apple’s Hidden AI Strategy: Waiting it out with token avoidance as a first principle.”

Apple’s relative restraint in deploying large-scale generative AI isn’t just about privacy posturing or design philosophy. Maybe it is just the latest supply chain management initiative in terms of managing tokens. It may reflect a deliberate avoidance of token-expensive cloud inference which is an infrastructural and financial commitment that Apple has historically chosen not to make. This choice is akin to keeping supply chain costs down; this type of effort fits with the general operating model. Right now engaging token usage would just eat profits.

Apple’s approach to “Apple Intelligence,” announced in 2024, hinges on three pillars:

On-device first: Apple designed its models (small language models and transformer variants) to run locally on A17+ and M-series chips. This dramatically reduces reliance on cloud GPUs and token accounting. If you generate 200 tokens on your phone, there’s no inference cost to Apple. This method avoids cloud costs, but makes the hardware the tipping point.
Private Cloud Compute: For tasks that exceed the capabilities of on-device models, Apple routes requests to its proprietary cloud using Secure Enclaves. But this only happens for high-value or infrequent tasks. That would include things like summarizing a document, generating email replies, or rewriting notes. This keeps cloud token loads minimal and predictable.
Selective rollout: Apple isn’t putting generative models everywhere. The system isn’t always listening, and “AI” is offered as an opt-in assistant across Mail, Notes, Safari, and Siri. There’s no ChatGPT clone embedded system-wide, and certainly nothing ambient like Gemini could theoretically become.

You can see based on the bottom line and balance sheet concerns why Apple’s caution probably makes financial sense in the long run. Apple sells hardware, not compute. Even some of the cloud forward vendors might operate at a loss at Apple scale. Unlike Google or Microsoft, it doesn’t have an economic engine tied to cloud usage. If it gave every iPhone user unlimited generative AI access via the cloud, it would have to subsidize trillions of tokens per year without monetization return. Nothing in the workflow has any ROI for Apple where the hardware is a sunk cost and they have not offered a standalone monthly AI service. They let everybody else spend billions on hardware, data centers, and electricity.

Instead, Apple wants:

Efficiency over scale.
Local inference over cloud latency.
Sporadic usage over daily token floods.

In short: Apple is playing defense against the tokenapocalypse before it ever hits. That token apocalypse will happen when billions of devices become token hungry.

What’s next for the Lindahl Letter? New editions arrive every Friday. If you are still listening at this point and enjoyed this content, then please take a moment and share it with a friend. If you are new to the Lindahl Letter, then please consider subscribing. Make sure to stay curious, stay informed, and enjoy the week ahead!

The Lindahl Letter

Apple’s Hidden AI Strategy

Discussion about this episode