My iPhone 16 Pro Max produces garbage output when running MLX LLMs

journal.rafaelcosta.me

262 points by rafaelcosta 11 hours ago

Methodology is one thing; I can't really agree that deploying an LLM to do sums is great. Almost as hilarious as asking "What's moon plus sun?"

But phenomenon is another thing. Apple's numerical APIs are producing inconsistent results on a minority of devices. This is something worth Apple's attention.

JimboOmega - 7 hours ago

(This is a total digression, so apologies)
My mind instantly answered that with "bright", which is what you get when you combine the sun and moon radicals to make 明(https://en.wiktionary.org/wiki/%E6%98%8E)
Anyway, that question is not without reasonable answers. "Full Moon" might make sense too. No obvious deterministic answer, though, naturally.
- - 3 hours ago
  
  [deleted]
- awesome_dude - 5 hours ago
  
  FTR the Full Moon was exactly 5 hours ago (It's not without humour that this conversation occurs on the day of the full moon :)
CrispinS - 6 hours ago

> What's moon plus sun?
Eclipse, obviously.
- christophilus - 6 hours ago
  
  That’s sun minus moon. Moon plus sun is a wildly more massive, nuclear furnace of a moon that also engulfs the earth.
  - SauntSolaire - 2 hours ago
    
    Reminds me of this AI word combination game recently shared on HN, with almost exactly these mechanics:
    https://neal.fun/infinite-craft/
    For the record, Sun+Moon is indeed eclipse.
  - fsckboy - 2 hours ago
    
    >Moon plus sun is a wildly more massive, nuclear furnace of a moon that also engulfs the earth.
    i just looked up mass of sun vs mass of moon (they differ by 10^30 vs 10^20), and the elemental composition of the sun: the moon would entirely disappear into the insignificant digits of trace elements which are in the range of .01 % of the sun. I could be off by orders of magnitude all over the place and it would still disappear.
  - mcny - 5 hours ago
    
    Wait so moon plus sun != sun plus moon? :Thinking:
    
    chii - 3 hours ago
    
    celestial objects don't need to obey algebraic commutativity!
  - dcrazy - 5 hours ago
    
    This thread reminds me of Scribblenauts, the game where you conjure objects to solve puzzles by describing them. I suspect it was an inspiration for Baba Is You.
    
    Der_Einzige - 5 hours ago
    
    Scribblenauts was also an early precursor to modern GenAI/word embeddings. I constantly bring it up in discussions of the history of AI for this reason.
    
    veqq - 2 hours ago
    
    Could you explain? :3
  - lkjdsklf - 3 hours ago
    
    Here i was, like an idiot, thinking it was moon light
  - AuryGlenz - 4 hours ago
    
    Or potentially a sun that lasts slightly longer?
- geuis - 6 hours ago
  
  Not obvious. Astronomers are actively looking for signatures of exomoons around exoplanets. So "sun plus moon" could mean that too.
  - xattt - 5 hours ago
    
    The OP said moon + sun, rather than sun + moon. We have no idea yet if celestial math is non-communicative.
    
    BenjiWiebe - 4 hours ago
    
    *commutative
    
    tbrownaw - 3 hours ago
    
    Well, that too.
  - godelski - 2 hours ago
    
    Well you find the signature by looking for a dip in but sun's luminosity. So minus might be the better relationship here
fatheranton - 7 hours ago

[dead]

DustinEchoes - 8 hours ago

I wish he would have tried on a different iPhone 16 Pro Max to see if the defect was specific to that individual device.

crossroadsguy - 6 hours ago

So true! And as any sane Apple user or the standard template Apple Support person would have suggested (and as they actually suggest) - did they try reinstalling the OS from scratch after having reset the data (of course before backing it up; preferably with a hefty iCloud+ plan)? Because that's the thing to do in such issues and it's very easy.
- post-it - 4 hours ago
  
  Reinstalling the OS sucks. I need to pull all my bank cards out of my safe and re-add their CVV's to the wallet, and sometimes authenticate over the phone. And re-register my face. And log back in to all my apps. It can take an hour or so, except it's spread out over weeks as I open an app and realize I need to log in a dozen times.
jajuuka - 3 hours ago

Latest update at the bottom of the page.
"Well, now it's Feb. 1st and I have an iPhone 17 Pro Max to test with and... everything works as expected. So it's pretty safe to say that THAT specific instance of iPhone 16 Pro Max was hardware-defective."
- Someone - 4 minutes ago
  
  [delayed]

raincole - 9 hours ago

Low level numerical operation optimizations are often not reproduceable. For example: https://www.intel.com/content/dam/develop/external/us/en/doc... (2013)

But it's still surprising that that LLM doesn't work on iPhone 16 at all. After all LLMs are known for their tolerance to quantization.

bri3d - 9 hours ago

Yes, "floating point accumulation doesn't commute" is a mantra everyone should have in their head, and when I first read this article, I was jumping at the bit to dismiss it out of hand for that reason.
But, what got me about this is that:
* every other Apple device delivered the same results
* Apple's own LLM silently failed on this device
to me that behavior suggests an unexpected failure rather than a fundamental issue; it seems Bad (TM) that Apple would ship devices where their own LLM didn't work.
- sva_ - 7 hours ago
  
  > floating point accumulation doesn't commute
  It is commutative (except for NaN). It isn't associative though.
  - ekelsen - 6 hours ago
    
    I think it commutes even when one or both inputs are NaN? The output is always NaN.
    
    addaon - 6 hours ago
    
    NaNs are distinguishable. /Which/ NaN you get doesn't commute.
    
    ekelsen - 5 hours ago
    
    I guess at the bit level, but not at the level of computation? Anything that relies on bit patterns of nans behaving in a certain way (like how they propagate) is in dangerous territory.
    
    addaon - 5 hours ago
    
    > Anything that relies on bit patterns of nans behaving in a certain way (like how they propagate) is in dangerous territory.
    Why? This is well specified by IEEE 754. Many runtimes (e.g. for Javascript) use NaN boxing. Treating floats as a semi-arbitrary selection of rational numbers plus a handful of special values is /more/ correct than treating them as real numbers, but treating them as actually specified does give more flexibility and power.
    
    ekelsen - 4 hours ago
    
    Can you show me where in the ieee spec this is guaranteed?
    My understanding is the exact opposite - that it allows implementations to return any NaN value at all. It need not be any that were inputs.
    It may be that JavaScript relies on it and that has become more binding than the actual spec, but I don't think the spec actually guarantees this.
    Edit: actually it turns out nan-boxing does not involve arithmetic, which is why it works. I think my original point stands, if you are doing something that relies on how bit values of NaNs are propagated during arithmetic, you are on shaky ground.
    
    addaon - 2 hours ago
    
    Don't have the spec handy, but specifically binary operations combining two NaN inputs must result in one of the input NaNs. For all of Intel SSE, AMD SSE, PowerPC, and ARM, the left hand operand is returned if both are signaling or both or quiet. x87 does weird things (but when doesn't it?), and ARM does weird things when mixing signaling and quiet NaNs.