AI in Mobile Apps: What Is Working Beyond the Hype
The integration of AI capabilities into mobile applications has followed the familiar hype cycle pattern: an initial period of breathless coverage about what AI would do for apps, followed by a quieter period of teams discovering which AI features users actually value and which are dismissed as gimmicks within the first week of use. The dust has not fully settled, but the outline of what works is becoming clear.
The AI features that have demonstrated durable user value are mostly not the ones that received the most attention during the hype phase. Large language model chatbots embedded in apps — the most visible AI feature of the 2023-2024 period — have retention profiles that most teams find disappointing. Users try them, find them useful or impressive in isolated interactions, and then forget to use them because the chat interface requires more effort than the specific task typically warrants.
On-Device AI and the Privacy Advantage
The most consequential development in mobile AI is not the cloud-based LLM features that dominated coverage but the improvement in on-device AI capabilities enabled by the Neural Processing Units in Apple’s A-series chips and Google’s Tensor chips. Processing that previously required sending user data to cloud servers — image classification, speech recognition, translation, autocomplete — can now run entirely on the device with meaningful quality.
The privacy implications are significant and increasingly important to users who have become more aware of data collection practices. A photo editing app that runs style transfer on the device, without ever sending the user’s photos to a server, has a privacy story that cloud-based alternatives cannot match. Apple’s on-device processing marketing has educated users to ask where their data goes, and on-device processing is the correct answer to that question.
Core ML on iOS and ML Kit and TensorFlow Lite on Android provide the inference infrastructure for on-device AI. The quality of models that fit within the memory and compute constraints of mobile devices has improved substantially as model compression and quantization techniques have matured. A model that required a cloud GPU two years ago can now run acceptably on a mid-range phone.
Features That Retained Users
The AI features with the strongest retention data share a characteristic: they improve an existing user workflow rather than creating a new one. AI-powered autocomplete in keyboard apps reduces the effort of text input for users who are already typing. AI-powered photo organization in gallery apps reduces the effort of finding photos for users who are already browsing. AI-powered transcript search in voice recording apps reduces the effort of reviewing recordings for users who are already recording.
The common thread is friction reduction in existing behavior. Users who are already doing something find it easier. This is different from AI features that require users to adopt a new behavior — opening a chat interface, formulating a prompt, evaluating an AI-generated response — which is a higher bar that most casual users do not clear with sufficient regularity for the feature to drive retention.
The LLM Integration Reality
Server-side LLM integration has produced genuinely valuable features in specific contexts. Writing assistance in productivity apps — helping users improve drafts, suggesting completions, summarizing long content — has demonstrated retention that novelty-driven features do not achieve, because it addresses a recurring task that users face regularly with enough friction that the assistance is consistently welcome.
The cost structure of server-side LLM inference is the limiting factor for most independent developers. The compute cost of processing user queries against a capable language model is not negligible at scale. Apps that offer unlimited LLM access within a subscription face economics that are sensitive to usage levels in ways that most subscription apps are not. The monetization model needs to account for the per-query cost in a way that traditional mobile app economics did not require.
Apple Intelligence — Apple’s branded AI system integrated into iOS, iPadOS, and macOS — provides a platform-level LLM capability that apps can access through system APIs. This distributes the inference cost to Apple while giving developers access to capable on-device AI without the server infrastructure costs. The quality and capability of platform-level AI will improve over subsequent hardware and software generations, progressively raising the floor for what mobile apps can do with AI without any per-query cost.
The apps that build AI features against user behavior rather than against AI capability will be the ones that users still engage with three years from now. The technology is not the constraint. The user behavior is.