Apple's researchers continue to focus on LLMs, with studies detailing the use of AI in UI prototype creation and a new dataset for image safety rating.
With Xcode 26.3, Apple introduced support for agentic coding tools to help developers plan, execute, and iterate on projects with the help of AI. In other words, Xcode offers built-in compatibility with popular LLM chatbots, such as Anthropic's Claude Agent and OpenAI's Codex.
And that looks to be only the start of Apple's vibe coding-related endeavors, as its latest research offers a new twist on generating UI designs with the help of AI. Apple is also exploring the use of AI in evaluating the safety content of images, among other things.
The Apple Machine Learning Blog details the various ideas and advancements possible by employing artificial intelligence. There's always a chance some of the described concepts will make their way to iOS, macOS, or utilities like Xcode.
SQUIRE: UI prototyping with component trees, powered by AI
In a study titled "SQUIRE: Interactive UI Authoring via Slot Query Intermediate Representations," Apple's researchers offer a new way of streamlining AI-assisted UI prototyping.
In essence, the research paper details the shortcomings of current vibe-coding approaches, particularly in UI development.
Per the study, many UI-generation tools powered by LLMs deliver a finished design based on a prompt, leaving little or no room for fine-tuning. The tools that do support minute adjustments don't always execute them as intended, causing issues for app developers.
This ultimately leaves developers stuck with a time-consuming trial-and-error loop. To mitigate these shortfalls, Apple's researchers devised Squire, which they describe as "a system designed for guided prototype exploration and refinement."
The system, powered by OpenAI's GPT-4o, offers versatility by giving developers more insight into what is being generated. Squire does this by offering a customizable component tree before delivering an AI-generated user interface prototype.
Developers are thus able to view or modify individual elements that make up, for example, a webpage. Through a dedicated option, developers can modify the font of a UI element, add additional layers to it with new information, or swap it for something else entirely.
Once the desired requirements have been met, Squire converts its component tree representation into working code, with CSS, HTML files and more. The tool is capable of delivering webpages, among other things.
The associated study let 11 frontend developers use Squire to create webpages for mobile devices. The participating developers "scored Squire positively for usability and general satisfaction."
The study concludes by explaining that there's "strong potential for code generation to be controlled in rapid UI prototyping tools by combining chat with explicitly scoped affordances." In other words, developers are given more control over the UI-generation process when a customizable component tree is present.
Squire's approach could, in theory, become part of a future Xcode update, depending on how Apple goes about vibe coding with its development utility.
SafetyPairs — Enhancing AI image safety rating with image pairs
With iOS 18, Apple made it possible to create images on an iPhone through local AI models. Image Playground lets users generate cartoon-like photos of just about anything, all without a Wi-Fi connection.
Apple's study titled "SafetyPairs: Isolating Safety Critical Image Features with Counterfactual Image Generation" examines how LLMs can be trained to evaluate the safety of an image and its contents.
SafetyPairs is described as a "scalable framework for generating counterfactual pairs of images that differ only in the features relevant to the given safety policy, thus flipping their safety label."
In essence, Apple's researchers have constructed pairs of images with ever-so-slightly different features that impact the safety rating. In practice, this can mean one image contains inappropriate gestures, like a middle finger, or depictions of violence, while the other does not.
If an image pair contains buildings, flags, or vehicles, one of the images might show these objects burning, for instance. In total, 1,510 unique image pairs were created, and the individual images were classified as safe or unsafe by AI models.
Overall, Apple's researchers found that the SafetyParis approach is "effective at highlighting weaknesses in state-of-the-art vision-language models, and can serve as a useful data augmentation strategy for training sample-efficient guard models."
The testing approach could be used to enhance the guardrails of utilities like Image Playground, or perhaps in the Photos app. Apple will preview its next generation of operating system updates at WWDC 2026, which starts on June 8.










