When it comes to innovation and technology, Apple is without a doubt the greatest competitor. And we might soon witness it at the vanguard of AI thanks to its most recent research, which many believe to be groundbreaking.
Two recent studies published by the iPhone maker have led to some significant discoveries in the field of AI research. Both of them introduced novel techniques for 3D avatars and appropriate language model inferences. Such developments would open up countless possibilities and result in intricate systems that are still in use on the most expensive iPhones and iPads.
The authors have developed a novel method for creating 3D avatars from short films, including ones shot with just one camera, in the first study of its type. Known as "mononuclear videos," these are created in the smallest amount of time—30 minutes—and have the ability to separate scenes while producing an animated human avatar.
Utilizing 3D Gaussian splatting technique, HUGS creates human models using static body shape molds or SMPI. The latter makes it possible to photograph minute details like hair and clothing. Meanwhile, blend skinning, a novel neural deformation technique, yields more realistic results.
All things considered, it's a rather well-coordinated movement that avoids artifacts and results in the character or avatar being reposted. What you end up with, then, is a unique rendering of both the scene and the person associated with it.
This is speedier and produces results faster with less training than avatar production procedures. After just 30 minutes of system optimization on vintage GPUs utilized for things like gaming, the results are astounding and remarkably photorealistic.
Additionally, studies have demonstrated that it produces outcomes that are superior than those obtained with Vid2Avatar and even Neuman. Ultimately, researchers from Apple, the company that makes the iPhone, are responsible for the astounding 3D modeling capabilities.
These research are revealing AI's enormous potential for developing avatars in the future. The writers came to the astounding conclusion that the idea of creating all of this with a single button press on your phone's camera is astounding.
Conversely, the second paper describes how Apple researchers are addressing a variety of issues related to the use of LLMs across devices with constrained memory and space. Modern high-end models such as OpenAI's GPT-4 lead to inference carried out over costly hardware.
These approaches restrict the flow of data during the inference stage. It is impressive to witness Apple create a model that is compatible with flash memory, though. It limits flash data transfer, reduces reading the content in large chunks, and offers recommendations in two key areas.
This relates to two procedures called Row Column Bundling and Windowing. The latter stores rows with columns for larger data block reads, whereas the former reuses activations once again.
These kinds of ground-breaking innovations are critical to the application of LLMs in resource-constrained environments. The authors further stated that doing so would increase their utility and accessibility. And before you know it, sophisticated AI assistants that function flawlessly on iPhones, iPads, and other Apple devices would be the result of this kind of optimization.
In summary, these experts demonstrate just how much Apple still leads the field in AI research and associated applications. Although it's a promising project, some are concerned about the amount of caution and significant responsibility that will come with applying the technology to a variety of products.
In addition, the potential effects on society must be taken into account. Furthermore, the issue of privacy protection must obviously be further assessed.
We can consider this to be a revolution in and of itself when and if it is executed well, and Apple deserves praise for its efforts to advance the field of artificial intelligence.