image: NeurIPS 2024 Panasonic HD develops image generation AI “Diffusion-KTO” that can personalize images to your tastes based on your “likes” and “dislikes”

Dec 02, 2024

Company / Press Release

Panasonic HD develops image generation AI “Diffusion-KTO” that can personalize images to your tastes based on your “likes” and “dislikes”

Osaka, Japan, December 2, 2024 – Panasonic R&D Company of America (PRDCA) and Panasonic Holdings Co., Ltd. (Panasonic HD), in collaboration with researchers at the University of California and more, have developed "Diffusion-KTO (Knowledge Transfer Optimization)", an image generation AI that can easily generate images that match the user's purpose and preferences by adjusting the generation model with binary feedback such as a "like" or "dislike" from the user.

In recent years, image generation AI has been used in a wide range of fields, from creative to business applications. In addition to the sophistication of images, the ability to generate images that reflect the user's preferences and needs (personalization) has become an important factor in terms of customer satisfaction. The newly developed Diffusion-KTO can efficiently generate high-quality, personalized images through a new approach that applies a "utility function" that quantifies each individual's preferences and values. Our approach can reduce annotation cost of preference dataset by up to N times compared to existing methods (N denotes the dataset size).

This technology has been accepted for presentation at NeurIPS 2024 (The Thirty-Eighth Annual Conference on Neural Information Processing Systems), a top conference on AI and machine learning, which will be held in Vancouver, Canada from December 10 to 14, 2024.

Overview:

Figure 1: Image generation and adjustment procedure using Diffusion-KTO

Panasonic HD and PRDCA are working on research into the personalization of generative AI models. Recently, AI models that generate images from text have had a huge impact on society, and many people are already using them. However, challenges remain:
- The generative model itself is very complex and has many parameters;
- Multiple variables (color, shape, composition, etc.) are involved in setting user preferences.
As a result, it is not easy to adjust the parameters to create an image that the user likes, and users currently need to make full use of prompt engineering to obtain images that they are satisfied with.

Research is also underway to adjust generated images to get closer to user preferences. However, to do so, it is necessary to separately collect data comparing "which image is preferred among similar images" (pairwise comparison) and then make adjustments using a complex reward model based on reinforcement learning.

In response to this, Diffusion-KTO proposed a new approach that applies a utility function to quantify each individual's preferences based on simple binary feedback such as a "like" or "dislike." The utility function adopted is designed based on prospect theory, a behavioral economics theory, which suggests that people prefer to avoid a loss over receiving an equivalent gain. Binary feedback makes it possible to collect each user's preferences easily and efficiently, significantly reducing the cost and time of data collection. Furthermore, by combining it with a utility function related to human decision-making, it is possible to efficiently generate high-quality images that are more in line with user preferences.

In evaluation experiments, we found that Diffusion-KTO outperformed the base model (SD v1-5) *1, achieving a maximum win rate of 87.2%. In particular, human evaluators consistently preferred the images generated by Diffusion-KTO over those generated by the base model.

Future Outlook:

Diffusion-KTO is an image generation AI that can adjust the generative model through just a simple interaction -- the user's binary feedback -- and generate images that suit the user's preferences. By applying this technology, it is possible to efficiently create datasets for AI training, which is essential for AI development. In principle, Diffusion-KTO can be applied not only to image generation but also to other generative models such as text generation and speech generation, making it possible to use it in many fields where personalization according to user preferences is required.

Panasonic HD will continue to accelerate the implementation of AI in society and promote research and development of AI technologies that will contribute to improving our customers' lives and workplaces.

*1 SD v1-5: Image Generation Model Stable Diffusion v1.5

Related Information:

“Aligning Diffusion Models by Optimizing Human Utility”
This research was carried out by Konstantinos Kallidromitis of PRDCA, Yusuke Kato and Kazuki Kozuka of Panasonic HD, in collaboration with Shufan Li who is a PhD student in UCLA and Akash Gokul who was previously in UC-Berkeley.
https://arxiv.org/abs/2404.04465

Panasonic × AI website
https://tech-ai.panasonic.com/en/

Panasonic Robotics Hub website
https://tech.panasonic.com/global/robot/

About the Panasonic Group

Founded in 1918, and today a global leader in developing innovative technologies and solutions for wide-ranging applications in the consumer electronics, housing, automotive, industry, communications, and energy sectors worldwide, the Panasonic Group switched to an operating company system on April 1, 2022 with Panasonic Holdings Corporation serving as a holding company and eight companies positioned under its umbrella. The Group reported consolidated net sales of 8,496.4 billion yen for the year ended March 31, 2024. To learn more about the Panasonic Group, please visit: https://holdings.panasonic/global/

The content in this website is accurate at the time of publication but may be subject to change without notice.
Please note therefore that these documents may not always contain the most up-to-date information.
Please note that German, French and Chinese versions are machine translations, so the quality and accuracy may vary.

Issued:
Panasonic Holdings Corporation

Downloads (Images)

Featured news