Text-to-image (T2I) models have gained widespread adoption among content creators and the general public. Gradually, there is an increasing demand for T2I models to incorporate mechanisms that prevent the generation of specific artistic styles, thereby safeguarding intellectual property rights. Existing methods for style extraction typically necessitate the collection of custom datasets and the training of specialized models. This, however, is resource-intensive, time-consuming, and often impractical for real-time applications. We present a novel, training-free framework to solve the style attribution problem, using the features produced by a diffusion model alone, without any external modules or retraining.
This is denoted as Introspective Style attribution (IntroStyle) and is shown to have superior performance to state-of-the-art models for style attribution. We also introduce a synthetic dataset of Artistic Style Split (ArtSplit) to isolate artistic style and evaluate fine-grained style attribution performance. Our experimental results show that our method adequately addresses the dynamic nature of artistic styles and the rapidly evolving landscape of digital art with no training overhead.
@article{kumar2024introstyle,
author = {Kumar, Anand and Mu, Jiteng and Vasconcelos, Nuno},
title = {IntroStyle: Training-Free Introspective Style Attribution using Diffusion Features},
journal = {arXiv preprint arXiv: 2412.14432},
year = {2024},
}