Below you will find pages that utilize the taxonomy term “Text to Image”
2024
[Artificial Intelligence] Text-to-Image with StableDiffusionPipeline
In this post, we explore the capabilities of StableDiffusionPipeline for generating photorealistic images from textual inputs. We start with setting up the environment and installing necessary libraries. Then, we dive into Textual Inversion, demonstrating how the model learns new concepts from images. Image-to-Image transformations are also explored, showcasing the pipeline's versatility. Additionally, we introduce Animagine XL 2.0, a model for high-resolution anime image creation, and provide sample code for its implementation. Lastly, we highlight Stable Diffusion XL, a powerful text-to-image model, and share a festive image generated using it.
2024
[Artificial Intelligence] Stable Diffusion: Text-to-Image Modeling Journey
This post explores Stable Diffusion, a latent text-to-image diffusion model in machine learning. Diffusion models, with forward, reverse, and sampling components, understand and generate patterns in datasets. Illustrating applications in image tasks, it introduces the process of installing and utilizing Stable Diffusion. The post details image generation and modification using prompts, with examples and troubleshooting. Notably, it shares an encounter with CUDA out-of-memory errors and the resolution through image resizing. Overall, it offers a comprehensive guide, combining theoretical insights with practical implementation steps in a professional manner.
2024
[Artificial Intelligence] OpenVINO, Optimum-Intel, CPU: An Exploration in Model Optimization
Explore the convergence of OpenVINO and Optimum-Intel in this post, where I detail the setup and execution of example code on my aging laptop. Focused on applying Quantization-aware Training and the Token Merging method to optimize the UNet model within the Stable Diffusion pipeline, this journey showcases the synergy of open-source tools for deep learning model deployment. Note that the provided code is tailored for CPU-based inference due to limitations in my aging GeForce graphics card, making it a valuable resource for users with similar hardware constraints. Dive into the world of optimized models and delightful Pokemon creation!