Я ранее использовал модель стабильной диффузии и теперь с ее новой версией пытаюсь увидеть, что изменилось в Stable Diffusion XL. Я развернул на AWS с помощью SageMaker. Мой блокнот доступен на github (см. text_to_image_stability_xl).

Множественные подсказки выглядят новым изменением, но оно не работает так, как я ожидал, и не так много документации.

Однако, используя мои предыдущие подсказки, новые изображения явно намного лучше, и просто интересно посмотреть, что получится! На руках по-прежнему больше, чем необходимо, пальцев, а уклоны все равно одинаковы.

Вот еще несколько. Подсказки помечены на самом изображении.

Рекурсивный

Эмоции и пол

Культурные отсылки и предубеждения

Как и в предыдущей версии, врач всегда мужчина. Мужчины, женщины и черты лица в основном европейские. Но он знает культурные различия, когда его об этом просят. Похоже, есть некоторая фильтрация для блокировки любых запросов, если используется слово «ребенок», так что уродливых детей по-прежнему нет!

Лица

Эмоции и выражение лица

Материальные объекты

Нематериальные активы

Знаменитые люди и места

испанский

Я пробовал хинди, но он не поддерживается. Испанский дал неплохие результаты.

подсказки 6-летнего ребенка

Я позволил своему 6-летнему ребенку ввести некоторые подсказки, и вы можете видеть, куда они пошли!

Фэнтези и фантастика

Сгенерированные подсказки

При стабильности XL результаты замечательные, и это простые текстовые подсказки.

image_generation_prompts = [
    "Generate a portrait of an elderly man with a kind smile and wise eyes.",
    "Create a realistic image of a young woman with curly hair and piercing green eyes.",
    "Generate a portrait of a young man with a rugged look and a slight beard.",
    "Create an image of a middle-aged woman with short hair and a friendly expression.",
    "Generate a portrait of a woman with rosy cheeks and bright eyes, full of wonder and curiosity.",
    "Create an image of a person with a serious expression and strong facial features.",
    "Generate a portrait of a woman with a unique hairstyle and bold makeup.",
    "Create an image of a man with a beard and glasses, looking contemplative.",
    "Generate a portrait of a person with piercing blue eyes and a mysterious expression.",
    "Create an image of a person with a bright smile and sparkling eyes, radiating happiness and joy."
]

image_generation_prompts = [
    "Imagine you're flying on the back of a giant eagle, soaring over mountains and valleys, feeling the wind rushing through your hair.\n",
    "Picture yourself walking through a forest of glowing mushrooms, the light casting a surreal glow over everything around you.\n",
    "You find yourself in a magical library, where every book holds a world of its own. You can choose any book and step into its pages, living out the story as if it were real.\n",
    "You step through a shimmering portal into a world of floating islands and soaring airships, where the sky is an endless expanse of vibrant colors and you can explore to your heart's content.\n",
    "You're a guest at a royal ball, where you dance the night away with elegant and exotic creatures from all corners of the realm, the music carrying you away into a world of enchantment.\n",
    "You venture deep into a mysterious cavern, where you find a glowing crystal that shows you visions of distant lands and times, revealing secrets of the universe.\n",
    "You awaken in a magical garden, filled with strange and wondrous creatures, each with a unique power or gift to share with you.\n",
    "You find yourself on a vast plain, surrounded by a circle of standing stones, and you suddenly realize that you have the power to control the elements of nature - the wind, the rain, the sun, and the earth.\n",
    "You're on a quest to find a legendary artifact, journeying through dark forests, treacherous mountains, and ancient ruins, facing challenges and meeting allies along the way.\n",
    "You discover a hidden portal that takes you to a realm of dreams, where you can explore the subconscious mind and unlock the secrets of the human psyche.\n"
]

image_generation_prompts = [
    "Imagine you're standing in the heart of New York City's Times Square, surrounded by the bright lights and buzzing energy of one of the world's most famous destinations. The towering billboards and electronic displays flash advertisements and messages, while street performers and vendors add to the lively atmosphere. You can feel the pulse of the city as people from all over the world rush past you, each on their own mission.",  
    "Picture yourself walking through the ornate halls of Buckingham Palace, home to the British royal family and a symbol of centuries of history and tradition. You can hear the echo of your footsteps on the marble floors as you take in the opulent furnishings and intricate artwork. You might even catch a glimpse of a member of the royal family, as they move through the palace's private chambers.",  
    "You find yourself on the beaches of Rio de Janeiro, where you can dance the samba, play soccer on the sand, and soak up the vibrant culture of one of Brazil's most iconic cities. The warm sand and cool ocean breeze invite you to relax and enjoy the sun, while the sounds of music and laughter fill the air. You can taste the delicious local cuisine and join in the lively conversation with the friendly locals.",  
    "You're sitting in the front row of a concert by your favorite musician, feeling the energy and excitement of the crowd and losing yourself in the music. The lights and sound of the show are intense, and you can feel the bass vibrating through your body. You might even get a chance to meet the artist backstage after the show.",  
    "You're exploring the winding streets and hidden alleyways of Venice, Italy, admiring the historic architecture and charming canals of this timeless and romantic city. You can smell the aroma of freshly baked bread and hear the sound of church bells ringing in the distance. The gondolas and water taxis glide past you on the shimmering canals, adding to the magical atmosphere.",  
    "You're at the top of the Eiffel Tower, looking out over the City of Lights and marveling at the breathtaking views of one of the world's most famous landmarks. The city below spreads out before you like a patchwork quilt, with the Seine River winding through the heart of it. You can see the iconic buildings and monuments of Paris from a unique perspective, and feel the breeze as it rushes by you at this lofty height.",  
    "You find yourself in the heart of Tokyo, surrounded by the neon lights and vibrant culture of one of the world's most bustling and dynamic cities. The streets are packed with people, all moving with purpose and urgency. The food stalls and restaurants offer a dizzying array of options, while the anime shops and karaoke bars are testament to the city's unique blend of old and new.",  
    "You're at a Hollywood movie premiere, walking the red carpet and rubbing shoulders with some of the biggest names in the entertainment industry. The cameras flash as you strike a pose for the paparazzi, and you can hear the murmur of excited fans gathered behind the velvet ropes. The air is electric with anticipation as you make your way into the theater to watch the latest blockbuster film.",  
    "You're standing in front of the Great Pyramid of Giza, one of the most famous and awe-inspiring structures in human history, marveling at the scale and craftsmanship of this ancient wonder. The sun beats down on the sandy landscape as you take in the enormity of the pyramid, which has stood for more than 4,500 years. You can feel the weight of history and culture as you stand in its shadow.",  
    "You're sitting in the stands at the Super Bowl, watching the biggestfootball game of the year with millions of viewers tuning in around the world. The atmosphere is electric as the two teams take to the field, and you can feel the excitement building with every play. The halftime show is a spectacle of music and dance, while the commercials are some of the most talked-about of the year. You can taste the salty snacks and cold drinks as you cheer on your favorite team to victory." 
]

Мой блокнот text_to_image_stable_xl находится в этом репозитории Github.



Если эти темы вас интересуют, свяжитесь со мной, и я буду признателен за любой отзыв. Если вы хотите работать над такими проблемами, вы, как правило, также найдете открытые роли! Пожалуйста, обратитесь к LinkedIn.