We Hired an AI Art Generator for Our Blog and I'm Not Mad at It
Back in mid-September of last year, we launched a significant redesign of the Deepgram website. To say the process went smoothly would be a gross mischaracterization, but we eventually hit a good groove with it and in the end, the launch went off with barely a hitch. By mid-November, you could say I was finally relaxing about it.
Then Slack dinged with a DM from our CEO, Scott Stephenson. “Our blog needs an overhaul big time,” he said.
I’ll spare you the details and skip right to the gist of it. Basically, we needed to figure out a way to get the blog to feel more like the rest of the website. To understand what I mean by that, I have to give you a little background as to how the blog design ended up where it did.
When we set out to redesign the website, we understood the following:
Revamping the entire website was going to be a significant undertaking
We were operating on a tight timeline
Thus it was very much an exercise in “going to battle with the army you have, versus the army you want.”
Given the fact that Deepgram’s design team is already tasked with shipping a lot of creative output for the company’s day-to-day needs, we knew we were operating within a set of pretty narrowly-defined constraints:
We don’t have sufficient bandwidth to redesign hundreds of blog post images by the target launch date.
The images we do have are good, but they follow a specific style: silhouetted objects on a full white background.
White background images are a little too stark for a dark background so the blog will have a light background.
Decision made. Here’s what we ended up with.
Not bad for a first pass, but it's not quite what we were looking for.
The Jumping-off Point
All that was well and good (for the time being) but it was also clear that the rough draft of our blog redesign didn’t line up with the dark, sleek look and feel we’d cultivated throughout the rest of Deepgram.com.
We began looking for ways to unify them. First, we said, “Well, just how bad would it be to change the background but not the images?”
Okay. “What if we applied a nice gradient overlay to all the images?”
Then I noticed something strange on the live site. “Hey, there’s an image on the site we didn’t make.”
Turns out someone on the team needed an image in a pinch and decided to whip up an AI-generated image with DALL-E. Despite its consistency issues, it wasn’t bad.
As a leading deep learning company, Deepgram leans into using AI whenever it makes sense. Using text-to-image models to generate cover art for the company’s blog, at scale, felt very much in-scope. So, we took it for a spin.
Onboarding Our AI Image Generator
Making the transition to AI-generated images was going to be pointless if it was going to result in a hodge-podge of styles and look like a hot mess.
We needed to establish some parameters. Here’s where we started:
Images should work with light or dark backgrounds to give us more flexibility in the future.
Our main site has what you might call an “energetic minimalist” vibe, with a sprinkling of ‘space’ thematics (i.e. dark matter roots, exploring the unknown, etc) that should be incorporated.
The main site is also mostly rich dark tones with pops of vaporwave color schemes. Let’s see if we can bring that more to the forefront with the blog.
Next, we explored a fair amount of services. We looked at DALL-E, Jasper, Midjourney, Stable Diffusion, Craiyon, Fotor, StarryAI, NightCafe, and a few others. We had a few basic criteria that we used to evaluate our options:
Which sites will export an image large enough for our current blog header size?
Which sites make it easiest to get the desired style result?
Which sites produce images the fastest? If it takes an hour to produce one image, that’s not going to be much of a time savings in the effort to redo numerous images.
In the end, the finalists were DALL-E and Jasper. Jasper had the edge because it allowed for unlimited (at least for now) high-resolution downloads, all for a reasonable monthly fee. Especially during the initial push to re-do most of our old header images, this was helpful. Whether we’ll need such permissive download functionality on an ongoing basis is unclear, but it’s definitely nice to have.
For the prompts I was using, Jasper also generated the most attractive images. DALL-E’s outputs were a bit rough around the edges. With a little more finessing of my prompts, it’s possible these visual issues could have been resolved, but I was sufficiently happy with Jasper’s style and quality..
Jasper was also a lot faster at image generation which enabled me to generate more successive iterations of an image and home in on a particular visual theme, ultimately producing better results. Over the course of initial testing, I found myself only turning to DALL-E in cases where Jasper simply wasn’t recognizing my prompts at all, and I was willing to get a slightly less polished image for a more accurate subject. Even in the case of AI image generation, the familiar trade-offs between speed and quality remain.
Regarding the prompts, I found that the following yielded a pretty good result:
[object] on [or floating above] a desert with a starry sky background, blue green purple ambient light, synthwave, vaporwave
That in combination with the more structured modifiers offered by Jasper (see image below) resulted in a nice clean image with vibrant color, a polished texture, a unique dreamlike quality, and the feeling of being located on an unknown planet. Energetic, minimalist space theme. Boom!
My initial tests bore out at least one hypothesis: that AI image generators can significantly accelerate the creative production of blog images. In roughly two hours, I was able to create and export 24 images, averaging a little less than 10 minutes of work per final image.
CEO Scott liked the results from Jasper, but he also had a few concerns, namely:
Consistency. Could we maintain the aesthetic over time? AI image generation is evolving so rapidly. Would the prompts we establish for us today still work for us 6 months from now?
Ownership. We won’t own the image and there’s nothing that says anyone else out there couldn’t use these images designed for our brand. (This is not unique to Jasper either; DALL-E’s terms of service also state that parent company OpenAI owns the images its model creates.)
Ever the ambitious technologist, Scott even floated the idea of Deepgram building its own AI image generator, but we decided that an off-the-shelf solution made the most sense. Barring the idea of making the team pivot from building Deepgram to DALL-Egram, this is how I addressed the two main issues:
My biggest concern was being able to plow through a few hundred images right away. If the prompts had to change in 6 months, I’d be fine with that and there’d be more time to explore/evolve them to get the desired results on an individual basis.
The blog images on our site right now are basically licensed stock images which we lightly customize for our specific needs—we don’t really own those either. We selected a style that could be easily updated to reflect our brand palette, but since nobody is doing a completely custom job there, I was less concerned.
I got the green light to move forward with Jasper! Our plan was that we’d start with the most recent 75-80 post images, launch and then systematically tackle a bit more each week. We were ready to roll!
And then, 3 days later, all my prompts stopped working.
So given that Scott’s prediction came true about 5 months and 27 days early, it was clear I had a bit more work to do. To give you an idea of what I’m talking about, previously a prompt of, “a snakeskin speaker sitting on a smooth desert with a starry sky background, synthwave, purple blue green ambient light, vaporwave, exciting, photography, Salvador Dalí, surrealism” resulted in this beauty:
Three days later testing out the same prompts resulted in these toxic nightmares:
Similarly, the same prompts that delivered this serene scene:
now resulted in this angry, apocalyptic wasteland:
On top of that, I was hitting a lot more glitchiness with the app this time. Whatever alleged improvements were made to Jasper’s image generation model also brought a bevy of new issues: system downtime, keywords that were inexplicably blocked despite being completely safe-for-work, and much less predictability when it came to image quality.
The downtime, while annoying, was generally short-lived. The blocked keywords were usually fixed by slightly tweaking (or even re-ordering) existing words. This was annoying but not a dealbreaker. The style fix took a bit more effort.
I ended up having to change the “mood” setting from “energetic” to “calm,” but also did a good bit of trial and error on descriptive words. I tried adding things like “3D-modeled,” “smooth,” “soothing,” or “dreamlike.” Occasionally I had to remove the reference to a “starry sky” as it tended to make the images veer more nightmarish than dreamlike. All this prompt engineering took marginally more time to get the results I was looking for. Instead of my previously estimated 10 minutes per image, it was probably about 15. Not a huge deal.
Now that we’re past that initial push of images, we’re in the process of backfilling in smaller batches, as well as tackling images for new posts. There are a few takeaways worth sharing.
Overall I found AI image generation made it significantly faster to reproduce the old images as well as to create new ones on the tight turnaround times we’ve become accustomed to. It’s safe to say the process is making it easier to keep up with demand. It wasn’t just a time savings either. It felt like less mental energy was expended to create these images which left me feeling more engaged and energized to continue to tackle bigger projects elsewhere.
Another unexpected positive outcome was that I personally had an improved attitude about incoming blog image requests. What was previously a bit tedious became a fun activity. What will Jasper surprise me with today?!
It’s also opened up a new way of approaching the images for me. Previously I entered the process of stock image searching with, “what am I likely to find a good image of that I can work with?”. Now it’s a bit more freeing. For example, instead of this previous image created to address building with Python and Flask:
I can now explore fun, less expected interpretations like a snakeskin-covered flask:
AI isn’t giving me ideas I wouldn’t have had before, nor is this something I couldn’t have done before with a Photoshop montage, but it is allowing me to do so a lot faster by saving me both the image manipulation time as well as the time simply sifting through numerous images looking for the right ones to use.
Overall the feedback from the team has been extremely positive and people seem excited to see what images we’re going to deliver for the site, but it’s far from a perfect solution. While AI-image generation feels like magic, these are the things that weren’t magically fixed:
As I mentioned before, there isn’t a lot of consistency with the prompts. We’re constantly evolving or tweaking them to get the desired result. Coming up with new ways of saying the same thing can be a bit challenging. Their predictably unpredictable nature has demanded that we create a set of cool but sparse background images that can be supplemented with other elements manually when the prompts aren’t delivering the outputs we need. This also comes in handy when we need something extremely specific, like this 3D image of a neural network model:
Or we need to feature a particular logo:
It takes a little extra time but not a crazy amount. And keeping an arsenal of them on hand makes future needs go faster.
Type Is Not Its Type
It’s also not great with type. For example, when I wanted this image of a door opening in front of a welcome mat, “WELCOME” was just not happening so I changed the prompt to “doormat” and supplemented it with type myself.
Again, it's not a big deal. AI did 90% of the up-front work; I just added some finishing touches.
Inconsistent Object Placement
Even if you do get a good image to work with, sometimes it’s not where you want it to be placement-wise. For example, our images use a horizontal cropping but the subject needs to be in the center so when the images get cropped down to square thumbnails, the focal point remains the same. Sometimes you can use size and placement words with your prompts like “small and centered” but it doesn’t always work and a little Photoshop may need to happen.
Minor Design Iterations: Mission Improbable
Sometimes you get a really cool image that just needs some altering but not the kind that Photoshop can fix. You just want to change a few minor things, but there’s no way to do that (at least right now in Jasper). With a human-created design, I can say to the designer, “This image is great, just change X, Y, and Z.” With AI, you make one slight change to your prompt and get a whole new image, not an iteration of what came before. It’s like pulling the arm of a slot machine every time. You can hope that third cherry is going to show up this time, but there’s a fair chance you’ll end up with a lemon and two 7’s instead.
Not Quite Ready for Mass Adoption Internally
There was a fleeting moment where we discussed the idea of allowing the team at large to use AI to generate images for their blog posts by giving them a set of established style prompts to work with. Unfortunately, there are too many design-specific things to watch out for and it’s just not consistent enough for widespread use while maintaining our design standards.
The Elephant in the Room
How do I, as a designer, feel about using AI to generate design work?
First, there’s the practical side. Honestly, I see it as another tool in my arsenal to deliver engaging work quickly. I’m old enough to remember the introduction of Photoshop, but 30-odd years later the world still needs talented photographers. The rest of us have just stopped wasting their time with menial edits.
The fact is, I’m part of a 2-person design team delivering images for 20 or so blog contributors—and that’s a fairly small portion of our output as a whole. We need to be able to keep up with demand and deliver something that is engaging from a visual standpoint without the time investment to create something wholly custom for each post. In that sense, it’s definitely a win.
Do I feel like I “created” these images?
At first my feeling was that it would be about the same as selecting images to work with from stock photo sites. You pick a subject that works and is available to you, make some tweaks to align with the brand aesthetic, and Bob's your uncle. The process might have been a bit different but I expected the emotional quality to be the same. While I definitely do not feel like I can call myself either a photographer or fine artist, I was surprised at how much of it I actually felt was a product of my creativity; of how I could choose to visually represent the writer’s content.
I felt less limited by a set of stock photos available to me. I could generate almost any image I wanted. The only limitation was my time and how long I was willing to spend crafting the right prompts. I was more excited about the output of the AI-generated images than I was about the stock image versions, and the feedback from the team showed they were excited as well.
I oddly didn’t feel like “anyone” could have created these images. But I did feel like anyone with a vision and an eye for design could create something cool with this. In the immortal words of Ratatouille’s Anton Ego, "Not everyone can be a great artist, but a great artist can come from anywhere."
Can an AI designer be “part of the team”?
Here’s the thing. Right now, the unpredictability of the designs is both a pro and a con. Yes, it’s more challenging to get the image to match the intention, but there’s an enjoyable element when it delivers something adjacent but possibly even more interesting. It almost feels “collaborative,” albeit in an unintentional way. That said, I imagine the eventual goal is for it to get more accurate, not more collaborative. Unlike the human designer on my team, it won’t take my idea and say, “Here’s what you asked for, but here’s something different that occurred to me in the process and I thought this might be an even better solution for what you’re trying to do.” In short: no, I don’t think AI is coming for our jobs anytime soon. AI image generation is a fantastic tool. But our ideas, our point of view, and the results of our collaborative efforts are where the real value lies.
If you have any feedback about this post, or anything else around Deepgram, we'd love to hear from you. Please let us know in our GitHub discussions .