Human vs. AI Feedback: Who Refines AI-Generated Images Better?
Introduction
AI tools like DALL·E make generating images easy—but crafting truly compelling visuals requires refinement. At JudgeMyImage.com, we tested two paths to that refinement: AI-generated feedback vs. real human feedback.
In this study, we began with an image of a futuristic city and created two refined versions: one guided by AI-generated suggestions, and another shaped by real human feedback. All three images were then evaluated by participants to determine which approach delivered the strongest aesthetic impact.
The goal? To explore a repeatable, data-driven method for enhancing AI-generated visuals—one that blends creative intuition, structured feedback, and measurable results.
Step 1: Generate Basic AI Image
We generated an image using OpenAI's DALL·E model via the ChatGPT (GPT-4-turbo) interface.
Prompt used:
"A breathtaking futuristic cityscape at sunset, with neon-lit skyscrapers featuring sleek, refined architecture. The skyline glows in deep purples, electric blues, and warm oranges blending into the horizon. Flying cars glide smoothly overhead, their lights trailing softly through the air. Rain-slick streets below mirror the vivid lights. A few elegantly dressed figures move through the scene—some under translucent umbrellas, others gazing upward, bathed in cinematic light. Their silhouettes and soft glows add human warmth to the high-tech environment Low-lying fog drifts near the ground, and distant mountains rise beyond the city, blending nature with technology. The composition should feel serene, immersive, and perfectly balanced, with flawless perspective and no visual distortion."
Generated Image:

Futuristic City – Basic AI Image
Step 2: Revision Based on AI Feedback
We uploaded the basic AI image to ChatGPT (GPT-4-turbo) and used its “Rate My Image” feature to obtain AI-generated feedback on visual quality, composition, and overall appeal.
Prompt used:
"Analyze this image and give me a concise prompt that describes aesthetic improvements, to be applied to this existing image using DALL·E 3. The prompt should not describe a new image, but specify how this one could be adjusted to enhance its visual appeal — such as lighting, color, composition, background, or clarity — while preserving the subject and overall structure."
Revision prompt resulted:
"Enhance the image by increasing overall clarity and sharpness, especially in the foreground figures and reflective surfaces; adjust lighting to add subtle rim lighting around the characters for better separation from the background; slightly reduce the haze to improve midground building visibility; enrich the sunset gradient with deeper purples and oranges for more dramatic contrast; and refine the flying vehicles with clearer detail and lighting to emphasize their futuristic design."
Finally, we used DALL·E via ChatGPT (GPT-4-turbo) to refine the basic AI image using the revision prompt above.
Generated Image:

Futuristic City – Revision Based on AI-Feedback
Step 3: Human Feedback on Basic AI Image
To obtain real human feedback, we conducted a verbal feedback study using our Image Survey Designer. Participants were shown the basic AI image and asked:
How would you make this image aesthetically more pleasing? Give at least two suggestions.
The responses were analyzed byTextResponseHub.com, which extracted qualitative insights. Highlights included:
- Add more vibrant or brighter colors (42%)
- Change or enhance specific elements (e.g., figures, buildings) (31%)
- Adjust lighting or brightness (27%)
Increase the contrast of the colors further; put glowing lights on the buildings and mirrors on the road, which makes the city architecture more colorful and attracts attention to the centerpiece of sight. Increase and sharpen the foreground characters to make them clearer and more detailed so they stick out but won’t change the moody effect and add to the balance of the overall picture.
The full report is available here.
Step 4: Revision Based on Human Feedback
Next, we used DALL·E via ChatGPT (GPT-4-turbo) to refine the basic AI image using the previously collected human feedback.
Prompt entered:
"This image underwent a human review, resulting in the following feedback report. Please create a new version of the image based on the suggestions provided. [Report Data]"
Generated Image:

Futuristic City – Revision Based on Human Feedback
Step 5: Image Comparison
To identify the most successful image, we created an image selection study using our Image Survey Designer. Participants were shown all three images and asked:
Which image is the most aesthetically pleasing?
Read Full Report


👉 The winner: The image revised with human feedback (55%).
Conclusions & Key Takeaways
- ✅ Human Feedback Delivers Clear Aesthetic Gains: The image revised with human feedback was selected the most aesthetically pleasing by 55% of participants—significantly outperforming both the original (35%) and the AI-only revised version (10%). Human feedback focused on emotionally resonant improvements such as contrast, focal clarity, and storytelling presence.
- ⚙️ AI Feedback Falls Short: Despite offering technically precise suggestions (e.g., sharpening, lighting adjustments), the AI feedback led to less aesthetically pleasing results. Therefore, GPT‑4o's limited image-evaluation capabilities should not be relied upon for image optimization.
- 🔁 Hybrid Approach Works Best: Our study demonstrates a reproducible creative pipeline: Generate with AI ➝ Gather Human Feedback ➝ Refine with AI ➝ Validate with Humans. This loop combines the well-known strengths of AI in generating and refining images with the well-known human ability to evaluate aesthetic beauty.
- 📊 Data > Guesswork: With tools like JudgeMyImage.com, creators no longer have to rely on intuition alone. Visual decisions can be tested, measured, and improved through feedback studies that are fast, affordable, and user-friendly.
What Next?
What makes this pipeline unique is the integration of real human feedback. While this study focused on AI-generated images, the same process applies to human-made art.
✅ Submit your AI or human-generated artwork
✅ Get real verbal feedback from real people
✅ Receive an insight-packed report
✅ Improve your visuals with confidence
Go to our Image Survey Designer, select Numerical & Verbal Feedback, and choose your open-ended question. As a launch offer, these evaluations start at just $13.20 (25 participants, 7-day delivery).