Back to all posts
AI
5 min read

AI Notes August 2025

AI Notes August 2025

AI Tools for Creatives – August 2025 Roundup


August 2025 marked a seismic shift in the AI landscape with the official launch of GPT-5, Google's breakthrough with "nano-banana" image generation, and a wave of enterprise-focused tools that are reshaping how creatives work. From advanced reasoning models to game-changing video generation capabilities, this month delivered tools that move beyond novelty into genuine productivity powerhouses.

General AI tools

GPT-5 — (Major Launch) OpenAI's flagship release combines GPT and o-series into one unified model with routing for speed or depth. Features advanced reasoning through "Thinking" mode (40% improvement over GPT-4), multimodal support, agent mode for autonomous workflows, and new personality themes: Cynic, Robot, Listener, and Nerd.

Claude Opus 4.1 — (Updated) Anthropic rolled out significant enhancements with 74.5% scoring on SWE-bench Verified, better multi-file refactoring, and new security features including /security-review command for code vulnerability scanning. Also introduced 1 million token context window (5x increase).

Perplexity — (Updated) Launched Perplexity Comet, a next-gen AI browser, and began revenue sharing with publishers to address content licensing concerns. Also introduced OpenTable integration for restaurant reservations.

Gemini 2.5 Deep Think — (New) Google's multi-agent reasoning model that parallel-processes ideas, achieving 99.2% score in competition math tests and bronze-level International Math Olympiad performance.

Microsoft Copilot — (Updated) Now running on GPT-5 with enhanced performance. Added Copilot 3D for turning images into 3D models, Copilot Appearance for animated expressions, and integration into Samsung TVs and monitors.

Creative AI tools

Adobe Acrobat Studio — (New) Revolutionary platform uniting Acrobat, Adobe Express, and AI agents with "PDF Spaces" that turn document collections into conversational knowledge hubs with customizable AI Assistants.

Google Gemini 2.5 Flash Image ("nano-banana") — (New) Breakthrough image editing model that tops LMArena charts with consistent character rendering, precise object manipulation, and the ability to transform Google Maps views into POV perspectives.

Leonardo AI Lucid Origin — (New) Token-based image generation model with increased vibrancy, diversity, Full HD output, and improved text rendering for graphic design applications.

Midjourney HD Video — (New) Midjourney's entry into video generation with dynamic sequences, adjustable camera angles, and smooth transitions from text prompts.

Higgsfield — (Updated) Added Upscale feature for boosting photo resolution up to 16x and video up to 4K, plus Higgsfield Assist with GPT-5 integration for creative prompting.

Ideogram Character Reference — (New) Addresses character consistency in AI images using single reference photos to maintain core features across different scenes and styles.

Adobe Firefly — Structure Reference continues to allow layout locking while varying creative content, plus collaborative Boards feature.

Adobe Illustrator — Generative Shape Fill tool remains a key feature for vector art creation.

Midjourney V7 — Enhanced video generation modes, improved logo creation, and new editor capabilities.

Leonardo AI — Real-time canvas with style references and 3D mesh export capabilities.

Story Diffusion — Maintains comic character consistency across panels.

DALL·E 3 — Inpainting and regional editing directly within ChatGPT.

Khroma — AI color scheme generator that learns user preferences.

AutoDraw — Quick sketch helper for wireframes.

Fontjoy — Font pairing generator.

Botika — AI fashion models for Shopify product images.

Magnific — AI upscaler with prompt-guided detail enhancement.

Presti — Realistic furniture placement in AI-generated rooms.

IC Light V2 — Text-guided portrait relighting tool.

Krea 3D Objects — Image to textured 3D mesh generation.

Flora — Team collaboration tool chaining text, images, and videos.

Stable Virtual Camera — Generates explorable 3D scenes from photographs.

Video AI tools

Google Vids — (New) Official rollout of AI-powered video editing platform integrated into Google Workspace with AI avatars, automated filler-word removal, and image-to-video generation.

Runway Aleph — (Updated) Enhanced text-based video editor supporting third-party models like Veo 3 within Chat Mode for greater control and flexibility.

Veo 3 — (Updated) Now features Image to Video capability with unprecedented control and quality, available in both Quality and Fast modes through Leonardo AI.

MovieFlo AI — (New) Professional video platform built by LucasFilm & ILM veterans, offering scene-based workflow with top video models (Veo, Kling, Runway) integrated.

Kling — Fast and controllable 1080p generative video tool for social media content.

Sora — OpenAI's Sora 2 still anticipated with promises to fix motion issues and add synchronized audio.

Scenario — Direct video generation control through frame sketching.

Seedance 1.0 — Narratively consistent video creation with multi-shot character cohesion.

Mirage (Decart) — Real-time world transformation and style transfer for video streams.

Moonvalley Marey — Hybrid 2D-to-3D filmmaking AI platform with camera motion control.

Dream Machine (Ray 2) — Cinematic AI-generated clips with realistic physics.

Amazon Nova Reel — Budget-friendly text-to-video generation.

Pika Pikas — AI actor and background swaps via prompts.

Gemini Video Generator — AI-produced animated scenes.

Descript — Video/audio editor with overdub and eye contact correction.

Viggle — Motion transfer and greenscreen-ready character creation.

Synthesia — Avatar video creation with multilingual voiceovers.

Facefusion — Open-source GPU-based face-swapping tool.

Deep Live Cam — Real-time deepfake streaming for VTubers.

Revid.ai — Auto video summarization for social shorts.

Riverside — Lossless recording with text-based editing.

Flux — Fast 1080p video generator with style control.

HeyGen — AI face swaps with multilingual voice control.

Arcads — Automated ad video creation platform.

Filmora AI — Video editing with auto cut, style transfer, and filler removal.

Keytake — Converts documents and URLs into explainer videos.

Music & Audio AI tools

ElevenLabs Music — (New) Major expansion beyond voice with commercial-use music generation, though raising questions about artist rights and originality.

Suno V4.5 — Fast 2-track mixing with improved vocals.

Udio — AI music generation for songs and skits.

ElevenLabs — Text-to-speech and sound effects, now with music generation capabilities.

Play HT — High-quality, low-latency voice API.

Artlist AI Voiceover — Multilingual voiceovers with Adobe Premiere support.

Hume Octave — Emotion-aware text-to-speech with controllable tone.

SoundHound Chat — Car assistant AI for voice commands and ordering.

ElevenLabs Bark — AI speech technology for pet-tech toys.

Amazon Nova Sonic — Expressive voice AI powered by AWS Bedrock.

Synthflow — AI tool for automating meeting bookings and CRM updates.

Shamaze — Generates bedtime stories in personalized AI voice.

UI Design AI tools

Figma "First Draft" — Relaunched AI layout generator.

Uizard — UI mockups from text prompts.

Galileo — Generates polished UI screens using AI.

Dora — No-code 3D site builder with WebGL export.

Webflow AI Site Builder — AI builds hosted sites from creative briefs.

Attention Insight — Predicts user gaze with heatmaps for UX improvement.

Operative.sh — Automated UX testing with screenshots.

Scene 2.0 — Whiteboard to website creation platform.

AI CSS Animations — Generates CSS animations from text prompts.

Same.dev — Clones websites into editable React code.

Firebase Studio — Builds full-stack apps from prompts.

Marketing AI tools

Pencil — (Updated) Now available on Google Cloud Marketplace, expanding enterprise accessibility.

Jasper — Brand voice control and team permissions.

Aha — AI influencer team management and ROI dashboard.

Virallyst — Caption rewriting, hook testing, and auto-posting.

Warmy — Domain warm-up service with updated deliverability guides.

Happenstance — Plain language AI network search.

AiSDR — LinkedIn automated meeting scheduler.

Reachy — Respectful LinkedIn outreach agent.

Keak — AI platform for landing pages with A/B test features.

No Code Builders (Vibe Coding)

Lovable — (Major Update) Multiplayer vibe coding platform projecting $1B ARR within 12 months, adding $8M ARR monthly with Agent Mode beta for autonomous AI planning and editing.

Replit Agent — Prompt-driven app deployment.

V0 by Vercel — (Updated) Introduced Design Mode for easy UI element tweaking and Tailwind v4 support.

GitHub Copilot in VSCode — Integrates into VS Code sidebar and terminal for natural-language coding assistance.

Emergent — (New) No-code mobile app builder powered by agentic AI.

Local Models and Tools to Run Them

Tools

Ollama — Open-source tool for running LLMs locally with command-line interface and API.

Chatbox — Cross-platform AI client for Windows, macOS, Linux, Android, iOS, and web.

Open Web UI — Self-hosted AI interface supporting Ollama and OpenAI-compatible APIs.

Comfy UI — Node-based interface for generative AI with visual workflow creation.

Automatic 1111 — Browser-based interface for Stable Diffusion with advanced tools.

Models

OpenAI GPT-OSS — (New) Open-source models (120B & 20B), optimized for reasoning and coding, Apache 2.0 licensed.

DeepSeek R1-Omni — 671B open-weights model with 200k context, free for research.

Llama 3 (8B/70B) — Llama 3.1 with 405B-parameter model, 128K token context, multilingual coding. Llama 3.2 brings vision-capable and edge-friendly models.

LG EXAONE Deep 32B — Laptop-friendly model scoring near GPT-4 on STEM tasks.

Ethical Models

FLite — Open-source 10B-parameter diffusion model trained on 80M licensed images from Freepik.

Bria AI — Compact open-source text-to-image model (4B parameters) built on licensed data.

Blunge — Ethical AI image generation protecting artists' rights through manual ownership checks.

Other AI tools

Microsoft MAI-Voice-1 & MAI-1-preview — (New) Microsoft's first in-house AI models, with MAI-Voice-1 generating one minute of audio in under one second.

OpenAI GPT Realtime — (New) Real-time API for building voice agents with 82.8% audio reasoning accuracy.

Anthropic Claude for Chrome — (New) Chrome extension allowing Claude to handle browser tasks like cart management and form filling.

Gradio — Open-source Python library for building interactive ML web apps.

Wide Research by Manus — Tool for handling multiple research tasks concurrently.

Harvey — AI bot for contract review and legal due diligence.

CopyCat — Low-code browser automation from natural language instructions.

Exa Search — Hybrid semantic+keyword search API for docs and e-commerce.

Lambda Inference API — Pay-per-token gateway to major frontier models.

Zapier MCP — One prompt triggers 8,000+ SaaS actions for agents.

Kimi K2 — Chinese open-weight 1T-parameter LLM outperforming GPT-4 in coding/math.

Payman — Agent-driven hiring with secure payment.

Pinokio — One-click local deployment of AI apps.

Cursor v1.3 — Shared terminal and faster context-aware coding chat.

Browser Use — Library for headless browser automation in agents.

Databutton MCP — Drag-and-drop AI workflow builder.

Documenso — Open-source DocuSign alternative.

Terra Security — AI-driven penetration testing platform.

Agent Simulate — Synthetic user load testing for UX research.

Education AI tools

Google Gemini Guided Learning — (New) Interactive step-by-step tutoring with visual elements, flashcards, and video explanations, designed to compete with OpenAI's Study Mode.

OpenAI Study Mode — Prompts users for reasoning, stepping away from direct answers.

Claude for Education — Large-context tutor and worksheet generator.

OpenAI Academy — Courses on prompt engineering and AI safety.

AI Tutor by Roadmap.sh — Interactive study tool following coding/learning roadmaps.

TurboLearn — Note-taking, flashcards, and quizzes from various media.

NotebookLM Audio/Video Overviews — Podcast-style or visual learning summaries powered by AI.

Nvidia free AI courses — Hands-on training for AI and ML fundamentals.

Globe Explorer — AI-based interactive knowledge maps.

Class Central — Indexed catalogue of online AI courses.

University of Illinois AI — MBA-level AI specialization online.

Microsoft Generative AI Beginner — Twelve-lesson introductory curriculum.

Maven AI Bootcamps — Cohort courses on safety, prototyping, product.

Worth checking out

Halo X — (New) $249 AI glasses from Harvard dropouts that record, transcribe, and analyze conversations in real-time, though raising privacy concerns with covert recording capabilities.

GeoSpy — (Controversy) AI tool that can pinpoint photo locations from visual details, being explored by LAPD for $5,000/year, raising surveillance concerns.

Figure 02 Helix — (New) Humanoid robot with AI model for complex tasks like folding laundry autonomously.

Devin 2.0 — Autonomous developer agent.

Convergence Parallel — Multi-agent orchestration framework.

Mistral OCR API — High-speed multilingual OCR API.

NotebookLM "Discover Sources" — Curated research companion.

Keytake — Converts documents into branded explainer videos.

Think pieces and resources

Google Prompt Engineering Guide — Best practices for prompt design.

State of AI 2025 — Concise trend graphs and insights.

Agent Survey 2025 — 264-page review on autonomous agents.

World Economic Forum Future Jobs Report — Skills and wage impact analysis.

Hugging Face SmolAgents Course — Build lightweight AI agents.

601 AI Income Ideas — Monetization guide for AI applications.

IBM Building AI-Powered Chatbots — Vendor-neutral course.

Elements of AI (Stanford/Harvard) — Free fundamentals of AI and ethics.

August 2025 Trends and Market Highlights

The GPT-5 Reality Check — While OpenAI's flagship launch brought advanced reasoning and unified architecture, user backlash over the removal of GPT-4o and complaints about "colder" responses showed that more powerful doesn't always mean better user experience.

The "Nano-Banana" Revolution — Google's mysterious Gemini 2.5 Flash Image model dominated social media and creative communities, proving that sometimes the biggest breakthroughs come from unexpected places.

Enterprise AI Goes Mainstream — From Adobe's PDF Spaces to Microsoft's in-house models, August marked the month when AI stopped being a novelty and became infrastructure for serious business applications.

The Vibe Coding Boom — Lovable's projection of $1B ARR within 12 months highlighted how natural language programming is moving from experimental to essential, with developers embracing "vibe coding" over traditional syntax.

AI Safety Gets Real — Following troubling reports of ChatGPT interactions with vulnerable users, major platforms began implementing concrete safety measures, moving beyond PR promises to actual protective features.

The Reasoning Revolution — Multiple labs launched reasoning-focused models (GPT-5, Deep Think, Claude Opus 4.1), signaling that the next battleground in AI isn't just speed or scale, but genuine problem-solving capability.

Privacy vs Progress Tensions — From Halo X's covert recording capabilities to GeoSpy's location tracking, August highlighted the growing tension between AI innovation and personal privacy rights.


If you spot any missing links, please DM or comment!

John Luba

John Luba

Author & Content Creator