Adobe this week unveiled significant updates to its Firefly AI suite, with a major focus on audio generation capabilities. The new features, Generate Soundtrack and Generate Speech, aim to streamline content creation by offering AI-powered music and voiceover solutions. These tools are currently in beta, promising creators greater control and flexibility over AI-generated audio.
AI Music Licensing Simplified
A key advantage of Firefly’s Generate Soundtrack is its universal licensing. Adobe ensures that any music created through the tool is free for commercial use, indefinitely. This addresses a major pain point for creators who often face complex and expensive music licensing restrictions. According to Adobe’s head of AI audio, Jay LeBoeuf, the goal is to “remove the confusion” surrounding music rights in the digital age. The AI is designed to avoid copyright issues by training on licensed content, preventing potential takedowns or strikes on platforms like YouTube.
The system also includes safeguards; for example, prompts referencing specific artists (like Taylor Swift) are rejected to prevent unauthorized replication of copyrighted material.
Creating AI Soundtracks: A Step-by-Step Approach
Generate Soundtrack works by analyzing uploaded videos and suggesting prompts. The AI generates prompts based on the video’s vibe, style, and purpose, allowing creators to refine these suggestions as needed. Users can adjust tempo, energy levels, and duration to match their content. Within minutes, Firefly delivers four instrumental variations tailored to the video’s length (up to five minutes).
To get started:
- Open Firefly on the web.
- Click “Generate” then “Generate Soundtrack.”
- Upload your video.
- Review or edit the AI-generated prompt.
- Adjust energy, tempo, and duration.
- Generate and download the soundtrack.
Adobe plans are required to access Firefly, starting at $10 per month.
AI Speech Generation: Fine-Tuning for Realism
Firefly’s Generate Speech tool offers a high degree of customization. Users can input text (up to 7,500 characters) and choose from 50 diverse voices, including nonbinary options, across 20 languages. The tool goes beyond simple text-to-speech by allowing users to add pauses, adjust tone, and correct pronunciations using a phonetic breakdown feature.
Adobe emphasizes the importance of lifelike speech for creators who may not be comfortable recording their own voiceovers. LeBoeuf explains that the aim is to empower “small business owners, educators, to everybody that really just has a story to tell.”
Expanding Partnerships for AI Innovation
In addition to its own developments, Adobe is expanding partnerships with AI companies like ElevenLabs (for multilingual speech generation) and Topaz Labs, further integrating third-party AI models into its platform. The company is also rolling out a fifth-generation Firefly Image Model with improved photorealism and prompt-based editing, along with a beta Firefly video editor featuring a multitrack timeline for AI clip compilation.
Adobe’s expansion into AI audio marks a significant step in democratizing content creation, providing tools that address licensing concerns and enhance creative control. The company’s commitment to partnerships suggests a future where AI-powered media production becomes increasingly accessible and efficient.



























