WHERE MULTIMODAL AI FITS IN
Multimodal capability integrates smoothly with existing layers in the contact center.
• Self-service and virtual assistants: AI handles intake, visual understanding, and simple resolutions.
• Agent assist: AI provides real-time context and guidance to agents.
• Post-call summaries: AI documents both visual and verbal root causes.
• Proactive support with telemetry: As described earlier, AI detects early warning signs, interprets signals, and gives agents the ability to resolve issues before customers experience failure. Thus moving operations from reactive to preventative.
FOUR PRACTICAL STEPS TO START
1. Pinpoint high repeat contact categories: Focus on issues where agents repeatedly ask what the customer is seeing.
2. Enable image or screenshot intake: This forms the foundation for multimodal understanding.
3. Train agents on AI-generated insights: Agents remain the decision-makers. AI enhances clarity.
4. Start with one journey: A single high-impact workflow builds momentum and proves value.
THE FUTURE: AGENTIC MULTIMODAL SYSTEMS
Agentic systems, with multimodal AI, are the next level of service and support and FCR. These do not just perceive and summarize problems. They can take safe, reversible actions that shorten time to resolution and reduce operational load.
This shift is accelerating rapidly; Gartner predicts that by 2028, 33 % of enterprise software applications will include agentic AI, up from less than 1 % in 2024.
Here’ s how to clearly differentiate the roles played by AI and by people.
What future AI agents will do:
• Interpret visuals, logs, and telemetry automatically.
• Run guided diagnostics without human intervention.
• Validate device states or configurations.
• Trigger safe workflows such as resets or permission checks.
• Resolve simple issues independently.
• Prepare full context packages for human agents on complex problems.
These capabilities will allow entire categories of low-complexity contacts to become fully autonomous, improving speed and reducing cost.
What human agents will focus on:
• Complex, emotionally sensitive, or high-consequence issues.
• Situations with multiple variables or unclear signals.
• Customer reassurance, negotiation, and expectation setting.
• Oversight and approval for agentic workflows.
• Relationship building and brand experience.
• Final decision-making in ambiguous cases.
This division of labor results in a high-efficiency model. Where simple, mid-complexity issues are resolved autonomously by AI, and complex interactions are solved through AI plus human collaboration, not by human effort alone.
MULTIMODAL AI
Agentic multimodal systems represent a natural continuation of the SEE- SAY-SOLVE model. Once AI can see and explain, the next logical step is allowing it to take carefully defined actions.
This is not speculation. The earliest forms of these systems are already emerging across advanced support operations.
FOR THE FIRST TIME, WE HAVE AI THAT CAN UNDERSTAND CUSTOMERS THE WAY HUMANS DO: THROUGH SIGHT, SOUND, AND CONTEXT.
CONCLUSION
For years, contact centers have attempted to improve FCR through better routing, training, and knowledge systems.
But the real barrier was not agent capability. Instead it was a lack of information and a lack of context.
Customers experience problems visually. Agents troubleshoot verbally. Traditional AI sits in the middle and fails to connect the two worlds.
Multimodal AI finally bridges this gap, providing that missing link. It replaces heuristic assumptions with deterministic data, gives agents the full picture instead of partial information, and enables resolution on the first attempt instead of repeated frustration.
And by doing so, multimodal AI becomes the foundation for the next frontier: agentic systems that autonomously resolve simple issues, while empowering agents to solve complex ones with more speed, more confidence, and more context than ever before.
For the first time, we have AI that can understand customers the way humans do: through sight, sound, and context. That is why multimodal AI is not just the future of FCR. It is the beginning of a fully agentic support ecosystem.
Ankit Talwar is Director of Product Management for AI at Dell Technologies and a Distinguished Fellow of the Soft Computing Research Society( SCRS). He designs and scales novel enterprise-grade AI systems across customer experience and large-scale supply chain operations. He can be reached at https:// www. linkedin. com / in / ankittalwar /
JUNE 2026 33