Large Multimodal Models (LMMs)
Microsoft Tackles Voice-In, Text-Out: Microsoft’s Phi-4 Multimodal model can process text, images, and speech simultaneously
Microsoft debuted its first official large language model that responds to spoken input.
1 Post
Stay updated with weekly AI News and Insights delivered to your inbox