Computer Vision

16 Posts

Table comparing model performance on Mathvista, MMMU, ChartQA, DocVQA, and other tasks.
Computer Vision

Mistral’s Vision-Language Contender: Mistral unveils Pixtral Large, a rival to top vision-language models

Mistral AI unveiled Pixtral Large, which rivals top models at processing combinations of text and images.
Grounding DINO animation depicting object detection with bounding boxes on images.
Computer Vision

Object Detection for Small Devices: Grounding DINO 1.5, an edge device model built for faster, smarter object detection

An open source model is designed to perform sophisticated object detection on edge devices like phones, cars, medical equipment, and smart doorbells.
Landmine Recognition: AI supports specialists in battlefields by detecting landmines and other unexploded ordnance.
Computer Vision

Landmine Recognition: AI supports specialists in battlefields by detecting landmines and other unexploded ordnance.

An AI system is scouring battlefields for landmines and other unexploded ordnance, enabling specialists to defuse them.
Amazon's Just Walk Out store
Computer Vision

Amazon Rethinks Cashier-Free Stores: Amazon scales back its AI-powered "Just Walk Out" checkout service

Amazon withdrew Just Walk Out, an AI-driven checkout service, from most of its Amazon Fresh grocery stores...
High Yields for Small Farms: AI elevates chili farming in India with smarter yields.
Computer Vision

High Yields for Small Farms: AI elevates chili farming in India with smarter yields.

Indian farmers used chatbots and computer vision to produce higher yields at lower costs. The state government of Telangana in South India partnered with agricultural aid organization Digital Green to provide AI tools to chili farmers. 
The Big Picture and the Details: I-JEPA, or how vision models understand the relationship between parts and the whole
Computer Vision

The Big Picture and the Details: I-JEPA, or how vision models understand the relationship between parts and the whole

A novel twist on self-supervised learning aims to improve on earlier methods by helping vision models learn how parts of an image relate to the whole.
Excerpt from Google Pixel 8 promotional video
Computer Vision

Generative AI Calling: Google brings advanced computer vision and audio tech to Pixel 8 and 8 Pro phones.

Google’s new mobile phones put advanced computer vision and audio research into consumers’ hands. The Alphabet division introduced its flagship Pixel 8 and Pixel 8 Pro smartphones at its annual hardware-launch event. Both units feature AI-powered tools for editing photos and videos.
Vision Transformers Made Manageable: FlexiViT, the vision transformer that allows users to specify the patch size
Computer Vision

Vision Transformers Made Manageable: FlexiViT, the vision transformer that allows users to specify the patch size

Vision transformers typically process images in patches of fixed size. Smaller patches yield higher accuracy but require more computation. A new training method lets AI engineers adjust the tradeoff.
Security cameras somewhere around the Red Square in Moscow, Russia
Computer Vision

From Pandemic to Panopticon: How Russia is using face recognition to punish dissidents.

Governments are repurposing Covid-focused face recognition systems as tools of repression. Russia’s internal security forces are using Moscow’s visual surveillance system, initially meant to help enforce pandemic-era restrictions, to crack down on anti-government...
Flowcharts show how a new contrastive learning approach uses metadata to improve AI image classifiers
Computer Vision

Learning From Metadata: Descriptive Text Improves Performance for AI Image Classification Systems

Images in the wild may not come with labels, but they often include metadata. A new training method takes advantage of this information to improve contrastive learning.
A football player performs drills on the pitch while a computer vision system tracks and grades his movements
Computer Vision

On the Ball: An AI-Powered App Lets Amateur Footballers Try Out for the Pros

AiSCOUT uses computer vision to grade amateur footballers and recommends those who score highest to representatives of professional teams.
A Cadillac SUV drives through one of UVeye's Atlas arches
Computer Vision

Auto Diagnosis: AI-Powered Inspections Arrive at Dealers for GM and Volvo

A drive-through system from UVeye automatically inspects vehicles for dents, leaks, and low tire pressure.
Masked Auto-Encoder (MAE) explanation
Computer Vision

Who Was That Masked Input? Pretraining Method Improves Computer Vision Performance

Researchers have shown that it’s possible to train a computer vision model effectively on around 66 percent of the pixels in each training image. New work used 25 percent, saving computation and boosting performance to boot.
AI system recognizes normal chest x-rays
Computer Vision

AI Enters the Radiology Department: ChestLink, an AI X-Ray Tool Approved by European Officials

The European Union approved for clinical use an AI system that recognizes normal chest X-rays. ChestLink is the first autonomous computer vision system to earn the European Economic Area’s CE mark for medical devices...
AI Research SuperCluster (RSC)
Computer Vision

New Supercomputer on the Block: All about Meta's AI Research Supercluster

Facebook’s parent company is staking its future on a new compute cluster. Meta unveiled AI Research SuperCluster (RSC), which is designed to accelerate training of large models for applications like computer vision, natural language processing, and speech recognition.
Load More

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox