Using GPT to debunk conspiracy theories Nvidia’s new open vision-language models

Published
Sep 23, 2024
Reading time
3 min read
A detailed underwater scene showing various whale species communicating through sound waves.

Twice a week, Data Points brings you the latest AI news, tools, models, and research in brief. In today’s edition, you’ll find:

  • Kling 1.5 text-to-video model adds 1080p, editing tools
  • Codeforces restricts AI use in competition
  • Identifying whalesong using bioacoustic markers
  • A prize challenge for improving robotics world models

But first:

Research: Chatbots can help talk people out of conspiracies

A study of 2,190 Americans who believed in conspiracy theories found that personalized dialogues with GPT-4 significantly reduced belief in focal conspiracy theories by about 20 percent on average, with effects persisting for at least two months. This treatment proved effective across various topics and even among participants with deeply entrenched beliefs, challenging the notion that believers in conspiracies are impervious to evidence. The AI conversations also reduced beliefs in unrelated conspiracies and shifted conspiracy-related behavioral intentions. These findings suggest that tailored, in-depth counterarguments can be persuasive and demonstrate the potential for AI to be used responsibly in combating misinformation at scale. (Science)

Nvidia’s open vision-language models excel at OCR, image understanding

Nvidia researchers introduced NVLM 1.0, a new family of open multimodal large language models that rival leading proprietary and open models in vision-language tasks. The model demonstrates improved text-only performance compared to its language model backbone after multimodal training, outperforming some competitors in this aspect. NVLM 1.0 achieves performance on par with leading models like GPT-4o and Llama 3-V across both vision-language and text-only tasks, while showing versatile capabilities in OCR, reasoning, localization, and coding. The model’s forthcoming release, including weights and code, offers developers a powerful new tool for advancing multimodal AI research and applications. (GitHub and arXiv)

Kling updates its high-definition text-to-video model

KLING AI unveiled its upgraded video model, KLING 1.5, which generates 1080p videos at the same price as previous 720p videos. KLING 1.5 boasts significant improvements in image quality, dynamic quality, and prompt relevance, offering what the company calls a 95% boost (using essentially a back-of-napkin metric) in performance compared to its predecessor. The company also introduced a new Motion Brush feature for KLING 1.0, allowing users to define movement of elements within images precisely; KLING AI says Motion Brush and Camera Control will come to the 1.5 model soon. (KLING AI)

Codeforces limits AI use in coding competitions after o1 results

Coding competition site Codeforces announced new rules restricting the use of AI systems like ChatGPT, Gemini, and Claude in programming competitions. The rules allow limited AI use for tasks like translating problem statements and basic code completion, but prohibit using AI to generate core logic, algorithms, or solutions to problems. OpenAI notably touted its o1 models’ success in solving Codeforces competition problems at release. (Codeforces)

Google’s new models identify rare whale calls

Google’s researchers developed a new whale bioacoustics model capable of identifying eight distinct whale species and multiple calls for two of those species. The model, which includes the recently attributed “Biotwang” sounds of Bryde’s whales, can classify vocalizations across a broad acoustic range and has been used to label over 200,000 hours of underwater recordings. This breakthrough in whale vocalization classification could significantly enhance conservation efforts and ecological research by improving researchers’ ability to track whale populations in remote environments. (Google Research)

1X releases robotics model, launches prize challenge for world models that improve on it

Researchers at 1X trained an AI world model that can simulate diverse robot behaviors and object interactions based on real-world data. The model aims to solve challenges in evaluating general-purpose robots across many tasks and changing environments, potentially enabling more rigorous testing and development of robotic systems. To accelerate progress, the company released a dataset and launched a public competition with cash prizes for improving world models for robotics. (1X)


Still want to know more about what matters in AI right now?

Read last week’s issue of The Batch for in-depth analysis of news and research.

Last week, Andrew Ng highlighted the role of data engineering in AI and introduced a new professional certificate course series on Coursera.

“In this specialization, you’ll go through the whole data engineering lifecycle and learn how to generate, ingest, store, transform, and serve data. You’ll learn how to make necessary tradeoffs between speed, flexibility, security, scalability, and cost.”

Read Andrew’s full letter here.

Other top AI news and research stories we covered in depth: OpenAI’s latest model excels in math, science, and coding, though its reasoning process isn’t visible; SambaNova increased inference speeds for Meta’s Llama 3.1 405B model; Amazon enhanced its warehouse automation by acquiring Covariant’s model-building talent and tech; and researchers proposed a method to reduce memorization in large language models, addressing privacy concerns.


Subscribe to Data Points

Share

Subscribe to Data Points

Your accelerated guide to AI news and research