Nemotron models boost Llama’s speed but maintain accuracy NotebookLM “reads” audio and video

Published

Sep 30, 2024

Reading time

3 min read

Twice a week, Data Points brings you the latest AI news, tools, models, and research in brief. In today’s edition, you’ll find:

Hacking ChatGPT’s long-term memory function
U.S. trade commission targets companies who lie about AI
A new OpenAI model for screening text and images
Apple, Meta, and others hold off on AI Pact

But first:

Nemotron models use NAS, distillation to shrink Llama 3.1

NVIDIA created Llama 3.1-Nemotron-51B using Neural Architecture Search (NAS) and knowledge distillation, reducing Meta’s 70 billion parameters to 51 billion. The new model delivers 2.2 times faster inference compared to Llama 3.1-70B while maintaining similar accuracy, and fits on a single NVIDIA H100 GPU. Nemotron achieves 98.21% of Llama’s accuracy on the MMLU benchmark and outperforms it on MT Bench, while processing up to 6,472 tokens per second for text generation compared to base Llama’s 2,975 tokens per second. This methodology may allow AI developers to deploy powerful language models more cost-effectively and expand where and how they can be deployed. (NVIDIA)

NotebookLM uses Gemini to transcribe and summarize multiple media types

Google’s NotebookLM can now import YouTube URLs and audio files as source materials, leveraging Gemini 1.5’s multimodal capabilities to process text, audio, and video. The AI can transcribe audio, analyze video, and extract key information from multiple media formats, enabling users to create comprehensive study guides and parse sources more effectively. Google also introduced a feature that allows users to share NotebookLM’s Audio Overviews directly via public links, streamlining collaboration and knowledge sharing between users. (Google)

New chatbot memory exploit found, patched

Security researcher Johann Rehberger discovered a vulnerability in ChatGPT’s long-term memory feature that allowed attackers to plant false information and exfiltrate user data through indirect prompt injection. The exploit worked by tricking ChatGPT into storing malicious instructions or false information in a user’s long-term memory, which would then be referenced in all future conversations. Rehberger demonstrated the severity of the issue with a proof-of-concept that caused ChatGPT’s macOS app to send all user inputs and AI outputs to an attacker-controlled server. While OpenAI has patched the data exfiltration vector, researchers warn that planting false memories through untrusted content remains possible. (Ars Technica)

U.S. government cracks down on AI scams and fraud

The U.S. Federal Trade Commission took action against five companies for using or selling AI technology in ways that deceive customers. The agency’s “Operation AI Comply” targets businesses that use AI to mislead consumers, with FTC Chair Lina Khan emphasizing that AI companies remain subject to existing laws. The enforcement actions include settlements with companies like Rytr and DoNotPay, which made false claims about AI-powered services, and ongoing cases against three e-commerce businesses that promised unrealistic profits if they used the businesses’ AI tools. (The Hill and FTC)

OpenAI’s new GPT-4 based moderation model

OpenAI released a new AI moderation model called “omni-moderation-latest” that can analyze both text and images for multiple types of harmful content. The model is based on GPT-4 and offers improved accuracy compared to OpenAI’s earlier text-only moderation models, especially for non-English languages. The model also adds new harm categories, including “illicit” content, which covers advice on how to commit wrongdoing, whether or not that wrongdoing is violent. This free update to OpenAI’s Moderation API aims to help developers build safer applications as generated text and image volume grows rapidly. (OpenAI)

100 countries sign Europe’s voluntary AI Pact, but some tech giants will wait and see

The European Commission announced that over 100 companies had signed its AI Pact, an initiative encouraging voluntary pledges on AI development and deployment. The Pact aims to encourage compliance with the EU’s upcoming AI Act through early adoption of its requirements and information-sharing among signatories. While major tech companies like Microsoft and OpenAI have signed on, notable absences include Apple, Meta, NVIDIA, and Anthropic, some of whom have concerns about public scrutiny. (European Commission)

Still want to know more about what matters in AI right now?

Read last week’s issue of The Batch for in-depth analysis of news and research.

Last week, Andrew Ng discussed AI’s transformative potential in education, highlighting Coursera’s generative AI tools and the ongoing need for innovation in the field.

“Given society’s heightened need for education and AI’s potential to transform the field, I feel the opportunities for edtech at this moment are greater than at any moment over the past decade.”

Read Andrew’s full letter here.

Other top AI news and research stories we covered in depth: California passed new laws regulating deepfakes, a local move that could influence national and global legislation; Qwen 2.5 continues the trend of ever-improving open-source large language models; Lionsgate, the studio behind blockbuster franchises like The Hunger Games and John Wick, embraced video generation technology with the help of AI startup Runway; and a robot capable of playing table tennis beat human beginners while entertaining expert players.

Subscribe to Data Points