Serverless GPU Inference Plus, Fine-tuning and custom models by OpenAI

Published

Apr 10, 2024

Reading time

5 min read

This week's top AI news and research stories featured the proliferation of coding agents, a study showing the most common uses for generative AI, the instability of Stability AI, and a transformer alternative called Mamba. But first:

Hugging Face and Cloudflare launch serverless GPU inference for open AI models
"Deploy on Cloudflare Workers AI" allows developers to build generative AI applications without the overhead of managing GPU infrastructure, reducing operating costs with a pay-per-use model. Models supported include Mistral 7B, Gemma 7B, Llama 2 13B, and Deepseek Coder 6.7B, plus specialized varieties. (Read more at Hugging Face’s blog)

OpenAI updates its fine-tuning API and expands its Custom Models program
Key API updates for GPT-3.5 Turbo (and in early experimental access, GPT-4) include features for better training control, such as epoch-based checkpoints, a comparative UI for model evaluation, and comprehensive validation metrics. Additionally, the Custom Models program offers assisted fine-tuning, supporting organizations in deploying advanced model refinements for their unique needs. (Find the details at OpenAI’s blog)

Big Tech's hunt for AI training data stirs privacy concerns
Big technology companies are eagerly acquiring vast amounts of training data, reigniting interest in once-dominant platforms like Photobucket. Photobucket, now with just 2 million active users, is negotiating to license its content for model training, potentially earning billions in revenue. The market for "ethically sourced" training data, valued at roughly $2.5 billion and expected to grow to $30 billion in a decade, is not without controversy. Concerns are mounting over the privacy implications of repurposing personal data for AI training without explicit consent. (Read the news at Reuters)

Startup Hume introduces Empathic Voice Interface (EVI), an AI promising emotional intelligence
EVI claims to process the subtleties of human speech, such as tune, rhythm, and timbre, enabling it to generate empathic responses with an appropriate tone of voice. Developers are offered a comprehensive suite of tools for easy integration, including a WebSocket API, REST API, and SDKs for both TypeScript and Python, along with open source examples and a web widget. EVI's capabilities include advanced speech transcription, language response generation, expressive text-to-speech modeling, and empathic response to user expressions. (Learn more at Hume’s blog)

DALL·E now offers advanced editing tools for precision image customization
Users can now refine their generated images by selecting specific areas for editing and inputting descriptive changes via chat or a conversation panel. This enhancement includes the option to add new elements, remove unwanted objects, and alter characteristics of existing parts of an image, like changing an expression or converting an image to black and white. (Read more at OpenAI’s blog)

AI-generated book ads overrun Amazon Kindle lock screens
This shift marks a departure from the previously diverse recommendations that aligned with users' reading preferences. While Kindle's ad-supported model offers a discount on the device's purchase price in exchange for displaying ads, the recent surge of low-quality, AI-generated content, including dubious imitations of existing works, has led to user discontent. Some irrelevant and unpopular titles have raised questions about Amazon's ad selection algorithms and potential experimental promotion of AI-generated content. (Read the story at Futurism)

Cohere launches Command R+, a large language model optimized for enterprise
Command R+ performs similarly to top models on benchmark tests at costs lower than its top competitors.’, Cohere’s models offer a 128k-token context window and tools to optimize retrieval augmented generation (RAG) In partnership with Microsoft Azure, Command R+ targets AI adoption in enterprise, serving a range of functions from customer relationship management to multilingual communication. Command R+ is now available on Azure and soon on other platforms. (Read all the details about Command R+ at Cohere’s blog)

Claude introduces “Tool Use” for function calling
Anthropic AI introduces a beta feature for its Claude AI models, accessible via the Anthropic Message API. Tool use allows users to enhance Claude's capabilities by connecting to external tools for real-time information retrieval and integration of third-party functionalities. This feature enables users to perform detailed, multi-step operations and execute complex commands with minimal coding. (Learn more at Anthropic’s blog)

UK and U.S. forge alliance on AI safety
The two nations signed a Memorandum of Understanding (MOU), setting the stage for a collaborative effort in developing advanced testing protocols for AI models. This partnership, endorsed by UK's Technology Secretary Michelle Donelan and U.S. Commerce Secretary Gina Raimondo, arises from pledges made during the AI Safety Summit in November 2023. Both nations aim to share resources and personnel to expedite the creation of comprehensive evaluation suites for AI technologies. (Read the UK government’s press release)

Yahoo acquires AI news app Artifact
Artifact struggled to scale its user base, but its underlying AI technology, designed to curate and personalize news content, attracted Yahoo's interest. The acquisition aims to leverage Artifact's sophisticated content taxonomy and recommendation systems, boosting Yahoo News's personalization capabilities for its 185 million monthly visitors. While Artifact as an app will be discontinued, its technology is expected to impact Yahoo News and potentially other Yahoo platforms. (Read more at The Verge)

Washington judge rejects AI-enhanced video evidence in a homicide case
The judge cited concerns over the technology's transparency and potential to confuse jury members. The contested video, intended to bolster the defense of accused Joshua Puloka by enhancing cellphone footage of the 2021 shooting incident, was critiqued for altering original video data, raising questions about its reliability and accuracy. (Read the news at NBC News)

American Federation of Musicians (AFM) secures contract including AI protections
The deal includes compensation and other provisions for AI-generated music. Musicians whose work is utilized to prompt AI systems will receive enhanced compensation.Proponents say it is an important step in acknowledging the value and persistence of human creativity despite technological transformations. (Learn more at Variety)

Amazon Web Services (AWS) boosts startup support with free credits for AI model usage
AWS will offer up to $500,000 in credits to the latest Y Combinator startup cohort. This initiative is part of Amazon's effort to foster the startup ecosystem and encourage the choice of AWS for cloud services. Model providers covered include Anthropic, Meta, Mistral AI, and Cohere. The announcement comes on the heels of Amazon completing a significant $4 billion investment in Anthropic, solidifying a partnership wherein Anthropic will prioritize AWS for cloud services. (Read more details at Reuters)

Yum Brands, the parent company of Taco Bell, Pizza Hut, and KFC, embraces AI
The company advances toward an "AI-first" approach in its operations. The SuperApp, a tool designed to aid restaurant managers in operational tasks, is undergoing enhancements with generative AI. Additionally, Yum is exploring customer-facing AI applications, such as AI-driven voice ordering and image-recognition for drive-through optimization. (Read the report at The Wall Street Journal)

Chatbots outperform humans in persuasive debates, study finds
A study conducted by the Swiss Federal Institute of Technology in Lausanne showed that chatbots, specifically those powered by GPT-4, are more effective at persuading people in debates than human counterparts. In experiments involving 820 participants, individuals engaged in debates on various topics with either a human or a GPT-4-powered chatbot. The results showed an 81.7 percent higher likelihood of participants being swayed by the chatbots’ arguments when the AI had access to personal information from questionnaires. (Read more at New Scientist)

Microsoft and OpenAI plan to build Stargate, a $100 billion server farm and supercomputer
Sources say the planned data center will be multiple times larger and more powerful than Azure’s current units powering OpenAI’s models. Such a supercomputer center would cost twice what Microsoft has spent this year for all its capital expenditures for servers, buildings, and other equipment. A computer and server farm this large could require as much as 5 gigawatts of power, posing significant energy costs along with the need for AI chips. If approved, Stargate could launch as soon as 2028; smaller AI-dedicated data centers in Wisconsin are projected to launch in 2026. (Check out the report in The Information)

Subscribe to Data Points