SWE-Kit helps developers build their own assistants New Tencent open model outperforms Llama 3 405B

Published

Nov 8, 2024

Reading time

3 min read

Twice a week, Data Points brings you the latest AI news, tools, models, and research in brief. In today’s edition, you’ll find:

Oasis builds interactive Minecraft-style games in real-time
Microsoft releases system to coordinate AI agents
OpenAI’s new predicted outputs feature speeds up generation
GitHub launches Spark, a platform to build and host micro-apps

But first:

An open-source toolkit to create custom AI coding agents

Composio launched SWE-Kit, an open-source, customizable toolkit for building AI-powered coding agents that can handle pull requests, code analysis, or other elements of software development. The toolkit supports various language models, can integrate with different agentic frameworks, and includes features like Code RAG, Code Analyser, and Code LSP for seamless codebase interaction. SWE-Kit’s flexibility and ease of use, along with its ability to run in a Docker container, make it an attractive option for developers looking to create or customize AI coding assistants. (Composio)

Tencent unveils largest open mixture-of-experts AI model

Tencent released Hunyuan-Large, an open-weights AI model with 389 billion total parameters and 52 billion active parameters. The model outperforms similar-sized models on benchmarks like MMLU and MATH, demonstrating improved understanding and reasoning capabilities across various tasks. Tencent’s release aims to advance AI technology, but the model’s license restricts usage for EU users and large companies, and it has limitations on China-sensitive topics. (Hugging Face and arXiv)

Generative AI model creates interactive video games on the fly

Oasis, a new AI system from Etched and Decart, generates Minecraft-style open-world games that respond to keyboard and mouse inputs in real-time. The system currently runs at low resolution on powerful graphics cards, but its creators plan to use specialized chips to deliver high-quality video to many users simultaneously. Etched predicts AI-generated interactive video will become a major part of internet content within ten years. (Etched)

Microsoft tackles problems with teams of specialized agents

The company released Magentic-One, an open-source multi-agent system designed to handle tasks requiring complex coordination. The system employs an Orchestrator agent directing four specialized agents to perform tasks like web browsing and coding. Magentic-One matches top performers on multiple benchmarks and offers advantages over single-agent systems due to its modular design. (Microsoft)

Predicted Outputs can speed up LLM responses for minor changes

OpenAI introduced a feature called Predicted Outputs that can significantly reduce latency when making small changes to existing text or code. The feature allows developers to pass in existing content as a prediction, which is particularly useful for tasks like refactoring code with small modifications. Predicted Outputs are currently only supported by GPT-4o and GPT-4o-mini models and have some limitations, including incompatibility with certain API parameters like function calling and multiple completions. (OpenAI)

GitHub’s Spark lets users build custom apps without coding

GitHub introduced Spark, a platform that enables users to create small, personalized applications without coding. The system uses AI to convert natural language descriptions into functional apps, which can be accessed on desktop and mobile devices. Spark includes a managed runtime environment for app hosting, data storage, and AI model integration. Users can share their creations, allowing others to use or modify them, potentially fostering a community of personalized app developers. (GitHub)

Still want to know more about what matters in AI right now?

Read this week’s issue of The Batch for in-depth analysis of news and research.

This week, Andrew Ng reflected on the role of social media manipulation in recent elections, emphasizing that generative AI likely wasn’t the primary tool for spreading disinformation.

“Everyone has a role to play in protecting democracy, and in tech, part of our duty will be to make sure social media platforms are fair and defend them against manipulation by those who seek to undermine democracy.”

Read Andrew’s full letter here.

Other top AI news and research stories we covered in depth: Anthropic empowers Claude Sonnet 3.5 to operate desktop apps, with safety and security warnings; automation transforms U.S. shipping ports, heightening labor tensions as robots take on more tasks on the loading docks; a new study, COMPL-AI, assesses large language models’ compliance with the EU’s AI Act; and OpenAI’s MLE-bench introduces a new way to test AI coding agents by having them train algorithms.

Subscribe to Data Points