News

This wasn't a one-off. In a text-only version of the same test, Claude Opus 4 chose blackmail 96 percent of the time. Google's Gemini 2.5 Flash nearly matched that rate. OpenAI's GPT-4.1 and xAI's ...
Elon Musk has announced plans to retrain Grok with statements he calls "politically incorrect, but nonetheless factually true," claiming this will correct and expand all human knowledge. Previously, ...
11ai also supports custom MCP servers. Teams can connect internal tools or specialized software to 11ai through their own MCP servers, extending the assistant’s functionality to fit their workflows.
Apple executives have been talking internally about potentially buying AI startup Perplexity AI, according to a Bloomberg report. The idea is to grab both the technology and talent for Apple's own ...
Google has handed over the Agent2Agent (A2A) protocol to a new open source project led by the Linux Foundation with the aim of creating a uniform communication standard for AI agents from different ...
The assistant is powered by a new language model called "Mu," which Microsoft developed specifically for the task. Mu has 330 million parameters and uses an encoder-decoder architecture that, ...
Snake training outperforms math datasets in some areas Training on Snake and rotation problems nudged the base model slightly ahead of MM-Eureka-Qwen-7B, a model specifically trained on math data, ...
MiniMax says it's working to improve generation speed, stability, and add new features beyond the current text-to-video and image-to-video options. Competing platforms like Runway already offer more ...
Mark Zuckerberg is personally setting up a new team of 50 experts, known as the 'Superintelligence Group', to address Meta's backlog in AI development. He is conducting the personnel interviews ...
Salesforce has launched CRMArena-Pro, a benchmark designed to evaluate AI agents in practical business situations, including multi-step conversations and data protection checks within CRM systems.
LAION and Intel have released Empathic-Insight, a suite of models and datasets that can analyze facial images and audio files across 40 emotion categories, covering not only emotional but also ...
The RELIC test works by giving an AI a formal grammar - essentially a precise rule set that defines an artificial language - along with a string of symbols. The model must then decide whether the ...