ChatGPT-maker OpenAI is preparing to release a new AI agent designed to perform tasks autonomously on a user's computer. According to reports, this AI agent, currently known internally as "Operator," will automate functions such as writing code and booking travel. OpenAI plans to launch a research preview of the agent in January 2025, and it will be accessible through the company's API for developers.
This AI agent is part of OpenAI's broader initiative to develop tools that can handle tasks across the web. The upcoming tool will be able to automate various tasks directly through a web browser, making it a versatile assistant for users. OpenAI CEO Sam Altman recently shared insights into this development, suggesting that "agents" will be the next major breakthrough in AI, with improvements coming in future models.
The release of OpenAI’s AI agent comes at a time when other tech giants, like Google, are also working on similar AI-driven tools. Google's "Project Jarvis," a tool powered by its Gemini model, is designed to automate tasks on the web, especially within Google Chrome. Jarvis will be able to interpret screenshots, click buttons, and input text, functioning as a versatile assistant for users.
Microsoft, a key investor in OpenAI, has introduced its new Magentic-One system, an innovative framework designed to streamline the coordination of multiple AI agents to handle complex tasks. The system leverages an "Orchestrator" agent that directs a range of specialized AI agents, each tailored for specific functions like web browsing, file management, coding, and interacting with computer terminals. This advanced system aims to enhance the efficiency of task execution by allowing different AI agents to work collaboratively under the guidance of the Orchestrator, ultimately improving productivity and simplifying complex workflows.
Anthropic has also ventured into the AI space with its new "computer use" feature, which was made available in public beta last month. This capability enables Claude AI to take direct control of a computer, allowing it to perform tasks such as moving the cursor, clicking buttons, and typing text. The addition of this feature expands Claude's functionality, providing users with an interactive and efficient tool for automating computer-based tasks.
The growing interest in AI agents reflects companies’ efforts to explore new revenue opportunities beyond traditional language models. According to reports, OpenAI is actively working on a variety of agent-based projects, with "Operator"—a tool designed to automate web browser tasks—being the first to near completion. This move is part of a broader trend to expand AI capabilities into more practical, task-oriented applications.
AI agents, such as those being developed by OpenAI and Google, are built on large language models and are capable of performing complex, multi-step tasks with little supervision. These agents go beyond traditional chatbots by not only responding to queries based on training data but also remembering past interactions and making decisions to plan and execute future actions. This allows them to offer more personalized and efficient task management.
See What’s Next in Tech With the Fast Forward Newsletter
Tweets From @varindiamag
Nothing to see here - yet
When they Tweet, their Tweets will show up here.