Friday 24 Jan 2025
By
main news image

(Jan 24): OpenAI is rolling out an artificial intelligence (AI) tool that can help book flights, plan grocery orders, and even complete purchases for users, joining a growing number of tech companies betting on so-called AI agents that act on a person’s behalf. 

The service, called 'Operator', can carry out a wide range of tasks by using the internet much in the way a human would, including navigating to a website, typing and clicking buttons, OpenAI said on Thursday. Operator’s software works by combining some of OpenAI’s computer-vision features with multi-step problem-solving capabilities meant to mimic how people reason, the company said. Bloomberg News first reported on OpenAI’s plans for Operator in November. 

Initially, OpenAI is releasing what it calls a “research preview” of Operator online to a limited number of US customers who pay US$200 (RM884.60) per month for the recently introduced ChatGPT Pro subscription. The company said it hopes to learn from Operator’s early users so it can improve the product, and plans to offer it to more paid customers over time. It also plans to release additional AI agents in the coming months, OpenAI chief executive officer Sam Altman said during a livestream introducing the product.

The Operator roll-out is part of a broader industry push towards agents, or AI software that can complete multi-step tasks for users with minimal supervision. OpenAI-backer Microsoft Corp and rival Anthropic have launched their own takes on agent software, as have a number of other start-ups. The companies hope such tools can save users time with their personal and professional tasks, and thereby live up to the long-held promise that AI will make people more productive. Altman previously said agents will be “the next giant breakthrough” for AI.

In a demonstration of the tool on Wednesday, Peter Welinder, OpenAI’s vice-president of product, and Yash Kumar, who leads product and engineering for Operator, showed how the tool could look for a restaurant reservation, or recognise the items on a handwritten list to prepare an online grocery order. Kumar said OpenAI partnered with a number of companies on the tool, including Instacart, OpenTable, Uber and StubHub, in part to ensure Operator works well on their websites.

After Kumar prompted Operator to use OpenTable to book a table at San Francisco restaurant Beretta around 7pm, the tool opened a remote browser window, went to OpenTable’s website and searched for the restaurant — but couldn’t initially find it. As it turned out, OpenTable was set to search for restaurants in Iowa, not California. But Kumar had previously instructed Operator to search within a certain San Francisco ZIP code for relevant queries, so on its own, the tool switched to searching OpenTable in San Francisco, and then shared a reservation for him to approve.

“We see a lot of potential for how this can evolve from small things to medium things to large things,” Kumar told Bloomberg News. He personally has been using Operator to do his grocery shopping, and to book reservations at tennis courts.

Operator can also carry out multiple tasks simultaneously. For example, a user might prompt the service to find a hotel in Vancouver that has Peloton bikes in its gym and then, before it has finished, ask Operator to find an American Girl doll bed on Craigslist. As long as a user has confirmed that Operator can carry out a transaction — such as buying a pair of leggings from an online store — and has input any required credentials and payment information, the tool should be able to complete a purchase, Kumar said.

OpenAI plans to eventually release the AI model behind Operator for developers to use to build their own agents.

With those capabilities, however, AI agents like Operator also present new safety and security risks, given the potential for AI to make mistakes or be misused by bad actors. It’s one thing for a chatbot to spit out an inaccurate response to a question about a historical event and quite another for an agent to make mistakes with someone’s credit card.

OpenAI said Operator is meant to turn down some tasks, such as actions involving banking and anything the company considers harmful. There are a number of actions the tool will not complete, and will instead alert a user to carry out, including logging in to websites, providing payment information and filling out CAPTCHAs, the company said. Operator should ask a user for approval before it does things like ordering something online. For some tasks, such as writing emails, the service will require a user to supervise it, OpenAI said. Users can also take control of any tasks that are in process and pause them if need be.

“The user should always feel they are in control,” Kumar said.

Uploaded by Tham Yek Lee

      Print
      Text Size
      Share