Anthropic’s new AI model can control your computer

Anthropic’s latest AI model, out now, can control your computer – “looking at a screen, moving a cursor, clicking, and typing text.”

The capability is available in the company’s upgraded Claude 3.5 Sonnet and new model, Claude 3.5 Haiku. It represents what Anthropic said was the “first frontier AI model to offer computer use in public beta.”

“We've built an API that allows Claude to perceive and interact with computer interfaces. This API enables Claude to translate prompts into computer commands. Developers can use it to automate repetitive tasks, conduct testing and QA, and perform open-ended research” the firm said.

See also: No LLMs aren’t about to “autonomously” hack your company

Initial behaviour is buggy, Anthropic cautioned today: “Claude 3.5 Sonnet's current ability to use computers is imperfect. Some actions that people perform effortlessly—scrolling, dragging, zooming—currently present challenges. So we encourage exploration with low-risk tasks.”

The company added in a pair of write-ups today: “On OSWorld, which evaluates AI models' ability to use computers like people do, Claude 3.5 Sonnet scored 14.9% in the screenshot-only category… nowhere near human-level skill (which is generally 70-75%), but it’s far higher than the 7.7% obtained by the next-best AI model in the same category.

“Since Claude can interpret screenshots from computers connected to the internet, it’s possible that it may be exposed to content that includes prompt injection attacks” it warned, sharing a reference implementation.

We're trying something fundamentally new.

Instead of making specific tools to help Claude complete individual tasks, we're teaching it general computer skills—allowing it to use a wide range of standard tools and software programs designed for people. pic.twitter.com/42u8VeTvXd
— Anthropic (@AnthropicAI) October 22, 2024

"Anthropic AI’s (Claude Sonnet 3.5) Computer Use is nuts. I’m going to mute the part of my brain that immediately jumps into security-dork threat modeling mode and instead, step back and realize just how natural an extension of the GenAI technologies this is and what this will mean. This is pretty amazing" commented Last Pass's Chief Secure Technology Oficer Chriofer Hoff: "The Security Dork part of my brain is screaming at me now, demanding proof-of-life that I, myself, have not been subsumed into the AI hype matrix…Can you imagine when this is available beyond just an API and the capability is integrated into the operating system?!"

"I just built [a] VM using the replit interface on my phone and instructed [
Anthropic's] Computer Use to scrape the presenters and agenda at ONUG today in NY and output the following result as json. Then trained my voice assistant with the json and can now create custom agendas based on interest, get a primer on the topics with questions to ask presenters etc." commented one IBM VP, James Walker, on LinkedIn. "It took about 10 minutes using my iPhone whilst multitasking...

"It’s a trivial use case but the ease of use, ability to fix problems, navigate websites etc is mind blowing. Only problem is that it is super heavy on tokens and I burned through my entitlement and got rate limited."

Haiku will be available via API, Bedrock, Vertex

Claude 3.5 Haiku meanwhile is the next generation of its fastest model and outperforms GPT-4o and the original Claude 3.5 Sonnet.

It will be released later this month said Anthropic and be available via first-party API, Amazon Bedrock, and Google Cloud’s Vertex AI – initially, it added, as a text-only model and with image input to follow.

Pre-deployment testing was conducted by the US AI Safety Institute (US AISI) and the UK Safety Institute (UK AISI) the company added.

More detail and reaction to follow. What are your views on the potential uses here? Early testers, Red Teamers,tinkerers, share thoughts

In other AI news, Radiohead singer Thom Yorke and author Kazuo Ishiguro have joined over 10,000 signatories to a campaign urging a ban on the “unlicensed use of creative works for training generative AI.”

British composer Ed Newton-Rex started the campaign.

He told The Guardian: “There are three key resources that generative AI companies need to build AI models: people, compute, and data.

“They spend vast sums on the first two – sometimes a million dollars per engineer, and up to a billion dollars per model.

“But they expect to take the third – training data – for free."

Anthropic is among those named in the many class action suits against AI companies on this issue; law firm Mishcon De Raya has a tracker here.