Skip to content

Search the site

Jailbreaking LLMs and abusing Copilot to “live off the land” of M365

Prompt injections to break safeguards on widely available LLMs meanwhile are also widely available.

jailbreaking llms lolcopilot

A team of security researchers have released an offensive security tool that allows users to abuse Copilot to” live-off-the-land” of Microsoft 365.

It can be used to custom-crafting spear-phishing emails in the compromised users’ style and exfiltrating data without their knowing.

Researchers at security startup Zenity showcased the “LOLCopilot” and its capabilities at the annual Black Hat security conference this week – it works with the default configuration in any M365 Copilot-enabled tenant.

They posted it on GitHub as part of the “Power-Pwn” offensive security kit; adding it to a growing plethora of freely available tools for attackers to use against AI-augmented environments or LLMs themselves – see, for example, Teamsphisher, which lets attackers use Teams to send a message across tenants to source sensitive information or execute malware.

Other hyperscaler products have also been found exploitable via prompt injection in recent months, including Google's AI studio.

Zenity CTO Michael Bargury and team demonstrated a range of attacks: how to use it to automate data gathering, like finding an O365 users’ top collaborators, find passwords, and even manipulate banking information. 

Using  prompt injection attacks, his team demonstrated how an attacker can take over Copilot remotely and get it to act as a “malicious insider.” 

Such an attacker can, for example, “tell Copilot to go to any site we wish (as long as it appears on Bing) and fetch [a watering hole-style backlink] back to present to the user [following a] completely innocent question.”

Bargur, a former senior security architect in Microsoft's Azure Security CTO office and now a project leader for the OWASP Low-Code/No-Code Top 10 Security Risks project as well as CTO of the startup, further demonstrated how an attacker could quietly insert an HTML tag into an email to replace a bank account number with that of the attacker.

"I can enter your conversation from the outside and take full control of all of the actions that the copilot does on your behalf and its input… Therefore, I'm saying this is the equivalent of remote code execution in the world of LLM apps” Bargury told Dark Reading’s Jeffrey Schwartz

See also: Meta’s new “CRAG” benchmark exposes depth of RAG challenge

Azure’s CTO Mark Russinovich earlier detailed Microsoft’s ongoing work on AI security here. Industry-wide efforts to harden LLMs against prompt injection and other attacks continues, but bypasses are easy to find. 

Anthropic says of its new Claude models, for example, that they were “developed be as trustworthy as they are capable. We have several dedicated teams that track and mitigate a broad spectrum of risks, ranging from misinformation and CSAM to biological misuse…”

But with a single (superficially incomprehensible) prompt injection found on social media The Stack could trigger the freely available Claude 3.5 sonnet model to spew out X posts for inciting racial hatred. We won't share the prompt nor the outputs. (Those with a robust appetite for filth wanting to see examples of an LLM gone rogue may be amused by outputs triggered by a Red Teamer and prolific LLM-breaker “Pliny the Prompter; they may also be offended.)

Need some help Red Teaming? Microsoft, which despite these issues with Copilot, has arguably been ahead of the curve on LLM security, has newly released a “Python Risk Identification Tool for generative AI” (PyRIT) – an “open access automation framework to empower security professionals and machine learning engineers to proactively find risks in their generative AI systems” which is available on GitHub under an MIT licence. 

As AI robotics company "Figure" meanwhile showcased a humanoid robot it dubbed "the world's most advanced AI hardware", "Pliny" commented: "We are one punched baby away from a global regulatory crackdown on this entire industry. Is it wise to embody jailbreakable LLMs given the current state of adversarial robustness? I sure hope the red teaming process is more extensive than 'OpenAI already tested that model and said we’re Zenitygood, ship it!' The stakes are in the stratosphere."

See also: Uber’s massive tech overhaul continues amid profit boost

Latest