Articles
Reading Between the Pixels: Failure Modes in Vision Language Models
6 min read
This post is Part 2 of a two-part series on multimodal typographic attacks. In Part 1 of “Reading Between the Pixels,” we demonstrated that text–image embedding distance correlates with typographic prompt injection success: conditions that push....
Defining Model Provenance: A Constitution for AI Supply Chain Safety and Security
5 min read
When it comes to AI models, one of the hardest questions to answer is deceptively simple: where did this model actually come from? We addressed part of this problem with Model Provenance Kit, an open-source tool that fingerprints models at the.....
Introducing Model Provenance Kit: Know Where Your AI Models Come From
7 min read
The importance of understanding a model’s origins has been a frequent topic of discussion among researchers and industry experts, and our own AI research confirms that AI supply chain security remains a weak link. Tracking where models come from....
Reading Between the Pixels: Assessing Prompt Injection Attack Success in Images
6 min read
This post is Part 1 of a two-part series on multimodal typographic attacks. This blog was written in collaboration between Ravi Balakrishnan, Amy Chang, Sanket Mendapara, and Ankit Garg. Modern generative AI models and agents increasingly treat...
Introducing the Cisco LLM Security Leaderboard: Bringing Transparency to AI Security
4 min read
Today, Cisco launched the LLM Security Leaderboard, a comprehensive resource for evaluating model risk and susceptibility to adversarial attacks. By providing transparent, adversarial evaluation signals, this leaderboard contextualizes model performance metrics against evaluations of how models handle malicious prompts, jailbreak attempts, and other manipulation strategies. The tool empowers organizations with a clear, objective understanding of model risk by mapping threats to our AI Safety and Security Framework taxonomy, and informs defense-in-depth approaches to AI deployments.
Identifying and remediating a persistent memory compromise in Claude Code
4 min read
We recently discovered a method to compromise Claude Code’s memory and maintain persistence beyond our immediate session into every project, every session, and even after reboots. In this post, we’ll break down how we were able to poison an AI.....
Cisco explores the expanding threat landscape of AI security for 2026 with its latest annual report
3 min read
Thank you to all of the contributors of the State of AI Security 2026, including Amy Chang, Tiffany Saade, Emile Antone, and the broader Cisco AI research team. As artificial intelligence (AI) technology and enterprise AI adoption advance at a rapid pace, the security landscape around it is expanding faster, leaving many defenders struggling to keep […]
AIUC-1 operationalizes Cisco’s AI Security Framework
1 min read
This blog is jointly written by Amy Chang, Hyrum Anderson, Rajiv Dattani, and Rune Kvist. We are excited to announce Cisco as a technical contributor to AIUC-1. The standard will operationalize Cisco’s Integrated AI Security and Safety Framework (AI Security Framework), enabling more secure AI adoption. AI risks are no longer theoretical. We have seen […]
Personal AI Agents like OpenClaw Are a Security Nightmare
4 min read
This blog is written in collaboration by Amy Chang, Vineeth Sai Narajala, and Idan Habler Over the past few weeks, Clawdbot (then renamed Moltbot, later renamed OpenClaw) has achieved virality as an open source, self-hosted personal AI assistant agent that runs locally and executes actions on the user’s behalf. The bot’s explosive rise is driven by […]
- 1
- 2