Avatar

Artificial Intelligence is no longer science fiction. AI tools such as OpenAI’s ChatGPT and GitHub’s Copilot are taking the world by storm. Employees are using them for everything from writing emails, to proofreading reports, and even for software development. AI tools often come in two flavors. The first is a Q&A style where a user submits a “prompt” and gets a response (e.g., ChatGPT). The second uses autocomplete functionality that behaves like autocomplete for text messages (e.g., Copilot). While these new technologies are quite capable, they are evolving rapidly and are introducing new risks that organizations need to consider and manage.

A Dream Come True?

Let’s imagine that you are an employee in a company’s audit department. One of your reoccurring tasks is to run some database queries and put the results in an Excel spreadsheet. You decide to automate this task, but you don’t know how so you ask an AI for help.

Asking OpenAI’s ChatGPT if it is capable of job automation advice.

The AI asks for the details of the job so it can give you some tips, which you provide.

Asking ChatGPT to help automate the creation of a spreadsheet using database content.

You quickly get a recommendation to use the Python programming to connect to the database and do the work for you. You follow the recommendation to install Python on your work computer but because you don’t know programming, you ask the AI to write the code for you.

Asking the AI to provide the Python programming code.

It is happy to do so and quickly generates code that you download to your work computer and begin to use. In ten minutes, you’ve now become a developer and automated a task that previously took you several hours a week to perform. Perhaps you will keep this new tool to yourself; You wouldn’t want your boss to fill up your newfound free time with more responsibilities.

The Reality

Now imagine you are a security stakeholder at the same company, and you are trying to gain an understanding of the risks of AI technology use by employees. You now have employees with no developer training or programming experience installing developer tools. They are sharing confidential information with an uncontrolled cloud service, copying code from the Internet, and allowing internet-sourced code to communicate with your production databases. Since these individuals doesn’t have any development experience, they can’t understand what their code is doing, let alone apply any of your organizations software policies and procedures. They certainly won’t be able to find any security vulnerabilities in the code. Additionally, you probably won’t be aware that this new code is running in your environment. You won’t know where to find it for review. Software and dependency upgrades are also very unlikely since that employee won’t understand the risks outdated software can be.

4 Risks of AI

The risks identified in the scenario above can be simplified to two core issues:

  1. Users are running untrusted code on a corporate network while evading security controls and review.
  2. Users are sending confidential information to an untrusted third-party.

These concerns expand beyond AI-assisted programming.  When employees send business data to an AI, such as the context needed to help write an email or the contents of a sensitive report that needs review, confidential data might be leaked. Employees will use these AI tools to generate unsanctioned document templates, spreadsheet formulas, and other potentially flawed content. These risks must be addressed before organizations can safely use AI. Here is a breakdown of the top risks:

1. You don’t control the service

Today’s popular tools are 3rd-party services operated by the AI’s maintainers. They should be treated as any untrusted external service. Unless specific business agreements with these organizations are made, they can access and use all data sent to them. Future versions of the AI may even be trained on this data, exposing it to additional parties. Further, vulnerabilities in the AI or data breaches from its maintainers can lead to malicious actors gaining access to your data. This is a proven issue with a bug in ChatGPT that leaked data, and sensitive data exposure by Samsung.

2. You can’t (fully) control its usage

While organizations have many ways to limit what websites and programs are used by employees on their work devices, personal devices are not so easily restricted. If employees are using unmanaged personal devices to access these tools on their home networks it will be very difficult, or even impossible, to reliably block access.

3. AI generated content can contain flaws and vulnerabilities

Creators of these AI tools typically go to great lengths to make them accurate and unbiased, however there is no guarantee that their efforts are completely successful. This means that any output from an AI should be reviewed and verified. The reason people don’t treat it as such is due to the bespoke nature of the AI’s responses; It uses the context of your conversation to make the response seem written just for you.

It’s difficultfor humans to avoid creating bugs when writing software. AI written code can also have the same bugs. These bugs often introduce vulnerabilities that are exploitable by malicious actors. This is true even if the user is smart enough to ask the AI to find vulnerabilities in the code.

A breakdown of the AI-generated code highlighting two anti-patterns that tend to cause security vulnerabilities. The first anti-pattern is hardcoded credentials, and the second is raw database queries.

One example of a common AI introduced vulnerabilities is hardcoded credentials. This is not limited to AI and is one of the most common flaws among human-authored code. Since AIs won’t understand a specific organization’s environment and policies, it won’t know how to properly follow best practices unless specifically asked to implement them. To continue the hardcoded credentials example, an AI won’t factor in that an organization uses a service to manage secrets such as passwords. Even if it is told to write code that works with a secret management system, it wouldn’t be wise to provide configuration details to a 3rd party service.

4. People will use AI content they don’t understand

There will be individuals that put faith into AI to do things they don’t understand. It will be like trusting a translator to accurately convey a message to someone who speaks a different language. This is especially risky on the software side of things.

Reading and understanding unfamiliar code is a key trait for any developer. However, there is a large difference between understanding the gist of a body of code and grasping the finer implementation details and intentions. This is often evident in code snippets that are considered “clever” or “elegant” as opposed to being explicit.

When an AI tool generates software, there is a chance that the individual requesting it will not fully grasp the code that is generated. This can lead to unexpected behavior that manifests as logic errors and security vulnerabilities. If large portions of a codebase are generated by an AI in one go, it could mean there are entire products that aren’t truly understood by its owners.

A Few Recommendations

All of this isn’t to say that AI tools are dangerous and should be avoided. Here are a few things for you and your organization to consider that will make their use safer:

Set policies & make them known

Your first course of action should be to set a policy about the use of AI. There should be a list of allowed and disallowed AI tools. After a direction has been set, notify your employees. If you’re allowing AI tools, you should provide restrictions and tips such as reminders that confidential information should not be shared with third parties. Additionally, you should re-emphasize the software development policies of your organization to remind developers that they still need to follow industry best practices when using AI generated code.

Provide guidance to users of AI

You should assume your non-technical employees will automate tasks using these new technologies, so you need to provide training and resources on how to do it safely. A good starting point is to require storing all code in code repositories and mandating security reviews. Provide training for Non-technical employees that will need training in these areas. The importance of code and dependency reviews are key, especially with recent critical vulnerabilities caused by common third-party dependencies (CVE-2021-44228).

Use Defense in Depth

If you’re worried about AI generated vulnerabilities, or what will happen if non-developers start writing code, take steps to prevent common issues from magnifying in severity. For example, using Multi-Factor Authentication reduces the risk of hard-coded credential misuse. Strong network security, monitoring, and access control mechanisms are key to a strong security posture that makes you resilient to many vulnerabilities and forms of attack. Perform frequent penetration tests to identify vulnerable and unmanaged software before it is discovered by attackers.

Generate functions, not projects

Use these tools to generate code in small chunks, such as one function at a time. Avoid using them broadly to create entire projects or large portions of your codebase at once, as this will increase the likelihood of introducing vulnerabilities and make flaws harder to detect. It will also be easier to understand generated code, which is mandatory for using it. Perform strict format and type validations on the function’s arguments, side-effects, and output. This will help sandbox the generated code from negatively impacting the system or accessing unnecessary data.

Use Test-Driven Development

One of the advantages of test-driven-development (or TDD) is that you specify the expected inputs and outputs of a function before implementing it. This helps you decide what the expected behavior of a block of code should be. Using this in conjunction with AI code creation leads to more understandable code and verification that it fits your assumptions. TDD lets you explicitly control the API and will enable you to enforce assumptions while still gaining productivity increases.

Final Thoughts

These risks and recommendations are nothing new, but the recent emergence and popularity of AI is cause for organizations to review and readdress them. As these tools continue to evolve, it is likely that many of these risks will diminish. AI response and code quality will improve. Future versions will even implement additional controls such as automatic security reviews before sharing code with users. Self-hosted AI utilities will become widely available. Until then, there will soon be more options for business agreements with AI creators.

I am excited about the future of AI and believe that it will have a positive impact on business and technology. We have yet to see what influence it will have on society as a whole, but it’s probably not a stretch to say it will be significant.

If you are looking for help navigating the security implications of AI, let Cisco be your partner. With experts in AI and SDLC, and decades of experience designing and securing the most complex technologies and networks, Cisco CX is well positioned to be a trusted advisor for all your security needs.



Authors

Alec Gleason

Security Architect

CX Assessment and Penetration Team