Artificial intelligence (AI) usage continues to trend higher, finding prominence in a variety of applications. This includes those that are having a significant impact on how we communicate ideas, like OpenAI’s ChatGPT and Google Bard. This integration of AI into our everyday world requires that our digital conversations become more secure, enabling data loss prevention. Monitoring, assessing, and maintaining the confidentiality and integrity of critical information is now a necessity. The potential exposure of information requires resilient and adaptable usage to address the ever-evolving threat landscape. But the big question is, how can we accomplish this?
The power of Data Loss Prevention functionality
Fortunately, there is a solution – and it revolves around Data Loss Prevention (DLP) functionality – a feature found in Cisco Umbrella, a cloud security platform that provides users a first line of defense against cybersecurity threats on the internet. DLP is an integral functionality within Umbrella that helps prevent sensitive data from being leaked outside an organization’s network. It uses intricate detection techniques to identify, monitor, and protect data-in-use (endpoint actions), data-in-motion (network traffic), and data-at-rest (data storage).
Umbrella multimode cloud DLP functionality analyzes outbound web traffic in-line and out-of-band to provide unified control over sensitive data leaving your organization. It’s easy to deploy and manage, with flexible policies incorporating pre-built, customizable data identifiers. With Umbrella multimode cloud DLP, you can accomplish the following.
- Inspect data in-line in real time with full SSL inspection via Secure Web Gateway (SWG) proxy.
- Use the SaaS API-based scanning to inspect data out-of-band at rest, without SWG proxy, but with near real time enforcement.
- Unify in-line and out-of-band policies and reporting in a single interface.
- Create flexible, customizable policies with 80+ pre-built dictionaries.
- Meet compliance requirements.
Applying DLP to ChatGPT interactions
ChatGPT, developed by OpenAI, holds immense potential for handling various tasks, from customer support to business operations. But an AI’s utility should not come at the cost of data security or lack of data protection. That’s why DLP works by identifying sensitive data, such as personally identifiable information (PII), Federal Contract Information, Controlled Unclassified Information, and other types of sensitive data to help prevent unauthorized access or sharing. When applied to ChatGPT, the DLP functionality can monitor and control data being sent to the AI system. And if a user attempts to input sensitive data, the DLP function can block this action.
Why is this important? In today’s age of digital transactions and interactions, the confidentiality, integrity, and privacy of data is critical. Umbrella DLP, when used in conjunction with AI applications like ChatGPT, helps keep sensitive data from being inadvertently shared or exposed. This is particularly crucial for government organizations that use AI applications for internal processes or customer interactions, as disclosure of data from either inadvertent sharing or insider misconduct could lead to regulatory compliance actions, reputational damage, and potentially a threat to national security.
DLP also contributes to a defense-in-depth culture of security within an organization. By implementing it, organizations show their commitment to data security, building trust and resiliency with clients and stakeholders while enhancing their overall cybersecurity posture.
How to create a Cisco Umbrella DLP rule for ChatGPT
Cisco Umbrella multimode cloud DLP functionality is easy to deploy and manage with flexible policies incorporating pre-built, customizable data identifiers. But what is the best approach for integrating it with ChatGPT? Recently Chris Ireland, Cisco Technical Security Architect, setup Umbrella in his laboratory to find out. From his findings, he has offered us the following example of how to set up Umbrella to use DLP to protect PII information with ChatGPT.
Step 1: Define your data classification
Within your Cisco Umbrella Console, navigate to “Policies” — > “Policy Components” — > “Data Classification”.
The DLP policy monitors or blocks content based on the rules configured for the policy. The rules use the following to determine what types of data should be monitored or blocked.
- Data identifiers describe the content the DLP monitors or blocks, including PII that may identify an individual (such as financial account numbers, medical records, passport or government identification numbers, or credit card numbers). Data identifiers can also describe certain content an organization may wish to monitor or block within its network traffic, such as discriminatory or aggressive content. Umbrella provides a collection of built-in data identifiers, plus you can create custom identifiers based on the built-in data identifiers.
- Data classifications are groups of data identifiers combined for the purpose of monitoring or blocking closely related content. For example, you can create a data classification that encompasses medically related content by including the built-in identifiers for ICD codes, drug names, prescription names, health conditions, and national drug code names. The classification, when applied to a rule in the DLP Policy, will monitor or block content matching those identifiers.
NEXT > Within the “Data Classification” screen, click the “Add” button to create a new Data Classification.
NEXT > Assign a “Data Classification Name” and a “Description” (optional) and select the “Data Identifiers” you want Cisco Umbrella to scan for from the list of built-in identifiers, or you can choose to create and assign custom identifiers (see Figure 1).
NEXT > When you’re finished assigning data identifiers to your data classification, click the “Save” button.
Figure 1: Add new data classification
Step 2: Assign a DLP Policy Rule
Within your Cisco Umbrella Console, navigate to “Policies” — > “Data Loss Prevention Policy”.
NEXT > Within the “Data Loss Prevention Policy” dashboard, click the “Add Rule” button and select “Real Time Rule” to create a new rule (see Figure 2).
Figure 2: Data Loss Prevention policy dashboard
NEXT > Within the “Add New Real Time Rule” page, assign a “Rule Name” a “Description” (optional) and select the “Severity” of the rule (see Figure 3).
Figure 3: Add new time rule
NEXT > Scroll down the page until you get to the “Data Classifications” section and assign the Data Classification you created earlier (see Figure 4).
Figure 4: Data Classifications section
NEXT > Scroll down the page until you get to the “Identities” section and assign an Identity in which you want the DLP rule to be applied to (see Figure 5).
- Identity is an internet-capable entity that Umbrella protects through policies and monitors through reports. An identity can be a high-level entity within your organization, for example, an entire network. Or it can be very granular, like Active Directory security groups, specific Active Directory users, and/or Roaming Computers.
Figure 5: Identities section
NEXT > Scroll down the page until you get to the “Destinations” section and choose the option to “Select Destinations Lists and Applications for Inclusion”.
NEXT > Scroll down the list of available applications and select “OpenAI ChatGPT” and “OpenAI ChatGPT API” for inclusion (see Figure 6).
Figure 6: Destinations section
Next > Scroll down to the bottom of the page until you get to the “Action” section. From the drop down menu, set the action to “Block” and click the “Save” button (see Figure 7). Your ChatGPT DLP rule is now complete.
Figure 7: Actions section
Step 3: Testing and End User Experience
Within a web browser, navigate to https://chat.openai.com/ to bring up the ChatGPT interface.
You’ll notice that any text submitted in the “Send a Message” box, that does not contain PII as defined by the ChatGPT DLP rule, is successfully transmitted and the conversation is stored within the interface. In the following example (see Figure 8), the text “What can you tell me about Cisco Umbrella DLP capabilities?” was successfully transmitted and ChatGPT AI responded with pertinent information.
Figure 8: The ChatGPT interface
In the next example (see Figure 9), an attempt is made to submit the following PII text: “What can you tell me about SSN: 323-23-2323?” However, due to the presence of PII as defined by the ChatGPT DLP rule, Umbrella successfully blocked the submission. The conversation was not stored within the interface, and ChatGPT AI responded:
“An error occurred. Either the engine you requested does not exist or there was another issue processing your request. If this issue persists, please contact us through our help center at help.openai.com.”
Figure 9: Umbrella successfully blocked PII information within ChatGPT
Step 4: Cisco Umbrella DLP Reporting
Within your Umbrella Console, navigate to “Reporting” — > “Additional Reports” — > “Data Loss Prevention” (see Figure 10).
- Data violations detected through the Real Time and SaaS API DLP rules are logged as part of the unified Events view of the DLP Report.
- Data violation log entries will display the Event Type, Severity, Identity or File Owner, Destination, Rule, Action, and the Date and Time stamp of the violation.
Figure 10: DLP reporting
Selecting the “…” link to the right of the DLP violation log entry will bring up additional event details, including contextual information about the DLP violation (see Figure 11).
Figure 11: Additional event details
ChatGPT is just the beginning
The combination of Cisco Umbrella’s SIG DLP functionality with AI applications like ChatGPT can be a key step forward for enhancing digital security on your network and for your users. By integrating AI with their existing or planned Cisco Umbrella security solution, government agencies of all sizes can leverage the vast potential of AI while helping keep their sensitive data secure. We should always remember that the role of AI is one of helper, making our lives easier. That’s why keeping its use secure is essential and is quickly becoming top of mind for IT leaders in government.
Additional resources on Data Loss Prevention
- Discover more on Cisco Umbrella Data Loss Prevention capabilities
- Understanding FedRAMP: How Cisco Umbrella is Getting Authorization
Thanks for providing a great example in such detail. We’ll give it a try.
Awesome James!
Interesting. Didn’t realize we had this capability in our Duo deployment.
Love the screenshots. So very helpful to our team.
Awesome Susan!
Thank you for presenting in a clear and understandable manner. We are eager to use such approaches in our strategy.
Wow, great blog. Didnt know we could do this with Duo. So great to know. Would like more blogs that breadown details of features we may not realize we have access to. Especially that we could apply for zero-trust.
So very helpful to our team.