Understanding the Assignment: A University IT Admin’s Roadmap for Secure AI Chatbot Deployment

If you manage IT for a university, you’ve likely seen not only students, but also professors and departments that are eager to use AI.  There are many exciting opportunities for language models to help with education.  For example, courses may wish to develop specific chatbots or for students to ask about course material.   Or perhaps, students can have question and answer sessions with language models on the subject, providing deeper engagement and understanding of the subject.  

This can be great for education, but it means that perhaps each chatbot is trained on specific course material, or developed with specific interactions in mind, based on the learning goal.   University system administrators may be seeing dozens or even hundreds of chatbots emerging.  This creates a special set of privacy and security concerns, as you need to have a framework and test environment across multiple apps.  University IT administrators need to set policies and frameworks for multiple courses, and then potentially monitor and test the results.  

Constraints and Concerns in the university environment

There are a few constraints and concerns that both make your job difficult. While these may feel normal to you. They are actually challenging and different from what might be expected in a corporate environment.

You are responsible for a range of educational technology, not just chatbots.  A lot of technology is needed to serve students and faculty.  AI apps and chatbots aren’t your only responsibility.  Your department may also be overseeing apps, LMS like Moodle, or developing custom research applications for academic labs, and so on.  This limits your ability to develop sophisticated protection mechanisms for the chatbots.  

Your staff  and resources are limited.   You have limited AI expertise on your staff. With so much market demand for AI expertise, it can be hard with a non-profit or educational budget to hire specifically AI specialists, and even harder to hire for AI experts with privacy and security expertise.  Furthermore, you have limited resources for new software or tools, and time is limited to build everything from the ground up.    

Getting professors to align is like herding cats.   Your college professors are amazing and brilliant.  Also, they like their independence,  and may go off and build things without checking in with you first.  It can be difficult to ensure that they are building chatbots that meet all your policies.  However, given your above constraints, you can’t chase them all down.  You need to find solutions that monitor and test the apps across the board, not one-by-one.

Reputation risk, academic freedoms, and student safety. Unfortunately, there are examples of LLMs and AI that take bad data from society and spit it back out in the form of offensive language or biased decisions.   At the same time, your university is concerned about protecting the students, educators, and staff from harm.  This can place special constraints on language.  You may particularly want to avoid course chatbots that perpetuate unfair or harmful bias and language. 

Students will push the limits of your systems.  Some of your students will test the limits and capabilities of the chatbots. Even students that are not malicious could be curious about the  boundaries. This means they may go beyond the intended use of the chatbot to see what responses they get.  This can have unintended consequences like overwhelming the model and creating a “model denial of service,” or perhaps it just uses up a bunch of tokens and creates an expensive bill.  Alternatively, they jailbreak the model to access secure information or get it to say offensive things.  

Due to your context and constraints, this article is organized around solutions, not problems. However, there are plenty of resources on the potential risks of LLM apps.  If you prefer to wrap your mind around those, see the box below. I’ve selected three resources on LLM app risks based on societal, security, and privacy harms.  

Three resources to learn more about potential LLM app risks.

  • Societal: A taxonomy of AI harms that includes fairness, accountability, and transparency.  The draft taxonomy by the US National Institute of Standards and Technology is a good place to start. https://www.nist.gov/system/files/documents/2021/10/15/taxonomy_AI_risks.pdf  This is a decent taxonomy even for non-Americans.  The EU AI Act is targeted at general purpose models, not LLM apps.  You might still enjoy reading that risk classification to get an idea of harms.  
  • Security: For potential security harms OWASP top 10 is a list of the most critical vulnerabilities found in applications utilizing Large Language Models.  https://owasp.org/www-project-top-10-for-large-language-model-applications/
  • Privacy: The Future of Privacy Forum has a nice explanation of LLMs and privacy risks. https://fpf.org/blog/lets-look-at-llms-understanding-data-flows-and-risks-in-the-workplace/

Pre-launch: Establish the data protection framework

Likely, the university already has a policy regarding sensitive data and how it can be used in apps and software. Usually, this is based on risk.  For example, the policy might (and should) state that student data and employee PII is typically considered high risk and can’t be used for such apps.  For chatbots that can only be accessed within the course or university setting, some low-risk data could be used.  For lower risk data, apps would still need to be protected and sandboxed.  

Most cloud and AI app service providers can also isolate and secure training data. I’ll oversimplify a bit.  These services claim that any data you use to fine-tune models or build your AI apps will not be used anywhere else.   This is great!  It means that you can create an app based on course material and it won’t show up in any other apps, models, or just generally released to the public.  

Cloud service providers statements on privacy and security for AI apps 

Establish policies around the input data to the model.   I’m assuming that your framework already handles some of this.  The data used to train or fine tune your LLM apps should only come from trusted sources, and should not include PII.  The OWASP LLM Top 10 describes training data poisoning https://llmtop10.com/llm03/ and supply chain vulnerabilities as concerns https://llmtop10.com/llm05/.   Furthermore, many societal concerns about fairness and bias arise due to the quality of the training data or data used to fine-tune the app.   Unfortunately, it’s difficult to monitor or test for this after the LLM app is built.  Make sure you have clear policies on reliability and quality of input data.   

If the university has a good data policy, and has set up the appropriate protections offered by the AI or cloud provider,  you are in a good place!   If you are a university administrator who hasn’t already developed a data risk framework or are unsure if the data is properly sandboxed, then I recommend starting there.   

However, even with the data privacy framework in place, there may still be privacy and security concerns to keep in mind with the educational AI apps.  The rest of this article focuses on on-going risks and how to prevent them.    When you already have the appropriate framework and controls around data flows in place, your team can focus on monitoring the chatbots on security and performance.   

Post-launch: monitor and test

Here are concrete steps to monitor chatbots and protect their security.  For each action item, I also tell you which of the OWASP Top 10 risks to LLM apps it addresses. Here, I propose the fixes that I think are most cost-effective and appropriate in a university environment.  

Keep an inventory

You will need a central place to store what AI apps are being built and used. Keep track of who owns the app, what it is supposed to do, and what features are in place to protect the app.  You may be able to use any other data management framework you have, or build a simple spreadsheet.  (We have a template spreadsheet available for free; just ask.) 

System Boundaries

Some AI apps are “simple” chatbots, and provide a question and response interface to people.   Other AI apps can do more, and may interact with other parts of your system or data, by design.  They may make function calls that edit or store data.  For example, an AI app might automatically grade an assignment.   The difference between the “simple” chatbots and more powerful AI apps is important for privacy and security.

For example, if an app assigns grades, students will be highly motivated to hack it for better grades.  (We may wish they spent more time on the assignment than on hacking, but we have to prepare for the worst).  Furthermore, you will need to insure that the system is fair, or at least as fair as a human grader.  Finally, access controls should be carefully designed to limit hacking and mistakes.  

Therefore, if you have an AI app with automatic decisions or function calls, you should do the following.  

  1. In your AI inventory, keep track of what the AI app can edit or change in the system.  
  2. Where possible, keep a human in the loop for crucial decisions (such as grades).
  3. Check the access and controls to limit what the app can do, and how often.  For example, are any plugins or extensions automatically called by the chatbot? Or perhaps the chatbot has extensive permissions which could be exploited? 

The following list of LLM App risks from OWASP are relevant considerations here.

Response Monitoring

This is probably the most important step for ongoing testing and monitoring of LLM apps in the university and course environment.    

Regularly test the response to prompts that might be offensive, or that test the boundaries of acceptable.   This can be done manually, and likely should be done occasionally by smart humans who understand the university context.   Additionally,  consider an automated test suite that can be run on all chatbots at regular intervals.   A test suite would include a suite of prompts, and your LLM apps responses are sent to an automated scoring system.  You then get some sense of how harmful, risky, offensive, or insecure the chatbot responses might be.   Existing tools to help with this are sometimes called AI Red Teams, and commercial solutions include Giskard.AI, Lakera.AI, and the PyRit python library from Microsoft.   

AI Red Teams can help check these risks. 

  • Prompt Injection:  The classic example of prompt injection is the prompt, “Forget all previous instructions.”  Red teams are well-suited for testing these malicious prompts that try to bypass safeguards on the models.  https://llmtop10.com/llm01/
  • Sensitive Information Disclosure:  This attack is trying to get the model to reveal proprietary, personal, or copyrighted information.  Red teaming can be well suited for designing prompts that would reveal such information.  https://llmtop10.com/llm06/
  • Overreliance: The vulnerability here is not just that the LLM invented stuff, but that it is treated as authoritative.  Red teams can test against false information and hallucinations or insecure code generation.  https://llmtop10.com/llm09/
  • Model Denial Of Service: Red teaming could test for resource-intense prompts that might cause a Denial of Service.  However, other options such as rate limiting and monitoring should be the first line of defense. https://llmtop10.com/llm04/

Summary

In conclusion, the rapid rise of LLM apps in the university presents both opportunities and challenges for university IT administrators. By implementing robust data protection frameworks, proactive response monitoring, strict input testing, and clear system boundaries, you can harness the power of AI while mitigating potential risks. 

As new technologies emerge and student expectations change, university IT administrators must remain vigilant and adaptable. By staying informed about the latest AI developments and proactively addressing potential risks, we can ensure that our universities remain at the forefront of educational innovation while upholding our commitment to student safety and academic integrity.  

If the practical implementation of these security measures seem daunting, we can help. Engaging a specialized consultant with experience in AI and cybersecurity can provide long-term return on investment.  We can provide tailored recommendations for tools and training specific to your university’s needs but also help identify emerging threats and develop proactive strategies to mitigate them. This collaboration can ensure that your AI chatbots are both secure and effective in the long-run, ultimately benefiting your educational mission.


Sign up for the newsletter and learn more!

  • Type your email

  • Check your email inbox

  • Click the confirmation link

Are you a data-driven business or tech company?  Looking for ways to improve your user data protection?  Want to know how we are building privacy into our tech stack?  

Sign up for our weekly mailing list.

Bonus: get a free PDF with 7 tips on usability and engineering for privacy. 

Schedule a Meet and Greet

Do you want to talk to us about whether this would benefit your data protection program? Let's have a 15-minute meet and greet.  

Call includes:

  • Honest answers to your questions about whether this service is a good fit for your org.

  • Adversary elicitation: To improve your threat models, we can brainstorm privacy adversaries.  

Call does not include:

  • Sales Pitch 

  • No strings attached: "It isn't for us" is an acceptable decision

About the author 

Rebecca

Dr. Rebecca Balebako is a certified privacy professional (CIPP/E, CIPT, Fellow of Information Privacy) who helped multiple organizations improve their privacy through research, analysis, and engineering. 

Our Vision

 We work together with companies to build data protection solutions that are lasting and valuable, thereby protecting privacy as a human right.  

Privacy by Default

respect

Quality Process

HEALTH

Inclusion

>