Associate Professor John Licato, Computer Science and Engineering and Associate Director for Outreach, USF Institute for AI+X has recently received an NSF Grant of $275,000 to develop tools to ensure AI adheres to human rules, ethics, and regulations. Generative AI, specifically Chatbots, function like a black box. This is both the source of their strength and their weakness, since ensuring accuracy, reliability, and security could become more difficult as the models become more advanced.
A category of errors dubbed AI Vulnerabilities is the focus of their current grant research. To use a turn of phrase, AI follows the letter of the law, not the spirit of the law, and so an AI Vulnerability is an unintended consequence of the rules we want it to follow. For example, a company develops an AI salesperson, however there are both implicit and explicit rules governing acceptable behavior when making a sale.
“The vulnerabilities we're focusing on are those that would cause the AI salesperson to violate one of these rules of conduct,” said Professor Licato. The nuances of human behavior and principles of ethical conduct can be difficult even for humans to interpret, but defining them mathematically can be shockingly difficult.
“The chatbot can respond differently to what a user says, but ideally, it should always stick to the rules it’s been given,” said Animesh Nighojkar, PhD. in Computer Science. “The problem is that these instructions can sometimes be flawed, unclear, or even misunderstood by the chatbot. The only way to ensure a chatbot behaves as intended is to test it.”
The proposed product verifies that a chatbot adheres to a set of rules like 'no swearing' or 'no promoting any form of discrimination.' The customer will send the chatbot, language model it’s built on, and any specific instructions along with the rule set, and their system will then interpret these rules into code and run a barrage of vulnerability tests.
AI is being integrated into every facet of society, yet there are many unresolved problems with the technology. The tools Professor Licato and Animesh are developing with the $275,000 NSF Grant will help reduce unintended consequences of AI. “Our unique advantage is our constantly updated database of weaknesses in AI chatbots,” said Animesh. The final output is a detailed report on the chatbot’s vulnerability score, and a list of suggestions to remedy the issues. “We help businesses ensure their chatbots don’t say or do anything they’re not supposed to,” said Animesh with an eager smile.