Talk: Do LLMs Exhibit Cybersecurity Misconceptions? 1/31 online

Evaluation of LLMs on CCI and CCA examinations

Do LLMs Show Cybersecurity Misconceptions?

Evaluation of LLMs Performance on Cybersecurity Concept Inventories

Shan Huang, UIUC

Joint work with Jeffrey Herman and Alan Sherman, et al.

12:00–1pm ET Friday, Jan. 31, 2025, online

We evaluated the performance of five LLMs (Llama a, GPT-3.5-turbo, GPT-4, GPT-4O, and GPT-O1) on two cybersecurity concept inventories: Cybersecurity Concept Inventory (CCI) and Cybersecurity Curriculum Assessment (CCA). Using a zero-shot setting to minimize external influencing factors, we compared the performance of these LLMs with that of students previously studied, and we conducted a qualitative analysis of GPT-O1's output to examine if it exhibits misconceptions. Quantitative analysis reveals that, for the CCI and CCA, GPT-O1 significantly outperformed other models and students, correctly answering 92% of CCI and 72% of CCA test items. These results indicate GPT-O1’s strong proficiency in foundational topics (CCI) but reveal its limitations in addressing these concepts in more technically advanced scenarios (CCA). Qualitative analysis of GPT-O1’s reasoning patterns uncovered instances of insightful reasoning but also highlighted ways in which GPT-O1's answers reflect persistent student mistakes, such as biases, overgeneralizations, and logical inconsistencies. This work highlights the significant potential of GPT-O1 as a tool for introductory cybersecurity education in its ability to provide detailed explanations and structured reasoning for novice learners.

Shan Huang is a Ph.D. candidate in Computer Science at the University of Illinois Urbana-Champaign. She is broadly interested in how educational games can improve student learning. Current work includes improving student learning in cybersecurity with educational games and accessing student knowledge of cybersecurity concepts. Shan is also involved in various educational data mining projects.

UMBC Cybersecurity Institute

More Information about Talk: Do LLMs Exhibit Cybersecurity Misconceptions? 1/31 online

Tags:

Posted: January 29, 2025, 8:42 AM

Read Original Post in myUMBC

UMBC Cybersecurity Institute

Search UMBC

Evaluation of LLMs on CCI and CCA examinations

Do LLMs Show Cybersecurity Misconceptions?

Evaluation of LLMs Performance on Cybersecurity Concept Inventories

Shan Huang, UIUC

Subscribe to UMBC Weekly Top Stories

I am interested in: