churclillsquareconsulting-logo

AI Hacking 101

Introduction:

The AI Hacking 101 ILT teaches students the fundamentals of penetration testing AI/LLM based applications such as customer facing chatbots.

The course focuses on demonstrating how to detect and exploit common AI vulnerabilities such as:

  • Prompt Injection
  • Sensitive Information Disclosure
  • Improper Output Handling
  • System Prompt Leakage
  • Misinformation
  • Excessive Agency

Not only will students learn about these exploits, but they will also spend hands-on time in a custom-built environment exploiting and uncovering these vulnerabilities. The online lab features the TCM Vulnerable Chatbot, a customer service chatbot that can interact with customers’ tickets and improve its responses via Retrieval Augmented Generation (RAG) using the company’s knowledge base.

Objectives:

Course Outline:

1 – AI Fundamentals Review

  • A quick review of some of the fundamentals of AI such as how they operate and standard terms such as model parameters, temperature, top-p, inference, training, LLMs.

2 – AI Threat Model

  • Discuss the threat actors, assets, adversary goals and attack surfaces for modern AI applications and the specific AI application used in the course

3 – Reconnaissance, Model Mapping and Baseline Behavior and Fingerprinting

  • Demonstrate techniques for performing reconnaissance of AI applications with a specific focus on fingerprinting underlying AI models and their settings.

4 – Prompt Injection and Jailbreaking

  • Demonstrate common techniques for prompt injection and jail breaking

5 – Prompt Injection Tools and Resources

  • Show common tools and repositories of prompts used for prompt injection and jailbreaking

6 – Bypassing Common Protections

  • Showcase how to bypass common protections for prompt injection such as input/output filtering

7 – Testing for harmful output/hate speech/misinformation/off-topic content and resource drainage

  • Demonstrate tests for verifying the model responds correctly to requests for generating harmful or Off-topic content or attempts to waste resources.

8 – Data Exfiltration

  • Demonstrate how retrieval augmented generation works and vulnerabilities associated with it such as leakage of confidential material and PII.

9 – RAG and Vector DB Attacks

  • Demonstrate attacks the focus on the retrieval of documents and the ticket base, showcase vector poisoning attacks.

10 – Excessive Agency

  • Demonstrate how excessive agency in applications can be exploited and tested for.

Enroll in this course

811.16

Need Help Finding The Right Training Solution?

Our training advisors are here for you.

EUR Euro