AI Tools

NVIDIA garak Tutorial: Build a Complete Defensive LLM Red-Teaming Workflow with Custom Probes and Detectors

AI News Desk

MarkTechPost

Jun 07, 2026

2 min read

This tutorial analyzes NVIDIA garak as a practical framework for defensive LLM red-teaming, walking through a complete workflow with custom probes and detectors.

NVIDIA garak Tutorial: Build a Complete Defensive LLM Red-Teaming Workflow with Custom Probes and Detectors

['In this tutorial, we dive into NVIDIA garak, a practical framework for defensive LLM red-teaming. We explore the entire workflow, from setting up Garak to creating custom probes and detectors, and exporting results in AVID format. Instead of just running a single scan, we use Garak end-to-end to understand how probes, detectors, generators, reports, and vulnerability scores work together in a complete LLM security testing workflow.', 'We start by importing the required libraries and creating a helper function to run shell commands directly from the notebook.

Next, we install garak, configure basic environment variables, and import the main garak modules needed for the tutorial. A reusable function is also defined to run Garak programmatically and capture the path to the generated report.', 'The garak plugin ecosystem is inspected by listing available probes, detectors, generators, and buffs. A quick dry run is then performed using the test generator to confirm that Garak is working without requiring any external model or API key.

After that, a real Hugging Face model is scanned and a multi-probe scan is run to generate a richer report for analysis.', "The generated garak report is loaded and prepared for detailed analysis using pandas and NumPy. First, Garak's built-in report parser is tried, and if that is unavailable, the JSONL report file is manually parsed. Safety scores and attack success rates are calculated, and vulnerabilities across different probe-detector combinations are visualized.

Sample hits with detector scores indicating potentially unsafe or vulnerable outputs are extracted and examined.", "A custom garak probe is created that uses fixed prompts and connects it with a custom detector. A custom detector that flags outputs containing the word 'hello' is defined and saved inside Garak's detector package. The custom probe and detector are then run against the test generator to verify that the extension works correctly.

Finally, the garak report is exported in AVID format, and a REST configuration template for connecting garak to an external model endpoint is shown.", 'In conclusion, a complete hands-on workflow for testing LLM behavior using NVIDIA garak has been demonstrated. Built-in probes are run, safety scores and attack success rates are analyzed, concrete flagged outputs are inspected, and Garak is extended with a custom probe and detector. Results are exported in AVID format, making the workflow more useful for structured vulnerability reporting.

This provides a platform to evaluate models and build more advanced defensive red-teaming pipelines.']

Share this article

X LinkedIn Telegram

Source: MarkTechPost