applicable models

llm

Techniques for Large Language Models (GPT, BERT, etc.)

16 techniques
GoalsModelsData TypesDescription
Prompt Sensitivity Analysis
Experimental
Architecture/neural Networks/transformer/llm
Paradigm/generative
+1
Text
Prompt Sensitivity Analysis systematically evaluates how variations in input prompts affect large language model...
Causal Mediation Analysis in Language Models
Mechanistic Interpretability
Architecture/neural Networks/transformer
Architecture/neural Networks/transformer/llm
+3
Text
Causal mediation analysis in language models is a mechanistic interpretability technique that systematically...
Feature Attribution with Integrated Gradients in NLP
Algorithmic
Architecture/neural Networks/transformer
Architecture/neural Networks/transformer/llm
+4
Text
Applies Integrated Gradients to natural language processing models to attribute prediction importance to individual...
Agent Goal Misalignment Testing
Testing
Architecture/model Agnostic
Architecture/neural Networks/transformer/llm
+3
Any
Agent goal misalignment testing identifies scenarios where AI agents pursue objectives in unintended ways or develop...
Chain-of-Thought Faithfulness Evaluation
Testing
Architecture/neural Networks/transformer/llm
Paradigm/generative
+1
Text
Chain-of-thought faithfulness evaluation assesses the quality and faithfulness of step-by-step reasoning produced by...
Constitutional AI Evaluation
Testing
Architecture/neural Networks/transformer/llm
Requirements/white Box
+1
Text
Constitutional AI evaluation assesses models trained to follow explicit behavioural principles or 'constitutions' that...
Embedding Bias Analysis
Algorithmic
Architecture/neural Networks
Architecture/neural Networks/transformer
+3
Text
Image
Embedding bias analysis examines learned representations to identify biases, spurious correlations, and problematic...
Few-Shot Fairness Evaluation
Testing
Architecture/neural Networks/transformer/llm
Paradigm/generative
+1
Text
Few-shot fairness evaluation assesses whether in-context learning with few-shot examples introduces or amplifies biases...
Hallucination Detection
Testing
Architecture/neural Networks/transformer
Architecture/neural Networks/transformer/llm
+2
Text
Hallucination detection identifies when generative models produce factually incorrect, fabricated, or ungrounded...
Jailbreak Resistance Testing
Testing
Architecture/neural Networks/transformer/llm
Requirements/black Box
Text
Jailbreak resistance testing evaluates LLM defences against techniques that bypass safety constraints. This involves...
Out-of-Domain Detection
Algorithmic
Architecture/model Agnostic
Architecture/neural Networks/transformer/llm
+2
Text
Out-of-domain (OOD) detection identifies user inputs that fall outside an AI system's intended domain or capabilities,...
Prompt Robustness Testing
Testing
Architecture/neural Networks/transformer/llm
Requirements/black Box
Text
Prompt robustness testing evaluates how consistently models perform when prompts undergo minor variations in wording,...
Prompt Injection Testing
Testing
Architecture/neural Networks/transformer/llm
Paradigm/generative
+1
Text
Prompt injection testing systematically evaluates LLMs and generative AI systems for vulnerabilities where malicious...
Retrieval-Augmented Generation Evaluation
Testing
Architecture/neural Networks/transformer/llm
Paradigm/generative
+1
Text
RAG evaluation assesses systems combining retrieval and generation by measuring retrieval quality, generation...
AI Agent Safety Testing
Testing
Architecture/neural Networks/transformer/llm
Paradigm/generative
+1
Any
AI agent safety testing evaluates autonomous AI agents that interact with external tools, APIs, and systems to ensure...
Toxicity and Bias Detection
Testing
Architecture/model Agnostic
Architecture/neural Networks/transformer/llm
+2
Text
Toxicity and bias detection uses automated classifiers and human review to identify harmful, offensive, or biased...
Rows per page
Page 1 of 1