applicable models
llm
Techniques for Large Language Models (GPT, BERT, etc.)
16 techniques
| Goals | Models | Data Types | Description | |||
|---|---|---|---|---|---|---|
| Prompt Sensitivity Analysis | Experimental | Architecture/neural Networks/transformer/llm Paradigm/generative +1 | Text | Prompt Sensitivity Analysis systematically evaluates how variations in input prompts affect large language model... | ||
| Causal Mediation Analysis in Language Models | Mechanistic Interpretability | Architecture/neural Networks/transformer Architecture/neural Networks/transformer/llm +3 | Text | Causal mediation analysis in language models is a mechanistic interpretability technique that systematically... | ||
| Feature Attribution with Integrated Gradients in NLP | Algorithmic | Architecture/neural Networks/transformer Architecture/neural Networks/transformer/llm +4 | Text | Applies Integrated Gradients to natural language processing models to attribute prediction importance to individual... | ||
| Agent Goal Misalignment Testing | Testing | Architecture/model Agnostic Architecture/neural Networks/transformer/llm +3 | Any | Agent goal misalignment testing identifies scenarios where AI agents pursue objectives in unintended ways or develop... | ||
| Chain-of-Thought Faithfulness Evaluation | Testing | Architecture/neural Networks/transformer/llm Paradigm/generative +1 | Text | Chain-of-thought faithfulness evaluation assesses the quality and faithfulness of step-by-step reasoning produced by... | ||
| Constitutional AI Evaluation | Testing | Architecture/neural Networks/transformer/llm Requirements/white Box +1 | Text | Constitutional AI evaluation assesses models trained to follow explicit behavioural principles or 'constitutions' that... | ||
| Embedding Bias Analysis | Algorithmic | Architecture/neural Networks Architecture/neural Networks/transformer +3 | Text Image | Embedding bias analysis examines learned representations to identify biases, spurious correlations, and problematic... | ||
| Few-Shot Fairness Evaluation | Testing | Architecture/neural Networks/transformer/llm Paradigm/generative +1 | Text | Few-shot fairness evaluation assesses whether in-context learning with few-shot examples introduces or amplifies biases... | ||
| Hallucination Detection | Testing | Architecture/neural Networks/transformer Architecture/neural Networks/transformer/llm +2 | Text | Hallucination detection identifies when generative models produce factually incorrect, fabricated, or ungrounded... | ||
| Jailbreak Resistance Testing | Testing | Architecture/neural Networks/transformer/llm Requirements/black Box | Text | Jailbreak resistance testing evaluates LLM defences against techniques that bypass safety constraints. This involves... | ||
| Out-of-Domain Detection | Algorithmic | Architecture/model Agnostic Architecture/neural Networks/transformer/llm +2 | Text | Out-of-domain (OOD) detection identifies user inputs that fall outside an AI system's intended domain or capabilities,... | ||
| Prompt Robustness Testing | Testing | Architecture/neural Networks/transformer/llm Requirements/black Box | Text | Prompt robustness testing evaluates how consistently models perform when prompts undergo minor variations in wording,... | ||
| Prompt Injection Testing | Testing | Architecture/neural Networks/transformer/llm Paradigm/generative +1 | Text | Prompt injection testing systematically evaluates LLMs and generative AI systems for vulnerabilities where malicious... | ||
| Retrieval-Augmented Generation Evaluation | Testing | Architecture/neural Networks/transformer/llm Paradigm/generative +1 | Text | RAG evaluation assesses systems combining retrieval and generation by measuring retrieval quality, generation... | ||
| AI Agent Safety Testing | Testing | Architecture/neural Networks/transformer/llm Paradigm/generative +1 | Any | AI agent safety testing evaluates autonomous AI agents that interact with external tools, APIs, and systems to ensure... | ||
| Toxicity and Bias Detection | Testing | Architecture/model Agnostic Architecture/neural Networks/transformer/llm +2 | Text | Toxicity and bias detection uses automated classifiers and human review to identify harmful, offensive, or biased... |
Rows per page
Page 1 of 1