You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Prepare the initial draft of the "Related Work" section for a technical report on prompt leakage probing using the agentic framework AutoGen. This includes gathering and analyzing relevant literature on attacks on large language models (LLMs), with a specific focus on prompt leakage. The section should be structured to include an overview of attack types, notable studies, and identify gaps related to the use of agentic frameworks for probing.
Checklist
Gather Initial Literature
Find a comprehensive review article on LLM attacks (e.g., prompt leakage, prompt injection, jailbreak attacks).
Include key characteristics of each attack type with precise descriptions:
Prompt Leakage: Methods and motivations, e.g., accessing hidden or restricted prompt content.
Prompt Injection: Manipulating prompts to alter model behavior.
Jailbreak Attacks: Techniques for bypassing restrictions or ethical safeguards.
Identify and cite foundational or highly-cited papers for each attack type.
Analyze Literature
Summarize key findings from selected works.
Highlight gaps in existing research relevant to our focus on AutoGen and prompt leakage.
Include Varied Methods of Prompt Leakage
Search for papers detailing diverse approaches to prompt leakage (e.g., encoding strategies like Base64).
LLM-Generated Attacks on LLMs
Collect studies where LLMs have been used to design or execute attacks against other LLMs.
Explore Use of Agentic Frameworks
Investigate whether any prior works have used agentic frameworks, like AutoGen, for probing LLM attacks.
Note if such work has not been done before (expected outcome).
Draft Related Work Section
Write the initial draft, integrating findings and citations.
Add missing quotes to defence mechanisms
Notes
Focus on Prompt Leakage: Centralize on studies that explore prompt leakage in diverse contexts, including unconventional methods like Base64 encoding.
Agentic Frameworks Gap: This work aims to highlight the novelty of using AutoGen for probing attacks. Confirm the absence of similar prior work.
Quality Sources: Prioritize high-citation reviews and foundational papers for credibility and impact.
The text was updated successfully, but these errors were encountered:
Description
Prepare the initial draft of the "Related Work" section for a technical report on prompt leakage probing using the agentic framework AutoGen. This includes gathering and analyzing relevant literature on attacks on large language models (LLMs), with a specific focus on prompt leakage. The section should be structured to include an overview of attack types, notable studies, and identify gaps related to the use of agentic frameworks for probing.
Checklist
Notes
The text was updated successfully, but these errors were encountered: