Skip to content

Swiss-AI-Safety/swiss-summer-camp-23

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

84 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Summer Camp 2023

Program

Day 1 - Intro

Technical

Run the notebook in Google Colab:

  1. Exercise 1 Pytorch Introduction
  2. Exercise 2 Optimization
  3. Exercise 3 Einops Basics
  4. Exercise 4 Einops for Deep Learning
  5. Exercise 5 Bonus Hyperparameters

Conceptual

Introduction to AI Safety

Governance

Introduction to AIS and AI Governance

Day 2 - RL

Technical

  1. Theory Reinforcement Learning: overleaf | pdf | book reference
  2. Exercice Deep Q Learning: google colab
  3. Reading (chapter 13 til the end of 13.3): RL Intro Policy Optimization
  4. Exercice Policy Gradient: google colab

Conceptual

RL and problems

Day 3 - Transformers

Technical

Introduction to Transformers

Play with Tokenizer

Governance

2.Understanding_the_ecosystem

Day 4 - Prediction and Interpretability

Conceptual

3. Prediction

Technical

Induction Circuits

Day 5 - RLHF and Adversarial attacks

Conceptual

4. Scalable Oversight

Technical Policy Gradient and RLHF

Slides: Policy Gradient and RLHF from Page 23

Ex 1: RLHF

Reading: Secrets of RLHF in Large Language Models Part I: PPO

Reading: Learning to summarize from human feedback

Technical Adversarial attacks

Slides: Adversarial attacks introduction

Ex 2: Fast Gradient Sign Method notebook

Challenge: Gandalf Jailbreak challenge

Reading + code: LLM attacks automatic jailbreak

Day 6 - Governance Projects

Project

Day 7 - Conceptual Interpretability and Compute Governance

Conceptual

Interpretability

Governance

Compute Governance

Installation

You can run the notebook in Google Colab or locally.

Run locally

If you want to run them locally, you can clone the repository

git clone https://github.com/Swiss-AI-Safety/swiss-summer-camp-23.git
cd swiss-summer-camp-23
conda create --name SAIS python=3.9 -y
conda activate SAIS
conda install pytorch torchvision torchaudio cpuonly -c pytorch -y
pip install -r requirements.txt

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published