This repository contains code related to variations of the teacher-student framework. It can be used to investigate continual learning, transfer learning, critical learning periods among other learning regimes. Results from papers such as
should all be reproducible from this code.
The majority of the code is written in Python. Any version above Python 3.5 should work, although extensive testing has not been carried out (development was done in Python 3.9 so this will be the most reliable version). Other python requirements can be satisfied by running (preferably in a virtual environment):
pip install -r requirements.txt
Finally, the package itself should be installed by running
pip install -e .
from the root of the repository.
Some parts, specifically the ODEs are implemented in C++ (C++17). I plan to add a python implementation to forgo these additional requirements (see TODOs below). In addition to standard C++17 compiler requirements, the Eigen package is also needed for the linear algebra computations. Saving the Eigen header files in the root of the repository should be sufficient. See the Eigen documentation for more details.
The main interface of the code is the config.yaml
file found in the experiments folder. In this file you can specify the parameters of an experiment, which can then be run using python run.py
.
A single run will produce a number of output files. By default these will be located in experiments/results/
under a folder named by the timestamp of the experiment. The files will include scalar data (e.g. generalisation errors and order parameters) for the network simulations and/or the ODE solutions, and plots of this data (under a subfolder named plots).
Below is a summary of the models / experiment configurations that have been implemented, and those that are planned. For features not yet implemented, the asterisk * denotes that it has been completed but note yet integrated/pushed.
- Standard teacher-student framework with IID Gaussian inputs.
- Hidden Manifold Model (HMM) where input data has non-trivial correlations.
- Multi teacher extensions of the above.
- Teachers rotated in feature and/or readout space.
- Identical teachers.
- Teachers with fraction of nodes shared and fraction rotated.
- Interpolation between different projection matrices for HMM.
- Interleaved replay of previous teacher during training of second (networks only).
- Output noise to teachers.
- Input noise to student only (e.g. for critical learning)
- Input noise to student only (e.g. for critical learning)
- Frozen hidden units (e.g. for critical learning)
- Classification (currently only regression is implemented).
- RL Perceptron (Nish et. al (2023))
The following features are not planned but are certainly possible:
- *More than 2 teachers.
- *More than 2 layers.
- *Mean field scaling.
- *MNIST or other arbitrary datasets.
- *Symmetric initialisation for students.
- *Other teacher configurations (e.g. drifting).
- *Copying head at switch.
- *Path integral consolidation (Zenke, Poole).
Code for most exist already but have not been pushed to this repository in an attempt to minimise complexity.
A limited number of integration tests have been implemented. They can be run with python -m unittest
from the directory root.
The idea of these integration tests is primarily to ensure that when new models etc. are implemented that previous use-cases are unaffected. These tests can take some time to complete.