Hannah Wayment-Steele, last updated Jan 2, 2024
For experiments performed in ColabFold, path to zipped files of ColabFold output are provided.
The run_af2.py
script, which all the non-ColabFold predictions used, is contained in this repo at /scripts/run_AF2.py
.
Left: default KaiBTE MSA in ColabFold, 3 recycles, depicting highest pLDDT of 5 models.
data_sep2022/colabfold_outputs/2qke_orig_default_colabfold_jul2022.result.zip
Right: Closest 50 sequences of KaiBTE MSA in ColabFold, 3 recycles, depicting highest pLDDT of 5 models.
data_sep2022/colabfold_outputs/2qke_top50_seqs_jul2022.result.zip
AF-Cluster models of KaiBTE. Generated using run_af2.py, model 1, 3 recycles, MSAs from AF-Cluster routine (ClusterMSA.py
)
python run_af2.py AF_Cluster/data_sep2022/AF_Cluster/data_sep2022/00_KaiB/msas/2QKEE_00.a3m --model_num 1 --recycles 3 --output_dir .
Models of 4 KaiB variants. Generated using run_af2.py, model 1, 3 recycles, “closest-10” MSAs (closest 10 sequences in phylogenetic tree by edit distance).
python run_af2.py AF_Cluster/data_sep2022/02_KaiB_fromPhylTree/msas/WP_069333443.1.a3m --model_num 1 --recycles 3 --output_dir .
R. sphaeroides: WP_069333443
T. elongatus: WP_011056333
S. elongatus: WP_011242647
L. pneumophilia: WP_027265801
KaiBTV-4 model, generated using ColabFold and MSA of 10 closest sequences in phylogenetic tree. Model 1, 12 recycles.
data_sep2022/colabfold_outputs/kaibtv4_local_MSA.result.zip
Models of point mutants: generated using single sequences, model=1, 12 recycles
Fastas of single sequences are in pt_mutant_fastas_apr7.zip. First unzip:
unzip pt_mutant_fastas_apr7.zip
python run_af2.py AF_Cluster/data_sep2022/03_KaiBRS_and_pt_muts/indiv_fastas_apr7/V83D_Y35C.fasta --model_num 1 --recycles 12 --output_dir .
RfaH models. Generated using run_af2.py
, model 1, 3 recycles, MSAs from AF-Cluster routine (/scripts/ClusterMSA.py
)
python run_af2.py AF_Cluster/data_sep2022/04_OtherFoldswitchers/00_RfaH/msas/RFAH_000.a3m --model_num 1 --recycles 3 --output_dir .
Mad2 models. Generated using run_af2.py
, model 1, 3 recycles, MSAs from AF-Cluster routine (/scripts/ClusterMSA.py
)
python run_af2.py AF_Cluster/data_sep2022/01_Mad2/1S2H_msas/1S2H_000.a3m --model_num 1 --recycles 3 --output_dir .
AF-Cluster initial screen
Generated using run_af2.py
, model 1, 1 recycle, MSAs from AF-Cluster routine (ClusterMSA.py)
python run_af2.py AF_Cluster/data_sep2022/05_Thioredoxins/00_Mpt53/msas/1LU4A_0.a3m --model_num 1 --recycles 1 --output_dir .
More sampling on Mpt53
Generated using run_af2.py
, model 1, 3 recycles, MSAs from AF-Cluster routine (ClusterMSA.py)
python run_af2.py AF_Cluster/data_sep2022/05_Thioredoxins/00_Mpt53/msas/1LU4A_0.a3m --model_num 1 --recycles 3 --output_dir .
KaiB-TE (PDB: 2QKE), closest 50, closest 50-100
Generated using ColabFold, 3 recycles. 5 models depicted.
Closest 50: data_sep2022/colabfold_outputs/ColabFold output: data_sep2022/colabfold_outputs/2qke_top50_seqs_jul2022.result.zip
Closest 50-100: data_sep2022/colabfold_outputs/2qke_next50_aug2023.result.zip
Sweep of eps values, same method as Fig 1d,e,f
KaiB AF-Cluster models: same method/data as Fig 1d,e,f
Mutant models: same method/data as Fig 3b
RfaH in AF2, complete MSA
Generated using ColabFold, 3 recycles, depicting 5 models.
data_sep2022/colabfold_outputs/Rfah_3_recycles_default_colabfold.result.zip
Other fold-switchers
Selecase:
python run_af2.py AF_Cluster/data_sep2022/02_selecase/msas/4QHF_000.a3m --model_num 1 --recycles 3 --output_dir .
Lymphotactin:
python run_af2.py AF_Cluster/data_sep2022/03_lymphotactin/2JP1_msas/2JP1_000.a3m --model_num 1 --recycles 3 --output_dir .
CLIC1:
python run_af2.py AF_Cluster/data_sep2022/04_OtherFoldswitchers/04_CLIC1/1K0N_000.a3m --model_num 1 --recycles 3 --output_dir .
Enhanced sampling Mpt53: Same method as Fig. 5c
Homologues to Mpt53: Same method as Fig. 5c
G_A/G_B homologues: Performed using ColabDesign code, 0 recycles, 8 seeds, 5 models.
Colab notebook:
data_sep2022/colabfold_outputs/GAGB_mutations_in_colabdesign.ipynb
Comparing models from single sequence and “closest-10” MSAs.
“Closest-10” models are from the methods described in Fig. 2b (3 recycles, model 1). Single-sequence models were generated analogously (3 recycles, model 1).
First unzip single seq fastas:
unzip AF_Cluster/data_sep2022/00_KaiB/fastas_single_seq.zip
python run_af2.py AF_Cluster/data_sep2022/00_KaiB/fastas_single_seq/WP_069333443.1.fasta --model_num 1 --recycles 3 --output_dir .
10 tests of shuffling residues within columns.
These were performed in ColabFold with 3 recycles and model 1.
data_sep2022/colabfold_outputs/Supp_Discussion_fig1c_scrambled_msas/*.result.zip
Supplemental figure 2: Testing increased sampling on KaiBRS.
Performed in colabfold, using 16 seeds, 3 recycles, 5 models, use_dropout=True.
data_sep2022/colabfold_outputs/kaibrs_lots_of_sampling_supp_discussion_fig2.result.zip
Note: in this .zip, .pickle files were not uploaded due to file size.