-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathREADME.Rmd
230 lines (182 loc) · 8.81 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
<!-- devtools::build_readme() is a good way to build this -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
# use released flywire cell types even if you have access to in progress
# information
options(fafbseg.use_static_celltypes = TRUE)
```
<img src="man/figures/coconatfly-200.jpg" align="right" height="200" />
# coconatfly
<!-- badges: start -->
[](https://lifecycle.r-lib.org/articles/stages.html#experimental)
[](https://github.com/natverse/coconatfly/actions/workflows/R-CMD-check.yaml)
[](https://app.codecov.io/gh/natverse/coconatfly?branch=master)
<!-- badges: end -->
**coconatfly** enables comparative/integrative connectomics across Drosophila datasets.
The philosophy is to provide access to the most important
functions for connectome analysis in a way that is both convenient and
uniform across Drosophila datasets.
The package builds upon the [coconat](https://natverse.org/coconat/)
package which provides more basic and/or dataset agnostic functionality. In case
you were wondering, **coconat** stands for COmparative COnnectomics for the NATverse
and **coconatfly** enables this specifically for fly datasets.
Although the code is already in active use, especially for comparison of the
hemibrain and flywire datasets, it remains experimental.
Therefore the interface should not yet been relied upon. In particular,
it is quite likely that refactoring will abstract more functionality into
[coconat](https://natverse.org/coconat/) as time goes by in order to enable
more core functionality to be reused.
## Datasets
At present the following datasets are supported (dataset names used in the package in brackets):
1. Janelia hemibrain (**hemibrain**)
2. Female Adult Fly Brain - [FlyWire connectome](https://flywire.ai/) (**flywire**)
3. [Janelia male Ventral Nerve Cord](https://www.janelia.org/project-team/flyem/manc-connectome) (**manc**)
4. Wei Lee, John Tuthill and colleagues [Female Adult Nerve Cord](https://github.com/htem/FANC_auto_recon) (**fanc**)
5. Janelia Male CNS (**malecns**)
6. Janelia Male Optic Lobe (part of the malecns) (**opticlobe**)
7. Wei Lee and colleagues [Brain and Nerve Cord](https://github.com/jasper-tms/the-BANC-fly-connectome/wiki) (**banc**)
Datasets 1-4 and 6, 7 are either public (hemibrain, manc, flywire, opticlobe) or
access can be requested subject to agreeing to certain terms of use (fanc, banc).
The Male CNS dataset is currently undergoing
proofreading and annotation in a collaboration between the
[FlyEM](https://www.janelia.org/project-team/flyem) and
[Cambridge Drosophila Connectomics Group](https://flyconnecto.me). Release is
anticipated late 2024.
## Installation
We recommend installing the latest version of **coconatfly** from github like so:
```{r eval=FALSE}
install.packages('natmanager')
natmanager::install(pkgs = 'coconatfly')
```
but you may also choose a specific
[released version](https://github.com/natverse/coconatfly/releases)
for enhanced reproducibility:
```{r eval=FALSE}
natmanager::install(pkgs = 'coconatfly@v0.2.2')
```
Some of the datasets exposed by **coconatfly** require authentication for access
or are still being annotated in private pre-release. Please consult individual
package dependencies for authentication details and do not be surprised if you
do not have access to all datasets at this time.
For installation of private packages (currently restricted to the male cns
dataset being developed with our collaborators at the
[FlyEM Team at Janelia](https://www.janelia.org/project-team/flyem))
you will need a GITHUB_PAT (Personal Access Token - an alternative to
a username+password).
This code checks if you have a PAT GITHUB_PAT and offers to make one if necessary.
```{r eval=FALSE}
natmanager::check_pat()
```
## An example
First let's load the libraries we need
```{r, message=FALSE, warning=FALSE, results='hide'}
library(coconatfly)
library(dplyr)
```
Two important functions are `cf_ids()` which allows you to specify a set of
neurons from one or more datasets and `cf_meta()` which fetches information
about each neuron including its cell type.
For example let's fetch information about DA1 projection neurons:
```{r}
cf_meta(cf_ids('DA1_lPN', datasets = 'hemibrain'))
```
We can also do that for multiple brain datasets
```{r}
da1meta <- cf_meta(cf_ids('DA1_lPN', datasets = c('hemibrain', 'flywire')))
head(da1meta)
```
```{r}
da1meta %>%
count(dataset, side)
```
We can also fetch connectivity for these neurons:
```{r}
da1ds <- da1meta %>%
cf_partners(threshold = 5, partners = 'output')
head(da1ds)
```
```{r}
da1ds %>%
group_by(type, dataset, side) %>%
summarise(weight=sum(weight), npre=n_distinct(pre_id), npost=n_distinct(post_id))
```
Let's restrict that to types that are observed in both datasets. We do this
by counting how many distinct datasets exist for each type in the results.
```{r, }
da1ds.shared_types.wide <- da1ds %>%
filter(!(dataset=='hemibrain' & side=='L')) %>% # drop truncated hemibrain LHS
group_by(type) %>%
mutate(datasets_type=n_distinct(dataset)) %>%
filter(datasets_type>1) %>%
group_by(type, dataset, side) %>%
summarise(weight=sum(weight)) %>%
mutate(shortdataset=abbreviate_datasets(dataset)) %>%
tidyr::pivot_wider(id_cols = type, names_from = c(shortdataset,side),
values_from = weight, values_fill = 0)
da1ds.shared_types.wide
```
With the data organised like this, we can easily compare the connection strengths
between the cell types across hemispheres:
```{r flywire-left-vs-right}
library(ggplot2)
da1ds.shared_types.wide %>%
filter(type!='KCg-m') %>%
ggplot(data=., aes(fw_L, fw_R)) +
geom_point() +
stat_smooth(method = "lm", formula = y ~ x + 0) +
geom_abline(slope=1, linetype='dashed')
```
... and across datasets:
```{r flywire-vs-hemibrain}
da1ds.shared_types.wide %>%
filter(type!='KCg-m') %>%
ggplot(data=., aes(fw_R, hb_R)) +
geom_point() +
stat_smooth(method = "lm", formula = y ~ x + 0) +
geom_abline(slope=1, linetype='dashed')
```
## Across dataset connectivity clustering
Being able to fetch shared connectivity in a uniform format is a building block
for a range of analyses. For example, we can compare the connectivity of a set
of neurons that are believed to constitute the same cell type across multiple
datasets. Cosine similarity clustering seems to work very well for this purpose.
```{r lal-cosine-cluster}
cf_cosine_plot(cf_ids('/type:LAL0(08|09|10|42)', datasets = c("flywire", "hemibrain")))
```
Each row (and column) correspond to a single neuron. Rows are labelled by cell
type, dataset and hemisphere; due to truncation hemibrain neurons sometimes
exist in one hemisphere, sometimes both.
Notice that LAL009 and LAL010 neurons from each hemisphere co-cluster together
exactly as we would expect for a cell type conserved across brains. In contrast
LAL008 and LAL042 are intermingled; we believe that these constitute a single
cell type of two cells / hemisphere (i.e. they should not have been split into
two cell types in the hemibrain).
You can also see that cells from one hemibrain hemisphere often cluster slightly
oddly (e.g. `387687146`) - this is likely due to truncation of the axons or dendrites of these cells
or a paucity of partners from the left hand side of the hemibrain.
## Going further
We strongly recommend consulting the online manual visible at https://natverse.org/coconatfly/. In particular the vignette(s) listed at
https://natverse.org/coconatfly/articles provide full code and instructions
for a step by step walk through.
## Acknowledgements
Upon publication, please ensure that you appropriately cite all datasets that
you use in your analysis.
In addition in order to justify continued development of natverse tools in general and
coconatfly in particular, we would appreciate two citations for
* For the natverse: [Bates et al eLife 2020](https://doi.org/10.7554/eLife.53350)
* For coconatfly: [Schlegel et al Nature 2024](https://doi.org/10.1038/s41586-024-07686-5)
Should you make significant use of natverse packages in your paper
(e.g. multiple panels or >1 figure), we would also strongly
appreciate a statement like this in the acknowledgements that can be tracked by
our funders.
> Development of the natverse including the coconatfly and fafbseg packages has been supported
by the NIH BRAIN Initiative (grant 1RF1MH120679-01), NSF/MRC Neuronex2 (NSF 2014862/MC_EX_MR/T046279/1) and core funding from the Medical Research Council (MC_U105188491).