elife01120.xml

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.1d1 20130915//EN"  "JATS-archivearticle1.dtd"><article article-type="research-article" dtd-version="1.1d1" xmlns:xlink="http://www.w3.org/1999/xlink"><front><journal-meta><journal-id journal-id-type="nlm-ta">elife</journal-id><journal-id journal-id-type="hwp">eLife</journal-id><journal-id journal-id-type="publisher-id">eLife</journal-id><journal-title-group><journal-title>eLife</journal-title></journal-title-group><issn publication-format="electronic">2050-084X</issn><publisher><publisher-name>eLife Sciences Publications, Ltd</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="publisher-id">01120</article-id><article-id pub-id-type="doi">10.7554/eLife.01120</article-id><article-categories><subj-group subj-group-type="display-channel"><subject>Research article</subject></subj-group><subj-group subj-group-type="heading"><subject>Neuroscience</subject></subj-group></article-categories><title-group><article-title>Expanding the olfactory code by in silico decoding of odor-receptor chemical space</article-title></title-group><contrib-group><contrib contrib-type="author" id="author-6204"><name><surname>Boyle</surname><given-names>Sean Michael</given-names></name><xref ref-type="aff" rid="aff1"/><xref ref-type="other" rid="par-1"/><xref ref-type="fn" rid="con1"/><xref ref-type="fn" rid="conf1"/></contrib><contrib contrib-type="author" id="author-6205"><name><surname>McInally</surname><given-names>Shane</given-names></name><xref ref-type="aff" rid="aff2"/><xref ref-type="fn" rid="con3"/><xref ref-type="fn" rid="conf3"/></contrib><contrib contrib-type="author" corresp="yes" id="author-6025"><name><surname>Ray</surname><given-names>Anandasankar</given-names></name><xref ref-type="aff" rid="aff1"/><xref ref-type="aff" rid="aff2"/><xref ref-type="aff" rid="aff3"/><xref ref-type="corresp" rid="cor1">*</xref><xref ref-type="fn" rid="con2"/><xref ref-type="fn" rid="conf2"/></contrib><aff id="aff1"><institution content-type="dept">Genetics, Genomics, and Bioinformatics Program</institution>, <institution>University of California, Riverside</institution>, <addr-line><named-content content-type="city">Riverside</named-content></addr-line>, <country>United States</country></aff><aff id="aff2"><institution content-type="dept">Department of Entomology</institution>, <institution>University of California, Riverside</institution>, <addr-line><named-content content-type="city">Riverside</named-content></addr-line>, <country>United States</country></aff><aff id="aff3"><institution content-type="dept">Institute of Integrative Genome Biology</institution>, <institution>University of California, Riverside</institution>, <addr-line><named-content content-type="city">Riverside</named-content></addr-line>, <country>United States</country></aff></contrib-group><contrib-group content-type="section"><contrib contrib-type="editor"><name><surname>Luo</surname><given-names>Liqun</given-names></name><role>Reviewing editor</role><aff><institution>Stanford University</institution>, <country>United States</country></aff></contrib></contrib-group><author-notes><corresp id="cor1"><label>*</label>For correspondence: <email>anand.ray@ucr.edu</email></corresp></author-notes><pub-date date-type="pub" publication-format="electronic"><day>01</day><month>10</month><year>2013</year></pub-date><pub-date pub-type="collection"><year>2013</year></pub-date><volume>2</volume><elocation-id>e01120</elocation-id><history><date date-type="received"><day>21</day><month>06</month><year>2013</year></date><date date-type="accepted"><day>26</day><month>08</month><year>2013</year></date></history><permissions><copyright-statement>© 2013, Boyle et al</copyright-statement><copyright-year>2013</copyright-year><copyright-holder>Boyle et al</copyright-holder><license xlink:href="http://creativecommons.org/licenses/by/3.0/"><license-p>This article is distributed under the terms of the <ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/3.0/">Creative Commons Attribution License</ext-link>, which permits unrestricted use and redistribution provided that the original author and source are credited.</license-p></license></permissions><self-uri content-type="pdf" xlink:href="elife01120.pdf"/><related-article ext-link-type="doi" id="ra1" related-article-type="commentary" xlink:href="10.7554/eLife.01605"/><abstract><object-id pub-id-type="doi">10.7554/eLife.01120.001</object-id><p>Coding of information in the peripheral olfactory system depends on two fundamental factors: interaction of individual odors with subsets of the odorant receptor repertoire and mode of signaling that an individual receptor-odor interaction elicits, activation or inhibition. We develop a cheminformatics pipeline that predicts receptor–odorant interactions from a large collection of chemical structures (&gt;240,000) for receptors that have been tested to a smaller panel of odorants (∼100). Using a computational approach, we first identify shared structural features from known ligands of individual receptors. We then use these features to screen in silico new candidate ligands from &gt;240,000 potential volatiles for several Odorant receptors (Ors) in the <italic>Drosophila</italic> antenna. Functional experiments from 9 Ors support a high success rate (∼71%) for the screen, resulting in identification of numerous new activators and inhibitors. Such computational prediction of receptor–odor interactions has the potential to enable systems level analysis of olfactory receptor repertoires in organisms.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.01120.001">http://dx.doi.org/10.7554/eLife.01120.001</ext-link></p></abstract><abstract abstract-type="executive-summary"><object-id pub-id-type="doi">10.7554/eLife.01120.002</object-id><title>eLife digest</title><p>Although our sense of smell is regarded as inferior to that of many other species, we can nevertheless distinguish between roughly 10,000 different odors. These are made up of molecules called odorants, each of which activates a specific subset of odorant receptors in the nose. However, much of what we know about this process has come from studying the fruit fly, <italic>Drosophila</italic>, which detects odors using receptors located mainly on its antennae.</p><p>The number of potential odorants in nature is vast, and only a tiny fraction of the interactions between odorants and receptors can be physically tested. To address this challenge, Boyle et al. have used a computational approach to study in depth the interactions between a subset of 24 odorant receptors in <italic>Drosophila</italic> antennae and 109 odorants.</p><p>After developing a method to identify structural features shared by the odorants that activate each receptor, Boyle et al. used this information to perform a computational (in silico) screen of more than 240,000 different odorant-like volatile compounds. For each receptor, they compiled a list of the 500 odorants predicted to interact most strongly with it. They then tested their predictions for a subset of the receptors by performing experiments in living flies, and found that roughly 71% of predicted compounds did indeed activate or inhibit their receptors, compared to only 10% of a control sample.</p><p>In addition to providing new insights into the nature of the interactions between odorants and their receptors, the computational screen devised by Boyle et al. could aid the development of novel insect repellents, or compounds that mask the odors used by disease-causing insects to identify their hosts. It could also be used in the future to develop novel flavors and fragrances.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.01120.002">http://dx.doi.org/10.7554/eLife.01120.002</ext-link></p></abstract><kwd-group kwd-group-type="author-keywords"><title>Author keywords</title><kwd>odorant receptors</kwd><kwd>antenna</kwd><kwd>electrophysiology</kwd><kwd>cheminformatics</kwd></kwd-group><kwd-group kwd-group-type="research-organism"><title>Research organism</title><kwd><italic>D. melanogaster</italic></kwd></kwd-group><funding-group><award-group id="par-1"><funding-source><institution-wrap><institution>National Science Foundation</institution></institution-wrap></funding-source><award-id>IGERT</award-id><principal-award-recipient><name><surname>Boyle</surname><given-names>Sean Michael</given-names></name></principal-award-recipient></award-group><funding-statement>The funder had no role in study design, data collection and interpretation, or the decision to submit the work for publication.</funding-statement></funding-group><custom-meta-group><custom-meta><meta-name>elife-xml-version</meta-name><meta-value>2</meta-value></custom-meta><custom-meta specific-use="meta-only"><meta-name>Author impact statement</meta-name><meta-value>A computational method that can screen thousands of chemicals and predict which odorants will interact with specific odorant receptors in flies may ultimately aid the development of more effective insect repellents.</meta-value></custom-meta></custom-meta-group></article-meta></front><body><sec id="s1" sec-type="intro"><title>Introduction</title><p>The peripheral olfactory system is unparalleled in its ability to detect and discriminate amongst an extremely large number of volatile compounds in the environment. To detect this wide variety of volatiles, most organisms have evolved large families of receptor genes that typically encode 7-transmembrane proteins expressed in the olfactory neurons (<xref ref-type="bibr" rid="bib4">Buck and Axel, 1991</xref>; <xref ref-type="bibr" rid="bib10">Clyne et al., 1999</xref>; <xref ref-type="bibr" rid="bib15">de Bruyne and Baker, 2008</xref>; <xref ref-type="bibr" rid="bib53">Vosshall et al., 1999</xref>; <xref ref-type="bibr" rid="bib14">Dahanukar et al., 2005</xref>). Each volatile chemical in the environment is thought to interact with a specific subset of odorant receptors depending upon odor structure and binding sites on the receptor. This precise detection and coding of odors by the peripheral olfactory neurons are subsequently processed, transformed and integrated in the central nervous system to generate specific behavioral responses that are critical for survival such as finding food, finding mates, avoiding predators etc (<xref ref-type="bibr" rid="bib51">van der Goes van Naters and Carlson, 2006</xref>).</p><p>Currently there are two major rate-limiting steps in analysis of peripheral coding in olfaction: a very small proportion of chemical space can be systematically tested for its activity on odorant receptors and a very small fraction of the numerous odorant receptors have been tested for responses (<xref ref-type="bibr" rid="bib1">Araneda et al., 2000</xref>; <xref ref-type="bibr" rid="bib26">Hallem et al., 2004</xref>; <xref ref-type="bibr" rid="bib25">Hallem and Carlson, 2006</xref>; <xref ref-type="bibr" rid="bib40">Pelz et al., 2006</xref>; <xref ref-type="bibr" rid="bib33">Kreher et al., 2008</xref>; <xref ref-type="bibr" rid="bib42">Saito et al., 2009</xref>; <xref ref-type="bibr" rid="bib38">Mathew et al., 2013</xref>). The challenges for overcoming the rate-limiting steps are enormous. First, volatile chemical space is immense, more than 2000 odors in the environment have been catalogued from a small fraction of plant sources alone (<xref ref-type="bibr" rid="bib31">Knudsen et al., 2006</xref>). Second, the complete three-dimensional structures of the 7-transmembrane odorant receptor proteins have not yet been determined and modeling of protein–odor interactions and sophisticated virtual screening methods are not yet possible except in rare instances (<xref ref-type="bibr" rid="bib50">Triballeau et al., 2008</xref>). In the decade since the first systematic study of 47 odorants on the <italic>Drosophila</italic> antenna in 2001 (<xref ref-type="bibr" rid="bib17">de Bruyne et al., 2001</xref>), additional studies have only identified a total of ∼250 novel activating odors (<xref ref-type="bibr" rid="bib16">de Bruyne et al., 1999</xref>; <xref ref-type="bibr" rid="bib17">de Bruyne et al., 2001</xref>; <xref ref-type="bibr" rid="bib18">Dobritsa et al., 2003</xref>; <xref ref-type="bibr" rid="bib22">Goldman et al., 2005</xref>; <xref ref-type="bibr" rid="bib26">Hallem et al., 2004</xref>; <xref ref-type="bibr" rid="bib25">Hallem and Carlson, 2006</xref>; <xref ref-type="bibr" rid="bib32">Kreher et al., 2005</xref>, <xref ref-type="bibr" rid="bib33">2008</xref>; <xref ref-type="bibr" rid="bib34">Kwon et al., 2007</xref>; <xref ref-type="bibr" rid="bib40">Pelz et al., 2006</xref>; <xref ref-type="bibr" rid="bib48">Stensmyr et al., 2003</xref>; <xref ref-type="bibr" rid="bib41">Turner and Ray, 2009</xref>; <xref ref-type="bibr" rid="bib52">van Naters and Carlson, 2007</xref>; <xref ref-type="bibr" rid="bib56">Yao et al., 2005</xref>; <xref ref-type="bibr" rid="bib43">Schmuker et al., 2007</xref>), which have been assembled and compared in an online database (<xref ref-type="bibr" rid="bib20">Galizia et al., 2010</xref>).</p><p>Here we overcome this challenge by designing a chemical-informatics platform that is effective and fast. In order to do so we focused our attention on one of the most comprehensive quantitative data sets available, where measurements of responses of 24 <italic>Drosophila</italic> odorant receptors to a panel of 109 odorants are known that provides a rich resource for structure-activity type analyses (<xref ref-type="bibr" rid="bib25">Hallem and Carlson, 2006</xref>). We devised a method to identify molecular structural properties that are shared amongst the activating odorants for each receptor. We then utilize information about these shared molecular features of active odorants, that are presumably required for binding to a receptor, to perform in silico screens on a chemical space of &gt;240,000 chemicals, including a large collection of naturally occurring and biologically important odors, and identify the top 500 hits for each of the odorant receptors (Ors). We then use single-unit electrophysiology to validate a subset of predictions for 9 Ors in vivo and find that our method met an overall success rate of ∼71% in identifying novel ligands. This approach is specific since testing shows a low (10%) rate of finding ligands while using non-predicted odors. This approach allows us to create a computationally predicted peripheral coding map of a large chemical space, which substantially improves our ability to predict and investigate peripheral olfactory coding and provides a powerful tool for the discovery of novel ligands for Ors, some of which may be ecologically important or useful for behavior modification.</p></sec><sec id="s2" sec-type="results"><title>Results</title><sec id="s2-1"><title>Analysis of odorant structure</title><p>Since the structure of receptor protein complexes is not known, we analyzed receptor–odor interactions by applying the ‘similarity property principle’, which reasons that structurally similar molecules (e.g., activating odorants) are more likely to have similar properties (<xref ref-type="bibr" rid="bib29">Hendrickson, 1991</xref>; <xref ref-type="bibr" rid="bib37">Martin et al., 2002</xref>). Although this general approach has been useful in the area of pharmaceuticals (<xref ref-type="bibr" rid="bib37">Martin et al., 2002</xref>; <xref ref-type="bibr" rid="bib30">Keiser et al., 2009</xref>), receptor–odor analysis presents significant additional challenges. Not only are odorant molecules generally smaller in size than pharmaceuticals (average MW of known odors ∼threefold less than FDA approved pharmaceuticals [<xref ref-type="bibr" rid="bib55">Wishart et al., 2008</xref>]) and therefore offer fewer structural features for differentiation, they are also detected by the receptors with specificity at extremely low concentrations in the volatile phase (<xref ref-type="bibr" rid="bib25">Hallem and Carlson, 2006</xref>; <xref ref-type="bibr" rid="bib33">Kreher et al., 2008</xref>). Additionally, odorant receptors are differentially tuned and can sometimes appear not to follow distinct structural rules: odors that look structurally different can strongly activate the same receptor, while odors that appear very similar may have very different levels of activity (<xref ref-type="bibr" rid="bib25">Hallem and Carlson, 2006</xref>). For example, while hexanal and γ-octalactone are structurally very different, they both strongly activate Or85b (<xref ref-type="bibr" rid="bib25">Hallem and Carlson, 2006</xref>). Alternatively, while hexanal and pentanal are structurally very similar, they have very different activities against Or85b (<xref ref-type="bibr" rid="bib25">Hallem and Carlson, 2006</xref>).</p></sec><sec id="s2-2"><title>General measures of odorant similarity</title><p>Similarity in chemical structure can be described and measured quantitatively using multiple approaches, however a single method may not be ideal for every single application (<xref ref-type="bibr" rid="bib36">Maldonado et al., 2006</xref>). In order to test whether non-optimized approaches would be able to identify similarities in shape of known activators we compared four different approaches: Cerius2 (Accelrys Software Inc), Dragon (Talete), Maximum-Common-Substructure (MCS) (<xref ref-type="bibr" rid="bib6">Cao et al., 2008b</xref>), and atom-pair (AP) (<xref ref-type="bibr" rid="bib8">Carhart et al., 1985</xref>; <xref ref-type="bibr" rid="bib5">Cao et al., 2008a</xref>). Cerius2 and Dragon represent collections of 200 and 3224 molecular descriptors, respectively, that calculates values for a broad range of chemical properties such as molecular weight, functional group counts, and in the case of Dragon, three-dimensional relationships within molecules. The AP method compares shortest path distances between all atom pairs in a molecule. Lastly, MCS identifies the largest two-dimensional substructure that exists between two compounds. Using each of these approaches, we computed distances between 109 odors that had previously been tested against 24 Ors from <italic>Drosophila melanogaster</italic> (<xref ref-type="bibr" rid="bib25">Hallem and Carlson, 2006</xref>). These represent most of the <italic>Or</italic> genes expressed in the <italic>Drosophila</italic> antenna (<xref ref-type="bibr" rid="bib25">Hallem and Carlson, 2006</xref>)<italic>.</italic> Upon comparison, we find that none of the four methods were vastly superior and that each method varied in the ability to group known activating odorants ‘actives’ close together in distance as measured for each Or using a method called accumulative-percentage-of-actives (APoA)(<xref ref-type="bibr" rid="bib9">Chen and Reynolds, 2002</xref>) (‘Materials and methods’ and <xref ref-type="fig" rid="fig1s1">Figure 1—figure supplement 1</xref>) and value of the area-under-the-curve (AUC). Ultimately, Dragon and Cerius2, which utilize a large number of diverse molecular descriptor values to describe each odor structure, performed better than AP or MCS, suggesting that a more diverse set of descriptors is better at explaining Or activity than two-dimensional measures alone (<xref ref-type="fig" rid="fig1">Figure 1B</xref>). Atom-Pair and MCS were subsequently ignored from further development.<fig-group><fig id="fig1" position="float"><object-id pub-id-type="doi">10.7554/eLife.01120.003</object-id><label>Figure 1.</label><caption><title>A receptor-optimized molecular descriptor approach has strong predictive power to find new ligands.</title><p>(<bold>A</bold>) Schematic of the cheminfomatics pipeline used to identify novel ligands from a larger chemical space. (<bold>B</bold>) Plot of mean APoA values for 19 Drosophila Ors calculated using various methods including a previously identified set (<xref ref-type="bibr" rid="bib24">Haddad et al., 2008</xref>). (<bold>C</bold>) Receiver-operating-characteristic curve (ROC) representing computational validation of ligand predictive ability of the Or-optimization approach. (<bold>D</bold>) Hierarchical cluster analysis of the 109 odorants of the training set using Or-specific optimized descriptor sets to calculate distances in chemical space for odorant receptors with strong activators (green), and odorant receptors with no strong activators (yellow).</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.01120.003">http://dx.doi.org/10.7554/eLife.01120.003</ext-link></p></caption><graphic xlink:href="elife01120f001"/></fig><fig id="fig1s1" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.01120.004</object-id><label>Figure 1—figure supplement 1.</label><caption><title>Analysis of APoA curves for individual odor receptors.</title><p>Plots of the mean APoA values obtained from various molecular descriptor methods demonstrates that optimized descriptor subsets generate highest values. Previous = 32 Dragon descriptors selected in <xref ref-type="bibr" rid="bib24">Haddad et al. (2008)</xref>. Molecular descriptor methods were compared using the 109 compounds that were previously tested in (<xref ref-type="bibr" rid="bib25">Hallem and Carlson, 2006</xref>).</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.01120.004">http://dx.doi.org/10.7554/eLife.01120.004</ext-link></p></caption><graphic xlink:href="elife01120fs001"/></fig><fig id="fig1s2" position="float" specific-use="child-fig"><object-id pub-id-type="doi">10.7554/eLife.01120.005</object-id><label>Figure 1—figure supplement 2.</label><caption><title>Pharmacophores of active compounds for individual Ors.</title><p>Hierarchical cluster identical to <xref ref-type="fig" rid="fig1">Figure 1D</xref>. Known odorant activity scale is indicated using independent color gradient scales. Horizontal black bars underneath cluster indicate part of active cluster, a subset of which were used to generate pharmacophores using the Ligand Scout program (shown underneath each Or in two orientations). Yellow = hydrophobic region, red = Hydrogen-bond acceptor, green/red = Hydrogen-bond donor or acceptor depending upon pH.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.01120.005">http://dx.doi.org/10.7554/eLife.01120.005</ext-link></p></caption><graphic xlink:href="elife01120fs002"/></fig></fig-group></p></sec><sec id="s2-3"><title>Identification of unique subsets of optimized descriptors for each <italic>Drosophila</italic> Or</title><p>Individual Ors respond to distinct subsets of ligands with some degree of overlap (<xref ref-type="bibr" rid="bib25">Hallem and Carlson, 2006</xref>; <xref ref-type="bibr" rid="bib33">Kreher et al., 2008</xref>). We reasoned that rather than using entire Dragon or Cerius2 descriptor sets, which likely includes a number of measurements for features irrelevant for ligand-binding to an individual Or, judiciously selecting subsets of molecular descriptors suited to cluster activators for an individual receptor may be more effective at defining an Or-specific chemical space. To test this hypothesis, we used a Sequential-Forward-Selection (SFS) method to incrementally create unique optimized descriptor subsets for each Or from an initial combined set of 3424 descriptors from Dragon and Cerius2 (<xref ref-type="bibr" rid="bib54">Whitney, 1971</xref>) (‘Materials and methods’; <xref ref-type="fig" rid="fig1">Figure 1A</xref>). This optimization-based analysis was performed on the 19 Ors from the dataset with known activating odors, excluding Or82a, since it has but a single known strong activator (<xref ref-type="bibr" rid="bib25">Hallem and Carlson, 2006</xref>).</p><p>Not surprisingly, the composition of the optimized descriptor sets varied greatly between Ors, as on average only 13% of descriptors are shared between Ors (<xref ref-type="table" rid="tbl1">Table 1</xref>; <xref ref-type="supplementary-material" rid="SD1-data">Supplementary file 1A</xref>). Molecular descriptors can be categorized from 0 to 3 dimensions. Zero-dimensional (0-D) descriptors define features that can be viewed as not directly being shape dependent, such as molecular weight or vapor pressure. On the other end of the scale, three-dimensional (3-D) descriptors define features of molecules in three-dimensional space, such as the distance between two atoms of an odor molecule. Interestingly, we find an overwhelming preference for three-dimensional and two-dimensional descriptors compared to one-dimensional and zero-dimensional descriptors, suggesting that structural shape features are more important for receptor–odor interactions (<xref ref-type="table" rid="tbl1">Table 1</xref>; <xref ref-type="supplementary-material" rid="SD1-data">Supplementary file 1A</xref>). We find that Or-optimized descriptor sets were far superior at grouping together activating odors from the training set when compared to the non-optimized methods (Dragon, Cerius2, MCS, AP) and a previously identified collection of descriptors that were identified without receptor-specific optimization (<xref ref-type="bibr" rid="bib24">Haddad et al., 2008</xref>) (<xref ref-type="fig" rid="fig1">Figure 1B</xref>, <xref ref-type="fig" rid="fig1s1">Figure 1—figure supplement 1</xref>).<table-wrap id="tbl1" position="float"><object-id pub-id-type="doi">10.7554/eLife.01120.006</object-id><label>Table 1.</label><caption><p>Optimized molecular descriptor set compositions</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.01120.006">http://dx.doi.org/10.7554/eLife.01120.006</ext-link></p></caption><table frame="hsides" rules="groups"><tbody><tr><td colspan="2"><bold>Descriptor class type counts for all Ors</bold></td></tr><tr><td> GETAWAY descriptors</td><td>75</td></tr><tr><td> 3D-MoRSE descriptors</td><td>66</td></tr><tr><td> 2D autocorrelations</td><td>44</td></tr><tr><td> Edge adjacency indices</td><td>44</td></tr><tr><td> 2D binary fingerprints</td><td>44</td></tr><tr><td> Functional group counts</td><td>43</td></tr><tr><td> Atom-centred fragments</td><td>37</td></tr><tr><td> WHIM descriptors</td><td>36</td></tr><tr><td> Topological charge indices</td><td>24</td></tr><tr><td> Atomtypes (Cerius2)</td><td>23</td></tr><tr><td> Burden eigenvalues</td><td>23</td></tr><tr><td> Molecular properties</td><td>23</td></tr><tr><td> Topological descriptors</td><td>22</td></tr><tr><td> Geometrical descriptors</td><td>18</td></tr><tr><td> 2D frequency fingerprints</td><td>11</td></tr><tr><td> RDF descriptors</td><td>8</td></tr><tr><td> Walk and path counts</td><td>6</td></tr><tr><td> Connectivity indices</td><td>5</td></tr><tr><td> Information indices</td><td>5</td></tr><tr><td> Topological (Cerius2)</td><td>4</td></tr><tr><td> Constitutional descriptors</td><td>3</td></tr><tr><td> Structural (Cerius2)</td><td>2</td></tr><tr><td> Randic molecular profiles</td><td>2</td></tr><tr><td colspan="2"><bold>Optimized descriptor analysis</bold></td></tr><tr><td> Average descriptor overlap between Ors</td><td>13%</td></tr><tr><td> Average number of descriptors per Or</td><td>29.9</td></tr><tr><td> Average number 3D descriptors per Or</td><td>10.8</td></tr><tr><td> Average number 2D descriptors per Or</td><td>12.2</td></tr><tr><td> Average number 1D descriptors per Or</td><td>6.6</td></tr><tr><td> Average number 0D descriptors per Or</td><td>0.3</td></tr><tr><td colspan="2"><bold>Descriptor dimensionality counts</bold></td></tr><tr><td> Number three dimensional descriptors</td><td>205</td></tr><tr><td> Number two dimensional descriptors</td><td>232</td></tr><tr><td> Number one dimensional descriptors</td><td>126</td></tr><tr><td> Number zero dimensional descriptors</td><td>5</td></tr><tr><td colspan="2"><bold>Descriptor Origin</bold></td></tr><tr><td> Number Dragon descriptors</td><td>539</td></tr><tr><td> Number Cerius descriptors</td><td>29</td></tr></tbody></table><table-wrap-foot><fn><p>Breakdowns of the molecular descriptor class type, dimensionality, origin, and average overlap for all optimized molecular descriptors selected for each Or.</p></fn></table-wrap-foot></table-wrap></p></sec><sec id="s2-4"><title>Computational validation of optimized descriptor sets</title><p>In order to validate the predictive ability of the <italic>Or-</italic>optimized method, we performed five independent trials of fivefold cross-validations followed by a Receiver-Operating-Characteristic (ROC) analysis, an established computational approach (<xref ref-type="bibr" rid="bib27">Hastie et al., 2001</xref>; <xref ref-type="bibr" rid="bib49">Tan et al., 2006</xref>) (‘Materials and methods’). Briefly, this involved withholding 20% of the 109 previously tested odors for a receptor. Descriptors were optimized using the remaining 80% odors for training, and ligand-predictions were subsequently performed on the 20% of odors that were withheld. This operation was repeated five times for each receptor, each time selecting a different 20% as withheld from the training set. The entire fivefold operation was repeated five times for each receptor and a mean ROC curve representing the prediction accuracy determined. This analysis was possible for 12 <italic>Ors</italic> which had &gt;6 known ligands that activated &gt;100 spikes/s. The Area-Under-Curve (AUC) value (0.815) is very promising and suggests that the <italic>Or-</italic>optimized descriptor sets are effective at predicting novel ligands (<xref ref-type="fig" rid="fig1">Figure 1C</xref>).</p><p>In addition to performing the fivefold cross-validation, we also clustered the 109 training odors independently for each Or, using distances calculated from the previously determined receptor specific descriptor sets we identified. As expected, we find that activating odorants cluster tightly together for each Or (<xref ref-type="fig" rid="fig1">Figure 1D</xref>) and activating odors of an Or have shared sub-structures and shared pharmacophore features (<xref ref-type="fig" rid="fig1s2">Figure 1—figure supplement 2</xref>). In a few cases, such as for Or35a and Or98a, not all the highly activating compounds are clustered, suggesting the possibility of multiple or flexible binding sites, or imperfect selection of descriptors. Four of the Ors (Or2a, Or23a, Or43a and Or85f) have few known activators, none of which activate the receptors at &gt;150 spikes/s, however our descriptor optimization approach is still able to cluster each of the few weak activators together (<xref ref-type="fig" rid="fig1">Figure 1D</xref>).</p></sec><sec id="s2-5"><title>High-throughput in silico screening of odorant receptors</title><p>Since Or-optimized descriptor sets can efficiently group strong activators in chemical space, we used them to rank untested compounds according to their distance from known activators for specific Ors. We assembled a natural odor library, which contains 3197 naturally occurring odors, and a library derived from Pubchem (<xref ref-type="bibr" rid="bib2">Bolton et al., 2008</xref>), which contains &gt;240,000 compounds with similar molecular weights and atom type compositions to known volatiles (‘Materials and methods’). We then systematically screened both libraries using the optimized descriptor sets of 19 <italic>D. melanogaster</italic> Ors in silico. We identify the top 500 (0.2%) hits from this vast chemical library for each Or, the top ∼100 of which are reported in <xref ref-type="supplementary-material" rid="SD1-data">Supplementary file 1B</xref>.</p></sec><sec id="s2-6"><title>Electrophysiological validation of in silico screen and identification of agonists</title><p>To validate our in silico screen, we obtained a large number of untested odorants belonging to the top 500 predicted ligands for nine different Ors (141 total interactions tested; ∼11–23/Or) that were available from commercial sources at high purity and reasonable prices. The nine receptors were selected on the basis of previous functional mapping studies that enable us to unambiguously identify the antennal olfactory receptor neurons (ORNs) they are housed in (<xref ref-type="bibr" rid="bib26">Hallem et al., 2004</xref>; <xref ref-type="bibr" rid="bib12">Couto et al., 2005</xref>). We systematically tested each predicted receptor–odor combination using single-unit electrophysiology to record from the ORNs to which these 9 Ors have been previously mapped (<xref ref-type="bibr" rid="bib26">Hallem et al., 2004</xref>; <xref ref-type="bibr" rid="bib12">Couto et al., 2005</xref>). We find that a majority of the predicted ligands evoked responses from the target ORNs; ∼71% evoked either activation (&gt;50 spikes/s above the spontaneous activity) or inhibition (&gt;50% reduction in spontaneous activity [reverse agonist activity]) (<xref ref-type="table" rid="tbl2">Table 2</xref>). These cutoffs were selected based on the study from which the training set was obtained and has been used in other studies in the past that use this type of recordings (<xref ref-type="bibr" rid="bib25">Hallem and Carlson, 2006</xref>; <xref ref-type="bibr" rid="bib33">Kreher et al., 2008</xref>). Interestingly, the mean vapor pressure of activating odors (11.84 Torr) is 7.5 times higher than of inactive odors (1.58 Torr), raising the possibility that some inactive odors may not be volatilized and delivered at adequate levels to the ORNs. Additionally, we find that ∼13% of the predicted compounds we tested showed an inhibitory effect on baseline activity of the respective neuron (<xref ref-type="table" rid="tbl2">Table 2</xref>). These inhibitors were identified by virtue of structural similarity to known activators suggesting that they may bind to similar sites on the receptor. Thus as an additional benefit our approach may provide a method to identify inhibitors as well. Such inhibitors would not only provide important tools to investigate mechanisms of odorant receptor inhibition but could also be used in blocking specific odor-mediated behaviors. Consistent with our observations three of the receptor–odor interactions had been previously identified independently as well, Or22a (<xref ref-type="bibr" rid="bib40">Pelz et al., 2006</xref>), and Or49b (<xref ref-type="bibr" rid="bib26">Hallem et al., 2004</xref>). The electrophysiological analysis provides the most important validation of our Or-optimized descriptor-based in silico screen.<table-wrap id="tbl2" position="float"><object-id pub-id-type="doi">10.7554/eLife.01120.007</object-id><label>Table 2.</label><caption><p>Predicted receptor–odor interactions validated as highly accurate using electrophysiology</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.01120.007">http://dx.doi.org/10.7554/eLife.01120.007</ext-link></p></caption><table frame="hsides" rules="groups"><thead><tr><th>Classification</th><th>Or7a</th><th>Or10a</th><th>Or22a</th><th>Or47a</th><th>Or49b</th><th>Or59b</th><th>Or85a</th><th>Or85b</th><th>Or98a</th><th>Total</th></tr></thead><tbody><tr><td>Ligands (%)</td><td>88</td><td>31</td><td>86</td><td>39</td><td>27</td><td>91</td><td>92</td><td>87</td><td>100</td><td>71</td></tr><tr><td>Agonists (&gt;50 spikes/s) (%)</td><td>63</td><td>31</td><td>81</td><td>33</td><td>18</td><td>64</td><td>69</td><td>70</td><td>92</td><td>58</td></tr><tr><td>Agonists (&gt;100 spikes/s) (%)</td><td>31</td><td>13</td><td>62</td><td>11</td><td>9</td><td>45</td><td>48</td><td>48</td><td>67</td><td>37</td></tr><tr><td>Inverse agonists (%)</td><td>25</td><td>0</td><td>5</td><td>6</td><td>9</td><td>25</td><td>23</td><td>17</td><td>8</td><td>13</td></tr></tbody></table><table-wrap-foot><fn><p>Summary of prediction accuracy percentages obtained by electrophysiology validation. Ligands = Agonists (≥50 spikes/s) + Inverse agonists (&gt;50% reduction from baseline activity).</p></fn></table-wrap-foot></table-wrap></p></sec><sec id="s2-7"><title>Odor response spectra of individual Ors</title><p>Since we systematically analyzed responses of a large number of new odorants individually, we were able to characterize the odor-response spectra of several antennal ORN classes to these new ligands (<xref ref-type="fig" rid="fig2">Figure 2A</xref>). New activators are reported for every receptor, and inhibitors are identified for several. Ligand predictions for 2 of the 3 receptors that do not perform as well are Or10a and Or49b that detect aromatic compounds. Their poor performance is explained by the lack of aromatic ligands in the initial training set (13/109) odorants. We find that a &gt;85% of the predicted ligands activate odorant receptors Or7a, Or22a, Or59b, Or85a, Or85b, and Or98a (<xref ref-type="fig" rid="fig2">Figure 2A</xref>).<fig id="fig2" position="float"><object-id pub-id-type="doi">10.7554/eLife.01120.008</object-id><label>Figure 2.</label><caption><title>Electrophysiology validates that odorant receptor-optimized molecular descriptors can successfully identify new ligands for Drosophila.</title><p>Mean increase in response of neurons to 0.5-s stimulus of indicated odors (10<sup>−2</sup> dilution) predicted for each associated Or. Dashed lines indicate the activator threshold (50 spikes/s). <italic>Δ</italic>H: Or85b (ab3B) = flies lack expression of Or22a in neighboring neuron, thus all observed neuron activation is unambiguously caused by Or85b. N = 3, error bars = s.e.m.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.01120.008">http://dx.doi.org/10.7554/eLife.01120.008</ext-link></p></caption><graphic xlink:href="elife01120f002"/></fig></p></sec><sec id="s2-8"><title>Specificity of in silico predicted ligands</title><p>We rigorously examined the rate of false negative predictions for each Or by systematically testing newly identified ligands of each Or against the other non-target receptors using electrophysiology. Of 504 non-target receptor–odor interactions tested, we found that only 10% evoked a response &gt;50 spikes/s and 3.7% evoked a response &gt;100 spikes/s (<xref ref-type="fig" rid="fig3">Figure 3A</xref>). This represents a high degree of specificity, especially considering that the Or-optimized descriptor method did not incorporate any additional computational screening to rule out non-target activators. Additionally, when we plot the percentage of odors that validated as activators when tested using electrophysiology (considering both predicted and non-target receptor–odor interactions), we find that activity is strongly related to predicted odor ranking (<xref ref-type="fig" rid="fig3">Figure 3B</xref>). Odors which rank closest to known activators for each Or, particularly within the top 500 hits, are far more likely to be activators than odors further away, and there is a drastic drop-off in activating odors present beyond the 1000 rank. We see the same trend if we plot mean activity of odors for the same ranking divisions. Highly ranked odors have a far higher mean activity than distantly ranked odors.<fig id="fig3" position="float"><object-id pub-id-type="doi">10.7554/eLife.01120.009</object-id><label>Figure 3.</label><caption><title>Predicted receptor–odor interactions are highly specific.</title><p>(<bold>A</bold>) Plot of activity (Top) for electrophysiologically tested receptor-odor interactions. (Bottom) Plot indicating locations of predicted receptor-odor combinations (green) and same odorants tested in non-target receptor-odor combinations (gray). (<bold>B</bold>) Plot of percentage of activating odors (&gt;50 spikes/s) considering all activating or inactive odors (&gt;0 spikes/s) across ranking bins for all odors tested using electrophysiology.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.01120.009">http://dx.doi.org/10.7554/eLife.01120.009</ext-link></p></caption><graphic xlink:href="elife01120f003"/></fig></p></sec><sec id="s2-9"><title>Relationship between descriptor sets and Or sequence and activity</title><p>Since receptor-optimized descriptor sets and the predicted ligand space they define are a function of shared molecular features that a receptor may employ to recognize ligands, we were now in a position to determine how these characteristics correlate with receptor properties such as their known-activity profiles and amino acid sequences. We used hierarchical cluster analysis to create trees that represent the various receptors based on: shared descriptors selected; known activity-based relationships (<xref ref-type="bibr" rid="bib25">Hallem and Carlson, 2006</xref>); degree of overlap of predicted ligands; and amino acid sequence (<xref ref-type="fig" rid="fig4">Figure 4A</xref>; ‘Materials and methods’). We found that the maximum overlap in Or relationships is retained between the descriptor and the activity trees, and the descriptor and the cross activity trees with 11 out of 24 Ors present in subgroups that are common in both cases. However, only two subgroups (yellow and grey) are conserved across the three trees. The largest shared overlap existing in the descriptor tree suggests that the Or-optimized descriptors link the known and the predicted receptor–odor interactions and that our analysis may expand upon odorant receptor activity relationships beyond those previously known from the training data. We also found that the phylogenetic tree has fewer relationships conserved with each of the trees, consistent with previous observations (<xref ref-type="bibr" rid="bib26">Hallem et al., 2004</xref>) supporting the idea that, while the most conserved amino acid residues in the Ors provide the structure of the tree, they do not correlate strongly with ligand specificity.<fig id="fig4" position="float"><object-id pub-id-type="doi">10.7554/eLife.01120.010</object-id><label>Figure 4.</label><caption><title>Analysis of receptor–odor relationships and breadth of tuning.</title><p>(<bold>A</bold>) Hierarchical clusters created from Euclidean distance values between Drosophila Ors calculated using: (left to right) shared optimized descriptors; known activity to training set odors (<xref ref-type="bibr" rid="bib25">Hallem and Carlson, 2006</xref>); overlap across top 500 predicted ligands; and Phylogenic tree of receptors (<xref ref-type="bibr" rid="bib25">Hallem and Carlson, 2006</xref>). Sub clusters shaded with colors or bars. (<bold>B</bold>) Frequency distribution of compounds from the &gt;240K library within the top 15% distance from highest active plotted to generate predicted breadth of tuning curves. Green arrows indicate relative distance of the furthest known activating compound determined by electrophysiology.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.01120.010">http://dx.doi.org/10.7554/eLife.01120.010</ext-link></p></caption><graphic xlink:href="elife01120f004"/></fig></p></sec><sec id="s2-10"><title>Analysis of breadth of predictions for each Or in chemical space</title><p>Coding of odors in a large volatile space (&gt;240,000) by a receptor repertoire is virtually impossible to determine experimentally. However, based on the Or-optimized descriptor sets we computationally derived prediction frequency distributions for each of the <italic>Drosophila</italic> Ors in this large chemical space (<xref ref-type="fig" rid="fig4">Figure 4B</xref>). As expected, we find substantial variation in the distribution frequency of predicted ligands across different receptors. The predicted response profiles support previous observations made with smaller odor panels that the olfactory system can potentially detect thousands of volatile chemicals, many of which the organism may never have encountered in its chemical environment. Plant volatiles constituted a large portion of compounds that are predicted to be ligands for <italic>Drosophila</italic> Ors. To further analyze odor source representation, we classified odors that belong to top 500 prediction lists according to their source, if known, and find that Ors are not specialized for odors from a single source (<xref ref-type="fig" rid="fig5">Figure 5A</xref>).<fig id="fig5" position="float"><object-id pub-id-type="doi">10.7554/eLife.01120.011</object-id><label>Figure 5.</label><caption><title>Analysis of predicted natural odor sources and cross activation.</title><p>(<bold>A</bold>) (Left) The numbers of compounds present in the collected volatile library according to source. (Right) The numbers and sources of predicted ligands for the 19 Drosophila odor receptors/neurons within the top 500 predicted compounds. (<bold>B</bold>) Comparison of plots for percentage of receptors that are: (top left) activated by percentage of known odors from training set (<xref ref-type="bibr" rid="bib25">Hallem and Carlson, 2006</xref>); (bottom left) predicted to be activated by Natural compound library; (top right) predicted to be activated from &gt;240K library; and (bottom right) activated by ligands for 10 shared <italic>Ors</italic> in this study vs activated by comparable actives previously tested (<xref ref-type="bibr" rid="bib25">Hallem and Carlson, 2006</xref>).</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.01120.011">http://dx.doi.org/10.7554/eLife.01120.011</ext-link></p></caption><graphic xlink:href="elife01120f005"/></fig></p></sec><sec id="s2-11"><title>Across-receptor activation patterns in <italic>Drosophila</italic></title><p>To study the ensemble activation patterns of odors predicted across all Ors, we analyzed the across-receptor activation patterns of the 3197 known compounds for nine receptors (Or7a, 10a, 22a, 47a, 49b, 59b, 85a, 85b, 98a). Surprisingly, we find that only 25% of compounds from the ‘natural’ odor library found in the top 500 predictions for each Or are predicted to activate multiple Ors (<xref ref-type="fig" rid="fig5">Figure 5B</xref>, lower left panel). If we consider compounds from the Pubchem library in the top 500 predicted activators for each receptor, we observe further reduction in the proportion of across-receptor activating compounds (<xref ref-type="fig" rid="fig5">Figure 5B</xref>, upper right). Consistent with this prediction we find that cross-activation by ligands functionally evaluated in this study for nine receptors is lower than that reported previously using ligands of comparable strength for the same nine receptors (<xref ref-type="bibr" rid="bib25">Hallem and Carlson, 2006</xref>) (<xref ref-type="fig" rid="fig5">Figure 5B</xref>, lower right panel). These data suggest that a number of natural odors may be detected by a few receptors, particularly at low concentrations.</p></sec></sec><sec id="s3" sec-type="discussion"><title>Discussion</title><p>A primary element of the olfactory code is information about odor identity, represented by the characteristic interaction of an odor with the ensemble of olfactory receptors in the nose. Here we report an in silico approach to systematically identify ligands from a vast chemical space for a large number of Ors expressed in the antenna of <italic>Drosophila</italic>. We demonstrate that our predictions are accurate using two different validation approaches—computational validations and functional validation using electrophysiology. There is a strong correlation between ranks of predicted ligands to electrophysiological activity.</p><p>Obtaining and testing odors using traditional methods is time and cost intensive. Electrophysiology and calcium imaging are consuming processes that require not only a great deal of time to perform, but also the purchase of each odor to be physically tested. Moreover, large plate-based combinatorial chemical libraries, which are commonly implemented in drug discovery in the pharmaceutical industry, are not available for volatile odor libraries at reasonable costs. Since <italic>Drosophila</italic> is a premier model for understanding neurobiology of olfaction, several laboratories over the last 12 years have together screened ∼250 odors, activities of which have been and compiled into a valuable database that standardizes across studies (<xref ref-type="bibr" rid="bib20">Galizia et al., 2010</xref>). In this study we screen &gt;240,000 chemicals and predict &gt;10,000 new ligands which represents a substantial expansion of the known peripheral olfactory code for this important model organism and provides a system-level view of odor detection (<xref ref-type="fig" rid="fig6">Figure 6A</xref>).<fig id="fig6" position="float"><object-id pub-id-type="doi">10.7554/eLife.01120.012</object-id><label>Figure 6.</label><caption><title>Predicted odor space and network view of odor coding.</title><p>(<bold>A</bold>) Expansion of the peripheral olfactory code in this study: large increase in numbers of identified activators and inhibitors. The different sized circles represent the approximate ratio of numbers of previously known ligands (top circles), predicted ligands based on a cutoff of the top 500 predicted compounds per receptor and corrected to the validation success rate (lower, diffuse circles). (<bold>B</bold>) <italic>Drosophila</italic> receptor–odor network. Each known interaction (&gt;50 spikes/s) from this and previous studies (<xref ref-type="bibr" rid="bib25">Hallem and Carlson, 2006</xref>) is linked by a purple edge. Predicted receptor–odor network (top 500 hits) are linked by light-grey edges. All compounds are represented as small black circles and Ors are represented as large colored circles matching the colors used in (<xref ref-type="fig" rid="fig4">Figure 4A</xref>).</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.01120.012">http://dx.doi.org/10.7554/eLife.01120.012</ext-link></p></caption><graphic xlink:href="elife01120f006"/></fig></p><p>The predicted ligands and prediction method will increase the speed of receptor–odor decoding and allow for interpretation of data at a large scale that is difficult to achieve. This could help answer questions such as breadth of receptor tuning, investigating responses to odorants from natural sources, and evolution of odor coding across a receptor repertoire. Additionally, using chemical informatics, it becomes possible to infer and prioritize for testing the network of odorant receptors that are activated from complex odor blends without the expensive and time consuming process of purchasing and testing all possible odors and receptor combinations (<xref ref-type="fig" rid="fig6">Figure 6B</xref>).</p><p>Interestingly, our attempts to identify molecular descriptors that would differentiate agonists from inverse agonists were not successful with this data set. This could be due to several reasons: an insufficient number of inverse agonists amongst the training odors, or the inverse agonists may act via the same binding sites as agonists and share many of the same structural features of the activating odors making them difficult to distinguish. We feel that this remains an important challenge to be overcome in the future with improved computational approaches or larger odor training sets.</p><p>A similar, yet much smaller, analysis applied chemical informatics on <italic>Drosophila</italic> olfactory neuron activities to 47 odorants and screened ligands from 21 untested compounds in <italic>Drosophila</italic> (<xref ref-type="bibr" rid="bib43">Schmuker et al., 2007</xref>). Although this study had a relatively modest success rate of ∼25% at predicting untested odorants as activators (by applying the same 50 spikes/s threshold for comparison), it also highlighted that structure-based ligand prediction is a viable method for further development. In another interesting analysis a Quantitative Structure Activity Relationship (QSAR) model was applied to describe odor-activity for <italic>Drosophila</italic> Ors. Using cheminformatics, important amino acid residues were identified using information from orthologous Or sequences identifying potential odor-binding regions, which was postulated to be 15 angstroms deep and 6 angstroms wide (<xref ref-type="bibr" rid="bib23">Guo and Kim, 2010</xref>). These studies, along with ours, suggests that computational approaches could have great utility in study of sensory receptors. It will also be very interesting to use our method for making ligand predictions for the structurally distinct receptors such as olfactory ionotropic glutamate receptors (IRs), and gustatory receptors (Grs) in insects, and olfactory and taste GPCRs in vertebrates.</p><p>Our approach is conservative and designed to search for novel odors that share structural features from a previously tested odor panel. Odor molecules are limited in size as well, and may offer a limited scaffold such that novel isofunctional chemotype identification may not be as prevalent as has been seen in other examples of scaffold-hopping (<xref ref-type="bibr" rid="bib44">Schneider et al., 2006</xref>). However while compounds that share similar values for the optimized descriptors do have structural similarity for selected parts of the molecule, it does not mean that they are not structurally different in other parts of the molecule. In the future, application of machine learning approaches, such as Support Vector Machines (SVMs) to the receptor-optimized molecular descriptor sets, may be useful to further increase the predictive ability. Additionally, we could replace our SFS approach with sequential floating search techniques, which allows for removal, as well as addition, of descriptors in the growing optimized list.</p><p>Our predictions suggest that a number of odorants at low concentrations may be detected by only a few receptors. In the current model of combinatorial coding emphasis is placed on the notion that combinations of several odorant receptors detect the majority of volatile chemicals, with the exception of pheromones and CO2. One possible explanation for this disparity could be that our predictions are fundamentally conservative in nature because we focus only on structurally similar ligands and 7-transmembrane heteromeric receptors may also contain additional unexplored binding sites. Another possibility is that previously tested subsets of odors were potentially selected on the basis of strong responses in electroantennograms and behavior assays, which could bias selection of cross-activating odors. In fact, complex fruit odor blends activate fewer Ors than the number activated by individual odorants at comparable concentrations using electrophysiology (<xref ref-type="bibr" rid="bib25">Hallem and Carlson, 2006</xref>) and Calcium imaging (<xref ref-type="bibr" rid="bib45">Semmelhack and Wang, 2009</xref>). The architecture of the olfactory code therefore appears to integrate two different models. On the one hand, most odors are detected by a few Ors from the repertoire, which may enhance the specificity of the olfactory system for detection of a large number of odors. On the other hand, 15–20% of odors are predicted to activate several Ors (up to 50%) at the same time, which may serve to aid the olfactory of the system in discriminating between fine concentration changes of important stimuli by having Ors tuned to low and high concentrations such as shown for Or42a and Or42b (<xref ref-type="bibr" rid="bib33">Kreher et al., 2008</xref>).</p><p>By identifying a large number of new ligands for each odorant receptor, we can also begin to systematically compare the ligand tuning profiles for each in the endogenous neurons vs the ‘empty neuron’ decoder system. If clear differences were identified, it could enable the identification of underlying reasons such as differences in levels of receptor expression in the neurons, or presence of different odorant binding proteins (OBPs) in the sensillum lymph.</p><p>This cheminformatics pipeline can also be applied for system-level analysis of other insects whose receptors and ORNs have been decoded such as mosquitoes (<xref ref-type="bibr" rid="bib7">Carey et al., 2010</xref>), and vertebrates such as mice and humans (<xref ref-type="bibr" rid="bib42">Saito et al., 2009</xref>). The search for novel insect repellents and attractants for species that transmit disease and destroy crops can be greatly assisted by a rational prioritization using such a cheminformatics approach.</p></sec><sec id="s4" sec-type="materials|methods"><title>Materials and methods</title><sec id="s4-1"><title>Virtual odor compound library</title><p>We assembled a subset of 3197 volatile compounds from annotated origins including plants (<xref ref-type="bibr" rid="bib31">Knudsen et al., 2006</xref>), insects (<xref ref-type="bibr" rid="bib19">El-Sayed, 2009</xref>), humans, and a fragrance collection (<xref ref-type="bibr" rid="bib47">Sigma-Aldrich, 2007</xref>) that may have additional fruit and floral volatiles (<xref ref-type="bibr" rid="bib57">Zeng et al., 1991</xref>; <xref ref-type="bibr" rid="bib11">Cork and Park, 1996</xref>; <xref ref-type="bibr" rid="bib58">Zeng et al., 1996</xref>; <xref ref-type="bibr" rid="bib39">Meijerink et al., 2000</xref>; <xref ref-type="bibr" rid="bib13">Curran et al., 2005</xref>; <xref ref-type="bibr" rid="bib31">Knudsen et al., 2006</xref>; <xref ref-type="bibr" rid="bib21">Gallagher et al., 2008</xref>; <xref ref-type="bibr" rid="bib35">Logan et al., 2008</xref>). We also assembled a subset of 241,150 odors from Pubchem, which have similar characteristics to known odor molecules. Compounds met a criteria of MW &lt;200 and only being composed of the following atoms (C, O, N, H, I, Cl, S, F).</p></sec><sec id="s4-2"><title>Calculation of 3D conformations</title><p>The three-dimensional structures were predicted for compounds through use of the Omega2 software package (<xref ref-type="bibr" rid="bib3">Bostrom et al., 2003</xref>; <xref ref-type="bibr" rid="bib28">Hawkins et al., 2010</xref>). The Omega2 software package identified the lowest energy 3D conformer for each compound in our Pubchem and Natural compound libraries were stored for use in molecular descriptor calculation.</p></sec><sec id="s4-3"><title>Calculation of molecular descriptors</title><p>Commercially available software packages Cerius2, Accelrys (200 idescriptors) and Dragon, Talete (3224 descriptors) were used to calculate molecular descriptors from three-dimensional molecular structures. Descriptor values were normalized across compounds to standard scores by subtracting the mean value for each descriptor type and dividing by the standard deviation. Molecular descriptors that did not show variation in values across the compounds were removed. Maximum Common Substructures were determined using an existing algorithm (<xref ref-type="bibr" rid="bib6">Cao et al., 2008b</xref>). Atom Pairs were computed from the version implemented in ChemmineR (<xref ref-type="bibr" rid="bib5">Cao et al., 2008a</xref>).</p></sec><sec id="s4-4"><title>Classification of active compounds</title><p>Since we were interested in identifying descriptors which best described activating compounds, we needed to first determine which compounds to classify as ‘active’ based on their electrophysiology activity for the receptor being studied. All of the training odors were clustered using hierarchical clustering by activity individually for each Or. The resulting tree can then be then be used to select the branch containing the majority of activating odors (&gt;50 spikes/s). The activity threshold therefore was set as the lowest spike/s activity of any odor present in the selected branch.</p></sec><sec id="s4-5"><title>Determination of Or-optimized descriptor subsets</title><p>A compound-by-compound activity distance matrix was calculated using training odor activity data for each of the Ors (<xref ref-type="bibr" rid="bib25">Hallem and Carlson, 2006</xref>). A separate compound-by-compound descriptor distance matrix was calculated using the 3424 descriptor values for training odors calculated by Dragon and Cerius2. Activating compounds for each Or were identified individually through activity thresholds, as described above. The correlation between the compound-by-compound activity (CbCA) and compound-by-compound descriptor distance matrices were compared for each actively classified compound, considering their distances to all other compounds. The goal was to identify molecular descriptors that best correlated with activity. To achieve this we applied a sequential forward selection (SFS) approach to identify optimal descriptors for each Or (<xref ref-type="bibr" rid="bib54">Whitney, 1971</xref>). The SFS functioned by iteratively building a list of molecular descriptors for a single Or by maximally increasing the correlation between the CbCA and CbCD matrices. In the first iteration the values for each single molecular descriptor were used to create CbCD matrices. The rows corresponding to activating compounds were compared to the same rows of the CbCA matrix by correlation. The descriptor which best described the activity (results in the highest correlation between descriptor and activity) was retained. In the second iteration the best single descriptor was combined with all possible descriptors and correlations are calculated again, resulting in a best two-descriptor combination. The process was continued in this fashion to iteratively search for additional descriptors with each iteration aiming to further increases in correlation values. In this manner, the size of the optimized descriptor set increases by one in each iteration, as the best descriptor set from the previous step is combined with all possible descriptors to find the next best descriptor. This process is halted when all possible descriptor additions in an iteration fails to improve the correlation value from the previous step. Molecular descriptors can be selected multiple times for each Or, effectively creating weights for descriptors, as a descriptor that was selected twice will have double the importance when predicting activity of the odor libraries. This whole process is run independently for each Or resulting in unique descriptor sets that are optimized for each Or.</p></sec><sec id="s4-6"><title>Calculation of accumulative percentage of actives (APoA)</title><p>The accumulative percentage of actives is calculated for each descriptor set individually as previously described (<xref ref-type="bibr" rid="bib9">Chen and Reynolds, 2002</xref>). Compounds are ranked according to their distance from each known activator using the Or-optimized descriptor values as distances, resulting in one set of ranked compound distances from each activating odor. Moving down the list for each of these rankings, ratios are calculated for the number of activating compounds observed divided by the total number of compounds inspected, or the APoA. APoA values are averaged across all activating compound rankings for each receptor, creating a single set of mean values representing the APoA for a single Or and descriptor set. Using this approach, ApoA mean values are calculated for each of the 24 Ors separately for each descriptor set used, including Or-optimized sets, all Dragon descriptors, all Cerius2 descriptors, Atom Pair, and Maximum Common Substructure. The area-under-the-curve (AUC) scores were calculated by approximation of the integral under each plotted APoA line.</p></sec><sec id="s4-7"><title>Clustering Ors by most common descriptors</title><p>The first 20 descriptors selected by our optimized descriptor selection algorithm for each Or were used to create an identity matrix. Each row representing an Or and column value specifying the presence of absence of a specific descriptor. This matrix was then converted into an Or-by-Or Euclidean distance matrix and clustered using hierarchical clustering and complete linkage.</p></sec><sec id="s4-8"><title>Clustering compounds by activity of Or</title><p>The responses of each of the Ors that had previously been tested against a panel of compounds were converted into an Or-by-Or Euclidean distance matrix (<xref ref-type="bibr" rid="bib25">Hallem and Carlson, 2006</xref>). Ors were clustered using hierarchical clustering and complete linkage. Specifically, this was achieved by creating a compound-by-compound distance matrix using the differences in activity between compounds tested on a singe Or. Hierarchical clustering using each Or distance matrix and then identifying the sub cluster which contained the most compounds.</p></sec><sec id="s4-9"><title>Clustering Ors by predicted ligand space</title><p>Percentages of overlapping predictions within the top 500 predicted compounds were calculated pair-wise for all Ors. Euclidean distances were calculated from the similarity between Ors.</p></sec><sec id="s4-10"><title>Calculation of Or prediction distribution frequencies</title><p>Initially, all extreme outliers were removed from the dataset for each Or. On average 5.82 compounds were removed for each Or, resulting in a mean dataset reduction of 0.0024%. Next, all compounds whose distance was &gt;3 standard deviations from the strongest activating compound were removed to reduce outliers. Distribution frequencies were produced for each Or. All compound distances were converted into a percentage of the most distant compound for each Or. Frequencies of compounds in the top 15% were plotted.</p></sec><sec id="s4-11"><title>Or-ligand interaction map</title><p>The Or-ligand interaction map was developed using Cytoscape (<xref ref-type="bibr" rid="bib46">Shannon et al., 2003</xref>). Each predicted Or-ligand interaction from the top 500 predicted ligands for all of the Ors listed were used to calculate the map. All predicted interactions are labeled in purple. In addition all interactions identified in this study and the previous study (<xref ref-type="bibr" rid="bib25">Hallem and Carlson, 2006</xref>) were included and labeled in gray. All compounds are represented as small black circles and Ors are represented as large colored circles. Or names are provided on the upper right corner of each Or.</p></sec><sec id="s4-12"><title>Computational validation of <italic>Drosophila</italic> receptor–odor predictions</title><p>We performed five independent fivefold cross-validations. For each independent validation the dataset was divided into five equal sized partitions containing roughly 22 compounds each. During each run, one of the partitions is selected for testing, and the remaining four sets are used for training. The training process is repeated five times with each unique odorant set being used as the test set exactly once. For every training iteration, a unique set of descriptors was calculated from the training compound set. These descriptors were then used to calculate distances of the test set compounds to the closest activating compound, exactly as we use to predict ligands in our ligand discovery pipeline. Once test set compounds have been ranked by distance from closest to furthest to a known activating in the training set, a receiver operating characteristics (ROC) analysis is used to analyze the performance of our computational ligand prediction approach. Using ROC we were able to determine our predictive ability for the 12 receptors. This validation could be performed only on receptors for which sufficient training odors had previously been identified. We consider this to consist of at least one very strongly activating known odor (&gt;150 spikes/s) and at least five strongly activating odors (&gt;100 spikes/s), thus allowing for at least one activating odor for each of the five test sets in the cross-validation (DmOr7a, DmOr9a, DmOr10a, DmOr22a, DmOr35a, DmOr43b, DmOr12, DmOr59b, DmOr67a, DmOr67c, DmOr85b, DmOr98a). Test set validations for all 12 Ors were combined and a single ROC curve representing an average across all Ors was plotted (<xref ref-type="fig" rid="fig1">Figure 1C</xref>).</p></sec><sec id="s4-13"><title>Electrophysiology</title><p>Extracellular single-sensillum electrophysiology was performed as before (<xref ref-type="bibr" rid="bib18">Dobritsa et al., 2003</xref>; <xref ref-type="bibr" rid="bib25">Hallem and Carlson, 2006</xref>; <xref ref-type="bibr" rid="bib17">de Bruyne et al., 2001</xref>) with a few modifications. Diagnostic odorants were used to distinguish individual classes of ORNs in sensilla (ab1-ab7) and therefore unequivocally identify the target Or expressing ORN for testing (<xref ref-type="bibr" rid="bib17">de Bruyne et al., 2001</xref>; <xref ref-type="bibr" rid="bib26">Hallem et al., 2004</xref>). 50 μl odor at 10<sup>−2</sup> dilution in paraffin oil was applied to cotton wool plugged odor cartridge. Due to variability in temporal kinetics of response across various odors, the counting window was shortened to 250 ms from the start of odor stimulus.</p></sec></sec></body><back><ack id="ack"><title>Acknowledgements</title><p>We would like to thank Thomas Girke and Y Cao for assistance with cheminformatics; Jocelyn Millar for providing chemicals and Anupama Dahanukar for critical reading of the manuscript. SMB is supported by an NSF IGERT grant in Chemical Genomics.</p></ack><sec sec-type="additional-information"><title>Additional information</title><fn-group content-type="competing-interest"><title>Competing interests</title><fn fn-type="conflict" id="conf1"><p>SMB: Listed as an inventor in patent applications filed by the University of California, Riverside.</p></fn><fn fn-type="conflict" id="conf2"><p>AR: Holds equity in an insect research company (Olfactor Labs) and is listed as an inventor in patent applications filed by the University of California, Riverside.</p></fn><fn fn-type="conflict" id="conf3"><p>The other author declares that no competing interests exist.</p></fn></fn-group><fn-group content-type="author-contribution"><title>Author contributions</title><fn fn-type="con" id="con1"><p>SMB, Conception and design, Acquisition of data, Analysis and interpretation of data, Drafting or revising the article</p></fn><fn fn-type="con" id="con2"><p>AR, Conception and design, Acquisition of data, Analysis and interpretation of data, Drafting or revising the article</p></fn><fn fn-type="con" id="con3"><p>SM, Acquisition of data, Analysis and interpretation of data</p></fn></fn-group></sec><sec sec-type="supplementary-material"><title>Additional files</title><supplementary-material id="SD1-data"><object-id pub-id-type="doi">10.7554/eLife.01120.013</object-id><label>Supplementary file 1.</label><caption><p>(<bold>A</bold>) Optimized descriptor sets for each <italic>Drosophila</italic> Or. Optimized descriptors occurrences, symbol, brief description, class, and dimensionality are listed. A summary of the total number of descriptors selected for the receptor repertoire is provided at the beginning. Descriptors are listed in ascending order of when they were selected into the optimized set, such that the descriptors selected first are more important. Weights indicate the number of times a descriptor was selected in an optimized descriptor set. (<bold>B</bold>) Top 100 predicted compounds for each <italic>Drosophila</italic> Or. Chemical name or Pubchem compound ID (CIDs), SMILES strings, and distances, of the top ∼100 predicted compounds for each Or. All distances represent the minimum distance based on optimized descriptors to the previously known strongest active compound listed in the gray cells for that particular Or.</p><p><bold>DOI:</bold> <ext-link ext-link-type="doi" xlink:href="10.7554/eLife.01120.013">http://dx.doi.org/10.7554/eLife.01120.013</ext-link></p></caption><media mime-subtype="xlsx" mimetype="application" xlink:href="elife01120s001.xlsx"/></supplementary-material></sec><ref-list><title>References</title><ref id="bib1"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Araneda</surname><given-names>RC</given-names></name><name><surname>Kini</surname><given-names>AD</given-names></name><name><surname>Firestein</surname><given-names>S</given-names></name></person-group><year>2000</year><article-title>The molecular receptive range of an odorant receptor</article-title><source>Nat Neurosci</source><volume>3</volume><fpage>1248</fpage><lpage>55</lpage><pub-id pub-id-type="doi">10.1038/81774</pub-id></element-citation></ref><ref id="bib2"><element-citation publication-type="book"><person-group person-group-type="author"><name><surname>Bolton</surname><given-names>EE</given-names></name><name><surname>Wang</surname><given-names>Y</given-names></name><name><surname>Thiessen</surname><given-names>PA</given-names></name><name><surname>Bryant</surname><given-names>SH</given-names></name></person-group><year>2008</year><article-title>PubChem: integrated platform of small molecules and biological activities</article-title><source>Annual reports in computational chemistry</source><publisher-loc>Washington DC</publisher-loc><publisher-name>American Chemical Society</publisher-name></element-citation></ref><ref id="bib3"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Bostrom</surname><given-names>J</given-names></name><name><surname>Greenwood</surname><given-names>JR</given-names></name><name><surname>Gottfries</surname><given-names>J</given-names></name></person-group><year>2003</year><article-title>Assessing the performance of OMEGA with respect to retrieving bioactive conformations</article-title><source>J Mol Graph Model</source><volume>21</volume><fpage>449</fpage><lpage>62</lpage><pub-id pub-id-type="doi">10.1016/S1093-3263(02)00204-8</pub-id></element-citation></ref><ref id="bib4"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Buck</surname><given-names>L</given-names></name><name><surname>Axel</surname><given-names>R</given-names></name></person-group><year>1991</year><article-title>A novel multigene family may encode odorant receptors: a molecular-basis for odor recognition</article-title><source>Cell</source><volume>65</volume><fpage>175</fpage><lpage>87</lpage><pub-id pub-id-type="doi">10.1016/0092-8674(91)90418-X</pub-id></element-citation></ref><ref id="bib5"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Cao</surname><given-names>Y</given-names></name><name><surname>Charisi</surname><given-names>A</given-names></name><name><surname>Cheng</surname><given-names>LC</given-names></name><name><surname>Jiang</surname><given-names>T</given-names></name><name><surname>Girke</surname><given-names>T</given-names></name></person-group><year>2008a</year><article-title>ChemmineR: a compound mining framework for R</article-title><source>Bioinformatics</source><volume>24</volume><fpage>1733</fpage><lpage>4</lpage><pub-id pub-id-type="doi">10.1093/bioinformatics/btn307</pub-id></element-citation></ref><ref id="bib6"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Cao</surname><given-names>Y</given-names></name><name><surname>Jiang</surname><given-names>T</given-names></name><name><surname>Girke</surname><given-names>T</given-names></name></person-group><year>2008b</year><article-title>A maximum common substructure-based algorithm for searching and predicting drug-like compounds</article-title><source>Bioinformatics</source><volume>24</volume><fpage>i366</fpage><lpage>74</lpage><pub-id pub-id-type="doi">10.1093/bioinformatics/btn186</pub-id></element-citation></ref><ref id="bib7"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Carey</surname><given-names>AF</given-names></name><name><surname>Wang</surname><given-names>GR</given-names></name><name><surname>Su</surname><given-names>CY</given-names></name><name><surname>Zwiebel</surname><given-names>LJ</given-names></name><name><surname>Carlson</surname><given-names>JR</given-names></name></person-group><year>2010</year><article-title>Odorant reception in the malaria mosquito <italic>Anopheles gambiae</italic></article-title><source>Nature</source><volume>464</volume><fpage>66</fpage><lpage>77</lpage><pub-id pub-id-type="doi">10.1038/nature08834</pub-id></element-citation></ref><ref id="bib8"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Carhart</surname><given-names>RE</given-names></name><name><surname>Smith</surname><given-names>DH</given-names></name><name><surname>Venkataraghavan</surname><given-names>R</given-names></name></person-group><year>1985</year><article-title>Atom pairs as molecular-features in structure activity studies: definition and applications</article-title><source>J Chem Inf Comput Sci</source><volume>25</volume><fpage>64</fpage><lpage>73</lpage><pub-id pub-id-type="doi">10.1021/ci00046a002</pub-id></element-citation></ref><ref id="bib9"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Chen</surname><given-names>X</given-names></name><name><surname>Reynolds</surname><given-names>CH</given-names></name></person-group><year>2002</year><article-title>Performance of similarity measures in 2D fragment-based similarity searching: comparison of structural descriptors and similarity coefficients</article-title><source>J Chem Inf Comput Sci</source><volume>42</volume><fpage>1407</fpage><lpage>14</lpage><pub-id pub-id-type="doi">10.1021/ci025531g</pub-id></element-citation></ref><ref id="bib10"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Clyne</surname><given-names>PJ</given-names></name><name><surname>Warr</surname><given-names>CG</given-names></name><name><surname>Freeman</surname><given-names>MR</given-names></name><name><surname>Lessing</surname><given-names>D</given-names></name><name><surname>Kim</surname><given-names>J</given-names></name><name><surname>Carlson</surname><given-names>JR</given-names></name></person-group><year>1999</year><article-title>A novel family of divergent seven-transmembrane proteins: candidate odorant receptors in Drosophila</article-title><source>Neuron</source><volume>22</volume><fpage>327</fpage><lpage>38</lpage><pub-id pub-id-type="doi">10.1016/S0896-6273(00)81093-4</pub-id></element-citation></ref><ref id="bib11"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Cork</surname><given-names>A</given-names></name><name><surname>Park</surname><given-names>KC</given-names></name></person-group><year>1996</year><article-title>Identification of electrophysiologically-active compounds for the malaria mosquito, <italic>Anopheles gambiae</italic>, in human sweat extracts</article-title><source>Med Vet Entomol</source><volume>10</volume><fpage>269</fpage><lpage>76</lpage><pub-id pub-id-type="doi">10.1111/j.1365-2915.1996.tb00742.x</pub-id></element-citation></ref><ref id="bib12"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Couto</surname><given-names>A</given-names></name><name><surname>Alenius</surname><given-names>M</given-names></name><name><surname>Dickson</surname><given-names>BJ</given-names></name></person-group><year>2005</year><article-title>Molecular, anatomical, and functional organization of the Drosophila olfactory system</article-title><source>Curr Biol</source><volume>15</volume><fpage>1535</fpage><lpage>47</lpage><pub-id pub-id-type="doi">10.1016/j.cub.2005.07.034</pub-id></element-citation></ref><ref id="bib13"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Curran</surname><given-names>AM</given-names></name><name><surname>Rabin</surname><given-names>SI</given-names></name><name><surname>Prada</surname><given-names>PA</given-names></name><name><surname>Furton</surname><given-names>KG</given-names></name></person-group><year>2005</year><article-title>Comparison of the volatile organic compounds present in human odor using SPME-GC/MS</article-title><source>J Chem Ecol</source><volume>31</volume><fpage>1607</fpage><lpage>19</lpage><pub-id pub-id-type="doi">10.1007/s10886-005-5801-4</pub-id></element-citation></ref><ref id="bib14"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Dahanukar</surname><given-names>A</given-names></name><name><surname>Hallem</surname><given-names>EA</given-names></name><name><surname>Carlson</surname><given-names>JR</given-names></name></person-group><year>2005</year><article-title>Insect chemoreception</article-title><source>Curr Opin Neurobiol</source><volume>15</volume><fpage>423</fpage><lpage>30</lpage><pub-id pub-id-type="doi">10.1016/j.conb.2005.06.001</pub-id></element-citation></ref><ref id="bib15"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>de Bruyne</surname><given-names>M</given-names></name><name><surname>Baker</surname><given-names>TC</given-names></name></person-group><year>2008</year><article-title>Odor detection in insects: volatile codes</article-title><source>J Chem Ecol</source><volume>34</volume><fpage>882</fpage><lpage>97</lpage><pub-id pub-id-type="doi">10.1007/s10886-008-9485-4</pub-id></element-citation></ref><ref id="bib16"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>de Bruyne</surname><given-names>M</given-names></name><name><surname>Clyne</surname><given-names>PJ</given-names></name><name><surname>Carlson</surname><given-names>JR</given-names></name></person-group><year>1999</year><article-title>Odor coding in a model olfactory organ: the Drosophila maxillary palp</article-title><source>J Neurosci</source><volume>19</volume><fpage>4520</fpage><lpage>32</lpage></element-citation></ref><ref id="bib17"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>de Bruyne</surname><given-names>M</given-names></name><name><surname>Foster</surname><given-names>K</given-names></name><name><surname>Carlson</surname><given-names>JR</given-names></name></person-group><year>2001</year><article-title>Odor coding in the Drosophila antenna</article-title><source>Neuron</source><volume>30</volume><fpage>537</fpage><lpage>52</lpage><pub-id pub-id-type="doi">10.1016/S0896-6273(01)00289-6</pub-id></element-citation></ref><ref id="bib18"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Dobritsa</surname><given-names>AA</given-names></name><name><surname>van der Goes van Naters</surname><given-names>W</given-names></name><name><surname>Warr</surname><given-names>CG</given-names></name><name><surname>Steinbrecht</surname><given-names>RA</given-names></name><name><surname>Carlson</surname><given-names>JR</given-names></name></person-group><year>2003</year><article-title>Integrating the molecular and cellular basis of odor coding in the Drosophila antenna</article-title><source>Neuron</source><volume>37</volume><fpage>827</fpage><lpage>41</lpage><pub-id pub-id-type="doi">10.1016/S0896-6273(03)00094-1</pub-id></element-citation></ref><ref id="bib19"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>El-Sayed</surname><given-names>A</given-names></name></person-group><year>2009</year><article-title>The Pherobase: database of insect pheromones and semiochemicals</article-title><ext-link ext-link-type="uri" xlink:href="http://www.pherobase.com/">http://www.pherobase.com/</ext-link></element-citation></ref><ref id="bib20"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Galizia</surname><given-names>CG</given-names></name><name><surname>Munch</surname><given-names>D</given-names></name><name><surname>Strauch</surname><given-names>M</given-names></name><name><surname>Nissler</surname><given-names>A</given-names></name><name><surname>Ma</surname><given-names>SW</given-names></name></person-group><year>2010</year><article-title>Integrating heterogeneous odor response data into a common response model: a DoOR to the complete olfactome</article-title><source>Chem Senses</source><volume>35</volume><fpage>551</fpage><lpage>63</lpage><pub-id pub-id-type="doi">10.1093/chemse/bjq042</pub-id></element-citation></ref><ref id="bib21"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Gallagher</surname><given-names>M</given-names></name><name><surname>Wysocki</surname><given-names>J</given-names></name><name><surname>Leyden</surname><given-names>JJ</given-names></name><name><surname>Spielman</surname><given-names>AI</given-names></name><name><surname>Sun</surname><given-names>X</given-names></name><name><surname>Preti</surname><given-names>G</given-names></name></person-group><year>2008</year><article-title>Analyses of volatile organic compounds from human skin</article-title><source>Br J Dermatol</source><volume>159</volume><fpage>780</fpage><lpage>91</lpage><pub-id pub-id-type="doi">10.1111/j.1365-2133.2008.08748.x</pub-id></element-citation></ref><ref id="bib22"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Goldman</surname><given-names>AL</given-names></name><name><surname>van Naters</surname><given-names>WV</given-names></name><name><surname>Lessing</surname><given-names>D</given-names></name><name><surname>Warr</surname><given-names>CG</given-names></name><name><surname>Carlson</surname><given-names>JR</given-names></name></person-group><year>2005</year><article-title>Coexpression of two functional odorant receptors in one neuron</article-title><source>Neuron</source><volume>45</volume><fpage>661</fpage><lpage>6</lpage><pub-id pub-id-type="doi">10.1016/j.neuron.2005.01.025</pub-id></element-citation></ref><ref id="bib23"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Guo</surname><given-names>S</given-names></name><name><surname>Kim</surname><given-names>J</given-names></name></person-group><year>2010</year><article-title>Dissecting the molecular mechanism of drosophila odorant receptors through activity modeling and comparative analysis</article-title><source>Proteins</source><volume>78</volume><fpage>381</fpage><lpage>99</lpage><pub-id pub-id-type="doi">10.1002/prot.22556</pub-id></element-citation></ref><ref id="bib24"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Haddad</surname><given-names>R</given-names></name><name><surname>Khan</surname><given-names>R</given-names></name><name><surname>Takahashi</surname><given-names>YK</given-names></name><name><surname>Mori</surname><given-names>K</given-names></name><name><surname>Harel</surname><given-names>D</given-names></name><name><surname>Sobel</surname><given-names>N</given-names></name></person-group><year>2008</year><article-title>A metric for odorant comparison</article-title><source>Nat Methods</source><volume>5</volume><fpage>425</fpage><lpage>9</lpage><pub-id pub-id-type="doi">10.1038/nmeth.1197</pub-id></element-citation></ref><ref id="bib25"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Hallem</surname><given-names>EA</given-names></name><name><surname>Carlson</surname><given-names>JR</given-names></name></person-group><year>2006</year><article-title>Coding of odors by a receptor repertoire</article-title><source>Cell</source><volume>125</volume><fpage>143</fpage><lpage>60</lpage><pub-id pub-id-type="doi">10.1016/j.cell.2006.01.050</pub-id></element-citation></ref><ref id="bib26"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Hallem</surname><given-names>EA</given-names></name><name><surname>Ho</surname><given-names>MG</given-names></name><name><surname>Carlson</surname><given-names>JR</given-names></name></person-group><year>2004</year><article-title>The molecular basis of odor coding in the Drosophila antenna</article-title><source>Cell</source><volume>117</volume><fpage>965</fpage><lpage>79</lpage><pub-id pub-id-type="doi">10.1016/j.cell.2004.05.012</pub-id></element-citation></ref><ref id="bib27"><element-citation publication-type="book"><person-group person-group-type="author"><name><surname>Hastie</surname><given-names>T</given-names></name><name><surname>Tibshirani</surname><given-names>R</given-names></name><name><surname>Friedman</surname><given-names>JH</given-names></name></person-group><year>2001</year><source>The elements of statistical learning: data mining, inference, and prediction: with 200 full-color illustrations</source><publisher-loc>New York</publisher-loc><publisher-name>Springer</publisher-name></element-citation></ref><ref id="bib28"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Hawkins</surname><given-names>PCD</given-names></name><name><surname>Skillman</surname><given-names>AG</given-names></name><name><surname>Warren</surname><given-names>GL</given-names></name><name><surname>Ellingson</surname><given-names>BA</given-names></name><name><surname>Stahl</surname><given-names>MT</given-names></name></person-group><year>2010</year><article-title>Conformer generation with OMEGA: algorithm and validation using high quality structures from the protein Databank and Cambridge structural database</article-title><source>J Chem Inf Model</source><volume>50</volume><fpage>572</fpage><lpage>84</lpage><pub-id pub-id-type="doi">10.1021/ci100031x</pub-id></element-citation></ref><ref id="bib29"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Hendrickson</surname><given-names>JB</given-names></name></person-group><year>1991</year><article-title>Concepts and applications of molecular similarity - Johnson, Ma, Maggiora, Gm</article-title><source>Science</source><volume>252</volume><fpage>1189</fpage><pub-id pub-id-type="doi">10.1126/science.252.5009.1189</pub-id></element-citation></ref><ref id="bib30"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Keiser</surname><given-names>MJ</given-names></name><name><surname>Setola</surname><given-names>V</given-names></name><name><surname>Irwin</surname><given-names>JJ</given-names></name><name><surname>Laggner</surname><given-names>C</given-names></name><name><surname>Abbas</surname><given-names>AI</given-names></name><name><surname>Hufeisen</surname><given-names>SJ</given-names></name><etal/></person-group><year>2009</year><article-title>Predicting new molecular targets for known drugs</article-title><source>Nature</source><volume>462</volume><fpage>175</fpage><lpage>81</lpage><pub-id pub-id-type="doi">10.1038/nature08506</pub-id></element-citation></ref><ref id="bib31"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Knudsen</surname><given-names>JT</given-names></name><name><surname>Eriksson</surname><given-names>R</given-names></name><name><surname>Gershenzon</surname><given-names>J</given-names></name><name><surname>Stahl</surname><given-names>B</given-names></name></person-group><year>2006</year><article-title>Diversity and distribution of floral Scent</article-title><source>Bot Rev</source><volume>72</volume><fpage>1</fpage><lpage>120</lpage><pub-id pub-id-type="doi">10.1663/0006-8101(2006)72[1:DADOFS]2.0.CO;2</pub-id></element-citation></ref><ref id="bib32"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Kreher</surname><given-names>SA</given-names></name><name><surname>Kwon</surname><given-names>JY</given-names></name><name><surname>Carlson</surname><given-names>JR</given-names></name></person-group><year>2005</year><article-title>The molecular basis of odor coding in the Drosophila larva</article-title><source>Neuron</source><volume>46</volume><fpage>445</fpage><lpage>56</lpage><pub-id pub-id-type="doi">10.1016/j.neuron.2005.04.007</pub-id></element-citation></ref><ref id="bib33"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Kreher</surname><given-names>SA</given-names></name><name><surname>Mathew</surname><given-names>D</given-names></name><name><surname>Kim</surname><given-names>J</given-names></name><name><surname>Carlson</surname><given-names>JR</given-names></name></person-group><year>2008</year><article-title>Translation of sensory input into behavioral output via an olfactory system</article-title><source>Neuron</source><volume>59</volume><fpage>110</fpage><lpage>24</lpage><pub-id pub-id-type="doi">10.1016/j.neuron.2008.06.010</pub-id></element-citation></ref><ref id="bib34"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Kwon</surname><given-names>JY</given-names></name><name><surname>Dahanukar</surname><given-names>A</given-names></name><name><surname>Weiss</surname><given-names>LA</given-names></name><name><surname>Carlson</surname><given-names>JR</given-names></name></person-group><year>2007</year><article-title>The molecular basis of CO2 reception in Drosophila</article-title><source>Proc Natl Acad Sci USA</source><volume>104</volume><fpage>3574</fpage><lpage>8</lpage><pub-id pub-id-type="doi">10.1073/pnas.0700079104</pub-id></element-citation></ref><ref id="bib35"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Logan</surname><given-names>JG</given-names></name><name><surname>Birkett</surname><given-names>MA</given-names></name><name><surname>Clark</surname><given-names>SJ</given-names></name><name><surname>Powers</surname><given-names>S</given-names></name><name><surname>Seal</surname><given-names>NJ</given-names></name><name><surname>Wadhams</surname><given-names>LJ</given-names></name><etal/></person-group><year>2008</year><article-title>Identification of human-derived volatile chemicals that interfere with attraction of <italic>Aedes aegypti</italic> mosquitoes</article-title><source>J Chem Ecol</source><volume>34</volume><fpage>308</fpage><lpage>22</lpage><pub-id pub-id-type="doi">10.1007/s10886-008-9436-0</pub-id></element-citation></ref><ref id="bib36"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Maldonado</surname><given-names>AG</given-names></name><name><surname>Doucet</surname><given-names>JP</given-names></name><name><surname>Petitjean</surname><given-names>M</given-names></name><name><surname>Fan</surname><given-names>BT</given-names></name></person-group><year>2006</year><article-title>Molecular similarity and diversity in chemoinformatics: from theory to applications</article-title><source>Mol Divers</source><volume>10</volume><fpage>39</fpage><lpage>79</lpage><pub-id pub-id-type="doi">10.1007/s11030-006-8697-1</pub-id></element-citation></ref><ref id="bib37"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Martin</surname><given-names>YC</given-names></name><name><surname>Kofron</surname><given-names>JL</given-names></name><name><surname>Traphagen</surname><given-names>LM</given-names></name></person-group><year>2002</year><article-title>Do structurally similar molecules have similar biological activity?</article-title><source>J Med Chem</source><volume>45</volume><fpage>4350</fpage><lpage>8</lpage><pub-id pub-id-type="doi">10.1021/jm020155c</pub-id></element-citation></ref><ref id="bib38"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Mathew</surname><given-names>D</given-names></name><name><surname>Martelli</surname><given-names>C</given-names></name><name><surname>Kelley-Swift</surname><given-names>E</given-names></name><name><surname>Brusalis</surname><given-names>C</given-names></name><name><surname>Gershow</surname><given-names>M</given-names></name><name><surname>Samuel</surname><given-names>AD</given-names></name><etal/></person-group><year>2013</year><article-title>Functional diversity among sensory receptors in a Drosophila olfactory circuit</article-title><source>Proc Natl Acad Sci USA</source><volume>110</volume><fpage>E2134</fpage><lpage>43</lpage><pub-id pub-id-type="doi">10.1073/pnas.1306976110</pub-id></element-citation></ref><ref id="bib39"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Meijerink</surname><given-names>J</given-names></name><name><surname>Braks</surname><given-names>MAH</given-names></name><name><surname>Brack</surname><given-names>AA</given-names></name><name><surname>Adam</surname><given-names>W</given-names></name><name><surname>Dekker</surname><given-names>T</given-names></name><name><surname>Posthumus</surname><given-names>MA</given-names></name><etal/></person-group><year>2000</year><article-title>Identification of olfactory stimulants for <italic>Anopheles gambiae</italic> from human sweat samples</article-title><source>J Chem Ecol</source><volume>26</volume><fpage>1367</fpage><lpage>82</lpage><pub-id pub-id-type="doi">10.1023/A:1005475422978</pub-id></element-citation></ref><ref id="bib40"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Pelz</surname><given-names>D</given-names></name><name><surname>Roeske</surname><given-names>T</given-names></name><name><surname>Syed</surname><given-names>Z</given-names></name><name><surname>De Bruyne</surname><given-names>M</given-names></name><name><surname>Galizia</surname><given-names>CG</given-names></name></person-group><year>2006</year><article-title>The molecular receptive range of an olfactory receptor in vivo (<italic>Drosophila melanogaster</italic> Or22A)</article-title><source>J Neurobiol</source><volume>66</volume><fpage>1544</fpage><lpage>63</lpage><pub-id pub-id-type="doi">10.1002/neu.20333</pub-id></element-citation></ref><ref id="bib42"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Saito</surname><given-names>H</given-names></name><name><surname>Chi</surname><given-names>Q</given-names></name><name><surname>Zhuang</surname><given-names>H</given-names></name><name><surname>Matsunami</surname><given-names>H</given-names></name><name><surname>Mainland</surname><given-names>JD</given-names></name></person-group><year>2009</year><article-title>Odor coding by a mammalian receptor repertoire</article-title><source>Sci Signal</source><volume>2</volume><fpage>ra9</fpage><pub-id pub-id-type="doi">10.1126/scisignal.2000016</pub-id></element-citation></ref><ref id="bib43"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Schmuker</surname><given-names>M</given-names></name><name><surname>De Bruyne</surname><given-names>M</given-names></name><name><surname>Hahnel</surname><given-names>M</given-names></name><name><surname>Schneider</surname><given-names>G</given-names></name></person-group><year>2007</year><article-title>Predicting olfactory receptor neuron responses from odorant structure</article-title><source>Chem Cent J</source><volume>1</volume><fpage>11</fpage><pub-id pub-id-type="doi">10.1186/1752-153X-1-11</pub-id></element-citation></ref><ref id="bib44"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Schneider</surname><given-names>G</given-names></name><name><surname>Schneider</surname><given-names>P</given-names></name><name><surname>Renner</surname><given-names>S</given-names></name></person-group><year>2006</year><article-title>Scaffold-hopping: how far can you jump?</article-title><source>Qsar, Comb Sci</source><volume>25</volume><fpage>1162</fpage><lpage>71</lpage><pub-id pub-id-type="doi">10.1002/qsar.200610091</pub-id></element-citation></ref><ref id="bib45"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Semmelhack</surname><given-names>JL</given-names></name><name><surname>Wang</surname><given-names>JW</given-names></name></person-group><year>2009</year><article-title>Select <italic>Drosophila glomeruli</italic> mediate innate olfactory attraction and aversion</article-title><source>Nature</source><volume>459</volume><fpage>218</fpage><lpage>23</lpage><pub-id pub-id-type="doi">10.1038/nature07983</pub-id></element-citation></ref><ref id="bib46"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Shannon</surname><given-names>P</given-names></name><name><surname>Markiel</surname><given-names>A</given-names></name><name><surname>Ozier</surname><given-names>O</given-names></name><name><surname>Baliga</surname><given-names>NS</given-names></name><name><surname>Wang</surname><given-names>JT</given-names></name><name><surname>Ramage</surname><given-names>D</given-names></name><etal/></person-group><year>2003</year><article-title>Cytoscape: a software environment for integrated models of biomolecular interaction networks</article-title><source>Genome Res</source><volume>13</volume><fpage>2498</fpage><lpage>504</lpage><pub-id pub-id-type="doi">10.1101/gr.1239303</pub-id></element-citation></ref><ref id="bib47"><element-citation publication-type="book"><person-group person-group-type="author"><collab>Sigma-Aldrich</collab></person-group><year>2007</year><source>Flavors and fragrances 2007-2008 catalog</source><publisher-loc>Milquakee, WI</publisher-loc><publisher-name>Sigma-Aldrich Fine Chemicals Company</publisher-name></element-citation></ref><ref id="bib48"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Stensmyr</surname><given-names>MC</given-names></name><name><surname>Giordano</surname><given-names>E</given-names></name><name><surname>Balloi</surname><given-names>A</given-names></name><name><surname>Angioy</surname><given-names>AM</given-names></name><name><surname>Hansson</surname><given-names>BS</given-names></name></person-group><year>2003</year><article-title>Novel natural ligands for Drosophila olfactory receptor neurones</article-title><source>J Exp Biol</source><volume>206</volume><fpage>715</fpage><lpage>24</lpage><pub-id pub-id-type="doi">10.1242/jeb.00143</pub-id></element-citation></ref><ref id="bib49"><element-citation publication-type="book"><person-group person-group-type="author"><name><surname>Tan</surname><given-names>P-N</given-names></name><name><surname>Steinbach</surname><given-names>M</given-names></name><name><surname>Kumar</surname><given-names>V</given-names></name></person-group><year>2006</year><source>Introduction to data mining</source><publisher-loc>Boston</publisher-loc><publisher-name>Pearson Addison Wesley</publisher-name></element-citation></ref><ref id="bib50"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Triballeau</surname><given-names>N</given-names></name><name><surname>van Name</surname><given-names>E</given-names></name><name><surname>Laslier</surname><given-names>G</given-names></name><name><surname>Cai</surname><given-names>D</given-names></name><name><surname>Paillard</surname><given-names>G</given-names></name><name><surname>Sorensen</surname><given-names>PW</given-names></name><etal/></person-group><year>2008</year><article-title>High-potency olfactory receptor agonists discovered by virtual high-throughput screening: molecular probes for receptor structure and olfactory function</article-title><source>Neuron</source><volume>60</volume><fpage>767</fpage><lpage>74</lpage><pub-id pub-id-type="doi">10.1016/j.neuron.2008.11.014</pub-id></element-citation></ref><ref id="bib41"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Turner</surname><given-names>SL</given-names></name><name><surname>Ray</surname><given-names>A</given-names></name></person-group><year>2009</year><article-title>Modification of CO(2) avoidance behaviour in Drosophila by inhibitory odorants</article-title><source>Nature</source><volume>461</volume><fpage>277</fpage><lpage>81</lpage><pub-id pub-id-type="doi">10.1038/nature08295</pub-id></element-citation></ref><ref id="bib51"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>van der Goes van Naters</surname><given-names>W</given-names></name><name><surname>Carlson</surname><given-names>JR</given-names></name></person-group><year>2006</year><article-title>Insects as chemosensors of humans and crops</article-title><source>Nature</source><volume>444</volume><fpage>302</fpage><lpage>7</lpage><pub-id pub-id-type="doi">10.1038/nature05403</pub-id></element-citation></ref><ref id="bib52"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>van Naters</surname><given-names>WVG</given-names></name><name><surname>Carlson</surname><given-names>JR</given-names></name></person-group><year>2007</year><article-title>Receptors and neurons for fly odors in Drosophila</article-title><source>Curr Biol</source><volume>17</volume><fpage>606</fpage><lpage>12</lpage><pub-id pub-id-type="doi">10.1016/j.cub.2007.02.043</pub-id></element-citation></ref><ref id="bib53"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Vosshall</surname><given-names>LB</given-names></name><name><surname>Amrein</surname><given-names>H</given-names></name><name><surname>Morozov</surname><given-names>PS</given-names></name><name><surname>Rzhetsky</surname><given-names>A</given-names></name><name><surname>Axel</surname><given-names>R</given-names></name></person-group><year>1999</year><article-title>A spatial map of olfactory receptor expression in the Drosophila antenna</article-title><source>Cell</source><volume>96</volume><fpage>725</fpage><lpage>36</lpage><pub-id pub-id-type="doi">10.1016/S0092-8674(00)80582-6</pub-id></element-citation></ref><ref id="bib54"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Whitney</surname><given-names>AW</given-names></name></person-group><year>1971</year><article-title>Direct method of nonparametric measurement selection</article-title><source>IEEE Trans Comput</source><volume>C 20</volume><fpage>1100</fpage><lpage>3</lpage><pub-id pub-id-type="doi">10.1109/T-C.1971.223410</pub-id></element-citation></ref><ref id="bib55"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Wishart</surname><given-names>DS</given-names></name><name><surname>Knox</surname><given-names>C</given-names></name><name><surname>Guo</surname><given-names>AC</given-names></name><name><surname>Cheng</surname><given-names>D</given-names></name><name><surname>Shrivastava</surname><given-names>S</given-names></name><name><surname>Tzur</surname><given-names>D</given-names></name><etal/></person-group><year>2008</year><article-title>DrugBank: a knowledgebase for drugs, drug actions and drug targets</article-title><source>Nucleic Acids Res</source><volume>36</volume><fpage>D901</fpage><lpage>6</lpage><pub-id pub-id-type="doi">10.1093/nar/gkm958</pub-id></element-citation></ref><ref id="bib56"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Yao</surname><given-names>CA</given-names></name><name><surname>Ignell</surname><given-names>R</given-names></name><name><surname>Carlson</surname><given-names>JR</given-names></name></person-group><year>2005</year><article-title>Chemosensory coding by neurons in the coeloconic sensilla of the Drosophila antenna</article-title><source>J Neurosci</source><volume>25</volume><fpage>8359</fpage><lpage>67</lpage><pub-id pub-id-type="doi">10.1523/JNEUROSCI.2432-05.2005</pub-id></element-citation></ref><ref id="bib57"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Zeng</surname><given-names>XN</given-names></name><name><surname>Leyden</surname><given-names>JJ</given-names></name><name><surname>Lawley</surname><given-names>HJ</given-names></name><name><surname>Sawano</surname><given-names>K</given-names></name><name><surname>Nohara</surname><given-names>I</given-names></name><name><surname>Preti</surname><given-names>G</given-names></name></person-group><year>1991</year><article-title>Analysis of characteristic odors from human male Axillae</article-title><source>J Chem Ecol</source><volume>17</volume><fpage>1469</fpage><lpage>92</lpage><pub-id pub-id-type="doi">10.1007/BF00983777</pub-id></element-citation></ref><ref id="bib58"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Zeng</surname><given-names>XN</given-names></name><name><surname>Leyden</surname><given-names>JJ</given-names></name><name><surname>Spielman</surname><given-names>AI</given-names></name><name><surname>Preti</surname><given-names>G</given-names></name></person-group><year>1996</year><article-title>Analysis of characteristic human female axillary odors: qualitative comparison to males</article-title><source>J Chem Ecol</source><volume>22</volume><fpage>237</fpage><lpage>57</lpage><pub-id pub-id-type="doi">10.1007/BF02055096</pub-id></element-citation></ref></ref-list></back><sub-article article-type="article-commentary" id="SA1"><front-stub><article-id pub-id-type="doi">10.7554/eLife.01120.014</article-id><title-group><article-title>Decision letter</article-title></title-group><contrib-group content-type="section"><contrib contrib-type="editor"><name><surname>Luo</surname><given-names>Liqun</given-names></name><role>Reviewing editor</role><aff><institution>Stanford University</institution>, <country>United States</country></aff></contrib></contrib-group></front-stub><body><boxed-text><p>eLife posts the editorial decision letter and author response on a selection of the published articles (subject to the approval of the authors). An edited version of the letter sent to the authors after peer review is shown, indicating the substantive concerns or comments; minor concerns are not usually shown. Reviewers have the opportunity to discuss the decision before the letter is sent (see <ext-link ext-link-type="uri" xlink:href="http://elife.elifesciences.org/review-process">review process</ext-link>). Similarly, the author response typically shows only responses to the major concerns raised by the reviewers.</p></boxed-text><p>Thank you for sending your work entitled “Expanding the olfactory code by in silico decoding of odor-receptor chemical space” for consideration at <italic>eLife</italic>. Your article has been favorably evaluated by a Senior editor and 3 reviewers, one of whom is a member of our Board of Reviewing Editors.</p><p>The Reviewing editor and the other reviewers discussed their comments before we reached this decision, and the Reviewing editor has assembled the following comments to help you prepare a revised submission.</p><p>The manuscript by Boyle et al. describes the use of a computational approach to identify new receptor-odor pairs for <italic>Drosophila</italic> based on analysis of a previously published small pool of known agonists and antagonists on a panel of 19 Drosophila odorant receptors. The achievements include: 1) by validating 71% of the 141 experimentally tested odor-receptor pairs, the authors expanded the existing dataset by a significant amount (it would be useful for the authors to determine the precise number and add to the revised text); 2) the 19 Ors <bold>×</bold> top 500 odorants matrix could constitute a much larger odor-receptor pairs that have a good chance to be authentic; 3) the same method can in principle be applied to other organisms, after experimentally determining a reasonably rich matrix of receptor-odor pairs. Overall the manuscript is of high technical quality. We would therefore like to invite the authors to revise the manuscript by addressing the following specific critiques.</p><p>1) The authors should make more explicit the limitations of the study in the Abstract and conclusions; their current summary creates false expectation. First, they can only predict more ligands based on receptors that have already been experimentally tested against a large number of ligands. Second, they predicted ligands for only 19 Ors; although technically it may constitute the “majority” of ORNs that utilize Ors, it is certainly not a “majority” of ∼50 ORN types in adult Drosophila (as a significant fraction of ORNs utilize Irs). Third, the validated 71% came from only 9 Ors; it is unclear whether this can be generalized to the other untested Or classes. Fourth, there is no evidence that any of the agonists or antagonists identified in silico are better (higher affinity) or more likely to be the true biologically relevant ligands for these receptors than those identified in small chemical libraries based on ecologically reasoning (i.e., the training set).</p><p>2) The use of the optimized descriptor sets need to be defined more explicitly. In <xref ref-type="table" rid="tbl1">Table 1</xref> what are the numbers given for each descriptor? If these are classes of descriptors then how many descriptors from the original set of over 3,000 are actually being used, and what are they? And how do they appear to be relevant to odor quality? One of the issues with the similar analysis performed by Haddad was that the descriptors (there determined by a PCA analysis) seemed to have little obvious relevance to odor quality. Is the optimized method presented here an improvement on that? The description of the SFS approach does not provide any detail as to how each incremental descriptor was chosen to grow the set. Was this done by a purely statistical or machine learning method or did the investigators use some intuition regarding the likelihood of the descriptor to be relevant to odor character?</p><p>3) Regarding the paragraph ‘Producing a systems level view of receptor activity for the <italic>Drosophila</italic> antenna’, either this very intriguing analysis should go in the Discussion or a deeper analysis is required for the reader. The network drawn in Figure 6A is impossible to interpret. The authors might choose a natural blend, or even an artificial one, and show us the network for that particular group of odors. That would be clearer and more useful, especially if it shows something new or unexpected. Showing all the interactions creates an attractive graphic, but not one that is informative.</p><p>4) The authors should provide more raw data from their analyses.</p></body></sub-article><sub-article article-type="reply" id="SA2"><front-stub><article-id pub-id-type="doi">10.7554/eLife.01120.015</article-id><title-group><article-title>Author response</article-title></title-group></front-stub><body><p><italic>1A) “…they can only predict more ligands based on receptors that have already been experimentally tested against a large number of ligands.</italic>”</p><p>We agree with the critique that we are only able to predict ligands for which a training odor set has been made available. We had included this information within the Results section, but have now incorporated this into our Abstract and Discussion sections.</p><p><italic>1B) “…they predicted ligands for only 19 Ors; although technically it may constitute the “majority” of ORNs that utilize Ors, it is certainly not a “majority” of ∼50 ORN types in adult Drosophila (as a significant fraction of ORNs utilize Irs).</italic>”</p><p>We agree that the wording could come across as ambiguous, which is not our intention. We have modified the text to clarify the distinction and included a short discussion of potential for predictions from other classes of chemoreceptors in the Discussion section.</p><p><italic>1C) “…the validated 71% came from only 9 Ors; it is unclear whether this can be generalized to the other untested Or classes.</italic>”</p><p>The rationale for selecting the 9 receptors for testing was accessibility to electrophysiology and unambiguous identification using a diagnostic odor panel. We have modified the text in the Abstract to accurately present the experimental data and clarify that 9 Ors were validated.</p><p><italic>1D) “…there is no evidence that any of the agonists or antagonists identified in silico are better (higher affinity) or more likely to be the true biologically relevant ligands for these receptors than those identified in small chemical libraries based on ecologically reasoning (i.e., the training set).</italic>”</p><p>We agree with the critique. Our focus in this analysis was to create a method that could identify a large number of active compounds, which we were successful at. We expect this will be useful for hypothesis generation and potentially identifying stronger, or ecologically and behaviorally important odors. We anticipate that availability of large number of candidate ligands will also be useful in behavioral disruption programs for pest and disease vector species, since it will allow a researcher to judiciously select affordable, pleasant smelling and environmentally safe chemicals for applications.</p><p><italic>2A) “In</italic> <xref ref-type="table" rid="tbl1"><italic>Table 1</italic></xref> <italic>what are the numbers given for each descriptor? If these are classes of descriptors then how many descriptors from the original set of over 3,000 are actually being used, and what are they? And how do they appear to be relevant to odor quality?</italic>”</p><p>The numbers in <xref ref-type="table" rid="tbl1">Table 1</xref> represent the total number of molecular descriptors from these classes that were identified by our approach as an overview. We agree that it would be much more informative to provide the exact molecular descriptor sets that were optimized for each receptor. We have now created a new table that provides molecular descriptor symbols, weights, descriptions, classes, and descriptive dimensionality for each Or-optimized set used in our study. This provides a wealth of useful data. While several of these descriptors are very specific and represent high dimensional graph based theory, a number of selected descriptors are easily understood such as functional group counts and atom types descriptors. Through this table readers will be able to identify which functional groups and 2D fingerprints are most important for a particular receptor and specialists will be able to utilize them in their own analysis such as the prediction pipeline we have created.</p><p><italic>2B) “The description of the SFS approach does not provide any detail as to how each incremental descriptor was chosen to grow the set.</italic>”</p><p>We apologize for being unclear and have clarified this in the Materials and methods section.</p><p><italic>3) “…this very intriguing analysis should go in the Discussion or a deeper analysis is required for the reader.</italic>”</p><p>We agree and thank you for your suggestion. We have moved this section to the Discussion.</p><p><italic>4) “The authors should provide more raw data from their analyses.</italic>”</p><p>We thank you for this suggestion and agree that this manuscript would benefit from increased raw data. We have incorporated several new supplemental tables and figures. Newly incorporated data includes: optimized molecular descriptor sets for each predicted Or, including name, weight, class, description, and dimensionality; top 100 predicted compounds for each of the receptors analyzed; APoA plots for individual Ors; and pharmacophore structures for active compounds for each Or.</p></body></sub-article></article>