Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Feature/issue 9 #29

Merged
merged 5 commits into from
Nov 9, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]
### Added
- Issue 9 - Create API Usage Documentation
- Issue 4 - User guide for how to run database load script manually
- Issue 12 - Move all constants to separate constants file
- Issue 6 - Pylint and Flake8 configured
Expand All @@ -23,5 +24,3 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
### Removed
### Fixed
### Security


34 changes: 34 additions & 0 deletions docs/_config.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# Book settings
# Learn more at https://jupyterbook.org/customize/config.html

title: Hydrocron
author: Physical Oceanography Distributed Active Archive Center (PO.DAAC)
logo: hydrocron_logo.png

# Force re-execution of notebooks on each build.
# See https://jupyterbook.org/content/execute.html
execute:
execute_notebooks: force

# Define the name of the latex output file for PDF builds
latex:
latex_documents:
targetname: book.tex

# Add a bibtex file so that we can create citations
bibtex_bibfiles:
- references.bib

# Information about where the book exists on the web
repository:
url: https://github.com/executablebooks/jupyter-book # Online location of your book
path_to_book: docs # Optional path to your book, relative to the repository root
branch: master # Which branch of the repository should be used when creating links (optional)

# Add GitHub buttons to your book
# See https://jupyterbook.org/customize/config.html#add-a-link-to-your-repository
html:
favicon: hydrocron_logo.png
use_issues_button: true
use_repository_button: true
home_page_in_navbar: false # Whether to include your home page in the left Navigation Bar
30 changes: 30 additions & 0 deletions docs/_toc.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Table of contents
# Learn more at https://jupyterbook.org/customize/toc.html

format: jb-book
defaults:
titlesonly: true
root: intro
parts:
- caption: GETTING STARTED
chapters:
- file: overview
- file: examples
- caption: API ENDPOINTS
chapters:
- file: timeseries
- caption: IMPORTANT USAGE NOTES
chapters:
- file: time
- caption: RESOURCES
chapters:
- url: https://github.com/podaac/hydrocron
title: GitHub Repository
- url: https://podaac.jpl.nasa.gov/SWOT
title: SWOT Mission Page
- url: https://www.swordexplorer.com/
title: SWOT River Database (SWORD) Explorer
- url: https://podaac.jpl.nasa.gov
title: PO.DAAC Website
- url: https://podaac.github.io/tutorials/
title: PO.DAAC Cookbook
17 changes: 17 additions & 0 deletions docs/examples.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Examples

## Get time series CSV for river reach

Search for a single river reach by reach ID.

/timeseries?feature=reach&feature_id=71224300643&output=csv&start_time=2023-08-01T00:00:00&end_time=2023-10-31T00:00:00

Will return a CSV file, eg:

## Get time series GEOJSON for river node

Search for a single river node by node ID.

/timeseries?feature=node&feature_id=45243800160101&output=geojson&start_time=2023-08-01T00:00:00&end_time=2023-10-31T00:00:00

Will return geojson, eg:
Binary file added docs/hydrocron_logo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
11 changes: 11 additions & 0 deletions docs/intro.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Hydrocron Documentation

Hydrocron is an API that repackages hydrology datasets from the Surface Water and Ocean Topography (SWOT) satellite into formats that make time-series analysis easier.

SWOT data is archived as individually timestamped shapefiles, which would otherwise require users to perform potentially thousands of file IO operations per river feature to view the data as a timeseries. Hydrocron makes this possible with a single API call.

Original SWOT data is archived at NASA's [Physical Oceanography Distributed Active Archive Center (PO.DAAC)](https://podaac.jpl.nasa.gov/SWOT).

Datasets included in Hydrocron:

- [SWOT Level 2 River Single-Pass Vector Data Product, Version 1.1](https://podaac.jpl.nasa.gov/dataset/SWOT_L2_HR_RiverSP_1.1)
21 changes: 21 additions & 0 deletions docs/overview.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Overview

Hydrocron has two main API endpoints:

- [timeseries/](/timeseries) which returns all of the timesteps for a single feature ID, and

- timeseriesSubset/ which returns all of the timesteps for all of the features within a given GeoJSON polygon (not yet released)

## Feature ID

The main timeseries endpoint allows users to search by feature ID.

River reach and node ID numbers are defined in the [SWOT River Database (SWORD)](https://doi.org/10.1029/2021WR030054),
and can be browsed using the [SWORD Explorer Interactive Dashboard](https://www.swordexplorer.com/).

SWOT may observe lakes and rivers that do not have an ID in the prior databases. In those cases, hydrology features are added to the Unassigned Lakes data product.
Hydrocron does not currently support Unassigned rivers and lakes.

## Limitations

Data return size is limited to 6 MB. If your query response is larger than this a 413 error will be returned.
56 changes: 56 additions & 0 deletions docs/references.bib
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
---
---

@inproceedings{holdgraf_evidence_2014,
address = {Brisbane, Australia, Australia},
title = {Evidence for {Predictive} {Coding} in {Human} {Auditory} {Cortex}},
booktitle = {International {Conference} on {Cognitive} {Neuroscience}},
publisher = {Frontiers in Neuroscience},
author = {Holdgraf, Christopher Ramsay and de Heer, Wendy and Pasley, Brian N. and Knight, Robert T.},
year = {2014}
}

@article{holdgraf_rapid_2016,
title = {Rapid tuning shifts in human auditory cortex enhance speech intelligibility},
volume = {7},
issn = {2041-1723},
url = {http://www.nature.com/doifinder/10.1038/ncomms13654},
doi = {10.1038/ncomms13654},
number = {May},
journal = {Nature Communications},
author = {Holdgraf, Christopher Ramsay and de Heer, Wendy and Pasley, Brian N. and Rieger, Jochem W. and Crone, Nathan and Lin, Jack J. and Knight, Robert T. and Theunissen, Frédéric E.},
year = {2016},
pages = {13654},
file = {Holdgraf et al. - 2016 - Rapid tuning shifts in human auditory cortex enhance speech intelligibility.pdf:C\:\\Users\\chold\\Zotero\\storage\\MDQP3JWE\\Holdgraf et al. - 2016 - Rapid tuning shifts in human auditory cortex enhance speech intelligibility.pdf:application/pdf}
}

@inproceedings{holdgraf_portable_2017,
title = {Portable learning environments for hands-on computational instruction using container-and cloud-based technology to teach data science},
volume = {Part F1287},
isbn = {978-1-4503-5272-7},
doi = {10.1145/3093338.3093370},
abstract = {© 2017 ACM. There is an increasing interest in learning outside of the traditional classroom setting. This is especially true for topics covering computational tools and data science, as both are challenging to incorporate in the standard curriculum. These atypical learning environments offer new opportunities for teaching, particularly when it comes to combining conceptual knowledge with hands-on experience/expertise with methods and skills. Advances in cloud computing and containerized environments provide an attractive opportunity to improve the effciency and ease with which students can learn. This manuscript details recent advances towards using commonly-Available cloud computing services and advanced cyberinfrastructure support for improving the learning experience in bootcamp-style events. We cover the benets (and challenges) of using a server hosted remotely instead of relying on student laptops, discuss the technology that was used in order to make this possible, and give suggestions for how others could implement and improve upon this model for pedagogy and reproducibility.},
booktitle = {{ACM} {International} {Conference} {Proceeding} {Series}},
author = {Holdgraf, Christopher Ramsay and Culich, A. and Rokem, A. and Deniz, F. and Alegro, M. and Ushizima, D.},
year = {2017},
keywords = {Teaching, Bootcamps, Cloud computing, Data science, Docker, Pedagogy}
}

@article{holdgraf_encoding_2017,
title = {Encoding and decoding models in cognitive electrophysiology},
volume = {11},
issn = {16625137},
doi = {10.3389/fnsys.2017.00061},
abstract = {© 2017 Holdgraf, Rieger, Micheli, Martin, Knight and Theunissen. Cognitive neuroscience has seen rapid growth in the size and complexity of data recorded from the human brain as well as in the computational tools available to analyze this data. This data explosion has resulted in an increased use of multivariate, model-based methods for asking neuroscience questions, allowing scientists to investigate multiple hypotheses with a single dataset, to use complex, time-varying stimuli, and to study the human brain under more naturalistic conditions. These tools come in the form of “Encoding” models, in which stimulus features are used to model brain activity, and “Decoding” models, in which neural features are used to generated a stimulus output. Here we review the current state of encoding and decoding models in cognitive electrophysiology and provide a practical guide toward conducting experiments and analyses in this emerging field. Our examples focus on using linear models in the study of human language and audition. We show how to calculate auditory receptive fields from natural sounds as well as how to decode neural recordings to predict speech. The paper aims to be a useful tutorial to these approaches, and a practical introduction to using machine learning and applied statistics to build models of neural activity. The data analytic approaches we discuss may also be applied to other sensory modalities, motor systems, and cognitive systems, and we cover some examples in these areas. In addition, a collection of Jupyter notebooks is publicly available as a complement to the material covered in this paper, providing code examples and tutorials for predictive modeling in python. The aimis to provide a practical understanding of predictivemodeling of human brain data and to propose best-practices in conducting these analyses.},
journal = {Frontiers in Systems Neuroscience},
author = {Holdgraf, Christopher Ramsay and Rieger, J.W. and Micheli, C. and Martin, S. and Knight, R.T. and Theunissen, F.E.},
year = {2017},
keywords = {Decoding models, Encoding models, Electrocorticography (ECoG), Electrophysiology/evoked potentials, Machine learning applied to neuroscience, Natural stimuli, Predictive modeling, Tutorials}
}

@book{ruby,
title = {The Ruby Programming Language},
author = {Flanagan, David and Matsumoto, Yukihiro},
year = {2008},
publisher = {O'Reilly Media}
}
3 changes: 3 additions & 0 deletions docs/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
jupyter-book
matplotlib
numpy
16 changes: 16 additions & 0 deletions docs/time.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# Handling Time

SWOT source data is organized to include all of the features from the prior river and lake databases that the satellite crosses over during each pass of a continent.
If for any reason SWOT does not record an observation of a prior database feature during a pass, the source data will contain fillvalues for all observed fields, including the time of observation.

To retain times where there was a satellite pass but no observation was made, Hydrocron queries on the *start time of the range of observations included in the pass over the continent during the cycle of interest*. For example, if it takes 10 seconds for the satellite to pass over North America, 3 different river reaches observed during that pass may have an observation time recorded at 2 seconds, 5 seconds, and 9 seconds. However, Hydrocron uses the range start time of 0 seconds (the beginning of the 10 second window for the pass over the continent), with a buffer of -30 seconds from the start_time, +30 seconds from the end_time specified in the query.

## Example

| reach_id | time | pass_start_time | wse | ... |
|-------------|---------------------|--------------------|----------|-----|
| 71224100223 | 2023-08-01T12:30:45 |2023-08-01T12:30:30 | 316.8713 | |
| 71224100234 | 2023-08-01T12:30:42 |2023-08-01T12:30:30 | 286.2983 | |
| 71224100283 | no_data |2023-08-01T12:30:30 | -999999999999.0000| |

In this case, querying Hydrocron using a start_time of 2023-08-01T12:30:00 will return all three features, becasue it is the pass start time that is used in the query. The returned data will include the actual observation time, including the no_data value for the feature that was not observed.
37 changes: 37 additions & 0 deletions docs/timeseries.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# timeseries

Get time series data from SWOT observations for reaches and nodes

## Parameters

feature : string
Type of feature being requested. Either "Reach" or "Node"

feature_id : string
ID of the feature to retrieve in format CBBTTTSNNNNNN (i.e. 74297700000000)

start_time : string
Start time of the timeseries (i.e 2023-08-04T00:00:00Z)

end_time : string
End time of the timeseries

output : string
Format of the data returned. Must be one of ["csv", "geojson"]

fields : string
The fields to return. Defaults to "feature_id, time_str, wse, geometry"

## Returns

CSV or GEOJSON file containing the data for the selected feature and time period.

## Responses

200 : OK

400 : The specified URL is invalid (does not exist)

404 : An entry with the specified region was not found

413 : Your query returns too much data
Loading