-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Es item #111
Merged
Merged
Es item #111
Changes from 55 commits
Commits
Show all changes
58 commits
Select commit
Hold shift + click to select a range
2a811f1
Removed unused interface
carlvitzthum 856d510
Added datastore param to PickStorage methods; move PickStorage to sto…
carlvitzthum 684c752
Added datastore parameters to connection.py
carlvitzthum d5d9a56
initial test
carlvitzthum 874832d
Fixed a couple imports, improvd test_pick_storage
carlvitzthum af3deff
request.datastore moved to storage.py. Misc fixes
carlvitzthum 69fb912
Disable some annoying loggers, improve PickStorage, couple test-relat…
carlvitzthum 2746e15
Confirm self.read has value in PickStorage.storage
carlvitzthum f90ea44
small test fix
carlvitzthum 22da12c
Revised register_storage function to better handle existing PickStorage
carlvitzthum d900bf8
Use new register storage with esstorage and mpindexer
carlvitzthum ad4938d
test changes
carlvitzthum 4b3e665
Test fix
carlvitzthum 3d6bc0e
Storage reconfiguration and changes for ES-based items
carlvitzthum 1b7c93a
Resolve merge conflicts with snovault v1.3.2 and refactor a bit of na…
carlvitzthum e03eb7c
Merge branch 'es_item' of https://github.com/4dn-dcic/snovault into e…
carlvitzthum 6726132
Fix for get_by_uuid direct, add TestingLinkTargetElasticSearch
carlvitzthum 46dd00b
test_create_es_item_without_es
carlvitzthum 1651282
A couple more misc test fixes
carlvitzthum 2e1dc8c
Fix to PickStorage.find_uuids_linked_to_item
carlvitzthum 4717d1e
Fix collection name
carlvitzthum 55604b5
One more small fix
carlvitzthum c8fa44e
Messy, but got something working. Cleanup is needed, especially for r…
carlvitzthum 680516f
Refactoring, simplifying, fixing tests
carlvitzthum 8832f7d
Fully remove linkFrom
carlvitzthum e69a02e
Test embedding with TestingLinkTargetElasticSearch
carlvitzthum 00aab68
Misc cleanup
carlvitzthum d43b3a7
small test fix
carlvitzthum 71bca4f
Polishing crud_views and connection, added agg_items to ES item tests
carlvitzthum 968aab5
Doc changes to cached_views.py
carlvitzthum be1628c
doc updates for esstorage.py
carlvitzthum 816d6ee
Slight refactor to mpindexer
carlvitzthum f3607df
Final doc refactors
carlvitzthum 3294856
Resolve merge
carlvitzthum 18ea498
Small fix for indexing-info when item is not yet indexed
carlvitzthum ac44fc2
Slight refactor of purge_uuid to remove from ES before DB
carlvitzthum 097c6f7
Refactored docs a bit and only include updated ones
carlvitzthum 7ed4990
Some progress on docs
carlvitzthum dada23c
Filled out storage overview doc
carlvitzthum 16583f1
Small doc-related changes
carlvitzthum 6374b9b
Added some placeholder docs and made rst formatting consistent
carlvitzthum 397e17b
Fix merge conflict in esstorage.py
carlvitzthum 5ce08f3
Correctly format inline code
carlvitzthum cb780c3
Change ES item designation to AbstractCollection.properties_datastore
carlvitzthum 5b02900
Fixes for links/uuids for ES items, as well as adjustment to properti…
carlvitzthum a40bc2e
Check request.datastore first in PickStorage.storage; adjustments for…
carlvitzthum b692fe8
Doc changes for properties_datastore
carlvitzthum 11fbbd5
Test and version updates
carlvitzthum 1994b3a
Small fixes and refactors related to default properties_datastore=dat…
carlvitzthum bdc0b11
Addressed a couple of Will's PR comments
carlvitzthum 540f067
Refactor TestingLinkTargetElasticSearch tests
carlvitzthum 919c3f1
Handle ES-based collections in create mapping
carlvitzthum a654902
Use new Collcection.default_properties_datastore for uuid cache inval…
carlvitzthum e3c46c3
Resolve merge
carlvitzthum a46ebd6
More docs
carlvitzthum a8e2449
small review changes
willronchetti 1a575cb
Merge branch 'master' into es_item
willronchetti dc1f627
fix import
willronchetti File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
Elasticsearch Indexing | ||
===================== | ||
|
||
**Work in progress!** | ||
|
||
Indexing is the process of building a complete document that contains multiple views of an item, then putting that document into Elasticsearch (ES). This is done whenever an item is created or changed, and acts as one of the backbones of Snovault, allowing searching of data and quick reading of complex views for items that are "cached" by using ES as a right storage. | ||
|
||
.. image:: img/indexing.png | ||
|
||
Figure 1: Diagram of the indexing process. | ||
|
||
Code | ||
----------------- | ||
* `indexer.py <https://github.com/4dn-dcic/snovault/blob/master/src/snovault/elasticsearch/indexer.py>`_: index endpoint and initialization, Indexer class | ||
* `mpindexer.py <https://github.com/4dn-dcic/snovault/blob/master/src/snovault/elasticsearch/mpindexer.py>`_: MPIndexer class and helper functions | ||
* `indexer_queue.py <https://github.com/4dn-dcic/snovault/blob/master/src/snovault/elasticsearch/indexer_queue.py>`_: QueueManager and endpoints for queueing and checking indexing | ||
* `indexing_views.py <https://github.com/4dn-dcic/snovault/blob/master/src/snovault/indexing_views.py>`_: index-data view and some other related endpoints |
File renamed without changes.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,75 +1,25 @@ | ||
Snovault Documentation | ||
======================== | ||
Snovault | ||
======================== | ||
|
||
Snovault is a JSON-LD Database Framework that serves as the backend for the 4DN Data portal and CGAP. | ||
|
||
|Build status|_ | ||
|
||
.. |Build status| image:: https://travis-ci.org/4dn-dcic/snovault.svg?branch=master | ||
.. _Build status: https://travis-ci.org/4dn-dcic/snovault | ||
|
||
Installation Instructions | ||
========================= | ||
|
||
Currently these are for Mac OSX using homebrew. If using linux, install dependencies with a different package manager. | ||
|
||
Step 0: Install Xcode (from App Store) and homebrew: http://brew.sh:: | ||
|
||
Step 1: Verify that homebrew is working properly:: | ||
|
||
$ sudo brew doctor | ||
|
||
|
||
Step 2: Install or update dependencies:: | ||
|
||
$ brew install libevent libmagic libxml2 libxslt openssl postgresql graphviz python3 | ||
$ brew install freetype libjpeg libtiff littlecms webp # Required by Pillow | ||
$ brew tap homebrew/versions | ||
$ brew install elasticsearch@5.6 | ||
|
||
If you need to update dependencies:: | ||
|
||
$ brew update | ||
$ brew upgrade | ||
|
||
Step 3: Run buildout:: | ||
|
||
$ python3 bootstrap.py --buildout-version 2.9.5 --setuptools-version 36.6.0 | ||
$ bin/buildout | ||
|
||
NOTE: | ||
If you have issues with postgres or the python interface to it (psycogpg2) you probably need to install postgresql | ||
via homebrew (as above) | ||
If you have issues with Pillow you may need to install new xcode command line tools: | ||
- First update Xcode from AppStore (reboot) | ||
$ xcode-select --install | ||
If you are running macOS Mojave, you may need to run the below command as well: | ||
$ sudo installer -pkg /Library/Developer/CommandLineTools/Packages/macOS_SDK_headers_for_macOS_10.14.pkg -target / | ||
|
||
|
||
|
||
If you wish to completely rebuild the application, or have updated dependencies: | ||
$ make clean | ||
|
||
Then goto Step 3. | ||
|
||
|
||
Running tests | ||
============= | ||
|
||
To run specific tests locally:: | ||
|
||
$ bin/test -k test_name | ||
|
||
To run with a debugger:: | ||
|
||
$ bin/test --pdb | ||
|
||
Specific tests to run locally for schema changes:: | ||
Snovault is a JSON-LD Database Framework that serves as the backend for the `4DN Data portal <https://github.com/4dn-dcic/fourfront>`_ and `CGAP <https://github.com/dbmi-bgm/cgap-portal>`_. It is a very divergent fork of the work of the same name written by the ENCODE team at Stanford University. `See here <https://github.com/ENCODE-DCC/snovault>`_ for the original version. | ||
|
||
$ bin/test -k test_load_workbook | ||
Since Snovault is used for multiple deployments across a couple projects, we use `GitHub releases <https://github.com/4dn-dcic/snovault/releases>_` to version it. This page also acts as a changelog. | ||
|
||
Run the Pyramid tests with:: | ||
To get started, read the following documentation on setting up and developing Snovault: | ||
|
||
$ bin/test | ||
.. toctree:: | ||
:titlesonly: | ||
local_installation | ||
testing | ||
resources | ||
storage_overview | ||
traversal | ||
resource_views | ||
es_mapping | ||
es_indexing | ||
snowflakes |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
Local Installation | ||
================== | ||
|
||
Currently these are for macOS using homebrew. If using linux, install dependencies with a different package manager. | ||
|
||
Snovault is known to work with Python 3.6.x and will not work with Python 3.7 or greater. If part of the HMS team, it is recommended to use Python 3.4.3, since that's what is running on our servers. A good tool to manage multiple python versions is `pyenv <https://github.com/pyenv/pyenv>_`. It is best practice to create a fresh Python virtualenv using one of these versions before proceeding to the following steps. | ||
|
||
Step 0: Obtain AWS keys. These will need to added to your environment variables or through the AWS CLI (installed later in this process). | ||
|
||
Step 1: Verify that homebrew is working properly:: | ||
|
||
$ sudo brew doctor | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I did
I think we should change this line to say just
|
||
|
||
|
||
Step 2: Install or update dependencies:: | ||
|
||
$ brew install libevent libmagic libxml2 libxslt openssl postgresql graphviz | ||
$ brew install freetype libjpeg libtiff littlecms webp # Required by Pillow | ||
$ brew tap homebrew/versions | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I recommend that we just remove this line. I don't actually think it's needed. |
||
$ brew install elasticsearch@5.6 | ||
|
||
If you need to update dependencies:: | ||
|
||
$ brew update | ||
$ brew upgrade | ||
|
||
Step 3: Run buildout:: | ||
|
||
$ python3 bootstrap.py --buildout-version 2.9.5 --setuptools-version 36.6.0 | ||
$ bin/buildout | ||
|
||
NOTE: | ||
If you have issues with postgres or the python interface to it (psycogpg2) you probably need to install postgresql | ||
via homebrew (as above) | ||
If you have issues with Pillow you may need to install new xcode command line tools: | ||
- First update Xcode from AppStore (reboot) | ||
$ xcode-select --install | ||
If you are running macOS Mojave, you may need to run the below command as well: | ||
$ sudo installer -pkg /Library/Developer/CommandLineTools/Packages/macOS_SDK_headers_for_macOS_10.14.pkg -target / | ||
|
||
|
||
If you wish to completely rebuild the application, or have updated dependencies: | ||
$ make clean | ||
|
||
Then go to Step 3. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
Resource Views | ||
=========================== | ||
|
||
**Work in progress!** | ||
|
||
This document outlines the different base resource views and their sources. May be worth first reading the `traversal <https://snovault.readthedocs.io/en/latest/traversal.html>`_ and `storage <https://snovault.readthedocs.io/en/latest/storage_overview.html>`_ documentation. | ||
|
||
**TODO: outline each resource view with context=Item.** | ||
|
||
**TODO: Include relationship to storage and traversal (context and embed.py)** |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
Resources | ||
=========================== | ||
|
||
**Work in progress!** | ||
|
||
This document outlines different classes that compose a base Snovault item. Code is located in the following files: | ||
|
||
- `resources.py <https://github.com/4dn-dcic/snovault/blob/master/src/snovault/resources.py>`_: Root, AbstractCollection, Collection, Item classes | ||
- `typeinfo.py <https://github.com/4dn-dcic/snovault/blob/master/src/snovault/typeinfo.py>`_: AbstractTypeInfo, TypeInfo, TypesTool | ||
- `config.py <https://github.com/4dn-dcic/snovault/blob/master/src/snovault/config.py>`_: CollectionsTool, collection and abstract_collection decorators | ||
|
||
**TODO: outline the role of each resource class. Include a complete example** |
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,21 +1,20 @@ | ||
================ | ||
Snowflakes | ||
================ | ||
|
||
General | ||
^^^^^^^^ | ||
----------------- | ||
|
||
Snowflakes used to be the front-end component of Snovault meant to serve as a demo. Since we at 4DN have our own Snovault-backed application (Fourfront, CGAP), snowflakes has been entirely removed from our version of Snovault. It is still present in ENCODE's version which you can find `here <https://github.com/ENCODE-DCC/snovault>`_ . | ||
|
||
Removing Snowflakes from Snovault proved more challenging than one may expect. Some parts of snowflakes were actually required for snovault to run, such as ``root.py``. These files have all been migrated into Snovault. | ||
|
||
Testing | ||
^^^^^^^^ | ||
----------------- | ||
|
||
In addition, several relevant tests that lived in Snowflakes have been migrated into Snovault. These tests include only those that are specific to Snovault and are not covered in existing Fourfront/CGAP testing. Properly configuring the tests proved challenging as the test framework as previously configured intertwined Snowflakes and Snovault in such a way that Snovault tests could not function without the presence of Snovault. | ||
|
||
To fix this, several aspects of the tests have been refactored. We now load test schemas from files and have migrated many of the relevant fixtures from Snowflakes. ``config.py`` also required changes to account for behavior Snovault expected that it inherited from Snowflakes due to how includes work in PyTest. | ||
|
||
Test coverage for Snovault should still be fairly strong, especially when combined with that of Fourfront/CGAP. Some indexing tests are marked as flaky as we've found they experience intermittent failures. Updating how we clear the SQS queue has also helped to remidy this issue. | ||
Test coverage for Snovault should still be fairly strong, especially when combined with that of Fourfront/CGAP. Some indexing tests are marked as flaky as we've found they experience intermittent failures. Updating how we clear the SQS queue has also helped to remedy this issue. | ||
|
||
netsettler marked this conversation as resolved.
Show resolved
Hide resolved
|
||
One issue of note that was not solved involved a particular logging related test that appears to pass on local and fail on Travis. The associated test is ``test_indexing_logging``. This tests makes a index post on the application and checks to see that a correct log message was emitted. The log message itself is emitted but for some reason on Travis it is truncated. Even spinning up Travis on an identical container could not reproduce the issue. The relevant line is marked in the test file. | ||
One issue of note that was not solved involved a particular logging related test that appears to pass on local and fail on Travis. The associated test is ``test_indexing_logging``. This tests makes a index post on the application and checks to see that a correct log message was emitted. The log message itself is emitted but for some reason on Travis it is truncated. Even spinning up Travis on an identical container could not reproduce the issue. The relevant line is marked in the test file. |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I bet most of these instructions would be better if we borrowed the instructions I just created for forefront. (We could do that in a separate PR sometime. Not a blocker here, though.