Over 15 thousand documents govern how the Department of Defense (DoD) operates. The documents exist in different repositories, often exist on different networks, are discoverable to different communities, are updated independently, and evolve rapidly. No single ability has ever existed that would enable navigation of the vast universe of governing requirements and guidance documents, leaving the Department unable to make evidence-based, data-driven decisions. Today GAMECHANGER offers a scalable solution with an authoritative corpus comprising a single trusted repository of all statutory and policy driven requirements based on Artificial-Intelligence (AI) enabled technologies.
Fundamentally changing the way in which the DoD navigates its universe of requirements and makes decisions
GAMECHANGER aspires to be the Department’s trusted solution for evidence-based, data-driven decision-making across the universe of DoD requirements by:
- Building the DoD’s authoritative corpus of requirements and policy to drive search, discovery, understanding, and analytic capabilities
- Operationalizing cutting-edge technologies, algorithms, models and interfaces to automate and scale the solution
- Fusing best practices from industry, academia, and government to advance innovation and research
- Engaging the open-source community to build generalizable and replicable technology
See LICENSE.md (including licensing intent - INTENT.md) and CONTRIBUTING.md
The following should be done in a MacOS or Linux environment (including WSL on Windows)
- Install Google Chrome and ChromeDriver
- https://chromedriver.chromium.org/getting-started
- after a successful installation you should be able to run the following from the shell:
chromedriver --version
- Install Miniconda or Anaconda (Miniconda is much smaller)
- https://docs.conda.io/en/latest/miniconda.html
- after a successful installation you should be able to run the following from the shell:
conda --version
- Create a gamechanger crawlers python3.6 environment:
conda create -n gc-crawlers python=3.6
- Clone the repo and change into that dir:
git clone https://github.com/dod-advana/gamechanger-crawlers.git cd gamechanger-crawlers
- Activate the conda environment and install requirements:
conda activate gc-crawlers pip install --upgrade pip setuptools wheel pip install -r ./docker/minimal-requirements.txt
- That's it.
- Follow the environment setup guide above if you have not already
- Change to the gamechanger crawlers directory and export the repository path to the PYTHONPATH environment variable:
cd /path/to/gamechanger-crawlers export PYTHONPATH="$(pwd)"
- Create an empty directory for the crawler file outputs:
CRAWLER_DATA_ROOT=/path/to/download/location mkdir -p "$CRAWLER_DATA_ROOT"
- Create an empty previous manifest file:
touch "$CRAWLER_DATA_ROOT/prev-manifest.json"
- Run the desired crawler spider from the
gamechanger-crawlers
directory (in this example we will use theexecutive_orders_spider.py
):scrapy runspider dataPipelines/gc_scrapy/gc_scrapy/spiders/executive_orders_spider.py \ -a download_output_dir="$CRAWLER_DATA_ROOT" \ -a previous_manifest_location="$CRAWLER_DATA_ROOT/prev-manifest.json" \ -o "$CRAWLER_DATA_ROOT/output.json"
- After the crawler finishes running, you should have all files downloaded into the crawler output directory