The WebAssistant allows monitoring websites (thinktanks) for the publication of new articles using configurable scrapers. Scrapers are executed using a headless Chrome browser. Found publications are saved in a PostgreSQL database and added to an ElasticSearch Index. Monitors allow to define filters that aggregate publications and automatically send Citavi .ris files to configured receivers by email.
Please note that most of the application is in German only currently.
This repository comes with a Docker Compose setup that should help to set up the requirements to run the project locally. The following containers will be started
swp
The main Django/React application, including Playwright and a headless Chrome browser. Restarts when code is changed.db
The PostgreSQL database, exposed to port5432
in case you want to access it directlyredis
The Redis message brokerelasticsearch
A single node ElasticSearch instancecelery
The Celery task queuebootstrap
Runs one-off tasks like updating translations, running database migrations etc.frontend
Runs nmp and webpack in watch mode to automatically rebuild the frontend when code is changed.
Please note that the Docker Compose setup is not recommended to be used for a live deployment, yet!
Build the docker image (this needs to be redone whenever major dependencies change)
docker-compose build --no-cache
With the image built, Docker Compose can be used to start the services.
docker-compose up
Once the containers are running, you should create a superuser:
docker-compose exec swp python manage.py createsuperuser
Finally you can log in at http://localhost:8000
The Docker Compose setup will generally automatically do any required actions when the code is changed or the application is used. However sometimes it may be necessary to manually trigger certain actions. This can be done by running the manage.py
script in the swp
container.
Rebuild the Elastic Search Index:
docker-compose exec swp python manage.py search_index --rebuild
Run a specific scraper (identified by its ID):
docker-compose exec swp python manage.py scrape <id>
More commands are listed when running manage.py without arguments:
docker-compose exec swp python manage.py
To set up the project in IntelliJ IDEA
- select
File > New > Project from existing Sources...
. - In the Dialog that pops up select the folder you have cloned the repository into.
- When prompted to select the Project SDK use the
+
# the top left corner to selectNew Python SDK...
. - Select
Docker Compose
in the left menu and selectswp
as the Service on the right side.
Add a run configuration go to Run > Edit Configurations...
. Click the +
in the top left corner of the dialog
and select Django
. Use 0.0.0.0
as the Host. Select Use SDK of module
. Edit the environment
variables and set DJANGO_SETTINGS_MODULE
to swp.settings.dev
. You should now be able to start
the django development server by running the configuration. Code completion will only work if the
docker container has been started at least once.
To enable running django management commands right-click on the project root and click Open Module Settings
.
Select the mangage.py
located in the project root. Select swp/settings/dev.py
as the settings file. You should now
be able to run management commands via Tools > Run manage.py Task
.
To have proper coding assistance regarding to import paths set the WebPack Config
in Preferences > Languages & Frameworks > JavaScript > WebPack
.
To properly use our lint rules defined in .eslintrc.js
for JavaScript and .stylelintrc.js
you have to activate these tools in the preferences.
-
To activate ESLint set the lint configuration in
Preferences > Languages & Frameworks > JavaScript > Code Quality Tools > ESLint to automatic
. -
To activate Stylelint go to
Preferences > Languages & Frameworks > Style Sheets > Stylelint
and set it enabled.