The Apollo-Tools Resource Manager (ARM) is a multi-user resource management tool for the edge-cloud continuum. It simplifies the management and unifies monitoring of resources and enables automatic application deployments across all registered resources.
The ARM provides the following key features:
- Manage and monitor AWS Lambda functions, Amazon EC2 instances, OpenFaaS instances and K8S clusters.
- Filter resources based on Service Level Objectives (SLO) and store a subset of these resources as Resource Ensembles.
- Automatically deploy functions (Python 3.8, Java 11) and services (Docker containers) on the supported platforms.
- Continuous monitoring of registered resources.
- Optional alerting of resources that are part of active application deployments.
- Apollo-Tools Resource-Manager (ARM)
- Java 11
- Docker
- Gradle 7.3.2
- Terraform 1.4.5
- OpenFaaS Cli 0.15.9
- Node 16.13.0
- PostgreSQL 14.10
- VictoriaMetrics v1.96.0
- Grafana 10.2.2
- Metrics Server v0.7.0 (enabled on all registered k8s resources)
- Node Exporter 1.7.0 (setup on all OpenFaaS resources)
- Install Metrics Service on K8s resources, installation instructions can be found here
- Install node exporter on OpenFaaS resources, example bash script that installs OpenFaaS CLI and Node Exporter can be found here
cd ./local-dev
docker compose up
Copy config.exmaple.json to ./backend/conf/config.json
and adjust the values to
your requirements:
Name | Description | Type |
---|---|---|
db_host | the host url of the PostgreSQL database |
string |
db_port | the port of the PostgreSQL database | int |
db_user | the user to access the PostgreSQL database | string |
db_password | the password to access the PostgreSQL database | string |
max_retries | the amount of retries for database operations if a serialization error occurs | int |
retry_delay_millis | the time span in milliseconds to wait until a database operation is called again after a serialization error occurred | ìnt |
api_port | the port of the Rest-API of the ARM | int |
build_directory | the directory, where all build file are stored for the automatic resource deployment | string |
dind_directory | the docker-in-docker volume path of the build directory. For local development "" should be used. | string |
upload_persist_directory | the directory, where uploaded files are persisted after successful validation | string |
upload_temp_directory | the directory, where uploaded files are temporarily stored | string |
max_file_size | the maximum size of an uploaded file in bytes | long |
jwt_secret | the secret that is used for the encryption of the Json Web Tokens | string |
jwt_algorithm | the algorithm that is used for the jwt signature | string |
token_minutes_valid | the period of validity of the Json Web Tokens in minutes | int |
ensemble_validation_period | the time period between each validation performed on all existing ensembles in minutes | int |
docker_insecure_registries | the insecure registries that are accessed for openfaas and ec2 deployments | list(string) |
kube_config_secrets_name | the name of the secret that contains the kube-configs of the registered k8s instances | string |
kube_config_secrets_namespace | the name space location of the kube config secrets | string |
kube_config_directory | the path of the directory where the kube configs to access registered k8s resources are stored | string |
kube_api_timeout_seconds | the timeout of requests that are sent to the k8s api in seconds | int |
kube_image_pull_secrets | the names of the secrets that contain access credentials to private docker registries. These secrets are only used for deployments to k8s resources and must be present on every registered k8s resource that should access it. |
list(string) |
monitoring_push_url | the push url of the external monitoring system (VictoriaMetrics) | string |
monitoring_query_url | the query url of the external monitoring system (VictoriaMetrics) | string |
latency_monitoring_count | The number of echo requests to send per latency test | int |
kube_monitoring_period | the time period between each monitoring update for all registered k8s resources in seconds | int |
openfaas_monitoring_period | the time period between each monitoring update for all registered OpenFaaS resources in seconds | int |
region_monitoring_period | the time period between each monitoring update for all registered regions in seconds | int |
aws_price_monitoring_period | the time period between each monitoring update the aws price list api in seconds | int |
file_cleanup_period | the time period between file clean ups of failed deployments. This is necessary for deployments where the RM was not able to automatically clean up the files. | int |
cd ./backend
- Copy
./backend/conf/config.example.json
to./backend/conf/config.json
and replace the existing values with the desired values ../gradlew build
../gradlew run
File -> Open...
- Select setttings.gradle.kts
- Copy
./backend/conf/config.example.json
to./backend/conf/config.json
and replace the existing values with the desired values - Select the gradle task
rm/application/run
in the gradle sidebar
- Adjust the values in .env.local and .env to your requirements:
Name | Description |
---|---|
NEXT_PUBLIC_API_URL | the url of the Rest-API of the backend |
NEXT_PUBLIC_POLLING_DELAY | the polling delay used for updating the status of a selected resource reservation |
cd ./frontend
npm install
npm run dev
The directory kubernetes contains the following file that can be used to deploy the ARM on a Kubernetes cluster:
File | Description |
---|---|
rm-api.yaml | Deploys the backend |
rm-db.yaml | Deploys a PostgreSQL database |
rm-gui.yaml | Deploys the frontend |
rm-kube-secret | Deploys kubeconfigs that are necessary to monitor the registered k8s resources |
rm-monitoring.yaml | Deploys VictoriaMetrics and Grafana |
Make sure that the database and secret are deployed and running before you deploy the backend. The files contain comments about properties and can be adjusted to your requirements.
Deployments can be executed with:
kubectl apply -f ./PATH/TO/DEPLOYMENT/FILE.yaml
To use your own container images following steps have to be performed:
- Build your images:
docker build -t ~username~/rm-api:latest ./backend --push
docker build -t ~username~/rm-gui:latest ./frontend --push
- Change the images used in rm-api.yaml and rm-gui.yaml to your newly created images.
- Apply the deployment with
kubectl apply
as mentioned above.
The ARM supports the deployment of Functions and Services.
Functions represent the implementation of a serverless function that have to be written in either Java 11 or Python 3.8 and can be deployed to three different platforms:
- AWS Lambda
- Amazon EC2 with OpenFaaS as serverless computing platform
- Any self-managed device with a running instance of OpenFaaS
Services represent container images and can be deployed on any Kubernetes instance that can be accessed and is registered at the ARM. Currently only self-managed devices with a running Kubernetes instance are supported.
At faas-templates you can find templates for all supported function runtimes. Each runtime directory contains a README.md that explains, how to implement a function in the respective language. For example implementations go to faas-examples. The example functions are ready to deploy using the ARM and provide additional guidance for function developers.
The following section describes the basic functionalities of the ARM using the frontend. The documentation of the REST-API is available as an OpenAPI 3.0 specification. For the optimal reading experience it is advised to open the specification inside a tool like Swagger Editor, Postman or ide specific plugins that can display OpenAPI 3.0 specifications. The specification is available at the path resource-manager.yaml
To access the ARM as a new user, a new account has to be created by an existing account, that has the admin role. After the account has been created, the user can log in with the credentials specified in the previous step. At the path /accounts/profile it is possible to update the password, add cloud credentials and Virtual Private Clouds (VPCs).
Important: To deploy resources to AWS the user has to store valid cloud credentials at the ARM. If the deployments include virtual machines (Amazon EC2) it is also necessary to register a VPC (Virtual Private Cloud) in the ARM for each region the user wants to deploy them. Both tasks can be done at the profile page in the frontend. For k8s deployments it is also necessary to assign a namespace to the user's account. This has to be done by accounts that have the admin role.
At the path /resources/new-resource users can register new resources and at /resources/resources all existing resources can be listed. Depending on the resource platform, there are some required metrics/properties that must be added to a resource after it's creation to be qualified for deployments. Nodes of K8s resources are created automatically by the monitoring service of the RM. They can not be registered manually.
At the path /functions/new-function users can register new functions and at /functions/functions all existing functions can be listed. It is possible to create private and public functions. Both can only be modified by the creator but public functions can be used for deployments by everyone.
At the path /services/new-service users can register new functions and at /services/services all existing service can be listed. It is possible to create private and public services. Both can only be modified by the creator but public services can be used for deployments by everyone.
At the path /ensembles/new-ensemble users can register new resource ensembles and at /ensembles/ensembles all existing ensembles can be listed. Ensembles are private and can only be viewed by their creator. An ensemble consists of a list of service level objectives and list of resources. A service level objective defines a limit for a certain metric that has to be fulfilled by all resources that are part of the resource ensemble. When creating a new resource ensemble, all resources have to fulfill the specified service level objectives. After the creation the ensemble can be manually validated. In the ensemble details invalid resources are highlighted with a red background. Additionally, all registered resource ensembles are validated periodically. The validation period can be defined with ensemble_validation_period.
At the path /deployments/new-deployment users can create a new deployment and at /deployments/deployments all existing deployments can be listed. Deployments are private and can only be viewed by their creator. Existing deployments can have the status NEW, DEPLOYED, TERMINATING, TERMINATED and ERROR. The status of the deployment depends on the deployment status of the resources that are part of the deployment. Opening the details of a deployment, displays the status of all resources as well as all logs that were created during deployment/termination. During the deployment and termination the deployments endpoint gets polled in a predefined interval (configurable with the .env variable NEXT_PUBLIC_POLLING_DELAY). If a new deployment contains a resource with EC2 or OpenFaaS as destination platform, users must provide valid docker credentials for a docker registry, that is reachable by all resources. Important: Do not use your actual password for that. You can create and use an access token with write and read permissions instead and delete the token after all resources were deployed. In addition, resources with OpenFaaS as platform are self-managed and are required to have a running instance of OpenFaaS that is accessible by the ARM.
If a deployment only contains a container resource users don't have to provide any additional credentials. The only requirement for deployments on container resources is one namespace per resources that has to be assigned to the users account by an admin.
To access all these urls in the gui, the suggested way is to use the sidebar. All routes that are explained above are reachable by using the sidebar.
For the benchmarking of the ARM, we implemented a benchmarking tool using python that can be found at benchmark. The results of the benchmarks can be found at benchmark-results. The analysis of these results has been implemented in Jupyter Notebooks and is located at rm-analysis. The raw data can be found in the other directories.