Outlier Detection System
Given the following types of JSON messages:
{"publisher": "publisher-id", "time": "2015-11-03 15:03:30.352","readings": [ 1, 13, 192, 7, 8, 99, 1014, 4]}
are pushed to kafka, the present system is proposing two services to find the outliers of median readings: a cosumer which will store the median of the readings in a Database and a rest service which retrives the last given number of records and mark the outliers.
- Consumer
- Web
- Tests (one of each covering the whole flow (:happypath:) :sleepy:)
- Java 1.8+
- Maven
- Docker for Kafka, Zookeeper, MySQL
docker run --net=host -d --name=zookeeper -e ZOOKEEPER_CLIENT_PORT=2181 confluentinc/cp-zookeeper
docker run --net=host -d -p 9092:9092 --name=kafka -e KAFKA_ZOOKEEPER_CONNECT=localhost:2181 -e KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://localhost:9092 -e KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR=1 confluentinc/cp-kafka
docker run --name readings-db --net=host -p 3306:3306 -e MYSQL_ROOT_PASSWORD=m4st3r -e MYSQL_DATABASE=readings_db -e MYSQL_USER=readingsuser -e MYSQL_PASSWORD=readingsus3r -d mysql
cd outlier-detection-system
mvn clean package
- Start consumer:
java -jar readings-consumer\target\readings-consumer-0.0.1-SNAPSHOT.jar
- Start web:
java -jar outliers-web\target\outliers-web-0.0.1-SNAPSHOT.jar
- Publish some readings
$ docker run --net=host --rm confluentinc/cp-kafka bash -c "echo '{\"publisher\":\"pub1\",\"time\":\"2019-12-03 13:8:03.040\",\"readings\":[7,8,9]}' | kafka-console-producer --request-required-acks 1 --broker-list localhost:9092 --topic outliers"
- Get outliers marked with true for specified publisher (pub1)
curl -XGET localhost:8080/publishers/pub1/outliers?limit=10