SecurNet Project

Overview

SecurNet is a comprehensive network security project that leverages a Network Intrusion Detection System (NIDS) to enhance the security of networks. The project involves data preprocessing, feature selection, machine learning-based log classification, and a Streamlit dashboard for insightful visualization of key metrics.

Workflow

1. Data Collection and Preprocessing

The project begins with the collection of network logs, which are sent to a Kafka topic named "logs" for initial preprocessing. The first Python file handles this task, preparing the data for feature selection.

2. Feature Selection and Further Preprocessing

A second Python file retrieves the preprocessed data from the "logs" Kafka topic, performs additional preprocessing, and sends the refined data to another Kafka topic named "logsprocessed."

3. Machine Learning-Based Log Classification

The third Python file retrieves data from the "logsprocessed" Kafka topic. It passes the logs through a trained machine learning model to classify them into categories: Background, Normal, or Botnet. The results are then sent to the "logslabelled" Kafka topic.

4. Data Storage with Apache Pinot

Apache Pinot acts as a consumer, ingesting data from the "logslabelled" Kafka topic and storing it in a database. This ensures efficient storage and retrieval of labeled log data.

5. Streamlit Dashboard

The final component is a Streamlit dashboard that fetches data from Apache Pinot. The dashboard displays key metrics and insights derived from the labeled log data. This visualization aids in better defending against network attacks by providing a real-time overview of network security.

Getting Started

To set up and run the SecurNet project, follow these steps:

Clone the repository:

git clone https://github.com/yourusername/SecurNet.git
cd SecurNet

Then download the files required from here, LINK and move it to the SecureNet folder.

Model Training

The preproccessing.py file cleans and makes the raw log data ready for training. It outputs prepro.csv file. This processed log data is used by MLmodeltraining.py file to train the model.

First run the preprocessing.py file
It will generate a csv file in folder named outprepro.
Change the name of the csv file to prepro.csv
Now run the MLmodeltraining.py file. This will save the model in Model folder ready to be used.

Network Intrusion Detection

Here we will simulate log data coming in realtime. I am reading a csv file of raw log data and sending it in chunks of 10 rows to Kafka. Flow of the log data can be seen below: Running the project, follow the steps below, NOTE: Run all the individual commands in a separate terminal.

Run Apache zookeeper and kafka in different terminals one after the other by following commnads:

zookeeper-server-start /opt/homebrew/etc/zookeeper/zoo.cfg
kafka-server-start /opt/homebrew/etc/kafka/server.properties

Create Kafka topics, "logs", "logsprocessed" and logslabelled"

kafka-topics --create --topic logs --bootstrap-server localhost:9092
kafka-topics --create --topic logsprocessed --bootstrap-server localhost:9092
kafka-topics --create --topic logslabelled --bootstrap-server localhost:9092

Start Apache Pinot Controller, Broker and Server

pinot-admin StartController -zkAddress localhost:2181 -clusterName PinotCluster -controllerPort 9001
pinot-admin StartBroker -zkAddress localhost:2181 -clusterName PinotCluster -brokerPort 7001
pinot-admin StartServer -zkAddress localhost:2181 -clusterName PinotCluster -serverPort 8001 -serverAdminPort 8011

Send the table schema and table config to Apache Pinot.

pinot-admin AddTable \
    -schemaFile files_config/transcript_schema.json \
    -tableConfigFile files_config/transcript_table_realtime.json \
    -controllerPort 9001 -exec

Start 0.py, 1.py, 2.py in three separate terminals one after the other
Open the apache pinot dashboard to see data ingesting ----> Link
Run streamlit app to see the dashboard

streamlit run app.py

Screenshot

This is how your dashboard will look like...😁

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Model/pyspark		Model/pyspark
files_config		files_config
logs		logs
.DS_Store		.DS_Store
0.py		0.py
1.py		1.py
2.py		2.py
LICENSE		LICENSE
MLmodeltraining.py		MLmodeltraining.py
README.md		README.md
app.py		app.py
commands.txt		commands.txt
del_schema_table.sh		del_schema_table.sh
example.txt		example.txt
kafka_topic_creation.sh		kafka_topic_creation.sh
preprocessing.py		preprocessing.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SecurNet Project

Overview

Workflow

1. Data Collection and Preprocessing

2. Feature Selection and Further Preprocessing

3. Machine Learning-Based Log Classification

4. Data Storage with Apache Pinot

5. Streamlit Dashboard

Getting Started

Model Training

Network Intrusion Detection

Screenshot

About

Releases

Packages

Languages

License

VikasShavi/SecureNet

Folders and files

Latest commit

History

Repository files navigation

SecurNet Project

Overview

Workflow

1. Data Collection and Preprocessing

2. Feature Selection and Further Preprocessing

3. Machine Learning-Based Log Classification

4. Data Storage with Apache Pinot

5. Streamlit Dashboard

Getting Started

Model Training

Network Intrusion Detection

Screenshot

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages