Skip to content

RANGER-5175: Functional Test Case Support for KMS API and HDFS Encryption #547

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Draft
wants to merge 17 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions PyTest-KMS-HDFS/pytest.ini
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
[pytest]
markers =
cleanEZ: clean up the encryption zone
createEZ: create encryption zone
60 changes: 60 additions & 0 deletions PyTest-KMS-HDFS/readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
# KMS API & HDFS Encryption Pytest Suite


This test suite validates REST API endpoints for KMS (Key Management Service) and tests HDFS encryption functionalities including key management and file operations within encryption zones.

**test_kms :** contains test cases for checking KMS API functionality

**test_hdfs :** contains test cases for checking hdfs encryption

## 📂 Directory Structure

```
test_directory/
├── test_kms/ # Tests on KMS API
├── test_keys.py # Key creation and key name validation
├── test_keyDetails.py # getKeyName, getKeyMetadata, getKeyVersion checks
├── test_keyOps.py # Key operations: Roll-over, generate DEK, Decrypt EDEK
├── conftest.py # Reusable fixtures and setup
├── utils.py # Utility methods
├── readme.md
├── test_hdfs/ # Tests on HDFS encryption cycle
├── test_encryption.py # Full HDFS encryption cycle testing
├── test_config.py # stores all constants and HDFS commands
├── conftest.py # sets up the environment
├── readme.md
├── pytest.ini # Registers custom pytest markers
├── requirements.txt
├── README.md # This file
```

## ⚙️ Setup Instructions
Bring up KMS container and any dependent containers using Docker.

Create a virtual environment and install the necessary packages through requirements.txt

## Run test cases

**Navigate to PyTest-KMS-HDFS directory**

**to run tests in test_kms folder**
> pytest -vs test_kms/

to run with report included
> pytest -vs test_kms/ --html=kms-report.html


**to run tests in test_hdfs folder**

> pytest -vs -k "test_encryption"
or
>pytest -vs test_hdfs/

to run with report included
>pytest -vs test_hdfs/ --html=hdfs-report.html

📌 Notes

Ensure Docker containers for KMS and HDFS are running before executing tests.

Reports generated using --html can be viewed in any browser for detailed test results.
20 changes: 20 additions & 0 deletions PyTest-KMS-HDFS/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
annotated-types==0.7.0
certifi==2025.1.31
charset-normalizer==3.4.1
docker==7.1.0
idna==3.10
iniconfig==2.0.0
Jinja2==3.1.6
MarkupSafe==3.0.2
packaging==24.2
pluggy==1.5.0
pydantic==2.11.0
pydantic_core==2.33.0
pytest==8.3.5
pytest-html==4.1.1
pytest-metadata==3.1.1
python-on-whales==0.76.1
requests==2.32.3
typing-inspection==0.4.0
typing_extensions==4.13.0
urllib3==2.3.0
80 changes: 80 additions & 0 deletions PyTest-KMS-HDFS/test_hdfs/conftest.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
import docker
import pytest
import time
from test_config import (HADOOP_CONTAINER, HDFS_USER,KMS_PROPERTY,CORE_SITE_XML_PATH,SET_PATH_CMD)

# Setup Docker Client
client = docker.from_env()

@pytest.fixture(scope="module")
def hadoop_container():
container = client.containers.get(HADOOP_CONTAINER) #to get hadoop container instance
return container


def configure_kms_property(hadoop_container):
# Check if KMS property already exists
check_cmd = f"grep 'hadoop.security.key.provider.path' {CORE_SITE_XML_PATH}"
exit_code, _ = hadoop_container.exec_run(check_cmd, user='root')

if exit_code != 0:
# Insert KMS property
insert_cmd = f"sed -i '/<\\/configuration>/i {KMS_PROPERTY}' {CORE_SITE_XML_PATH}"
exit_code, output = hadoop_container.exec_run(insert_cmd, user='root')
print(f"KMS property inserted. Exit code: {exit_code}")

# Debug: Show updated file
cat_cmd = f"cat {CORE_SITE_XML_PATH}"
_, file_content = hadoop_container.exec_run(cat_cmd, user='root')
print("Updated core-site.xml:\n", file_content.decode())

# Restart the container to apply the config changes
print("Restarting Hadoop container to apply changes...")
hadoop_container.restart()
time.sleep(10) # Wait for container to fully restart
print("Hadoop container restarted and ready.")

else:
print("KMS provider already present. No need to update config.")



def ensure_user_exists(hadoop_container, username):
# Ensure keyadmin user exists
print("Ensuring keyadmin user exists...")
user_check_cmd = f"id -u {username}"
exit_code, _ = hadoop_container.exec_run(user_check_cmd, user='root')

if exit_code != 0:
# Create the keyadmin user if not already present
create_user_cmd = f"useradd {username}"
exit_code, output = hadoop_container.exec_run(create_user_cmd, user='root')
print(f"keyadmin user created. Exit code: {exit_code}")

# Assign necessary permissions to the user
assign_permissions_cmd = f"usermod -aG hadoop {username}"
exit_code, output = hadoop_container.exec_run(assign_permissions_cmd, user='root')
print(f"Permissions assigned to keyadmin. Exit code: {exit_code}")
else:
print("keyadmin user already exists. No need to create.")



# Automatically setup environment before tests run
@pytest.fixture(scope="module", autouse=True)
def setup_environment(hadoop_container):

set_path_cmd = SET_PATH_CMD
hadoop_container.exec_run(set_path_cmd, user='root')

configure_kms_property(hadoop_container)
ensure_user_exists(hadoop_container,"keyadmin")

# Exit Safe Mode
print("Exiting HDFS Safe Mode...")
hadoop_container.exec_run("hdfs dfsadmin -safemode leave", user=HDFS_USER)

yield # Run tests

# Post-test cleanup
print("Tests completed.")
91 changes: 91 additions & 0 deletions PyTest-KMS-HDFS/test_hdfs/readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
# This is the main directory for testing HDFS encryption cycle

## Structure
```
test_hdfs/
├── test_encryption.py
├── test_config.py #stores all constants and HDFS commands
├── conftest.py #sets up the environment
├── utils.py #utility methods

```

---

## Extra Features

- **Markers:**
Markers have been used to selectively run specific test cases, improving test efficiency and organization.

---

### `setup_environment`

Handled in `Conftest.py` file
Before running the test cases, some environment configurations are needed:
- HDFS must communicate with KMS to fetch key details.
- Specific KMS properties are added to the `core-site.xml` file.
- Containers are restarted to apply the changes effectively.

---

### Utility Methods

- **get_error_logs:**
Fetches logs from both KMS and HDFS containers. Helps in identifying issues when errors or exceptions occur during testing.

- **run_command:**
Executes all necessary HDFS commands inside the containers.

---

## `test_encryption.py`

Handles the **full HDFS encryption cycle**, including setup, positive and negative test scenarios, and cleanup.

### Main Highlights:
- Encryption Zone (EZ) creation in HDFS.
- Granting permissions to specific users for read/write operations within the EZ.
- Validating read/write attempts by unauthorized users inside the EZ.


## Test Cases

### ✅ Positive Test Cases

1. **test_create_key:**
Creates an Encryption Zone (EZ) Key which is required to create an EZ.

2. **test_create_encryption_zone:**
Creates an Encryption Zone (EZ) using an existing EZ key.

3. **test_grant_permissions:**
Grants read-write permissions to a specific user (e.g., HIVE) within the EZ.

4. **test_hive_user_write_read:**
Performs write and read operations inside the EZ using the authorized HIVE user.

---

### ❌ Negative Test Cases

1. **test_unauthorized_write:**
Attempts to write inside the EZ using an unauthorized user (e.g., HBASE). Validates expected denial of access.

2. **test_unauthorized_read:**
Attempts to read inside the EZ using an unauthorized user. Validates expected denial of access.

---

### 🧹 Cleanup

- **test_cleanup:**
Cleans up the Encryption Zone and all files created during testing.
Deletes the EZ key created earlier.
Ensures the test environment is reset for clean re-runs.

---

## Summary

This test suite ensures that **HDFS encryption and access control mechanisms** function as expected, validating both authorized and unauthorized access scenarios while maintaining a clean and reusable test environment.
57 changes: 57 additions & 0 deletions PyTest-KMS-HDFS/test_hdfs/test_config.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
HDFS_USER = "hdfs"
HIVE_USER = "hive"
KEY_ADMIN="keyadmin"
HEADERS={"Content-Type": "application/json","Accept":"application/json"}
PARAMS={"user.name":"keyadmin"}
BASE_URL="http://localhost:9292/kms/v1"
HADOOP_CONTAINER = "ranger-hadoop"
HDFS_USER = "hdfs"
KMS_CONTAINER = "ranger-kms"

#KMS configs that needs to be added in XML file------------add more if needed
KMS_PROPERTY = """<property><name>hadoop.security.key.provider.path</name><value>kms://http@host.docker.internal:9292/kms</value></property>"""

CORE_SITE_XML_PATH = "/opt/hadoop/etc/hadoop/core-site.xml"

# Ensure PATH is set for /opt/hadoop/bin
SET_PATH_CMD="echo 'export PATH=/opt/hadoop/bin:$PATH' >> /etc/profile && export PATH=/opt/hadoop/bin:$PATH"

HADOOP_NAMENODE_LOG_PATH="/opt/hadoop/logs/hadoop-hdfs-namenode-ranger-hadoop.example.com.log"
KMS_LOG_PATH="/var/log/ranger/kms/ranger-kms-ranger-kms.example.com-root.log"


# HDFS Commands----------------------------------------------------
CREATE_KEY_COMMAND = "hadoop key create my_key -size 128 -provider kms://http@host.docker.internal:9292/kms"

VALIDATE_KEY_COMMAND = "hadoop key list -provider kms://http@host.docker.internal:9292/kms"

CREATE_EZ_COMMANDS = [
"hdfs dfs -mkdir /secure_zone2",
"hdfs crypto -createZone -keyName my_key -path /secure_zone2",
"hdfs crypto -listZones"
]

GRANT_PERMISSIONS_COMMANDS = [
"hdfs dfs -chmod 700 /secure_zone2",
"hdfs dfs -chown hive:hive /secure_zone2"
]

HIVE_CREATE_FILE_COMMAND = 'bash -c \'echo "Hello, this is a third file!" > /home/hive/testfile2.txt && ls -l /home/hive/testfile2.txt\''

HIVE_ACTIONS_COMMANDS = [
"hdfs dfs -put /home/hive/testfile2.txt /secure_zone2/",
"hdfs dfs -ls /secure_zone2/",
"hdfs dfs -cat /secure_zone2/testfile2.txt"
]

UNAUTHORIZED_WRITE_COMMAND = 'hdfs dfs -put /home/hbase/hack.txt /secure_zone2/'

UNAUTHORIZED_READ_COMMAND = "hdfs dfs -cat /secure_zone2/testfile2.txt"

CLEANUP_COMMANDS = [
"hdfs dfs -rm /secure_zone2/testfile2.txt",
"hdfs dfs -rm -R /secure_zone2"
]
KEY_DELETION_CMD = "bash -c \"echo 'Y' | hadoop key delete my_key -provider kms://http@host.docker.internal:9292/kms\""


Loading