Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Convert llm-load-test to package-able format #78

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

sjmonson
Copy link
Member

@sjmonson sjmonson commented Jan 16, 2025

This PR converts the llm-load-test repo into a more packaging friendly format.

  • Moved source files to src/llm_load_test.
  • Replaced requirements.txt with a pyproject.toml as per python packaging guidelines.
  • Configured PDM as the build system (this is personal preference other options are hatch, setuptools, poetry, etc).
  • Added a load_test.py script to the project root that calls the main entrypoint of the package for legacy compatibility.

Note: This PR fixes an issue with the linting config where certain files were missed and thus there are many new linting errors.

Partially Implements #76
Closes #75
Closes #77

Summary by CodeRabbit

Release Notes

  • New Features

    • Added a command-line interface (CLI) for load testing large language models
    • Introduced a modular plugin system for testing different LLM runtimes and APIs
  • Documentation

    • Updated README with new installation and usage instructions
    • Updated Python version requirement to 3.10+
  • Chores

    • Migrated project configuration to pyproject.toml
    • Restructured project module imports
    • Updated dependency management approach
  • Build

    • Added PDM (Python Development Master) configuration
    • Configured package build and distribution settings

Copy link
Collaborator

@npalaska npalaska left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a good step towards modularizing the tool, I'll give it a try locally.

Comment on lines +91 to +92
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
# in version control.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this file need to be generated locally to install all the required dependency?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just pulled those lines from the Github Python gitignore template; AFAIK any PDM configuration can be done through the pyproject.toml file.

However we probably won't need any build system configuration since our build process does not require any external library support.

@ashishkamra
Copy link

@coderabbitai review

Copy link

coderabbitai bot commented Jan 30, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link

coderabbitai bot commented Jan 30, 2025

Walkthrough

The pull request introduces a comprehensive restructuring of the llm-load-test project, focusing on improving package organization, dependency management, and project structure. The changes include creating a pyproject.toml for dependency and build configuration, updating import paths to use absolute imports within the llm_load_test package, modifying the .gitignore to support PDM (Python Development Master), and updating the README and installation instructions. The project is now set up to be more easily installable and maintainable.

Changes

File Change Summary
.gitignore Added PDM-related entries to ignore configuration and build files
README.md Updated Python version requirement, installation, and usage instructions
load_test.py Converted to a legacy adapter delegating to llm_load_test.load_test.main()
pyproject.toml Added new project configuration with dependencies, build system, and entry point
requirements.txt Removed explicit dependencies, added editable package install
src/llm_load_test/plugins/* Updated import paths to use absolute imports from llm_load_test
tox.ini Modified linter commands to search Python files in src/ directory

Assessment against linked issues

Objective Addressed Explanation
Linter runs in subdirectories [#75]
Make project package-able [#77]

Poem

🐰 A package reborn, with structure so neat,
Dependencies dancing to a PDM beat
Imports now absolute, imports now clear
Load testing made simple, give a rabbit cheer!
🚀 Package magic, our code takes flight!

✨ Finishing Touches
  • 🔥 Error while generating docstrings. (♻️ Check again to generate again)

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
src/llm_load_test/load_test.py (2)

16-46: Consider using monotonic time for more robust timing.

In run_main_process, you rely on time.time(), which can jump if the system clock changes. For load testing, using time.monotonic() is often safer and less error-prone.

-    start_time = time.time()
-    current_time = start_time
-    while (current_time - start_time) < duration:
+    start_time = time.monotonic()
+    current_time = start_time
+    while current_time - start_time < duration:

153-157: Add test coverage for main concurrency loop.

The loop in main that adjusts concurrency (for n_users in concurrency:) is central to the tool’s functionality. Consider adding or expanding automated tests to ensure the logic handles edge cases (e.g., concurrency set to [0], concurrency set to very large values) without issues.

Would you like me to propose a new test file or add tests inline to an existing suite?

tox.ini (1)

32-32: Consider re-enabling pylint to maintain consistent code quality.

Pylint is commented out in [testenv:linters]. Having both flake8 and pylint active would help catch a wider range of issues. Plan to re-enable it in your follow-up PR to ensure thorough lint coverage.

README.md (1)

27-32: Ensure usage instructions remain consistent.

The README usage lines reference two different commands: python load_test.py and load-test. Consider aligning them for clarity so users see a single, consistent command form.

-Run the tool with the following command: `python load_test.py -c config.yaml`
+Run the tool with the following command: `load-test -c config.yaml`
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 85b6c66 and b88d7f7.

⛔ Files ignored due to path filters (1)
  • pdm.lock is excluded by !**/*.lock
📒 Files selected for processing (13)
  • .gitignore (1 hunks)
  • README.md (3 hunks)
  • load_test.py (1 hunks)
  • pyproject.toml (1 hunks)
  • requirements.txt (1 hunks)
  • src/llm_load_test/generation_pb2_grpc.py (2 hunks)
  • src/llm_load_test/load_test.py (1 hunks)
  • src/llm_load_test/plugins/dummy_plugin.py (1 hunks)
  • src/llm_load_test/plugins/hf_tgi_plugin.py (1 hunks)
  • src/llm_load_test/plugins/openai_plugin.py (1 hunks)
  • src/llm_load_test/plugins/tgis_grpc_plugin.py (1 hunks)
  • src/llm_load_test/utils.py (1 hunks)
  • tox.ini (1 hunks)
✅ Files skipped from review due to trivial changes (9)
  • src/llm_load_test/utils.py
  • requirements.txt
  • src/llm_load_test/generation_pb2_grpc.py
  • src/llm_load_test/plugins/openai_plugin.py
  • src/llm_load_test/plugins/dummy_plugin.py
  • src/llm_load_test/plugins/hf_tgi_plugin.py
  • src/llm_load_test/plugins/tgis_grpc_plugin.py
  • load_test.py
  • pyproject.toml
🧰 Additional context used
🪛 GitHub Actions: linters
src/llm_load_test/load_test.py

[warning] 10-12: Import statements are in wrong order and missing proper grouping

🔇 Additional comments (5)
.gitignore (2)

88-96: LGTM! PDM configuration looks good.

The PDM-related ignore patterns align with the project's transition to a packageable format:

  • .pdm.toml: Correctly ignored as it contains project-wide configurations
  • .pdm-python: Correctly ignored as it's environment-specific
  • .pdm-build/: Correctly ignored as it contains build artifacts

The configuration follows PDM's best practices and includes helpful documentation links.


89-90: Consider tracking pdm.lock in version control.

The comment suggests including pdm.lock in version control, but it's currently commented out. The lock file helps ensure reproducible builds by pinning exact versions of dependencies. This is particularly important for a load testing tool where consistent behavior across environments is crucial.

Run this script to check if pdm.lock exists and contains dependency information:

✅ Verification successful

Uncomment pdm.lock in .gitignore to track it in version control

The pdm.lock file exists and contains critical dependency information including exact versions, hashes, and requirements. For a load testing tool, tracking this file ensures reproducible builds and consistent behavior across environments.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check if pdm.lock exists and verify its contents
if [ -f "pdm.lock" ]; then
    echo "pdm.lock exists. Contents:"
    cat pdm.lock
else
    echo "pdm.lock not found. Consider creating it with 'pdm lock'"
fi

Length of output: 40820

src/llm_load_test/load_test.py (1)

160-170: Confirm safe termination of subprocesses.

The KeyboardInterrupt and generic Exception handlers both call stop_procs and stop_test. Verify that user processes don’t leave behind dangling threads or objects when an exception occurs mid-test. If needed, add final cleanup or a robust teardown to ensure consistent resource deallocation.

tox.ini (1)

26-26: Consider cross-platform compatibility for linting.

The command uses bash -c 'find src/ -name "*.py"', which may fail on non-Unix systems lacking bash. If you wish to support Windows, consider a platform-agnostic approach like a Python one-liner or adding an equivalent script.

README.md (1)

7-7: Clarify Python version requirements.

You’ve updated the requirement to “Python 3.10 or newer.” Ensure that your pyproject.toml and any CI configuration also reflect 3.10 as the minimum version so that users aren’t misled.

Comment on lines +10 to +12
from llm_load_test.dataset import Dataset

from llm_load_test import logging_utils
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Resolve import ordering to match linter rules.

The linter warns that these imports are not grouped or ordered properly. Typically, we group imports by standard library, third-party libraries, and local modules. Try consolidating the local modules in one block to satisfy style guidelines.

Apply the following diff to reorder imports and fix the pipeline warning:

 import logging
 import logging.handlers
 import multiprocessing as mp
 import sys
 import time

-from llm_load_test.user import User
-from llm_load_test.dataset import Dataset
-
 from llm_load_test import logging_utils
 from llm_load_test import utils
+
+from llm_load_test.dataset import Dataset
+from llm_load_test.user import User

Committable suggestion skipped: line range outside the PR's diff.

🧰 Tools
🪛 GitHub Actions: linters

[warning] 10-12: Import statements are in wrong order and missing proper grouping

Copy link

coderabbitai bot commented Jan 30, 2025

Caution

An unexpected error occurred while opening a pull request: Not Found - https://docs.github.com/rest/git/refs#get-a-reference

@sjmonson
Copy link
Member Author

@coderabbitai Please propose some unit tests

Copy link

coderabbitai bot commented Jan 30, 2025

Tip

For best results, initiate chat on the files or code changes.

@sjmonson Here's a proposal for unit tests. I'll break them down into multiple test files:

# tests/test_load_test.py
import pytest
from unittest.mock import Mock, patch
from multiprocessing import Queue
from llm_load_test.load_test import run_main_process, gather_results, stop_procs, create_procs

def test_run_main_process():
    # Mock dependencies
    dataset = Mock()
    dataset.get_next_n_queries.return_value = ["query1", "query2"]
    dataset_q = Queue()
    stop_q = Queue()
    
    # Test normal execution
    with patch('time.time') as mock_time, patch('time.sleep'):
        mock_time.side_effect = [0, 0.5, 1, 1.5, 2]  # Simulate time progression
        run_main_process(2, 1, dataset, dataset_q, stop_q)
        
        # Verify dataset queue was filled
        assert dataset_q.qsize() == 0  # Queue should be empty after cleanup
        assert stop_q.qsize() == 1  # Stop signal should be sent

def test_gather_results():
    # Mock pipes
    pipe1, pipe2 = Mock(), Mock()
    pipe1.recv.return_value = ["result1", "result2"]
    pipe2.recv.return_value = ["result3"]
    
    results = gather_results([pipe1, pipe2])
    assert len(results) == 3
    assert "result1" in results
    assert "result3" in results

def test_stop_procs():
    # Mock processes and queues
    proc1, proc2 = Mock(), Mock()
    dataset_q = Queue()
    stop_q = Queue()
    
    # Add some items to queues
    dataset_q.put("item1")
    
    stop_procs([proc1, proc2], dataset_q, stop_q)
    
    # Verify cleanup
    assert proc1.join.called
    assert proc2.join.called
    assert dataset_q.empty()
    assert stop_q.empty()
# tests/test_plugins.py
import pytest
from llm_load_test.plugins.dummy_plugin import DummyPlugin
from llm_load_test.result import RequestResult

def test_dummy_plugin_request():
    # Test non-streaming request
    plugin = DummyPlugin({"streaming": False})
    query = {
        "text": "hello world",
        "output_tokens": 5,
        "input_tokens": 2
    }
    
    result = plugin.request_http(query, user_id=1)
    assert isinstance(result, RequestResult)
    assert result.output_text == "dlrow olleh"  # Reversed text
    assert result.output_tokens == 5
    
def test_dummy_plugin_streaming():
    # Test streaming request
    plugin = DummyPlugin({"streaming": True})
    query = {
        "text": "hello world",
        "output_tokens": 5,
        "input_tokens": 2
    }
    
    result = plugin.streaming_request_http(query, user_id=1)
    assert isinstance(result, RequestResult)
    assert result.ack_time is not None
    assert result.first_token_time is not None
    assert result.end_time > result.start_time
# tests/test_result.py
import pytest
from llm_load_test.result import RequestResult

def test_request_result_calculation():
    result = RequestResult(user_id=1, input_id="test1", input_tokens=10)
    result.start_time = 1000
    result.ack_time = 1001
    result.first_token_time = 1002
    result.end_time = 1005
    result.output_tokens = 20
    
    result.calculate_results()
    
    assert result.time_to_first_token == 2  # 1002 - 1000
    assert result.total_time == 5  # 1005 - 1000
    assert result.tokens_per_second == 4  # 20 tokens / 5 seconds

def test_request_result_error_handling():
    result = RequestResult(user_id=1, input_id="test1", input_tokens=10)
    result.error_code = 500
    result.error_text = "Internal Server Error"
    
    assert result.has_error()
    assert result.error_code == 500
# tests/test_integration.py
import pytest
from llm_load_test.load_test import main
from llm_load_test.plugins.dummy_plugin import DummyPlugin

@pytest.fixture
def mock_config(tmp_path):
    config_file = tmp_path / "test_config.yaml"
    config_file.write_text("""
plugin: "dummy"
plugin_options:
  streaming: false
dataset:
  type: "simple"
  input_text: "test query"
  input_tokens: 2
  output_tokens: 5
load_options:
  concurrency: 2
  duration: 1
""")
    return str(config_file)

def test_complete_flow(mock_config):
    with pytest.raises(SystemExit) as exc_info:
        main(["-c", mock_config, "--log_level", "debug"])
    assert exc_info.value.code == 0

These tests cover:

  1. Core functionality:
    • Process management
    • Queue handling
    • Result gathering
  2. Plugin system:
    • Plugin interface
    • Request handling
    • Streaming vs non-streaming
  3. Result handling:
    • Calculations
    • Error scenarios
  4. Integration:
    • Complete flow with configuration
    • Error handling

Additional tests should be added for:

  1. Other plugin implementations (OpenAI, HF TGI, TGIS gRPC)
  2. Dataset handling
  3. Edge cases and error scenarios
  4. Configuration parsing
  5. Logging functionality

Would you like me to generate tests for any specific component in more detail?

🔗 Analysis chain
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Find all test files
echo "=== Existing test files ==="
fd "test.*\.py$"

# Look at the main functionality
echo -e "\n=== Main functionality (load_test.py) ==="
cat src/llm_load_test/load_test.py

# Look at plugin implementations
echo -e "\n=== Plugin implementations ==="
fd ".*plugin\.py$" --exec cat {}

Length of output: 33540


✏️ Learnings added
Learnt from: sjmonson
URL: https://github.com/openshift-psap/llm-load-test/pull/78

Timestamp: 2025-01-30T22:26:44.892Z
Learning: The llm-load-test project uses pytest for unit testing, with tests organized by component (core functionality, plugins, results, integration).

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

@sjmonson
Copy link
Member Author

Seems ok... would have to try these tests

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make llm-load-test package-able Linter is not run file in subdirectories
3 participants