Skip to content

Generating and checking

Vladimir Turov edited this page Oct 29, 2021 · 9 revisions

Table of contents

  1. About
  2. General testing process
    1. generate
    2. run
    3. check
  3. TestCase class
    1. Input
    2. Arguments
    3. Files
    4. Time limit
    5. Attach
  4. Generate method
  5. Check method
    1. Regular check method
    2. Without check method
    3. Different check methods

About

In the hs-test library, tests are a set of test cases. One test case is an object of the TestCase class. The generate method should generate a list of objects of the TestCase class and return it. It's a low-level mechanism of writing tests.

In the previous sections, you saw tests with dynamically generated input, but it is possible to create tests with static predetermined input. If the program to test is rather simple you can use this kind of tests to check it.

Dynamic testing has some benefits over tests with static input:

  1. You are able to test small parts of the output compared to checking the whole output. Parsing the whole output to determine if the program written correctly may be more error-prone rather than parsing small pieces of the output.
  2. Some programs aren't testable with static input because the input may be dependent on the previous output (for example, Tic-Tac-Toe game).

This method of testing revealed serious weaknesses and limitations for testing complex programs. Use static input only when the program to be tested is very simple. In other cases, it's strongly recommended to use dynamic testing.

General testing process

You can see the abstract testing algorithm in the code block below.

tests = generate()
for test in tests:
    output = run(test)
    result = check(output, test.attach)
    if not result.correct:
        break

So, you can see that there are three functions here - generate, run, check. Let's consider them below.

generate

The generate function is intended for creating test cases. This function returns the list of test cases. Each test case contains enough information for this separate test. For example, what data should be submitted to the standard input of the program or what files should be created before the start of this test, with which command-line arguments the program should be launched.

Note that the content of this function will be different for each stage of each project. That's why this is the first function you should implement while writing tests for a project.

run

This function runs user code.

First, it launches the main method of the program being tested with the command-line arguments listed in the test case.

Second, it puts the text saved in the test case in the standard input before running the main method. It looks like the user is entering something into the program from the keyboard, but in fact, it is predefined in the test case.

This method also creates the necessary files before starting the test (for example, the stage description says you should read the data from the file). After the main method is executed, these files are deleted.

Also, this function makes sure that the values of all variables in the program are the same each time. For example, in Java, there are static variables that keep their value between method executions. In Python, there are variables inside the module, which also do not change their value between main file executions, because modules are imported only once during the whole period of the program execution. So, this method also changes all the variables in the program to the original state as if before running the first test.

This method returns what the program printed to the console, i.e. the standard output.

You don't need to implement this function, it is always the same for all the stages of all the projects, so it's already written.

check

This function is intended to check the correctness of the student program. It takes two parameters - the output of the program for the current test and attach to the test (this attach should also be added to the test case in the generate function).

The function returns a special object containing 2 fields - correct and feedback. We will consider them later.

As you can see from the algorithm above, the first failed test stops testing.

Note that the content of this function will be different for each stage of each project. That's why this is the second function you should implement while writing tests for a project.

TestCase class

One test case is a single run of the user program. Before creating an object of the TestCase class, it is necessary to import it.

from hstest import TestCase

To parameterize a test case, uses named arguments.

TestCase(
    stdin=text,
    attach=object,
    files={src: content, src2: content2}
)

Input

The first and most frequently used parameter is stdin. It is intended to simulate input using the keyboard.

Let's look at this example:

TestCase(stdin="15\n23")

In this case, user's program below will run normally, count two numbers, and print their sum as if the user had entered these numbers using the keyboard. If you run this program directly, you will actually have to enter these two numbers from the keyboard. And you can use the stdin method to prepare this data in advance.

line = input()
first, second = line.split()
print(int(first) + int(second))

Arguments

The test case can also be parameterized by command line arguments. These arguments are passed to the program when starting the program from the console. For example:

$ python renamer.py --from old --to new --all

A Python program will get a list of six arguments: [renamer.py, --from, old, --to, new, --all] - the first parameter in Python is always a path to the launched file.

An example of how to set up a command-line arguments for test cases can be seen below:

TestCase(args=["--from", "old", "--to", "new", "--all"])

An example of using command line arguments in user code can be seen below:

import sys

print(len(sys.argv))
for arg in sys.argv:
    print(arg)

Files

You can also create external files for testing. Before starting the test case, these files are created on the file system so that the student's program can read them. After the user program finishes, these will be deleted from the file system.

An example of how to set up external files for test cases can be seen below:

TestCase(
    files={
        'file1.txt': 'File content',
        'file2.txt': 'Another content'
    }
)

Below you can see example of using these external files in user's code:

with open('file1.txt') as f:
    print(f.read())

Time limit

By default, the user program has 15 seconds to do all the necessary work. When the time runs out, the program shuts down and the user is shown a message that the time limit has been exceeded.

To use time limit, set the time_limit argument. Notice, that you need to specify the limit in milliseconds (so, by default it's 15000). If you want to disable time limit, make it zero or negative, for example -1.

TestCase(time_limit=10000)

Attach

You can also attach any object to the test case. This is an internal object, it will be available during checking inside the check method. You can put in the test input, feedback, anything that can help you to check this particular test case.

Below you can see how to attach the object to the test case.

from hstest import *


class SampleTest(StageTest):
    def generate(self) -> List[TestCase]:
        return [
            TestCase(attach=('Sample feedback', 1)),
            TestCase(attach=('Another feedback', 2))
        ]

    def check(self, reply: str, attach) -> CheckResult:
        feedback, count = attach
        if str(count) not in reply:
            return CheckResult.wrong(feedback)
        return CheckResult.correct()


if __name__ == '__main__':
    SampleTest().run_tests()

Generate method

You can generate these test cases within the generate method. It must return a list. You can see an example of using the method in the example of Attach (previous example).

Check method

Regular check method

The check method is used to check the user's solution. This method has 2 parameters - reply and attach. The first parameter is the output of the student program to the standard output. Below you can see examples of student programs that print data to the standard output.

print('Hello world!')
print('Second line also')

Everything that the user's program has outputted to the standard output is passed as the first argument to the check method. Therefore, in the following examples, the reply variable will be equal to "Hello world!\nSecond line also\n".

This method must return the CheckResult object. If the user's solution is incorrect, it is necessary to explain why - for example, in the example below in the check method it is checked that the user printed exactly 2 lines, and if the user not printed 2 lines, then inform the user what went wrong during checking.

from hstest import *


class SampleTest(StageTest):
    def generate(self) -> List[TestCase]:
        return [TestCase()]

    def check(self, reply: str, attach) -> CheckResult:
        if len(reply.strip().split()) == 2:
            return CheckResult.correct()
        return CheckResult.wrong("You should output exactly two lines")


if __name__ == '__main__':
    SampleTest().run_tests()

Instead of returning CheckResult object you may throw WrongAnswer or TestPassed error to indicate about the result of the test. It can be useful if you are under a lot of methods and don't want to continue checking everything else.

Since throwing any exceptions in the check method is prohibited (it indicates that something went wrong in tests and tests should be corrected), a lot of tests forced to be written like this:

... deep into function calls
    if some_parse_problem:
        raise Exception(feedback)
...

... check method
try:
    grids = Grid.parse(out)
except Exception ex:
    return CheckResult.wrong(ex.message)
...

But it is allowed to throw WrongAnswer and TestPassed errors. Now you can write this code in a more understandable way without many unnecessary try/catch constructions.

from hstest import WrongAnswer

... deep into function calls
    if some_parse_problem:
        raise WrongAnswer(feedback)
...

... check method
grids = Grid.parse(out)

Without check method

Sometimes test cases do not require a complex check method. For example, when the program output should be checked exactly. In this case, this method will always look like this:

def check(self, reply: str, expected: str):
    is_correct = reply.strip() == expected.strip()
    return CheckResult(is_correct, self.feedback)

Therefore, there is a class SimpleTestCase. The constructor of this class accepts two arguments - input data and expected output data.

Let's look at an example of tests for a program that should sum up all the numbers that go into the standard input.

from hstest import *


class SumTest(StageTest):
    def generate(self):
        return [
            SimpleTestCase(stdin="2 3", stdout="5", feedback=''),
            SimpleTestCase(stdin="1 2 3 4", stdout="10", feedback=''),
            SimpleTestCase(stdin="7", stdout="7", feedback=''),
            SimpleTestCase(stdin="0\n1 2", stdout="3", feedback='')
        ]


if __name__ == '__main__':
    SumTest().run_tests()

Here you can see that with the help of the SimpleTestCase class you do not have to write the check method yourself. One of the user's approaches to this task may look like this:

print(sum(map(int, input().split())))

And after the checking the student gets: Wrong answer in test #4 and doesn't understand what is wrong with the solution. In fact, the solution expects all numbers to be in the first line, although the task says about all the numbers to be input, including the other lines. To do this, you can set a special feedback, which will appear next to the Wrong answer in test #4, so that the student can understand where the program error is. An updated, more correct example of the tests can be seen below:

from hstest import *


class SumTest(StageTest):
    def generate(self):
        return [
            SimpleTestCase(
                stdin="2 3", stdout="5", 
                feedback='The numbers were: 2 3'),

            SimpleTestCase(
                stdin="1 2 3 4", stdout="10", 
                feedback='There are 4 numbers in the input'),

            SimpleTestCase(
                stdin="7", stdout="7", 
                feedback='Input may contain a single number, isn\'t it?'),

            SimpleTestCase(
                stdin="0\n1 2", stdout="3", 
                feedback='Numbers in this test appear in multiple lines')
        ]


if __name__ == '__main__':
    SumTest().run_tests()

You can mix tests that do not require a check method and regular tests. For normal tests, you also need to override the check method. In the example below, the check method is called only in the fifth test, the first four tests are tested as in the example above. Suppose the task says that if there are no numbers in the standard input, you should output an error message containing the word error.

from hstest import *


class SumTest(StageTest):
    def generate(self):
        return [
            SimpleTestCase(
                stdin="2 3", stdout="5", 
                feedback='The numbers were: 2 3'),

            SimpleTestCase(
                stdin="1 2 3 4", stdout="10", 
                feedback='There are 4 numbers in the input'),

            SimpleTestCase(
                stdin="7", stdout="7", 
                feedback='Input may contain a single number, isn\'t it?'),

            SimpleTestCase(
                stdin="0\n1 2", stdout="3", 
                feedback='Numbers in this test appear in multiple lines'),

            TestCase()
        ]
    
    def check(self, reply, attach):
        if 'error' in reply.lower():
            return CheckResult.correct()
        return CheckResult.wrong(
            'If there are no numbers in the standard input ' +
            'you should print an error message'
        )


if __name__ == '__main__':
    SumTest().run_tests()

Different check methods

Testing by check method is launched by default for test cases. But you can redefine this behavior by providing the test case with your own verification method. Thus, some test cases may check the student's solution using one method, while others may check the student's solution using another test method.

To change the standard check method, you need to set the check_function argument and pass another method. This can be seen in the example below. Let's imagine that you need to write tests for a program that adds two numbers if the numbers are input, otherwise it will concatenate two lines. In the example below, the first two test cases are tested using the check_integer_sum method; the next three test cases are tested using the check_string_concat method.

from hstest import *


class SumTest(StageTest):
    def generate(self):
        return [
            TestCase(
                stdin="12\n34", attach="12\n34",
                check_function=self.check_integer_sum
            ),
            TestCase(
                stdin="43\n23", attach="43\n23",
                check_function=self.check_integer_sum
            ),
            TestCase(
                stdin="qw\ner", attach="qw\ner",
                check_function=self.check_string_concat
            ),
            TestCase(
                stdin="12\new", attach="12\new",
                check_function=self.check_string_concat
            ),
            TestCase(
                stdin="qw\n12", attach="qw\n12",
                check_function=self.check_string_concat
            ),
        ]

    def check_integer_sum(self, reply: str, attach):
        first, second = map(int, attach.split())

        try:
            replied = int(reply.strip())
        except ValueError:
            return CheckResult.wrong(
                "Your program didn't output a number!"
            )
        
        if first + second != replied:
            return CheckResult.wrong(
                "You should output a sum of numbers " +
                    "when both numbers is integers"
            )

        return CheckResult.correct()

    def check_string_concat(self, reply: str, attach):
        words = attach.split()

        if reply.strip() != words[0] + words[1]:
            return CheckResult.wrong(
                "You should output a concatenation of words " +
                "when at least one of the words in not integer"
            )

        return CheckResult.correct()