-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
[Bug]: LLMRerank Default Prompt will fail and cause an error if an unrelated line containing a comma is included in the answer #17352
Comments
Hey there, @okirmis! I'm here to help you with any bugs, questions, or contributions you have in mind. Let's work together to solve the issue you're facing. To resolve the def default_parse_choice_select_answer_fn(
answer: str, num_choices: int, raise_error: bool = False
) -> Tuple[List[int], List[float]]:
"""Default parse choice select answer function."""
answer_lines = answer.split("\n")
answer_nums = []
answer_relevances = []
for answer_line in answer_lines:
line_tokens = answer_line.split(",")
if len(line_tokens) != 2:
if not raise_error:
continue
else:
raise ValueError(
f"Invalid answer line: {answer_line}. "
"Answer line must be of the form: "
"answer_num: <int>, answer_relevance: <float>"
)
try:
# Check if the token contains a colon before splitting
if ':' not in line_tokens[0]:
if raise_error:
raise ValueError(
f"Invalid answer line: {answer_line}. "
"Answer line must be of the form: "
"answer_num: <int>, answer_relevance: <float>"
)
continue
answer_num = int(line_tokens[0].split(":")[1].strip())
except (IndexError, ValueError) as e:
if not raise_error:
continue
else:
raise ValueError(
f"Invalid answer line: {answer_line}. "
"Answer line must be of the form: "
"answer_num: <int>, answer_relevance: <float>"
)
if answer_num > num_choices:
continue
# Check if the token contains a colon before splitting
if ':' not in line_tokens[1]:
if raise_error:
raise ValueError(
f"Invalid answer line: {answer_line}. "
"Answer line must be of the form: "
"answer_num: <int>, answer_relevance: <float>"
)
continue
_answer_relevance = re.findall(r"\d+", line_tokens[1].split(":")[1].strip())[0]
answer_relevances.append(float(_answer_relevance))
return answer_nums, answer_relevances This modification ensures that the function checks for the presence of a colon before attempting to split the string, which helps prevent To continue talking to Dosu, mention @dosu. Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other |
Bug Description
Similar to #11092 there is still an IndexError being thrown when the LLM's answer in LLMRerank contains at least one unrelated line which also contains a comma, in my case the answer was:
While the first two lines and the third (empty) line are processed correctly, the third line throws an error in
default_parse_choice_select_answer_fn
in the following code as line_tokens[1] does not contain a colon:Version
0.12.8
Steps to Reproduce
The can be reproduced using the example code mentioned in https://docs.llamaindex.ai/en/stable/examples/node_postprocessor/LLMReranker-Gatsby/ but I cannot share the exact input data.
But manually calling
default_parse_choice_select_answer_fn
with the LLM answer mentioned will throw an IndexError, e.g.:Relevant Logs/Tracbacks
The text was updated successfully, but these errors were encountered: