Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

UnicodeDecodeError or SystemError when using f-string with lambda and non-ASCII characters #130618

Open
gaesa opened this issue Feb 27, 2025 · 3 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-unicode type-bug An unexpected behavior, bug, or error

Comments

@gaesa
Copy link

gaesa commented Feb 27, 2025

Bug report

Bug description:

When using an f-string in combination with lambda functions that return non-ASCII characters, either a UnicodeDecodeError or a SystemError is raised, depending on the specific modification made to the code. These errors do not occur consistently, but rather vary based on how the lambdas and string content are altered.

Steps to Reproduce

def test1(foo, bar):
    return ""

def test2():
    return f"{test1(
        foo=lambda: '、、、、、、、、、、、、、、、、、',
        bar=lambda: 'abcdefghijklmnopqrstuvwxyz 123456789 123456789',
    )}"

Run the code with python <file.py>, which triggers the following error:

UnicodeDecodeError: 'utf-8' codec can't decode bytes in position 28-29: unexpected end of data

Expected Behavior

The code should execute without raising any errors.

Actual Behavior

The following behaviors are observed when making the specified modifications. Each of these cases is independent:

  1. Removing one character from the first string (foo): No error.
  2. Removing all characters from the first string: SystemError: Negative size passed to PyUnicode_New.
  3. Removing foo= (i.e., not passing foo by keyword): No error.
  4. Removing lambda: (i.e., making either argument or both into a str type instead of a Callable[[], str]): No error.
  5. Removing any character from the second string (bar): No error.

Other Relevant Information

The bug does not reproduce in Python 3.13.1, but it does reproduce in Python 3.13.2 and 3.12.9.

CPython versions tested on:

3.13

Operating systems tested on:

Linux

Linked PRs

@gaesa gaesa added the type-bug An unexpected behavior, bug, or error label Feb 27, 2025
@encukou encukou added interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-unicode labels Feb 27, 2025
@vstinner
Copy link
Member

It seems to be a parser bug.

cc @lysnikolaou @pablogsal

@pablogsal
Copy link
Member

On it

@pablogsal
Copy link
Member

The problem was introduced by:

❯ git bisect bad
a3797492179c249417a06d2499a7d535d453ac2c is the first bad commit
commit a3797492179c249417a06d2499a7d535d453ac2c (HEAD)
Author: Miss Islington (bot) <31488909+miss-islington@users.noreply.github.com>
Date:   Wed Jan 22 01:50:22 2025 +0100

    [3.13] gh-129093: Fix f-string debug text sometimes getting cut off when expression contains `!` (GH-129159) (#129163)

    gh-129093: Fix f-string debug text sometimes getting cut off when expression contains `!` (GH-129159)
    (cherry picked from commit 767cf708449fbf13826d379ecef64af97d779510)

    Co-authored-by: Tomas R <tomas.roun8@gmail.com>

 Lib/test/test_fstring.py                                                          | 18 ++++++++++++++++++
 Misc/NEWS.d/next/Core_and_Builtins/2025-01-21-23-35-41.gh-issue-129093.0rvETC.rst |  2 ++
 Parser/lexer/lexer.c                                                              |  4 +---
 3 files changed, 21 insertions(+), 3 deletions(-)
 create mode 100644 Misc/NEWS.d/next/Core_and_Builtins/2025-01-21-23-35-41.gh-issue-129093.0rvETC.rst

pablogsal added a commit to pablogsal/cpython that referenced this issue Feb 27, 2025
pablogsal added a commit to pablogsal/cpython that referenced this issue Feb 27, 2025
…strings (pythonGH-130638)

(cherry picked from commit e06bebb)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
pablogsal added a commit to pablogsal/cpython that referenced this issue Feb 27, 2025
…strings (pythonGH-130638)

(cherry picked from commit e06bebb)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
pablogsal added a commit to pablogsal/cpython that referenced this issue Feb 27, 2025
…strings (pythonGH-130638)

(cherry picked from commit e06bebb)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
Signed-off-by: Pablo Galindo <pablogsal@gmail.com>
pablogsal added a commit to pablogsal/cpython that referenced this issue Feb 27, 2025
…strings (pythonGH-130638)

(cherry picked from commit e06bebb)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-unicode type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

4 participants