Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

tests/test_blackd.py runs out of fds on systems with high nproc (is blackd leaking fds?) #4504

Open
mgorny opened this issue Oct 26, 2024 · 2 comments
Labels
T: bug Something isn't working

Comments

@mgorny
Copy link

mgorny commented Oct 26, 2024

Describe the bug

When running the test suite on a machine with high nproc (i.e. large number of CPUs/cores — we have 80 on arm64 and 256 on sparc), the test suite suddenly runs out of fds in middle of testing tests/test_blackd.py. The remaining blackd tests fail, then pytest hangs when it's supposed to exit.

To Reproduce

  1. Errr, get a system with high nproc… (perhaps some mocking will work?)
  2. tox -e py312-ci (xdist in non-CI jobs works around the problem)

Expected behavior

Test suite passing.

Environment

  • Black's version: 53a2190
  • OS and Python version: Gentoo Linux arm64, 3.12.7

Additional context

To not shadow the issue, here's a minimal log:

$ python -m pytest tests/test_blackd.py --maxfail=2
========================================================= test session starts =========================================================
platform linux -- Python 3.12.7, pytest-8.3.3, pluggy-1.5.0
rootdir: /home/mgorny/black
configfile: pyproject.toml
plugins: xdist-3.6.1, cov-5.0.0
collected 20 items                                                                                                                    

tests/test_blackd.py ........FF

============================================================== FAILURES ===============================================================
_________________________________________ BlackDTestCase.test_blackd_request_needs_formatting _________________________________________

self = <tests.test_blackd.BlackDTestCase testMethod=test_blackd_request_needs_formatting>

    async def test_blackd_request_needs_formatting(self) -> None:
        response = await self.client.post("/", data=b"print('hello world')")
>       self.assertEqual(response.status, 200)
E       AssertionError: 500 != 200

tests/test_blackd.py:38: AssertionError
---------------------------------------------------------- Captured log call ----------------------------------------------------------
ERROR    root:__init__.py:163 Exception during handling a request
Traceback (most recent call last):
  File "/home/mgorny/black/src/blackd/__init__.py", line 124, in handle
    formatted_str = await loop.run_in_executor(
                          ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/asyncio/base_events.py", line 863, in run_in_executor
    executor.submit(func, *args), loop=self)
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/concurrent/futures/process.py", line 831, in submit
    self._start_executor_manager_thread()
  File "/usr/lib/python3.12/concurrent/futures/process.py", line 770, in _start_executor_manager_thread
    self._launch_processes()
  File "/usr/lib/python3.12/concurrent/futures/process.py", line 797, in _launch_processes
    self._spawn_process()
  File "/usr/lib/python3.12/concurrent/futures/process.py", line 807, in _spawn_process
    p.start()
  File "/usr/lib/python3.12/multiprocessing/process.py", line 121, in start
    self._popen = self._Popen(self)
                  ^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/multiprocessing/context.py", line 282, in _Popen
    return Popen(process_obj)
           ^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)
  File "/usr/lib/python3.12/multiprocessing/popen_fork.py", line 65, in _launch
    child_r, parent_w = os.pipe()
                        ^^^^^^^^^
OSError: [Errno 24] Too many open files
WARNING  asyncio:base_events.py:1981 Executing <Task pending name='Task-120' coro=<RequestHandler.start() running at /home/mgorny/black/.tox/py312/lib/python3.12/site-packages/aiohttp/web_protocol.py:534> wait_for=<Future pending cb=[Task.task_wakeup()] created at /usr/lib/python3.12/asyncio/base_events.py:449> created at /home/mgorny/black/.tox/py312/lib/python3.12/site-packages/aiohttp/web_protocol.py:319> took 0.168 seconds
____________________________________________ BlackDTestCase.test_blackd_request_no_change _____________________________________________

self = <tests.test_blackd.BlackDTestCase testMethod=test_blackd_request_no_change>

    async def get_application(self) -> web.Application:
>       return blackd.make_app()

tests/test_blackd.py:34: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
src/blackd/__init__.py:92: in make_app
    executor = ProcessPoolExecutor()
/usr/lib/python3.12/concurrent/futures/process.py:754: in __init__
    self._call_queue = _SafeQueue(
/usr/lib/python3.12/concurrent/futures/process.py:175: in __init__
    super().__init__(max_size, ctx=ctx)
/usr/lib/python3.12/multiprocessing/queues.py:43: in __init__
    self._rlock = ctx.Lock()
/usr/lib/python3.12/multiprocessing/context.py:68: in Lock
    return Lock(ctx=self.get_context())
/usr/lib/python3.12/multiprocessing/synchronize.py:169: in __init__
    SemLock.__init__(self, SEMAPHORE, 1, 1, ctx=ctx)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <Lock(owner=unknown)>, kind = 1, value = 1, maxvalue = 1

    def __init__(self, kind, value, maxvalue, *, ctx):
        if ctx is None:
            ctx = context._default_context.get_context()
        self._is_fork_ctx = ctx.get_start_method() == 'fork'
        unlink_now = sys.platform == 'win32' or self._is_fork_ctx
        for i in range(100):
            try:
>               sl = self._semlock = _multiprocessing.SemLock(
                    kind, value, maxvalue, self._make_name(),
                    unlink_now)
E                   OSError: [Errno 24] Too many open files

/usr/lib/python3.12/multiprocessing/synchronize.py:57: OSError
======================================================= short test summary info =======================================================
FAILED tests/test_blackd.py::BlackDTestCase::test_blackd_request_needs_formatting - AssertionError: 500 != 200
FAILED tests/test_blackd.py::BlackDTestCase::test_blackd_request_no_change - OSError: [Errno 24] Too many open files
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 2 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
===================================================== 2 failed, 8 passed in 1.82s =====================================================
^CException ignored in atexit callback: <function _exit_function at 0xffff9a292c00>
Traceback (most recent call last):
  File "/usr/lib/python3.12/multiprocessing/util.py", line 360, in _exit_function
    p.join()
  File "/usr/lib/python3.12/multiprocessing/process.py", line 149, in join
    res = self._popen.wait(timeout)
          ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/multiprocessing/popen_fork.py", line 43, in wait
    return self.poll(os.WNOHANG if timeout == 0.0 else 0)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/multiprocessing/popen_fork.py", line 27, in poll
    pid, sts = os.waitpid(self.pid, flag)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
KeyboardInterrupt: 

(note I had to ^C it, as it was hanging)

My initial guess was that ProcessPoolExecutor is not cleaned up when main() finishes, but hacking a .shutdown() in doesn't seem to help. Adding max_workers= does (with values up to 29 here).

The setup here is using the default ulimit -n 1024.

@mgorny mgorny added the T: bug Something isn't working label Oct 26, 2024
gentoo-bot pushed a commit to gentoo/gentoo that referenced this issue Oct 26, 2024
Use pytest-forked to workaround fd leaks in blackd that cause the test
suite to fail and hang on systems with high nproc (i.e. our arm64
and sparc devboxes).

Bug: psf/black#4504
Signed-off-by: Michał Górny <mgorny@gentoo.org>
@JelleZijlstra
Copy link
Collaborator

Possibly related to python/cpython#124706. I found that adding some gc.collect() calls helped with that, but didn't end up making that change in Black itself.

@mgorny
Copy link
Author

mgorny commented Oct 26, 2024

Actually, I've tried adding gc.collect() after the shutdown() call and that didn't help either.

Oh, and with Python 3.13 it's easier to reproduce:

PYTHON_CPU_COUNT=80 python -m pytest tests/test_blackd.py

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
T: bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants