Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

resource: fallback to sysconf when failed to detect memory size from hwloc for branch-2025.1 #8

Merged

Conversation

syuu1228
Copy link
Contributor

@syuu1228 syuu1228 commented Feb 3, 2025

This is backported version of scylladb/seastar#2624


On Fedora 41 AMI on some aarch64 instance such as m7gd.16xlarge, Seastar program such as Scylla fails to startup with following error message:

$ /opt/scylladb/bin/scylla --log-to-stdout 1
WARNING: debug mode. Not for benchmarking or production
hwloc/linux: failed to find sysfs cpu topology directory, aborting linux discovery.
scylla: seastar/src/core/resource.cc:683: resources seastar::resource::allocate(configuration &): Assertion `!remain' failed.

It seems like hwloc is failed to initialize because of /sys/devices/system/cpu/cpu0/topology/ not available on the instance.

I debugged src/core/resource.cc to find out why assert occured, and found that alloc_from_node() is failing because node->total_memory is 0. It is likely because of failure of hwloc initialize described above.

I also found that calculate_memory() going wrong since machine->total_memory is also 0.

To avoid the error on such environment, we need to fixup memory size on both machine->total_memory and node->total_memory. We can use sysconf(_SC_PAGESIZE) * sysconf(_SC_PHYS_PAGES) for this, just like we do on non-hwloc version of allocate().

Fixes scylladb/scylladb#22382
Related scylladb/scylla-pkg#4797

(cherry picked from commit b0a9f89)

…hwloc

On Fedora 41 AMI on some aarch64 instance such as m7gd.16xlarge, Seastar
program such as Scylla fails to startup with following error message:
```
$ /opt/scylladb/bin/scylla --log-to-stdout 1
WARNING: debug mode. Not for benchmarking or production
hwloc/linux: failed to find sysfs cpu topology directory, aborting linux discovery.
scylla: seastar/src/core/resource.cc:683: resources seastar::resource::allocate(configuration &): Assertion `!remain' failed.
```

It seems like hwloc is failed to initialize because of
/sys/devices/system/cpu/cpu0/topology/ not available on the instance.

I debugged src/core/resource.cc to find out why assert occured,
and found that alloc_from_node() is failing because node->total_memory is 0.
It is likely because of failure of hwloc initialize described above.

I also found that calculate_memory() going wrong since
machine->total_memory is also 0.

To avoid the error on such environment, we need to fixup memory size on
both machine->total_memory and node->total_memory.
We can use sysconf(_SC_PAGESIZE) * sysconf(_SC_PHYS_PAGES) for this,
just like we do on non-hwloc version of allocate().

Fixes scylladb/scylladb#22382
Related scylladb/scylla-pkg#4797

(cherry picked from commit b0a9f89)
@avikivity avikivity merged commit a350b5d into scylladb:branch-2025.1 Feb 3, 2025
15 checks passed
@avikivity
Copy link
Member

Submodule update queued for 2025.1.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants