-
Notifications
You must be signed in to change notification settings - Fork 136
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
[Bug] Core Dump in mirror_replay
Test Suite During Execution
#782
Comments
FYI: Non debug builds produce the following:
|
CI seem lost isolation2 that |
@yjhjstz & @avamingli I hope to bring it and others online soon. I am able to run two of the isolation2 tests and did notice there are failures (output differences). They can be seen here. https://github.com/edespino/cloudberry/actions/runs/12364538041 |
This test is currently causing core dumps when run as part of the greenplum_schedule. To prevent this from blocking other testing while we investigate the root cause: - Created new fixme_schedule containing only mirror_replay - Removed mirror_replay from greenplum_schedule - Added installcheck-fixme make target to run problematic tests in isolation Issue: apache#782
Hi, at a glance, that's a case we should fix, please feel free to create the PR bringing isolation2 back if there were only that case failed. I will help you fix the diffs there.(on vacation today, perhaps tomorrow I will be back) |
diff -I HINT: -I CONTEXT: -I GP_IGNORE: -U3 /__w/cloudberry/cloudberry/src/test/isolation2/expected/parallel_retrieve_cursor/explain.out /__w/cloudberry/cloudberry/src/test/isolation2/results/parallel_retrieve_cursor/explain.out
[18](https://github.com/edespino/cloudberry/actions/runs/12364538041/job/34508249298#step:18:19)
--- /__w/cloudberry/cloudberry/src/test/isolation2/expected/parallel_retrieve_cursor/explain.out 2024-12-16 17:38:39.620082360 -0800
[19](https://github.com/edespino/cloudberry/actions/runs/12364538041/job/34508249298#step:18:20)
+++ /__w/cloudberry/cloudberry/src/test/isolation2/results/parallel_retrieve_cursor/explain.out 2024-12-16 17:38:39.628082370 -0800
[20](https://github.com/edespino/cloudberry/actions/runs/12364538041/job/34508249298#step:18:21)
@@ -113,40 +113,40 @@
[21](https://github.com/edespino/cloudberry/actions/runs/12364538041/job/34508249298#step:18:22)
QUERY PLAN
[22](https://github.com/edespino/cloudberry/actions/runs/12364538041/job/34508249298#step:18:23)
___________
[23](https://github.com/edespino/cloudberry/actions/runs/12364538041/job/34508249298#step:18:24)
Seq Scan on pg_catalog.pg_class
[24](https://github.com/edespino/cloudberry/actions/runs/12364538041/job/34508249298#step:18:25)
- Output: oid, relname, relnamespace, reltype, reloftype, relowner, relam, relfilenode, reltablespace, relpages, reltuples, relallvisible, reltoastrelid, relhasindex, relisshared, relpersistence, relkind, relnatts, relchecks, relhasrules, relhastriggers, relhassubclass, relrowsecurity, relforcerowsecurity, relispopulated, relreplident, relispartition, relisivm, relrewrite, relfrozenxid, relminmxid, relacl, reloptions, relpartbound
[25](https://github.com/edespino/cloudberry/actions/runs/12364538041/job/34508249298#step:18:26)
-GP_IGNORE:(3 rows)
[26](https://github.com/edespino/cloudberry/actions/runs/12364538041/job/34508249298#step:18:27)
+ Output: oid, relname, relnamespace, reltype, reloftype, relowner, relam, relfilenode, reltablespace, relpages, reltuples, relallvisible, reltoastrelid, relhasindex, relisshared, relpersistence, relkind, relnatts, relchecks, relhasrules, relhastriggers, relhassubclass, relrowsecurity, relforcerowsecurity, relispopulated, relreplident, relispartition, relisivm, relisdynamic, relrewrite, relfrozenxid, relminmxid, relacl, reloptions, relpartbound
[27](https://github.com/edespino/cloudberry/actions/runs/12364538041/job/34508249298#step:18:28)
+GP_IGNORE:(4 rows) help to add |
This test is currently causing core dumps when run as part of the greenplum_schedule. To prevent this from blocking other testing while we investigate the root cause: - Created new fixme_schedule containing only mirror_replay - Removed mirror_replay from greenplum_schedule - Added installcheck-fixme make target to run problematic tests in isolation Issue: apache#782
* Enhance Build Pipeline with Debug and Core Analysis Support Adds comprehensive debug build support and automated core dump analysis to the Cloudberry build pipeline. Key features: - Debug build capability with preserved symbols and debug-specific RPMs - Automated core dump detection and analysis during test execution - Core file correlation with test failures - Enhanced test result reporting with core dump status - Improved artifact management for debug builds The changes enable better debugging of test failures and provide more detailed information about process crashes during testing. * test: Move mirror_replay test to separate schedule due to core dumps This test is currently causing core dumps when run as part of the greenplum_schedule. To prevent this from blocking other testing while we investigate the root cause: - Created new fixme_schedule containing only mirror_replay - Removed mirror_replay from greenplum_schedule - Added installcheck-fixme make target to run problematic tests in isolation Issue: #782 * test: Mark mirror_replay cores as warnings When enable_check_core is disabled, the test should proceed with a warning rather than failing. Modified the core file check and summary to mark mirror_replay with a warning status in these cases. This complements the previous isolation of this test into fixme_schedule, allowing testing to proceed while we investigate the underlying core dump issue.
@edespino can you help to bring |
Yes I will |
@yjhjstz FYI: installcheck-cbdb-parallel is now live: https://github.com/apache/cloudberry/actions/runs/12502691175 |
@edespino please help to set |
@yjhjstz This is already configured: cloudberry/src/test/regress/GNUmakefile Line 217 in a03d2b8
|
Apache Cloudberry version
main branch
What happened
The
mirror_replay
test suite is consistently generating a core dump during execution. This test is part of thegreenplum_schedule
running under theic-good-opt-off
(make -c src/test/regress installcheck-good
) test matrix configuration. From the core dump's stack , the issue occurs specifically during the append-only segment file handling in the startup process.Environment
Project: Apache Cloudberry
Test Suite: mirror_replay
Schedule: greenplum_schedule
Test Matrix Config: ic-good-opt-off
Build Type: Debug build with the following configuration:
Stack Trace
The core dump stack trace indicates the crash occurs during append-only segment file handling:
Impact
What you think should happen instead
Analysis
How to reproduce
Ensure your system is capable of generating core files. Execute the following dev test execution command:
make -c src/test/regress installcheck-good
Issue reproduces consistently without additional steps
Operating System
Rocky Linux 9 (should be platfo independent)
Anything else
Additional Context
The error occurs during the append-only truncate replay operation (ao_truncate_replay), suggesting potential issues with either:
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: