-
Notifications
You must be signed in to change notification settings - Fork 136
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
checkberry gpcheckperf series updates are used to solve the problem that cbdb 1.5.2 version gpcheckperf does not display disk information #430
Conversation
|
I recommend rebasing your code on the latest upstream/main as some of the commits made on May 10th appear to be weird. |
88ea742
to
bde5625
Compare
After rebasing, the formats look good to me. Thanks. |
@August-beaulo Please don't change original commit, and add a new commit for your own logic. |
879e7e2
to
ea4e51a
Compare
conflicts need to be resolved |
conflicts resolved |
11ae324
to
0f90844
Compare
Security vulnerability was found in the scp program shipped with the openssh-clients package and CVSS score for this security vulnerability is 7.8 (https://access.redhat.com/security/cve/cve-2020-15778). Recommended action was to replace scp with rsync so, * Replaced scp with rsync in gpdb cm utilities and utility files * Renamed gpscp utility to gpsync after replacing scp with rsync in utility code
Gpcheckperf was throwing an exception when run with -f and -V option together. This was happening at with -V option, gpssh command outpuot is having few extra lines which are causing trouble while parsing the output. With this change, provided flag to skip verbose mode when running ssh command and used this non-verbose SSH mode to execute the command when getting host-name. Corrected run-time errors due to python3 in gpcheckperf. Also added the test case to cover the scenario relating the host file and -V option.
* gpcheckperf: Fix memory calculation function Currently, there is a bug in the `getMemory()` function in `gpcheckperf` because of the way we check the return code of the `run()` method which is called inside `getMemory()`. The `run()` method returns an integer value of `zero` in case of success and a `non-zero` value if it fails. We are checking this value using the condition `if not ok` which is incorrect because when the `run()` method succeeds (`ok = 0`), the condition would result as `False` causing the `getMemory()` function to assume that the `run()` method failed but in reality, it did not. A simple fix would be to change the condition from `if not ok` to `if ok != 0` to check for any possible failure from the `run()` method. Further, the way `getMemory()` handles errors is also incorrect. It just returns `0` whenever there is an error which can lead to the incorrect file size being used to perform disk performance tests. This is because the gpcheckperf utility internally calls `multidd` command to perform disk performance tests which also accepts a file size parameter with the -S option which when not set (or is equal to 0), uses its own default value of `2 * memory_size`, but instead of dividing it from the number of input directories, it uses this value for each directory i.e. the total file size value would be `2 * memory_size * no_of_input_directories`. Due to this, sometimes the user can meet File System Full error when the number of input directories is big. Hence we need to properly handle the error in the `getMemory()` function and exit from the code instead of just returning `0` Before: ``` $ gpcheckperf -d /tmp/test1 -d /tmp/test2 -d /tmp/test3 -h localhost -rd disk write avg time (sec): 5.46 disk write tot bytes: 12884901888 --> is equal to 2 * memory_size * no_of_input_directories disk write tot bandwidth (MB/s): 2250.55 disk write min bandwidth (MB/s): 2250.55 [localhost] disk write max bandwidth (MB/s): 2250.55 [localhost] ``` After: ``` $ gpcheckperf -d /tmp/test1 -d /tmp/test2 -d /tmp/test3 -h localhost -rd disk write avg time (sec): 1.87 disk write tot bytes: 4295000064 --> is equal to 2 * memory_size disk write tot bandwidth (MB/s): 2190.39 disk write min bandwidth (MB/s): 2190.39 [localhost] disk write max bandwidth (MB/s): 2190.39 [localhost] ``` Also added unit test cases to test the getMemory() function outputs and added a main section to the gpcheckperf code.
In gpcheckperf, a remaining "python" reference exists. This commit renames it to "python3" and allows it to run when the symbolic link /usr/bin/python pointing to /usr/bin/python3 does not exist. Tl;dr - An attempt was made to update all Greenplum utilities to reference 'python3'. A reference was left unchanged in gpcheckperf. Python 3 installations by default do not create the symbolic link /usr/bin/python --> /usr/bin/python3. This has been observed in Python 3 installation3 on Rockylinux 8 & 9 and Ubuntu 20.04 & 22.04. In the case of gpcheckperf, when invoked, the output below reveals the "Error" a user receives when /usr/bin/python does not exist: ``` gpadmin@cdw:~$ /usr/local/gp7/bin/gpcheckperf -h cdw -d /data -r d -S 1 -v -------------------- SETUP 2023-03-12T17:26:27.967030 -------------------- [Info] verify python interpreter exists [Info] /usr/local/gp7/bin/gpssh -h cdw 'python -c print' -------------------- TEARDOWN -------------------- [Info] /usr/local/gp7/bin/gpssh -h cdw 'rm -rf /data/gpcheckperf_$USER' [Error] unable to find python interpreter on some hosts verify PATH variables on the hosts gpadmin@cdw:~$ ```
Parity changes from 6X PR: https://github.com/greenplum-db/gpdb/pull/15192 Changes are as follows: 1. Controlling verbosity of gpssh() function through verbose parameter 2. Updating gpssh calls with verbose parameter 3. Removing non-required checks in print*Results() 4. Removing regex removal for b' and \r characters when getting hostname as it is fixed in gpssh. 5. Re-structured test cases for gpcheckperf
Issue: After updating gpssh to return string output, gpcheckperf failed to parse results. RCA: String before gpssh changes: "'multidd total bytes '" String after gpssh fix: "multidd total bytes '" In parseMultiDDResult(), str.Find() before gpssh changes were returning index 1 as it was getting string starting with "'". Post gpssh changes, substring is as the beginning of the string hence returning index 0. Fix: str.find() returns -1 if the substring is not found otherwise returns index. As the substring searched in parseMultiDDResult() is appearing at index zero, corrected condition to include index zero.
* gpcheckperf: add buffer size parameter (#14848) Before this patch, while running gpcheckperf utility the buffer was set by default for the underlying gpnetbenchClient utility as 32Kb. It led to problem with receiving annoying and misleading warnings about connections between hosts. Use '--buffer-size' flag with size in kilobytes to set buffer size, which will be used at gpnetbenchClient. It is an optional parameter. The default value is 8Kb.
Changes to fix gpcheckperf failure with an exception when sometime time command output has a comma separator instead of a dot. Steps to reproduce issue: $export LC_ALL=de_DE_utf8 $time sleep 1 real 0m1,021s user 0m0,001s sys 0m0,005s Fix: added check if comma present in the time output, replace it with a dot and continue parsing. Testing: Added unit tests to check the output of parseMultiDDResult() in case of comma and dot.
0f90844
to
d3f0ac7
Compare
b1ab2bd
to
7d5f64b
Compare
7d5f64b
to
bd72162
Compare
This commit updates the README.md file by replacing all occurrences of "Greenplum" with "cdbd". This commit includes the following changes: 1. README.md: - Replaced all instances of "Greenplum" with "Cloudberry" to reflect the new naming convention. 2. Makefile: - Added new libraries to GPDEMO_LIBS. - Added new targets for `gpdemo` and `gpshrink`. - Included a command to create the directory '$(DESTDIR)$(bindir)/lib/gpdemo': 3.gpshrink: -update Scp -> Sync $(MKDIR_P) '$(DESTDIR)$(bindir)/lib/gpdemo' Related to Issue#id <apache#421>
bd72162
to
b559eec
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
fix #421 cbdb 1.5.2 version gpcheckperf does not display disk information
Change logs
Changes to fix gpcheckperf failure with an exception when sometime time command output
has a comma separator instead of a dot.
Before this patch, while running gpcheckperf utility the buffer was set by default for the underlying gpnetbenchClient utility as 32Kb. It led to problem with receiving annoying and misleading warnings about connections between hosts.
Use '--buffer-size' flag with size in kilobytes to set buffer size, which will be used at gpnetbenchClient. It is an optional parameter. The default value is 8Kb.
Update gpcheckperf makefile README.md from greenplum 2 cbdb & add gpshrink & gpdemo based on the greenplum gpcheckperf command parameter and function call update.
etc.
Why are the changes needed?
The open source database gp has made a series of updates to the gpcheckperf tool, including
Does this PR introduce any user-facing change?
no
How was this patch tested?
Contributor's Checklist
Here are some reminders and checklists before/when submitting your pull request, please check them:
make installcheck
make -C src/test installcheck-cbdb-parallel
cloudberrydb/dev
team for review and approval when your PR is ready🥳