Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

checkberry gpcheckperf series updates are used to solve the problem that cbdb 1.5.2 version gpcheckperf does not display disk information #430

Merged
merged 9 commits into from
May 28, 2024

Conversation

August-beaulo
Copy link
Contributor

fix #421 cbdb 1.5.2 version gpcheckperf does not display disk information


Change logs

  1. Changes to fix gpcheckperf failure with an exception when sometime time command output
    has a comma separator instead of a dot.

  2. Before this patch, while running gpcheckperf utility the buffer was set by default for the underlying gpnetbenchClient utility as 32Kb. It led to problem with receiving annoying and misleading warnings about connections between hosts.
    Use '--buffer-size' flag with size in kilobytes to set buffer size, which will be used at gpnetbenchClient. It is an optional parameter. The default value is 8Kb.

  3. Update gpcheckperf makefile README.md from greenplum 2 cbdb & add gpshrink & gpdemo based on the greenplum gpcheckperf command parameter and function call update.

etc.

Why are the changes needed?

The open source database gp has made a series of updates to the gpcheckperf tool, including

  1. Replace scp with rsync
  2. .Fixing gpcheckperf failure on -V with -f optio
  3. gpcheckperf: Fix memory calculation function
  4. gpcheckperf - Update a Python 3 reference of "python" to "python3"
  5. gpcheckperf: incorporating parity changes from 6X PR
  6. gpcheckperf: fixing string parsing in parseMultiDDResult()
  7. gpcheckperf: add buffer size parameter
  8. Fixing if the time command has comma in the output

Does this PR introduce any user-facing change?

no

How was this patch tested?

Contributor's Checklist

Here are some reminders and checklists before/when submitting your pull request, please check them:

  • Make sure your Pull Request has a clear title and commit message. You can take git-commit template as a reference.
  • Sign the Contributor License Agreement as prompted for your first-time contribution(One-time setup).
  • Learn the coding contribution guide, including our code conventions, workflow and more.
  • List your communication in the GitHub Issues or Discussions (if has or needed).
  • Document changes.
  • Add tests for the change
  • Pass make installcheck
  • Pass make -C src/test installcheck-cbdb-parallel
  • Feel free to request cloudberrydb/dev team for review and approval when your PR is ready🥳

@CLAassistant
Copy link

CLAassistant commented May 10, 2024

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 5 committers have signed the CLA.

✅ August-beaulo
❌ Annu149
❌ piyushc01
❌ jnihal
❌ red1452
You have signed the CLA already but the status is still pending? Let us recheck it.

@tuhaihe
Copy link
Member

tuhaihe commented May 10, 2024

I recommend rebasing your code on the latest upstream/main as some of the commits made on May 10th appear to be weird.

@tuhaihe
Copy link
Member

tuhaihe commented May 10, 2024

I recommend rebasing your code on the latest upstream/main as some of the commits made on May 10th appear to be weird.

After rebasing, the formats look good to me. Thanks.

@my-ship-it
Copy link
Contributor

@August-beaulo Please don't change original commit, and add a new commit for your own logic.

@avamingli avamingli added the cherry-pick cherry-pick upstream commts label May 13, 2024
@August-beaulo August-beaulo force-pushed the gpcheckperf branch 2 times, most recently from 879e7e2 to ea4e51a Compare May 15, 2024 08:08
@avamingli
Copy link
Contributor

conflicts need to be resolved

@August-beaulo
Copy link
Contributor Author

conflicts need to be resolved

conflicts resolved

@August-beaulo August-beaulo force-pushed the gpcheckperf branch 2 times, most recently from 11ae324 to 0f90844 Compare May 17, 2024 08:48
Annu149 and others added 8 commits May 17, 2024 17:10
Security vulnerability was found in the scp program shipped with the
openssh-clients package and CVSS score for this security vulnerability
is 7.8 (https://access.redhat.com/security/cve/cve-2020-15778).
Recommended action was to replace scp with rsync so,

 * Replaced scp with rsync in gpdb cm utilities and utility files
 * Renamed gpscp utility to gpsync after replacing scp with rsync in
   utility code
Gpcheckperf was throwing an exception when run with -f and -V option together.
This was happening at with -V option, gpssh command outpuot is having few
extra lines which are causing trouble while parsing the output. With this change, provided flag to skip verbose mode when running ssh command and used this non-verbose SSH mode to execute the command when getting host-name.
Corrected run-time errors due to python3 in gpcheckperf.
Also added the test case to cover the scenario relating the host file and -V option.
* gpcheckperf: Fix memory calculation function

Currently, there is a bug in the `getMemory()` function in `gpcheckperf` because of the way we check the return code of the `run()` method which is called inside `getMemory()`. The `run()` method returns an integer value of `zero` in case of success and a `non-zero` value if it fails.

We are checking this value using the condition `if not ok` which is incorrect because when the `run()` method succeeds (`ok = 0`), the condition would result as `False` causing the `getMemory()` function to assume that the `run()` method failed but in reality, it did not.

A simple fix would be to change the condition from `if not ok` to `if ok != 0` to check for any possible failure from the `run()` method.

Further, the way `getMemory()` handles errors is also incorrect. It just returns `0` whenever there is an error which can lead to the incorrect file size being used to perform disk performance tests. This is because the gpcheckperf utility internally calls `multidd` command to perform disk performance tests which also accepts a file size parameter with the -S option which when not set (or is equal to 0), uses its own default value of `2 * memory_size`, but instead of dividing it from the number of input directories, it uses this value for each directory i.e. the total file size value would be `2 * memory_size * no_of_input_directories`. Due to this, sometimes the user can meet File System Full error when the number of input directories is big.

Hence we need to properly handle the error in the `getMemory()` function and exit from the code instead of just returning `0`

Before:
```
$ gpcheckperf -d /tmp/test1 -d /tmp/test2 -d /tmp/test3 -h localhost -rd
 disk write avg time (sec): 5.46
 disk write tot bytes: 12884901888     --> is equal to 2 * memory_size * no_of_input_directories
 disk write tot bandwidth (MB/s): 2250.55
 disk write min bandwidth (MB/s): 2250.55 [localhost]
 disk write max bandwidth (MB/s): 2250.55 [localhost]
 ```
After:
```
$ gpcheckperf -d /tmp/test1 -d /tmp/test2 -d /tmp/test3 -h localhost -rd
 disk write avg time (sec): 1.87
 disk write tot bytes: 4295000064     --> is equal to 2 * memory_size
 disk write tot bandwidth (MB/s): 2190.39
 disk write min bandwidth (MB/s): 2190.39 [localhost]
 disk write max bandwidth (MB/s): 2190.39 [localhost]
```

Also added unit test cases to test the getMemory() function outputs and added a main section to the gpcheckperf code.
In gpcheckperf, a remaining "python" reference exists. This commit
renames it to "python3" and allows it to run when the symbolic link
/usr/bin/python pointing to /usr/bin/python3 does not exist.

Tl;dr - An attempt was made to update all Greenplum utilities to
reference 'python3'. A reference was left unchanged in gpcheckperf.

Python 3 installations by default do not create the symbolic link
/usr/bin/python --> /usr/bin/python3. This has been observed in Python
3 installation3 on Rockylinux 8 & 9 and Ubuntu 20.04 & 22.04.

In the case of gpcheckperf, when invoked, the output below reveals the
"Error" a user receives when /usr/bin/python does not exist:

```
gpadmin@cdw:~$ /usr/local/gp7/bin/gpcheckperf -h cdw -d /data -r d -S 1 -v
--------------------
  SETUP 2023-03-12T17:26:27.967030
--------------------
[Info] verify python interpreter exists
[Info] /usr/local/gp7/bin/gpssh -h cdw 'python -c print'
--------------------
  TEARDOWN
--------------------
[Info] /usr/local/gp7/bin/gpssh -h cdw 'rm -rf  /data/gpcheckperf_$USER'
[Error] unable to find python interpreter on some hosts
        verify PATH variables on the hosts
gpadmin@cdw:~$
```
Parity changes from 6X PR: https://github.com/greenplum-db/gpdb/pull/15192
Changes are as follows:
    1. Controlling verbosity of gpssh() function through verbose parameter
    2. Updating gpssh calls with verbose parameter
    3. Removing non-required checks in print*Results()
    4. Removing regex removal for b' and \r characters when getting hostname
       as it is fixed in gpssh.
    5. Re-structured test cases for gpcheckperf
Issue:
After updating gpssh to return string output, gpcheckperf failed to parse results.

RCA:
String before gpssh changes: "'multidd total bytes '"
String after gpssh fix: "multidd total bytes '"
In parseMultiDDResult(), str.Find() before gpssh changes were returning index 1
as it was getting string starting with "'".
Post gpssh changes, substring is as the beginning of the string hence returning index 0.

Fix:
str.find() returns -1 if the substring is not found otherwise returns index.
As the substring searched in parseMultiDDResult() is appearing at index zero,
corrected condition to include index zero.
* gpcheckperf: add buffer size parameter (#14848)

Before this patch, while running gpcheckperf utility the buffer was set by
default for the underlying gpnetbenchClient utility as 32Kb. It led to problem
with receiving annoying and misleading warnings about connections between
hosts.

Use '--buffer-size' flag with size in kilobytes to set buffer size,
which will be used at gpnetbenchClient.
It is an optional parameter. The default value is 8Kb.
Changes to fix gpcheckperf failure with an exception when sometime time command output
has a comma separator instead of a dot.
Steps to reproduce issue:

$export LC_ALL=de_DE_utf8
$time sleep 1
real	0m1,021s
user	0m0,001s
sys	0m0,005s

Fix: added check if comma present in the time output, replace it with a dot and continue parsing.

Testing: Added unit tests to check the output of parseMultiDDResult() in case of comma and dot.
@August-beaulo August-beaulo force-pushed the gpcheckperf branch 2 times, most recently from b1ab2bd to 7d5f64b Compare May 21, 2024 07:17
This commit updates the README.md file by replacing all occurrences of
"Greenplum" with "cdbd".

This commit includes the following changes:

1. README.md:
   - Replaced all instances of "Greenplum" with "Cloudberry" to reflect
 the new naming convention.

2. Makefile:
   - Added new libraries to GPDEMO_LIBS.
   - Added new targets for `gpdemo` and `gpshrink`.
   - Included a command to create the directory '$(DESTDIR)$(bindir)/lib/gpdemo':
3.gpshrink:
   -update Scp -> Sync
 $(MKDIR_P) '$(DESTDIR)$(bindir)/lib/gpdemo'

Related to Issue#id <apache#421>
Copy link
Contributor

@avamingli avamingli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@avamingli avamingli merged commit 6b5c6a2 into apache:main May 28, 2024
9 of 10 checks passed
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
cherry-pick cherry-pick upstream commts
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] cbdb 1.5.2 version gpcheckperf does not display disk information
9 participants