factor: add -h/--exponents option #4710

Ideflop · 2023-04-05T19:49:29Z

This pull request adds support for the -h/--exponents option to the factor function, which was recently introduced in GNU coreutils 9.2.

With this option, the factors are printed in the form p^e, rather than repeating the prime p, e times.

For example :

$ ./factor -h 3000
3000: 2^3 3 5^3

This is implemented with a new module that sets a mutable global variable when the option is provided. The Display implementation for Factors has been updated to use this global variable when printing factors. Tests have been added to cover the new option.

This pull request Fixes #4682

src/uu/factor/src/cli.rs

Ideflop · 2023-04-06T13:25:27Z

can we do it in a different way? like adding an parameter/option to factor

Yes

I have removed the unsafe code, and now the function factor takes a parameter print_exponents, and the struct Factor has print_exponents as a field. So the function fmt now checks if the field print_exponents is true.

sylvestre · 2023-04-07T08:16:17Z

much better, thanks

did you bench the performances ?

Ideflop · 2023-04-07T14:05:00Z

did you bench the performances ?

Yes, there are no performance differences.

I have added workplace and upgraded the rand_chacha crate from version 2.2 to 3.1 in the Cargo.toml file, as it was required for compilation. I have also added benchmark tests for when print_exponents is set to true. The rest of the changes are based on clippy suggestions.

sylvestre · 2023-04-07T19:40:49Z

Yes, there are no performance differences.

Sorry, I meant GNU vs yours :)

Ideflop · 2023-04-08T08:54:59Z

Running the following command: seq 2 $((10 ** 8)) | factor 10 times, I obtained the following results for the average time taken:

For GNU factor: 29436 ms
For old uu-factor: 90937 ms
For new uu-factor: 91032 ms

Running the same command with the -h option:

For GNU factor: 28965 ms (1.7% improvement over GNU factor)
For new uu-factor: 89776 ms (1.4% improvement over new uu-factor)

anastygnome · 2023-04-09T13:36:39Z

src/uu/factor/src/factor.rs

-pub struct Factors(RefCell<Decomposition>);
+pub struct Factors {
+    decomposition: RefCell<Decomposition>,
+    print_exponents: bool,


Hey, nice PR!

Hmm, you shouldn't get this inside the factor struct, as Rust already has an integrated mechanism to use an alternate format when there are two ways to convert to a string.

This is showcased in the Formatter documentation

This way, we avoid adding a boolean to the factor struct, which is a bit weird (factors don't change), and can cause a misalignment in memory that reduces performance.

You can use it like so in print_factors_str:

if print_exponents { writeln!(factors_buffer, "{}:{:#}", x, factor(x))?; } else { writeln!(factors_buffer, "{}:{}", x, factor(x))?; }

and modify the implementation of fmt like so :

impl fmt::Display for Factors { fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { let include_exponents = f.alternate(); if include_exponents { write!(f, " {p}^{exp}",) } else { write!(f, " {p}".repeat(*exp))?; } } } }

I have used repeat to avoid unnecessary write! calls

@sylvestre what do you think?

Much better indeed

@anastygnome clever approach, I learned something new :)

@cakebaker Yeah, I stumbled upon that while looking for something the other day.
As a sidenote, please comment such clever tricks if you ever use it in code
Glad you learned something new, and happy Easter!

I ran benchmark tests using the command seq 2 $((10 ** 8)) | factor. The average elapsed time for executing this 10 times with the alternate() function and repeat() was 108422 ms, which is 30% slower compared to using the for loop with the alternate() function, which had an average elapsed time of 83422 ms.
So, in this situation, it appears that continuing with the for loop is better as it seems to be faster.

@Ideflop, I don't know how you benched it, but a simple timing is a trap on those apps, there is great variability across runs, and it turns out that both methods are the same, really

hyperfine --shell=zsh "for i in {1..10000} ; do ./target/release/factor_fmt $i; done" "for i in {1..10000} ; do ./target/release/factor_for $i; done" Benchmark 1: for i in {1..10000} ; do ./target/release/factor_fmt ; done Time (mean ± σ): 16.859 s ± 8.278 s [User: 8.346 s, System: 8.771 s] Range (min … max): 10.162 s … 30.896 s 10 runs Benchmark 2: for i in {1..10000} ; do ./target/release/factor_for ; done Time (mean ± σ): 15.710 s ± 1.880 s [User: 8.575 s, System: 7.346 s] Range (min … max): 12.902 s … 17.785 s 10 runs Summary 'for i in {1..10000} ; do ./target/release/factor_for ; done' ran 1.07 ± 0.54 times faster than 'for i in {1..10000} ; do ./target/release/factor_fmt ; done'

So this is a matter of taste, it seems :)

Indeed, there was a problem. I obtained the following results using hyperfine:

hyperfine --shell=bash "for i in {1..10000} ; do ./coreutils-old factor $i; done" "for i in {1..10000} ; do ./coreutils-for factor $i; done" "for i in {1..10000} ; do ./coreutils-repeat factor $i; done" Benchmark 1: for i in {1..10000} ; do ./coreutils-old factor ; done Time (mean ± σ): 19.188 s ± 1.018 s [User: 10.589 s, System: 8.938 s] Range (min … max): 18.167 s … 21.210 s 10 runs Benchmark 2: for i in {1..10000} ; do ./coreutils-for factor ; done Time (mean ± σ): 18.754 s ± 1.206 s [User: 10.459 s, System: 8.651 s] Range (min … max): 17.593 s … 21.410 s 10 runs Benchmark 3: for i in {1..10000} ; do ./coreutils-repeat factor ; done Time (mean ± σ): 19.741 s ± 1.005 s [User: 10.860 s, System: 9.217 s] Range (min … max): 18.215 s … 21.357 s 10 runs Summary 'for i in {1..10000} ; do ./coreutils-for factor ; done' ran 1.02 ± 0.09 times faster than 'for i in {1..10000} ; do ./coreutils-old factor ; done' 1.05 ± 0.09 times faster than 'for i in {1..10000} ; do ./coreutils-repeat factor ; done'

Using repeat is 5% slower (not 30%) compared to the for loop. Furthermore, the for loop was faster than the current implementation of uutils factor. At the moment, I will stick with the for loop, therefore.

Nice :) do mind the tolerance though, 5% plus or minus 9% means nothing, the difference is negligible just as in my test. Did you do the tests with a release build? (compiled with --release)?

Yes, I have tested it with the release branch. After conducting further tests, I found no difference between the current implementation of factor and the new one. However, I observed that using repeat was always slightly slower. Therefore, I have decided to keep the for loop instead of replacing it with repeat because the current implementation of factor from uutils is slower than the GNU version of factor.

Ideflop · 2023-04-11T21:18:56Z

I have removed the print_exponent inside the struct fields and used the alternate function as @anastygnome suggested ( see #4710 (comment) ).
I kept the for loop because it is faster than the repeat() function.

anastygnome

Great!

Can you just rebase it to make less noise in the PR? Maybe squash the first few?

Ideflop · 2023-04-12T16:32:06Z

Sure, I will rebased the PR .

anastygnome · 2023-04-14T09:10:09Z

@tertsdiepraam lgtm

sylvestre · 2023-04-14T19:20:07Z

Thanks!

factor: add -h/--exponents option

0c13bae

sylvestre reviewed Apr 6, 2023

View reviewed changes

src/uu/factor/src/cli.rs Outdated Show resolved Hide resolved

sylvestre force-pushed the factor-add-exponents-option branch from 898c837 to 0a2dd31 Compare April 7, 2023 08:15

anastygnome reviewed Apr 9, 2023

View reviewed changes

anastygnome approved these changes Apr 12, 2023

View reviewed changes

uutils deleted a comment from github-actions bot Apr 12, 2023

Factor: Rebase, add -h/--exponents options

f4e2a19

Ideflop force-pushed the factor-add-exponents-option branch from 6223ec8 to f4e2a19 Compare April 14, 2023 08:53

sylvestre merged commit 6a54d82 into uutils:main Apr 14, 2023

Ideflop deleted the factor-add-exponents-option branch April 14, 2023 21:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

factor: add -h/--exponents option #4710

factor: add -h/--exponents option #4710

Ideflop commented Apr 5, 2023

Ideflop commented Apr 6, 2023

sylvestre commented Apr 7, 2023

Ideflop commented Apr 7, 2023

sylvestre commented Apr 7, 2023

Ideflop commented Apr 8, 2023

anastygnome Apr 9, 2023 •

edited

Loading

anastygnome Apr 9, 2023

sylvestre Apr 9, 2023

cakebaker Apr 9, 2023

anastygnome Apr 9, 2023

Ideflop Apr 10, 2023

anastygnome Apr 10, 2023 •

edited

Loading

Ideflop Apr 10, 2023

anastygnome Apr 12, 2023 •

edited

Loading

Ideflop Apr 12, 2023 •

edited

Loading

Ideflop commented Apr 11, 2023

anastygnome left a comment •

edited

Loading

Ideflop commented Apr 12, 2023 •

edited

Loading

anastygnome commented Apr 14, 2023

sylvestre commented Apr 14, 2023

factor: add -h/--exponents option #4710

factor: add -h/--exponents option #4710

Conversation

Ideflop commented Apr 5, 2023

Ideflop commented Apr 6, 2023

sylvestre commented Apr 7, 2023

Ideflop commented Apr 7, 2023

sylvestre commented Apr 7, 2023

Ideflop commented Apr 8, 2023

anastygnome Apr 9, 2023 • edited Loading

Choose a reason for hiding this comment

anastygnome Apr 9, 2023

Choose a reason for hiding this comment

sylvestre Apr 9, 2023

Choose a reason for hiding this comment

cakebaker Apr 9, 2023

Choose a reason for hiding this comment

anastygnome Apr 9, 2023

Choose a reason for hiding this comment

Ideflop Apr 10, 2023

Choose a reason for hiding this comment

anastygnome Apr 10, 2023 • edited Loading

Choose a reason for hiding this comment

Ideflop Apr 10, 2023

Choose a reason for hiding this comment

anastygnome Apr 12, 2023 • edited Loading

Choose a reason for hiding this comment

Ideflop Apr 12, 2023 • edited Loading

Choose a reason for hiding this comment

Ideflop commented Apr 11, 2023

anastygnome left a comment • edited Loading

Choose a reason for hiding this comment

Ideflop commented Apr 12, 2023 • edited Loading

anastygnome commented Apr 14, 2023

sylvestre commented Apr 14, 2023

anastygnome Apr 9, 2023 •

edited

Loading

anastygnome Apr 10, 2023 •

edited

Loading

anastygnome Apr 12, 2023 •

edited

Loading

Ideflop Apr 12, 2023 •

edited

Loading

anastygnome left a comment •

edited

Loading

Ideflop commented Apr 12, 2023 •

edited

Loading