-
-
Notifications
You must be signed in to change notification settings - Fork 189
Conversation
Hm, lots of reported issues by Psalm. |
Is it possible to configure cs-fixer so that it spits out the offending line?
|
Simply run php-cs-fixer on your machine like so:
To be honest, though, I would simply run
and be done with it :) |
Got i! Looks like it wants to remove the ConQAT Apache license header? |
Yes. |
Is this legally correct? IIRC, BSD license is more permissive than Apache 2.0. |
I would think so. You wrote new PHP code, you are not copying code as-is from another project. |
Depends on your definition of "new". :) The code is manually transpiled from the original Java version. From my discussions on the #fsf IRC channel, if seemed like a good idea to keep it. |
Not done, need to update tests. |
OK, please run the Github Action suite. I did not add the SuffixtreeStrategy to the old test file, since it behaved slightly different. Main problem was how it behaves when all tokens repeat in a small pattern, e..g with fixtures a.php, b.php and c.php. Instead I added a new test for edit distance inside a clone. One todo is to make EditDistanceTest work with three files. For some reason, the report behaviour was different with three files than of two. It might be related to the part of the algorithm I didn't transpile (yet), GappedCloneConsumer, which detects clones with gaps, as part of a post-process step. I do think the new algorithm is still useful without it, though. |
Fix pushed. Didn't have xdebug installed locally, so missed the coverage issue. |
Codecov Report
@@ Coverage Diff @@
## master #199 +/- ##
=============================================
+ Coverage 29.19% 65.79% +36.59%
- Complexity 121 293 +172
=============================================
Files 12 22 +10
Lines 387 877 +490
=============================================
+ Hits 113 577 +464
- Misses 274 300 +26
Continue to review full report at Codecov.
|
You want me to act on the Codecov Report? Maybe also run the CI automatically at push? It's free, AFAIK. |
Hm, the last failing 8.1 test is due to phpunit phar issue, right? I'll just wait and see. :) |
Any updates about phpunit and PHP 8.1? |
Looks like it. |
Thanks! Sure, will look at those issues. |
I wish! Since PHP does not support generics, this is how you tell Psalm the types of the templates. |
The current index "+1" works in the "normal" case, but it's pretty east to "freak out" the algorithm and get nonsensical results, especially when you lower the token count. In the original implementation, they had an extra post-processing step that I didn't port over. It's a little bit open which type of code-base should be considered "normal", I guess. Maybe I can run a "call to action" and ask people to test the algorithm, and tell me how it reacts. So far, I've only used it on parts of the Yii framework and our own code-base (fairly big, lots of legacy code). Hm, or maybe I can run it on a couple of most popular composer packages and see... |
Hm, seems like there's a bug that causes an infinite loop now. Didn't see that before. |
Oh, it stopped, but it took like 15 minutes... Phew. It found duplicates in the symfony/string library successfully, among phpcpd's dependencies, e.g.
One issue was this:
So it doesn't report the scanned lines properly. |
Full description of algorithm here: https://www.cqse.eu/fileadmin/content/news/publications/2009-do-code-clones-matter.pdf
Original source available here: https://www.cqse.eu/en/news/blog/conqat-end-of-life/
Possible to detect Type 3 clones.
Example run:
Question:
To be able to analyze more than one file, the input should not be "file" but a list of tokens. Maybe there's a memory saving mechanism in place right now, where only the hashes are saved between analysis? Which works for the Rabin-Karp default algorithm, but maybe not for the new one. Have to double check.