-
Notifications
You must be signed in to change notification settings - Fork 203
Toom/FFT multiplication #227
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Conversation
05bdfa4
to
c962e52
Compare
So my local astyle 3.1 (current) said it was already well formatted but the older astyle 2.06 had a different opinion about that. |
IMO that's a regression in the current astyle! |
I think it might also be good to add these pari/gp programs as separate scripts in a subdirectory for comparison. Some superficial review comments:
|
Anything wrong with:
Yes, the tuning program contains testing code (It is already comparing the new code against the old one for benchmarking, so it wasn't a lot of work to add a That program is included in "All checks have passed" means the new functions are correct ("as far as tested" and so on, the usual caveats apply here, too, of course) The FFT functions have an additional run-time test but that is just casting-nines. It catches the point when the rounding errors start to appear but it is not strictly deterministic. If you remember from primary school: it has false positives.
Yes, and it has been done in the aforementioned program. T-C-[2,3,4](mul|sqr) and FFT(mul|sqr) are tested individually.
It has been stripped down quite a bit already ;-)
Yes. The two commits (TC and FFT) are more or less independent and can be separated but it would need quite a bit of work to do so. An even finer grid would be a lot of work, though, and to be honest, the amount of work involved to make TC and FFT completely independent alone is already something I cannot see the advantage of. OK, making small logical units has a lot of advantages, but ripping things apart for the sake of ripping them apart is not something I can support with a clean conscious. Also: even if I separate them, all the glue ( I cannot just C&P the ten tests from
I don't use unfree code, out of principle! But you mean Public Domain code, which is a very different kind of beast, mainly because it doesn't really exist anymore in most of the jurisdictions of the ""developed world". For some reason (money?) the lobbyists were able to persuade the governments that Public Domain is something very icky. No code has been copied, all had been written from scratch. That's the reason it took me so long to prepare it for LTM. My private T-C implementations had code from Bodrato and Zimmermann in it, both GPL-3, which I had to replace with my own derived from the descriptions in the papers cited in my code. It's mostly the interpolation part that is of any interest, the rest is just optimizing for memory by reusing variables and for calculations by reusing results. Just shoving things to and fro until it looks good and runs well ;-) It is also highly simplified now. You might take a look at the implementation of T-C-3 by Zimmermann for contrast ;-) A lot of it is from Jörg Arndt's book "Matters Computational" including the idea to optimize the Hartley transform (actually, the idea to use the Hartley transform in the first place) and his way of avoiding the bit-reverse part. I was not able to use any of his code for LTM, because it is GPL-3, only the descriptions and pseudocodes in his book. Leaves the question if it was OK for me, legally, to use Jörg Arndt's book in that manner. While looking for a paper online Google offered me a piece of software from Harvard for atmospheric modelling who used J. Arndt's book to implement the Hartley transform, too. Their license, although very close to PD, is still a license and not PD, so I couldn't even use that. (I think I made a note in I will give it a try and ask Jörg himself but I don't have a lot of hope for an answer, his inbox is most probably full with questions regarding his book *sigh* License is also the reason I didn't offer the fast-division part of it now. No, found it. It was indeed a patch meant for LTM and the year fits, too, it was less than 10 years ago that I wrote my version and it is also straight from the paper, even the variable names are the same. I doubt I took anything from it, I would have made a note at a prominent place if I did, but I'll put a link to the Sourceforge archive in the head of my file, just to be sure. The Newton Division seems to need some work. It is correct but benchmarking showed some odd slowdowns at unexpected places so I most probably put some kind of bummer in it somewhere and found no time yet, to look for it. Fast division, B-Z at least, is necessary to implement fast number conversion which is the last addition I would like LTM to have so I can restart my work at LibTomFloat. I stopped it when I found out that I forgot to implement the guard bits for proper rounding but was sure I already did which was definitely one of the bigger forehead-flattening "Aaargh!" moments in my programming-life ;-) |
b12dedf
to
6e2f86d
Compare
I did and got an answer Just re-assign everything you need(*) to whatever license. Suggest to put "by permission of the author" or something to that effect. (*) includes code from the fxt lib not printed, localized FHT comes to mind. I "stamped" every file with his permission where there is a chance that I used any of his stuff, |
6e2f86d
to
c23ead9
Compare
@minad et. al. : found some time this evening to shovel the tests in Is it because Also: call me stingy if you want but I don't like to use precious entropy for the PRNG if it is not needed at all for an individual test that just needs a bag of mixed-bits like the tests for the multiplication functions. Another disadvantage is the fact that it is not repeatable. So I would like to use a deterministic PRNG as the source of non-cryptographic bits for the non-cryptographic tests. My go-to function for that purpose is the 64-bit variant of Bob Jenkins' PRNG. It has a good mix, good avalanche and is sufficiently fast. That together with a fixed seed ( |
Maybe it got somehow lost on the way but I think it should be there. If needed, add it.
Yes we could add a deterministic PRNG for this purpose. I used xoroshiro recently which is simpler than Jenkins. |
I'm still no fan of "adding private headers to what-should-be API tests" but until we've got this pulled apart properly I agree to do it like that. |
@czurnieden If you have time you might want to update your PRs to the current state on develop. After this consolidation work, I personally feel better about adding functionality. Furthermore you can test private functions in the unit tests. Also some of your work like the prime sieve goes in the right direction concerning #243. |
Sorry for the silence but had to update my machine because LTS stopped last month. Failed completely (but my machine was as mess, would have made me wonder if it would have worked flawlessly ;-) ), had to install a new one, am still plucking my stuff out of the backups, cursing the Gnome and systemd developers, and trying to find out what's amiss.
Not at all, but that shall not matter.
I think it cannot be done with a simple rebase now, probably needs more work, hence a bit more time than what is usual with me.
List, please, so I can concentrate on what is wanted.
OK, so prime sieve first. I had some questions regarding the naming and scope of some the constants and types I need for the sieve. Did you make a decision or do you want me to try my own teeth at it? BTW: why did you change the title? Is it a hint for me re. cutting my stuff into logical parts? ;-) |
The usual issues ;)
My focus until now was (or is) working on consolidation of the code base (Testing, API, ...), . So I am mostly interested in additions which improve/generalize what we have now. Also throwing away (or deprecating) things which good obsolete etc. But this is my personal preference, you have your own goals, I guess. So there is nothing barring us from adding more useful stuff. Since you are adding also more special cases to mp_mul, I hope we can get #262 or something like it. I hope we can add stuff without degrading the quality of what we have now. Or even better - if you discover some old warts while working on new stuff, make a PR improving on those!
I think we have pretty good guidelines now? Generally put everything into tommath_private if possible.
:) |
@czurnieden Can we close this one in favor of the split-up PRs you are proposing now? For example #265 |
Yes. |
You can also close ;) |
Toom-Cook 4, 5 and FFT (radix-2 Hartley transform, nothing fancy)
etc/tune
has also been updated to tune the additional algorithms, too.Toom-Cook-3 has been updated to allow to "proof" the algorithm in the same way as Toom-Cook 4 and 5: they contain comments that can be fed into pari/gp.