Skip to content

Commit

Permalink
CHANGELOG formatting fixes
Browse files Browse the repository at this point in the history
  • Loading branch information
piskvorky committed Oct 19, 2020
1 parent 4afd863 commit cf33e57
Showing 1 changed file with 14 additions and 12 deletions.
26 changes: 14 additions & 12 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,27 +1,30 @@
Changes
=======

## 4.0.0beta, Unreleased FIXME 2020-10-??
## 4.0.0beta, FIXME 2020-10-??

**⚠️ Gensim 4.0 contains breaking API changes! See the [Migration guide](https://github.com/RaRe-Technologies/gensim/wiki/Migrating-from-Gensim-3.x-to-4) to update your existing Gensim 3.x code and models.**

Gensim 4.0 is a major release with lots of performance & robustness improvements and a [new website](https://radimrehurek.com/gensim_4.0.0).
Gensim 4.0 is a major release with lots of performance & robustness improvements and a new website.

### Main highlights (see also *👍 Improvements* below)

* Massively optimized popular algorithms the community has grown to love: [fastText](https://radimrehurek.com/gensim/models/fasttext.html), [word2vec](https://radimrehurek.com/gensim/models/word2vec.html), [doc2vec](https://radimrehurek.com/gensim/models/doc2vec.html), [phrases](https://radimrehurek.com/gensim/models/phrases.html):
a. **Efficiency**.

a. **Efficiency**

| model | 3.8.3: wall time / peak RAM / throughput | 4.0.0: wall time / peak RAM / throughput |
|----------|------------|--------|
| fastText | 2.9h / 4.11 GB / 822k words/s | 2.3h / **1.26 GB** / 914k words/s |
| word2vec | 1.7h / 0.36 GB / 1685k words/s | **1.2h** / 0.33 GB / 1762k words/s |

In other words, **fastText now needs 3x less RAM** (and is faster); word2vec has 2x faster init (and needs less RAM, and is faster); detecting collocation phrases is 2x faster. ([4.0 benchmarks](https://github.com/RaRe-Technologies/gensim/issues/2887#issuecomment-711097334)))

b. **Robustness**. We fixed a bunch of long-standing bugs by refactoring the internal code structure (see 🔴 Bug fixes below)

c. **Simplified OOP model** for easier model exports and integration with TensorFlow, PyTorch &co.

Most of these updates come to you transparently aka "for free", but see [Migration guide](https://github.com/RaRe-Technologies/gensim/wiki/Migrating-from-Gensim-3.x-to-4) for some changes that break the old Gensim 3.x API. **Update your code accordingly**.
These improvements come to you transparently aka "for free", but see [Migration guide](https://github.com/RaRe-Technologies/gensim/wiki/Migrating-from-Gensim-3.x-to-4) for some changes that break the old Gensim 3.x API. **Update your code accordingly**.

* Dropped a bunch of externally contributed modules: summarization, pivoted TFIDF normalization, FIXME.
- Code quality was not up to our standards. Also there was no one to maintain them, answer user questions, support these modules.
Expand All @@ -39,7 +42,7 @@ This is the direction we'll keep going forward: less kitchen-sink of "latest aca

### Why pre-release?

This 4.0.0beta pre-release is for users who want the cutting edge models: extra performance, bug fixes. Plus users who want to help out, by testing and providing feedback: code, documentation, workflows… Please let us know on the [mailing list](https://groups.google.com/forum/#!forum/gensim)!
This 4.0.0beta pre-release is for users who want the **cutting edge performance and bug fixes**. Plus users who want to help out, by **testing and providing feedback**: code, documentation, workflows… Please let us know on the [mailing list](https://groups.google.com/forum/#!forum/gensim)!

Install the pre-release with:

Expand Down Expand Up @@ -73,12 +76,12 @@ Production stability is important to Gensim, so we're improving the process of *
* [#2968](https://github.com/RaRe-Technologies/gensim/pull/2968): Migrate tutorials & how-tos to 4.0.0, by [@piskvorky](https://github.com/piskvorky)
* [#2899](https://github.com/RaRe-Technologies/gensim/pull/2899): Clean up of language and formatting of docstrings, by [@piskvorky](https://github.com/piskvorky)
* [#2899](https://github.com/RaRe-Technologies/gensim/pull/2899): Added documentation for NMSLIB indexer, by [@piskvorky](https://github.com/piskvorky)
* [#2832](https://github.com/RaRe-Technologies/gensim/pull/2832): Clear up LdaModel documentation - remove claim that it accepts CSC matrix as input, by [@FyzHsn](https://github.com/FyzHsn)
* [#2832](https://github.com/RaRe-Technologies/gensim/pull/2832): Clear up LdaModel documentation by [@FyzHsn](https://github.com/FyzHsn)
* [#2871](https://github.com/RaRe-Technologies/gensim/pull/2871): Clarify that license is LGPL-2.1, by [@pombredanne](https://github.com/pombredanne)
* [#2896](https://github.com/RaRe-Technologies/gensim/pull/2896): Make docs clearer on `alpha` parameter in LDA model, by [@xh2](https://github.com/xh2)
* [#2897](https://github.com/RaRe-Technologies/gensim/pull/2897): Update Hoffman paper link for Online LDA, by [@xh2](https://github.com/xh2)
* [#2910](https://github.com/RaRe-Technologies/gensim/pull/2910): Refresh docs for run_annoy tutorial, by [@piskvorky](https://github.com/piskvorky)
* [#2935](https://github.com/RaRe-Technologies/gensim/pull/2935): Fix "generator" language in word2vec docs, by [@polm](https://github.com/polm))
* [#2935](https://github.com/RaRe-Technologies/gensim/pull/2935): Fix "generator" language in word2vec docs, by [@polm](https://github.com/polm)

### :red_circle: Bug fixes

Expand All @@ -93,21 +96,20 @@ Production stability is important to Gensim, so we're improving the process of *

### :warning: Removed functionality & deprecations

* [#6](https://github.com/RaRe-Technologies/gensim-wheels/pull/6): No more binary wheels for x32 platforms, by [menshikh-iv](https://github.com/menshikh-iv)
* [#2899](https://github.com/RaRe-Technologies/gensim/pull/2899): Renamed overly broad `similarities.index` to the more appropriate `similarities.annoy`, by [@piskvorky](https://github.com/piskvorky)
* [#2958](https://github.com/RaRe-Technologies/gensim/pull/2958): Remove gensim.summarization subpackage, docs and test data, by [@mpenkov](https://github.com/mpenkov)
* [#6](https://github.com/RaRe-Technologies/gensim-wheels/pull/6): No more binary wheels for x32 platforms, by [menshikh-iv](https://github.com/menshikh-iv)
* [#2926](https://github.com/RaRe-Technologies/gensim/pull/2926): Rename `num_words` to `topn` in dtm_coherence, by [@MeganStodel](https://github.com/MeganStodel)
* [#2937](https://github.com/RaRe-Technologies/gensim/pull/2937): Remove Keras dependency, by [@piskvorky](https://github.com/piskvorky)
* Removed all code, methods, attributes and functions marked as deprecated in 3.8.3.
* Removed all code, methods, attributes and functions marked as deprecated in [Gensim 3.8.3](https://github.com/RaRe-Technologies/gensim/releases/tag/3.8.3).

---


## :warning: 3.8.x will be the last Gensim version to support Py2.7. Starting with 4.0.0, Gensim will only support Py3.5 and above.


## 3.8.3, 2020-05-03

**:warning: 3.8.x will be the last Gensim version to support Py2.7. Starting with 4.0.0, Gensim will only support Py3.5 and above.**

This is primarily a bugfix release to bring back Py2.7 compatibility to gensim 3.8.

### :red_circle: Bug fixes
Expand Down

0 comments on commit cf33e57

Please # to comment.