Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Bump charset-normalizer from 2.1.1 to 3.0.0 #1

Merged
merged 1 commit into from
Nov 3, 2022

Conversation

dependabot[bot]
Copy link
Contributor

@dependabot dependabot bot commented on behalf of github Nov 3, 2022

Bumps charset-normalizer from 2.1.1 to 3.0.0.

Release notes

Sourced from charset-normalizer's releases.

Version 3.0.0

3.0.0 (2022-10-20)

Added

  • Extend the capability of explain=True when cp_isolation contains at most two entries (min one), will log in details of the Mess-detector results
  • Support for alternative language frequency set in charset_normalizer.assets.FREQUENCIES
  • Add parameter language_threshold in from_bytes, from_path and from_fp to adjust the minimum expected coherence ratio
  • normalizer --version now specify if the current version provides extra speedup (meaning mypyc compilation whl)

Changed

  • Build with static metadata (not pyproject.toml yet)
  • Make language detection stricter
  • Optional: Module md.py can be compiled using Mypyc to provide an extra speedup up to 4x faster than v2.1

Fixed

  • CLI with opt --normalize fail when using full path for files
  • TooManyAccentuatedPlugin induce false positive on the mess detection when too few alpha characters have been fed to it
  • Sphinx warnings when generating the documentation

Removed

  • Coherence detector no longer returns 'Simple English' instead returns 'English'
  • Coherence detector no longer returns 'Classical Chinese' instead returns 'Chinese'
  • Breaking: Method first() and best() from CharsetMatch
  • UTF-7 will no longer appear as "detected" without a recognized SIG/mark (is unreliable/conflicts with ASCII)
  • Breaking: Class aliases CharsetDetector, CharsetDoctor, CharsetNormalizerMatch and CharsetNormalizerMatches
  • Breaking: Top-level function normalize
  • Breaking: Properties chaos_secondary_pass, coherence_non_latin and w_counter from CharsetMatch
  • Support for the backport unicodedata2

This is the last version (3.0.x) to support Python 3.6 We plan to drop it for 3.1.x

Version 3.0.0rc1

This is the last pre-release. If everything goes well, I will publish the stable tag.

3.0.0rc1 (2022-10-18)

Added

  • Extend the capability of explain=True when cp_isolation contains at most two entries (min one), will log in details of the Mess-detector results
  • Support for alternative language frequency set in charset_normalizer.assets.FREQUENCIES
  • Add parameter language_threshold in from_bytes, from_path and from_fp to adjust the minimum expected coherence ratio

Changed

  • Build with static metadata using 'build' frontend
  • Make language detection stricter

Fixed

  • CLI with opt --normalize fail when using full path for files
  • TooManyAccentuatedPlugin induce false positive on the mess detection when too few alpha characters have been fed to it

Removed

... (truncated)

Changelog

Sourced from charset-normalizer's changelog.

3.0.0 (2022-10-20)

Added

  • Extend the capability of explain=True when cp_isolation contains at most two entries (min one), will log in details of the Mess-detector results
  • Support for alternative language frequency set in charset_normalizer.assets.FREQUENCIES
  • Add parameter language_threshold in from_bytes, from_path and from_fp to adjust the minimum expected coherence ratio
  • normalizer --version now specify if current version provide extra speedup (meaning mypyc compilation whl)

Changed

  • Build with static metadata using 'build' frontend
  • Make the language detection stricter
  • Optional: Module md.py can be compiled using Mypyc to provide an extra speedup up to 4x faster than v2.1

Fixed

  • CLI with opt --normalize fail when using full path for files
  • TooManyAccentuatedPlugin induce false positive on the mess detection when too few alpha character have been fed to it
  • Sphinx warnings when generating the documentation

Removed

  • Coherence detector no longer return 'Simple English' instead return 'English'
  • Coherence detector no longer return 'Classical Chinese' instead return 'Chinese'
  • Breaking: Method first() and best() from CharsetMatch
  • UTF-7 will no longer appear as "detected" without a recognized SIG/mark (is unreliable/conflict with ASCII)
  • Breaking: Class aliases CharsetDetector, CharsetDoctor, CharsetNormalizerMatch and CharsetNormalizerMatches
  • Breaking: Top-level function normalize
  • Breaking: Properties chaos_secondary_pass, coherence_non_latin and w_counter from CharsetMatch
  • Support for the backport unicodedata2

3.0.0rc1 (2022-10-18)

Added

  • Extend the capability of explain=True when cp_isolation contains at most two entries (min one), will log in details of the Mess-detector results
  • Support for alternative language frequency set in charset_normalizer.assets.FREQUENCIES
  • Add parameter language_threshold in from_bytes, from_path and from_fp to adjust the minimum expected coherence ratio

Changed

  • Build with static metadata using 'build' frontend
  • Make the language detection stricter

Fixed

  • CLI with opt --normalize fail when using full path for files
  • TooManyAccentuatedPlugin induce false positive on the mess detection when too few alpha character have been fed to it

Removed

  • Coherence detector no longer return 'Simple English' instead return 'English'
  • Coherence detector no longer return 'Classical Chinese' instead return 'Chinese'

3.0.0b2 (2022-08-21)

Added

... (truncated)

Upgrade guide

Sourced from charset-normalizer's upgrade guide.

Guide to upgrade your code from v1 to v2

  • If you are using the legacy detect function, that is it. You have nothing to do.

Detection

Before

from charset_normalizer import CharsetNormalizerMatches
results = CharsetNormalizerMatches.from_bytes(
'我没有埋怨,磋砣的只是一些时间。'.encode('utf_32')
)

After

from charset_normalizer import from_bytes
results = from_bytes(
'我没有埋怨,磋砣的只是一些时间。'.encode('utf_32')
)

Methods that once were staticmethods of the class CharsetNormalizerMatches are now basic functions. from_fp, from_bytes, from_fp and `` are concerned.

Staticmethods scheduled to be removed in version 3.0

Commits
  • 0ec52ef Version 3.0.0 (#223)
  • db134f3 Update python-publish.yml
  • 690f74c 🔧 pass --no-isolation through CIBW_CONFIG_SETTINGS --build-option
  • 20996c3 ⬆️ cibuildwheel v2.11.1 (fix-tag)
  • 24f366c ⬆️ cibuildwheel v2.11.1
  • 33b7327 🔧 update universal-wheel stage (missing build pkg)
  • 544595d Merge pull request #209 from Ousret/3.0
  • 6367d53 📝 Missing CHANGELOG entry and add language_threshold to docs::advanced...
  • b15f416 📝 Update CHANGELOG.md
  • f8e1153 📝 Adjust speedup docs section
  • Additional commits viewable in compare view

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [charset-normalizer](https://github.com/Ousret/charset_normalizer) from 2.1.1 to 3.0.0.
- [Release notes](https://github.com/Ousret/charset_normalizer/releases)
- [Changelog](https://github.com/Ousret/charset_normalizer/blob/master/CHANGELOG.md)
- [Upgrade guide](https://github.com/Ousret/charset_normalizer/blob/master/UPGRADE.md)
- [Commits](jawah/charset_normalizer@2.1.1...3.0.0)

---
updated-dependencies:
- dependency-name: charset-normalizer
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot bot added the dependencies Pull requests that update a dependency file label Nov 3, 2022
@Ghalbeyou Ghalbeyou merged commit 03f40cb into main Nov 3, 2022
@dependabot dependabot bot deleted the dependabot/pip/charset-normalizer-3.0.0 branch November 3, 2022 12:54
@Ghalbeyou Ghalbeyou restored the dependabot/pip/charset-normalizer-3.0.0 branch November 3, 2022 12:57
@Ghalbeyou Ghalbeyou deleted the dependabot/pip/charset-normalizer-3.0.0 branch November 3, 2022 12:58
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
dependencies Pull requests that update a dependency file
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant