jpayne@7
|
1 Metadata-Version: 2.1
|
jpayne@7
|
2 Name: charset-normalizer
|
jpayne@7
|
3 Version: 3.3.2
|
jpayne@7
|
4 Summary: The Real First Universal Charset Detector. Open, modern and actively maintained alternative to Chardet.
|
jpayne@7
|
5 Home-page: https://github.com/Ousret/charset_normalizer
|
jpayne@7
|
6 Author: Ahmed TAHRI
|
jpayne@7
|
7 Author-email: ahmed.tahri@cloudnursery.dev
|
jpayne@7
|
8 License: MIT
|
jpayne@7
|
9 Project-URL: Bug Reports, https://github.com/Ousret/charset_normalizer/issues
|
jpayne@7
|
10 Project-URL: Documentation, https://charset-normalizer.readthedocs.io/en/latest
|
jpayne@7
|
11 Keywords: encoding,charset,charset-detector,detector,normalization,unicode,chardet,detect
|
jpayne@7
|
12 Classifier: Development Status :: 5 - Production/Stable
|
jpayne@7
|
13 Classifier: License :: OSI Approved :: MIT License
|
jpayne@7
|
14 Classifier: Intended Audience :: Developers
|
jpayne@7
|
15 Classifier: Topic :: Software Development :: Libraries :: Python Modules
|
jpayne@7
|
16 Classifier: Operating System :: OS Independent
|
jpayne@7
|
17 Classifier: Programming Language :: Python
|
jpayne@7
|
18 Classifier: Programming Language :: Python :: 3
|
jpayne@7
|
19 Classifier: Programming Language :: Python :: 3.7
|
jpayne@7
|
20 Classifier: Programming Language :: Python :: 3.8
|
jpayne@7
|
21 Classifier: Programming Language :: Python :: 3.9
|
jpayne@7
|
22 Classifier: Programming Language :: Python :: 3.10
|
jpayne@7
|
23 Classifier: Programming Language :: Python :: 3.11
|
jpayne@7
|
24 Classifier: Programming Language :: Python :: 3.12
|
jpayne@7
|
25 Classifier: Programming Language :: Python :: Implementation :: PyPy
|
jpayne@7
|
26 Classifier: Topic :: Text Processing :: Linguistic
|
jpayne@7
|
27 Classifier: Topic :: Utilities
|
jpayne@7
|
28 Classifier: Typing :: Typed
|
jpayne@7
|
29 Requires-Python: >=3.7.0
|
jpayne@7
|
30 Description-Content-Type: text/markdown
|
jpayne@7
|
31 License-File: LICENSE
|
jpayne@7
|
32 Provides-Extra: unicode_backport
|
jpayne@7
|
33
|
jpayne@7
|
34 <h1 align="center">Charset Detection, for Everyone π</h1>
|
jpayne@7
|
35
|
jpayne@7
|
36 <p align="center">
|
jpayne@7
|
37 <sup>The Real First Universal Charset Detector</sup><br>
|
jpayne@7
|
38 <a href="https://pypi.org/project/charset-normalizer">
|
jpayne@7
|
39 <img src="https://img.shields.io/pypi/pyversions/charset_normalizer.svg?orange=blue" />
|
jpayne@7
|
40 </a>
|
jpayne@7
|
41 <a href="https://pepy.tech/project/charset-normalizer/">
|
jpayne@7
|
42 <img alt="Download Count Total" src="https://static.pepy.tech/badge/charset-normalizer/month" />
|
jpayne@7
|
43 </a>
|
jpayne@7
|
44 <a href="https://bestpractices.coreinfrastructure.org/projects/7297">
|
jpayne@7
|
45 <img src="https://bestpractices.coreinfrastructure.org/projects/7297/badge">
|
jpayne@7
|
46 </a>
|
jpayne@7
|
47 </p>
|
jpayne@7
|
48 <p align="center">
|
jpayne@7
|
49 <sup><i>Featured Packages</i></sup><br>
|
jpayne@7
|
50 <a href="https://github.com/jawah/niquests">
|
jpayne@7
|
51 <img alt="Static Badge" src="https://img.shields.io/badge/Niquests-HTTP_1.1%2C%202%2C_and_3_Client-cyan">
|
jpayne@7
|
52 </a>
|
jpayne@7
|
53 <a href="https://github.com/jawah/wassima">
|
jpayne@7
|
54 <img alt="Static Badge" src="https://img.shields.io/badge/Wassima-Certifi_Killer-cyan">
|
jpayne@7
|
55 </a>
|
jpayne@7
|
56 </p>
|
jpayne@7
|
57 <p align="center">
|
jpayne@7
|
58 <sup><i>In other language (unofficial port - by the community)</i></sup><br>
|
jpayne@7
|
59 <a href="https://github.com/nickspring/charset-normalizer-rs">
|
jpayne@7
|
60 <img alt="Static Badge" src="https://img.shields.io/badge/Rust-red">
|
jpayne@7
|
61 </a>
|
jpayne@7
|
62 </p>
|
jpayne@7
|
63
|
jpayne@7
|
64 > A library that helps you read text from an unknown charset encoding.<br /> Motivated by `chardet`,
|
jpayne@7
|
65 > I'm trying to resolve the issue by taking a new approach.
|
jpayne@7
|
66 > All IANA character set names for which the Python core library provides codecs are supported.
|
jpayne@7
|
67
|
jpayne@7
|
68 <p align="center">
|
jpayne@7
|
69 >>>>> <a href="https://charsetnormalizerweb.ousret.now.sh" target="_blank">π Try Me Online Now, Then Adopt Me π </a> <<<<<
|
jpayne@7
|
70 </p>
|
jpayne@7
|
71
|
jpayne@7
|
72 This project offers you an alternative to **Universal Charset Encoding Detector**, also known as **Chardet**.
|
jpayne@7
|
73
|
jpayne@7
|
74 | Feature | [Chardet](https://github.com/chardet/chardet) | Charset Normalizer | [cChardet](https://github.com/PyYoshi/cChardet) |
|
jpayne@7
|
75 |--------------------------------------------------|:---------------------------------------------:|:--------------------------------------------------------------------------------------------------:|:-----------------------------------------------:|
|
jpayne@7
|
76 | `Fast` | β | β
| β
|
|
jpayne@7
|
77 | `Universal**` | β | β
| β |
|
jpayne@7
|
78 | `Reliable` **without** distinguishable standards | β | β
| β
|
|
jpayne@7
|
79 | `Reliable` **with** distinguishable standards | β
| β
| β
|
|
jpayne@7
|
80 | `License` | LGPL-2.1<br>_restrictive_ | MIT | MPL-1.1<br>_restrictive_ |
|
jpayne@7
|
81 | `Native Python` | β
| β
| β |
|
jpayne@7
|
82 | `Detect spoken language` | β | β
| N/A |
|
jpayne@7
|
83 | `UnicodeDecodeError Safety` | β | β
| β |
|
jpayne@7
|
84 | `Whl Size (min)` | 193.6 kB | 42 kB | ~200 kB |
|
jpayne@7
|
85 | `Supported Encoding` | 33 | π [99](https://charset-normalizer.readthedocs.io/en/latest/user/support.html#supported-encodings) | 40 |
|
jpayne@7
|
86
|
jpayne@7
|
87 <p align="center">
|
jpayne@7
|
88 <img src="https://i.imgflip.com/373iay.gif" alt="Reading Normalized Text" width="226"/><img src="https://media.tenor.com/images/c0180f70732a18b4965448d33adba3d0/tenor.gif" alt="Cat Reading Text" width="200"/>
|
jpayne@7
|
89 </p>
|
jpayne@7
|
90
|
jpayne@7
|
91 *\*\* : They are clearly using specific code for a specific encoding even if covering most of used one*<br>
|
jpayne@7
|
92 Did you got there because of the logs? See [https://charset-normalizer.readthedocs.io/en/latest/user/miscellaneous.html](https://charset-normalizer.readthedocs.io/en/latest/user/miscellaneous.html)
|
jpayne@7
|
93
|
jpayne@7
|
94 ## β‘ Performance
|
jpayne@7
|
95
|
jpayne@7
|
96 This package offer better performance than its counterpart Chardet. Here are some numbers.
|
jpayne@7
|
97
|
jpayne@7
|
98 | Package | Accuracy | Mean per file (ms) | File per sec (est) |
|
jpayne@7
|
99 |-----------------------------------------------|:--------:|:------------------:|:------------------:|
|
jpayne@7
|
100 | [chardet](https://github.com/chardet/chardet) | 86 % | 200 ms | 5 file/sec |
|
jpayne@7
|
101 | charset-normalizer | **98 %** | **10 ms** | 100 file/sec |
|
jpayne@7
|
102
|
jpayne@7
|
103 | Package | 99th percentile | 95th percentile | 50th percentile |
|
jpayne@7
|
104 |-----------------------------------------------|:---------------:|:---------------:|:---------------:|
|
jpayne@7
|
105 | [chardet](https://github.com/chardet/chardet) | 1200 ms | 287 ms | 23 ms |
|
jpayne@7
|
106 | charset-normalizer | 100 ms | 50 ms | 5 ms |
|
jpayne@7
|
107
|
jpayne@7
|
108 Chardet's performance on larger file (1MB+) are very poor. Expect huge difference on large payload.
|
jpayne@7
|
109
|
jpayne@7
|
110 > Stats are generated using 400+ files using default parameters. More details on used files, see GHA workflows.
|
jpayne@7
|
111 > And yes, these results might change at any time. The dataset can be updated to include more files.
|
jpayne@7
|
112 > The actual delays heavily depends on your CPU capabilities. The factors should remain the same.
|
jpayne@7
|
113 > Keep in mind that the stats are generous and that Chardet accuracy vs our is measured using Chardet initial capability
|
jpayne@7
|
114 > (eg. Supported Encoding) Challenge-them if you want.
|
jpayne@7
|
115
|
jpayne@7
|
116 ## β¨ Installation
|
jpayne@7
|
117
|
jpayne@7
|
118 Using pip:
|
jpayne@7
|
119
|
jpayne@7
|
120 ```sh
|
jpayne@7
|
121 pip install charset-normalizer -U
|
jpayne@7
|
122 ```
|
jpayne@7
|
123
|
jpayne@7
|
124 ## π Basic Usage
|
jpayne@7
|
125
|
jpayne@7
|
126 ### CLI
|
jpayne@7
|
127 This package comes with a CLI.
|
jpayne@7
|
128
|
jpayne@7
|
129 ```
|
jpayne@7
|
130 usage: normalizer [-h] [-v] [-a] [-n] [-m] [-r] [-f] [-t THRESHOLD]
|
jpayne@7
|
131 file [file ...]
|
jpayne@7
|
132
|
jpayne@7
|
133 The Real First Universal Charset Detector. Discover originating encoding used
|
jpayne@7
|
134 on text file. Normalize text to unicode.
|
jpayne@7
|
135
|
jpayne@7
|
136 positional arguments:
|
jpayne@7
|
137 files File(s) to be analysed
|
jpayne@7
|
138
|
jpayne@7
|
139 optional arguments:
|
jpayne@7
|
140 -h, --help show this help message and exit
|
jpayne@7
|
141 -v, --verbose Display complementary information about file if any.
|
jpayne@7
|
142 Stdout will contain logs about the detection process.
|
jpayne@7
|
143 -a, --with-alternative
|
jpayne@7
|
144 Output complementary possibilities if any. Top-level
|
jpayne@7
|
145 JSON WILL be a list.
|
jpayne@7
|
146 -n, --normalize Permit to normalize input file. If not set, program
|
jpayne@7
|
147 does not write anything.
|
jpayne@7
|
148 -m, --minimal Only output the charset detected to STDOUT. Disabling
|
jpayne@7
|
149 JSON output.
|
jpayne@7
|
150 -r, --replace Replace file when trying to normalize it instead of
|
jpayne@7
|
151 creating a new one.
|
jpayne@7
|
152 -f, --force Replace file without asking if you are sure, use this
|
jpayne@7
|
153 flag with caution.
|
jpayne@7
|
154 -t THRESHOLD, --threshold THRESHOLD
|
jpayne@7
|
155 Define a custom maximum amount of chaos allowed in
|
jpayne@7
|
156 decoded content. 0. <= chaos <= 1.
|
jpayne@7
|
157 --version Show version information and exit.
|
jpayne@7
|
158 ```
|
jpayne@7
|
159
|
jpayne@7
|
160 ```bash
|
jpayne@7
|
161 normalizer ./data/sample.1.fr.srt
|
jpayne@7
|
162 ```
|
jpayne@7
|
163
|
jpayne@7
|
164 or
|
jpayne@7
|
165
|
jpayne@7
|
166 ```bash
|
jpayne@7
|
167 python -m charset_normalizer ./data/sample.1.fr.srt
|
jpayne@7
|
168 ```
|
jpayne@7
|
169
|
jpayne@7
|
170 π Since version 1.4.0 the CLI produce easily usable stdout result in JSON format.
|
jpayne@7
|
171
|
jpayne@7
|
172 ```json
|
jpayne@7
|
173 {
|
jpayne@7
|
174 "path": "/home/default/projects/charset_normalizer/data/sample.1.fr.srt",
|
jpayne@7
|
175 "encoding": "cp1252",
|
jpayne@7
|
176 "encoding_aliases": [
|
jpayne@7
|
177 "1252",
|
jpayne@7
|
178 "windows_1252"
|
jpayne@7
|
179 ],
|
jpayne@7
|
180 "alternative_encodings": [
|
jpayne@7
|
181 "cp1254",
|
jpayne@7
|
182 "cp1256",
|
jpayne@7
|
183 "cp1258",
|
jpayne@7
|
184 "iso8859_14",
|
jpayne@7
|
185 "iso8859_15",
|
jpayne@7
|
186 "iso8859_16",
|
jpayne@7
|
187 "iso8859_3",
|
jpayne@7
|
188 "iso8859_9",
|
jpayne@7
|
189 "latin_1",
|
jpayne@7
|
190 "mbcs"
|
jpayne@7
|
191 ],
|
jpayne@7
|
192 "language": "French",
|
jpayne@7
|
193 "alphabets": [
|
jpayne@7
|
194 "Basic Latin",
|
jpayne@7
|
195 "Latin-1 Supplement"
|
jpayne@7
|
196 ],
|
jpayne@7
|
197 "has_sig_or_bom": false,
|
jpayne@7
|
198 "chaos": 0.149,
|
jpayne@7
|
199 "coherence": 97.152,
|
jpayne@7
|
200 "unicode_path": null,
|
jpayne@7
|
201 "is_preferred": true
|
jpayne@7
|
202 }
|
jpayne@7
|
203 ```
|
jpayne@7
|
204
|
jpayne@7
|
205 ### Python
|
jpayne@7
|
206 *Just print out normalized text*
|
jpayne@7
|
207 ```python
|
jpayne@7
|
208 from charset_normalizer import from_path
|
jpayne@7
|
209
|
jpayne@7
|
210 results = from_path('./my_subtitle.srt')
|
jpayne@7
|
211
|
jpayne@7
|
212 print(str(results.best()))
|
jpayne@7
|
213 ```
|
jpayne@7
|
214
|
jpayne@7
|
215 *Upgrade your code without effort*
|
jpayne@7
|
216 ```python
|
jpayne@7
|
217 from charset_normalizer import detect
|
jpayne@7
|
218 ```
|
jpayne@7
|
219
|
jpayne@7
|
220 The above code will behave the same as **chardet**. We ensure that we offer the best (reasonable) BC result possible.
|
jpayne@7
|
221
|
jpayne@7
|
222 See the docs for advanced usage : [readthedocs.io](https://charset-normalizer.readthedocs.io/en/latest/)
|
jpayne@7
|
223
|
jpayne@7
|
224 ## π Why
|
jpayne@7
|
225
|
jpayne@7
|
226 When I started using Chardet, I noticed that it was not suited to my expectations, and I wanted to propose a
|
jpayne@7
|
227 reliable alternative using a completely different method. Also! I never back down on a good challenge!
|
jpayne@7
|
228
|
jpayne@7
|
229 I **don't care** about the **originating charset** encoding, because **two different tables** can
|
jpayne@7
|
230 produce **two identical rendered string.**
|
jpayne@7
|
231 What I want is to get readable text, the best I can.
|
jpayne@7
|
232
|
jpayne@7
|
233 In a way, **I'm brute forcing text decoding.** How cool is that ? π
|
jpayne@7
|
234
|
jpayne@7
|
235 Don't confuse package **ftfy** with charset-normalizer or chardet. ftfy goal is to repair unicode string whereas charset-normalizer to convert raw file in unknown encoding to unicode.
|
jpayne@7
|
236
|
jpayne@7
|
237 ## π° How
|
jpayne@7
|
238
|
jpayne@7
|
239 - Discard all charset encoding table that could not fit the binary content.
|
jpayne@7
|
240 - Measure noise, or the mess once opened (by chunks) with a corresponding charset encoding.
|
jpayne@7
|
241 - Extract matches with the lowest mess detected.
|
jpayne@7
|
242 - Additionally, we measure coherence / probe for a language.
|
jpayne@7
|
243
|
jpayne@7
|
244 **Wait a minute**, what is noise/mess and coherence according to **YOU ?**
|
jpayne@7
|
245
|
jpayne@7
|
246 *Noise :* I opened hundred of text files, **written by humans**, with the wrong encoding table. **I observed**, then
|
jpayne@7
|
247 **I established** some ground rules about **what is obvious** when **it seems like** a mess.
|
jpayne@7
|
248 I know that my interpretation of what is noise is probably incomplete, feel free to contribute in order to
|
jpayne@7
|
249 improve or rewrite it.
|
jpayne@7
|
250
|
jpayne@7
|
251 *Coherence :* For each language there is on earth, we have computed ranked letter appearance occurrences (the best we can). So I thought
|
jpayne@7
|
252 that intel is worth something here. So I use those records against decoded text to check if I can detect intelligent design.
|
jpayne@7
|
253
|
jpayne@7
|
254 ## β‘ Known limitations
|
jpayne@7
|
255
|
jpayne@7
|
256 - Language detection is unreliable when text contains two or more languages sharing identical letters. (eg. HTML (english tags) + Turkish content (Sharing Latin characters))
|
jpayne@7
|
257 - Every charset detector heavily depends on sufficient content. In common cases, do not bother run detection on very tiny content.
|
jpayne@7
|
258
|
jpayne@7
|
259 ## β οΈ About Python EOLs
|
jpayne@7
|
260
|
jpayne@7
|
261 **If you are running:**
|
jpayne@7
|
262
|
jpayne@7
|
263 - Python >=2.7,<3.5: Unsupported
|
jpayne@7
|
264 - Python 3.5: charset-normalizer < 2.1
|
jpayne@7
|
265 - Python 3.6: charset-normalizer < 3.1
|
jpayne@7
|
266 - Python 3.7: charset-normalizer < 4.0
|
jpayne@7
|
267
|
jpayne@7
|
268 Upgrade your Python interpreter as soon as possible.
|
jpayne@7
|
269
|
jpayne@7
|
270 ## π€ Contributing
|
jpayne@7
|
271
|
jpayne@7
|
272 Contributions, issues and feature requests are very much welcome.<br />
|
jpayne@7
|
273 Feel free to check [issues page](https://github.com/ousret/charset_normalizer/issues) if you want to contribute.
|
jpayne@7
|
274
|
jpayne@7
|
275 ## π License
|
jpayne@7
|
276
|
jpayne@7
|
277 Copyright Β© [Ahmed TAHRI @Ousret](https://github.com/Ousret).<br />
|
jpayne@7
|
278 This project is [MIT](https://github.com/Ousret/charset_normalizer/blob/master/LICENSE) licensed.
|
jpayne@7
|
279
|
jpayne@7
|
280 Characters frequencies used in this project Β© 2012 [Denny VrandeΔiΔ](http://simia.net/letters/)
|
jpayne@7
|
281
|
jpayne@7
|
282 ## πΌ For Enterprise
|
jpayne@7
|
283
|
jpayne@7
|
284 Professional support for charset-normalizer is available as part of the [Tidelift
|
jpayne@7
|
285 Subscription][1]. Tidelift gives software development teams a single source for
|
jpayne@7
|
286 purchasing and maintaining their software, with professional grade assurances
|
jpayne@7
|
287 from the experts who know it best, while seamlessly integrating with existing
|
jpayne@7
|
288 tools.
|
jpayne@7
|
289
|
jpayne@7
|
290 [1]: https://tidelift.com/subscription/pkg/pypi-charset-normalizer?utm_source=pypi-charset-normalizer&utm_medium=readme
|
jpayne@7
|
291
|
jpayne@7
|
292 # Changelog
|
jpayne@7
|
293 All notable changes to charset-normalizer will be documented in this file. This project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
jpayne@7
|
294 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).
|
jpayne@7
|
295
|
jpayne@7
|
296 ## [3.3.2](https://github.com/Ousret/charset_normalizer/compare/3.3.1...3.3.2) (2023-10-31)
|
jpayne@7
|
297
|
jpayne@7
|
298 ### Fixed
|
jpayne@7
|
299 - Unintentional memory usage regression when using large payload that match several encoding (#376)
|
jpayne@7
|
300 - Regression on some detection case showcased in the documentation (#371)
|
jpayne@7
|
301
|
jpayne@7
|
302 ### Added
|
jpayne@7
|
303 - Noise (md) probe that identify malformed arabic representation due to the presence of letters in isolated form (credit to my wife)
|
jpayne@7
|
304
|
jpayne@7
|
305 ## [3.3.1](https://github.com/Ousret/charset_normalizer/compare/3.3.0...3.3.1) (2023-10-22)
|
jpayne@7
|
306
|
jpayne@7
|
307 ### Changed
|
jpayne@7
|
308 - Optional mypyc compilation upgraded to version 1.6.1 for Python >= 3.8
|
jpayne@7
|
309 - Improved the general detection reliability based on reports from the community
|
jpayne@7
|
310
|
jpayne@7
|
311 ## [3.3.0](https://github.com/Ousret/charset_normalizer/compare/3.2.0...3.3.0) (2023-09-30)
|
jpayne@7
|
312
|
jpayne@7
|
313 ### Added
|
jpayne@7
|
314 - Allow to execute the CLI (e.g. normalizer) through `python -m charset_normalizer.cli` or `python -m charset_normalizer`
|
jpayne@7
|
315 - Support for 9 forgotten encoding that are supported by Python but unlisted in `encoding.aliases` as they have no alias (#323)
|
jpayne@7
|
316
|
jpayne@7
|
317 ### Removed
|
jpayne@7
|
318 - (internal) Redundant utils.is_ascii function and unused function is_private_use_only
|
jpayne@7
|
319 - (internal) charset_normalizer.assets is moved inside charset_normalizer.constant
|
jpayne@7
|
320
|
jpayne@7
|
321 ### Changed
|
jpayne@7
|
322 - (internal) Unicode code blocks in constants are updated using the latest v15.0.0 definition to improve detection
|
jpayne@7
|
323 - Optional mypyc compilation upgraded to version 1.5.1 for Python >= 3.8
|
jpayne@7
|
324
|
jpayne@7
|
325 ### Fixed
|
jpayne@7
|
326 - Unable to properly sort CharsetMatch when both chaos/noise and coherence were close due to an unreachable condition in \_\_lt\_\_ (#350)
|
jpayne@7
|
327
|
jpayne@7
|
328 ## [3.2.0](https://github.com/Ousret/charset_normalizer/compare/3.1.0...3.2.0) (2023-06-07)
|
jpayne@7
|
329
|
jpayne@7
|
330 ### Changed
|
jpayne@7
|
331 - Typehint for function `from_path` no longer enforce `PathLike` as its first argument
|
jpayne@7
|
332 - Minor improvement over the global detection reliability
|
jpayne@7
|
333
|
jpayne@7
|
334 ### Added
|
jpayne@7
|
335 - Introduce function `is_binary` that relies on main capabilities, and optimized to detect binaries
|
jpayne@7
|
336 - Propagate `enable_fallback` argument throughout `from_bytes`, `from_path`, and `from_fp` that allow a deeper control over the detection (default True)
|
jpayne@7
|
337 - Explicit support for Python 3.12
|
jpayne@7
|
338
|
jpayne@7
|
339 ### Fixed
|
jpayne@7
|
340 - Edge case detection failure where a file would contain 'very-long' camel cased word (Issue #289)
|
jpayne@7
|
341
|
jpayne@7
|
342 ## [3.1.0](https://github.com/Ousret/charset_normalizer/compare/3.0.1...3.1.0) (2023-03-06)
|
jpayne@7
|
343
|
jpayne@7
|
344 ### Added
|
jpayne@7
|
345 - Argument `should_rename_legacy` for legacy function `detect` and disregard any new arguments without errors (PR #262)
|
jpayne@7
|
346
|
jpayne@7
|
347 ### Removed
|
jpayne@7
|
348 - Support for Python 3.6 (PR #260)
|
jpayne@7
|
349
|
jpayne@7
|
350 ### Changed
|
jpayne@7
|
351 - Optional speedup provided by mypy/c 1.0.1
|
jpayne@7
|
352
|
jpayne@7
|
353 ## [3.0.1](https://github.com/Ousret/charset_normalizer/compare/3.0.0...3.0.1) (2022-11-18)
|
jpayne@7
|
354
|
jpayne@7
|
355 ### Fixed
|
jpayne@7
|
356 - Multi-bytes cutter/chunk generator did not always cut correctly (PR #233)
|
jpayne@7
|
357
|
jpayne@7
|
358 ### Changed
|
jpayne@7
|
359 - Speedup provided by mypy/c 0.990 on Python >= 3.7
|
jpayne@7
|
360
|
jpayne@7
|
361 ## [3.0.0](https://github.com/Ousret/charset_normalizer/compare/2.1.1...3.0.0) (2022-10-20)
|
jpayne@7
|
362
|
jpayne@7
|
363 ### Added
|
jpayne@7
|
364 - Extend the capability of explain=True when cp_isolation contains at most two entries (min one), will log in details of the Mess-detector results
|
jpayne@7
|
365 - Support for alternative language frequency set in charset_normalizer.assets.FREQUENCIES
|
jpayne@7
|
366 - Add parameter `language_threshold` in `from_bytes`, `from_path` and `from_fp` to adjust the minimum expected coherence ratio
|
jpayne@7
|
367 - `normalizer --version` now specify if current version provide extra speedup (meaning mypyc compilation whl)
|
jpayne@7
|
368
|
jpayne@7
|
369 ### Changed
|
jpayne@7
|
370 - Build with static metadata using 'build' frontend
|
jpayne@7
|
371 - Make the language detection stricter
|
jpayne@7
|
372 - Optional: Module `md.py` can be compiled using Mypyc to provide an extra speedup up to 4x faster than v2.1
|
jpayne@7
|
373
|
jpayne@7
|
374 ### Fixed
|
jpayne@7
|
375 - CLI with opt --normalize fail when using full path for files
|
jpayne@7
|
376 - TooManyAccentuatedPlugin induce false positive on the mess detection when too few alpha character have been fed to it
|
jpayne@7
|
377 - Sphinx warnings when generating the documentation
|
jpayne@7
|
378
|
jpayne@7
|
379 ### Removed
|
jpayne@7
|
380 - Coherence detector no longer return 'Simple English' instead return 'English'
|
jpayne@7
|
381 - Coherence detector no longer return 'Classical Chinese' instead return 'Chinese'
|
jpayne@7
|
382 - Breaking: Method `first()` and `best()` from CharsetMatch
|
jpayne@7
|
383 - UTF-7 will no longer appear as "detected" without a recognized SIG/mark (is unreliable/conflict with ASCII)
|
jpayne@7
|
384 - Breaking: Class aliases CharsetDetector, CharsetDoctor, CharsetNormalizerMatch and CharsetNormalizerMatches
|
jpayne@7
|
385 - Breaking: Top-level function `normalize`
|
jpayne@7
|
386 - Breaking: Properties `chaos_secondary_pass`, `coherence_non_latin` and `w_counter` from CharsetMatch
|
jpayne@7
|
387 - Support for the backport `unicodedata2`
|
jpayne@7
|
388
|
jpayne@7
|
389 ## [3.0.0rc1](https://github.com/Ousret/charset_normalizer/compare/3.0.0b2...3.0.0rc1) (2022-10-18)
|
jpayne@7
|
390
|
jpayne@7
|
391 ### Added
|
jpayne@7
|
392 - Extend the capability of explain=True when cp_isolation contains at most two entries (min one), will log in details of the Mess-detector results
|
jpayne@7
|
393 - Support for alternative language frequency set in charset_normalizer.assets.FREQUENCIES
|
jpayne@7
|
394 - Add parameter `language_threshold` in `from_bytes`, `from_path` and `from_fp` to adjust the minimum expected coherence ratio
|
jpayne@7
|
395
|
jpayne@7
|
396 ### Changed
|
jpayne@7
|
397 - Build with static metadata using 'build' frontend
|
jpayne@7
|
398 - Make the language detection stricter
|
jpayne@7
|
399
|
jpayne@7
|
400 ### Fixed
|
jpayne@7
|
401 - CLI with opt --normalize fail when using full path for files
|
jpayne@7
|
402 - TooManyAccentuatedPlugin induce false positive on the mess detection when too few alpha character have been fed to it
|
jpayne@7
|
403
|
jpayne@7
|
404 ### Removed
|
jpayne@7
|
405 - Coherence detector no longer return 'Simple English' instead return 'English'
|
jpayne@7
|
406 - Coherence detector no longer return 'Classical Chinese' instead return 'Chinese'
|
jpayne@7
|
407
|
jpayne@7
|
408 ## [3.0.0b2](https://github.com/Ousret/charset_normalizer/compare/3.0.0b1...3.0.0b2) (2022-08-21)
|
jpayne@7
|
409
|
jpayne@7
|
410 ### Added
|
jpayne@7
|
411 - `normalizer --version` now specify if current version provide extra speedup (meaning mypyc compilation whl)
|
jpayne@7
|
412
|
jpayne@7
|
413 ### Removed
|
jpayne@7
|
414 - Breaking: Method `first()` and `best()` from CharsetMatch
|
jpayne@7
|
415 - UTF-7 will no longer appear as "detected" without a recognized SIG/mark (is unreliable/conflict with ASCII)
|
jpayne@7
|
416
|
jpayne@7
|
417 ### Fixed
|
jpayne@7
|
418 - Sphinx warnings when generating the documentation
|
jpayne@7
|
419
|
jpayne@7
|
420 ## [3.0.0b1](https://github.com/Ousret/charset_normalizer/compare/2.1.0...3.0.0b1) (2022-08-15)
|
jpayne@7
|
421
|
jpayne@7
|
422 ### Changed
|
jpayne@7
|
423 - Optional: Module `md.py` can be compiled using Mypyc to provide an extra speedup up to 4x faster than v2.1
|
jpayne@7
|
424
|
jpayne@7
|
425 ### Removed
|
jpayne@7
|
426 - Breaking: Class aliases CharsetDetector, CharsetDoctor, CharsetNormalizerMatch and CharsetNormalizerMatches
|
jpayne@7
|
427 - Breaking: Top-level function `normalize`
|
jpayne@7
|
428 - Breaking: Properties `chaos_secondary_pass`, `coherence_non_latin` and `w_counter` from CharsetMatch
|
jpayne@7
|
429 - Support for the backport `unicodedata2`
|
jpayne@7
|
430
|
jpayne@7
|
431 ## [2.1.1](https://github.com/Ousret/charset_normalizer/compare/2.1.0...2.1.1) (2022-08-19)
|
jpayne@7
|
432
|
jpayne@7
|
433 ### Deprecated
|
jpayne@7
|
434 - Function `normalize` scheduled for removal in 3.0
|
jpayne@7
|
435
|
jpayne@7
|
436 ### Changed
|
jpayne@7
|
437 - Removed useless call to decode in fn is_unprintable (#206)
|
jpayne@7
|
438
|
jpayne@7
|
439 ### Fixed
|
jpayne@7
|
440 - Third-party library (i18n xgettext) crashing not recognizing utf_8 (PEP 263) with underscore from [@aleksandernovikov](https://github.com/aleksandernovikov) (#204)
|
jpayne@7
|
441
|
jpayne@7
|
442 ## [2.1.0](https://github.com/Ousret/charset_normalizer/compare/2.0.12...2.1.0) (2022-06-19)
|
jpayne@7
|
443
|
jpayne@7
|
444 ### Added
|
jpayne@7
|
445 - Output the Unicode table version when running the CLI with `--version` (PR #194)
|
jpayne@7
|
446
|
jpayne@7
|
447 ### Changed
|
jpayne@7
|
448 - Re-use decoded buffer for single byte character sets from [@nijel](https://github.com/nijel) (PR #175)
|
jpayne@7
|
449 - Fixing some performance bottlenecks from [@deedy5](https://github.com/deedy5) (PR #183)
|
jpayne@7
|
450
|
jpayne@7
|
451 ### Fixed
|
jpayne@7
|
452 - Workaround potential bug in cpython with Zero Width No-Break Space located in Arabic Presentation Forms-B, Unicode 1.1 not acknowledged as space (PR #175)
|
jpayne@7
|
453 - CLI default threshold aligned with the API threshold from [@oleksandr-kuzmenko](https://github.com/oleksandr-kuzmenko) (PR #181)
|
jpayne@7
|
454
|
jpayne@7
|
455 ### Removed
|
jpayne@7
|
456 - Support for Python 3.5 (PR #192)
|
jpayne@7
|
457
|
jpayne@7
|
458 ### Deprecated
|
jpayne@7
|
459 - Use of backport unicodedata from `unicodedata2` as Python is quickly catching up, scheduled for removal in 3.0 (PR #194)
|
jpayne@7
|
460
|
jpayne@7
|
461 ## [2.0.12](https://github.com/Ousret/charset_normalizer/compare/2.0.11...2.0.12) (2022-02-12)
|
jpayne@7
|
462
|
jpayne@7
|
463 ### Fixed
|
jpayne@7
|
464 - ASCII miss-detection on rare cases (PR #170)
|
jpayne@7
|
465
|
jpayne@7
|
466 ## [2.0.11](https://github.com/Ousret/charset_normalizer/compare/2.0.10...2.0.11) (2022-01-30)
|
jpayne@7
|
467
|
jpayne@7
|
468 ### Added
|
jpayne@7
|
469 - Explicit support for Python 3.11 (PR #164)
|
jpayne@7
|
470
|
jpayne@7
|
471 ### Changed
|
jpayne@7
|
472 - The logging behavior have been completely reviewed, now using only TRACE and DEBUG levels (PR #163 #165)
|
jpayne@7
|
473
|
jpayne@7
|
474 ## [2.0.10](https://github.com/Ousret/charset_normalizer/compare/2.0.9...2.0.10) (2022-01-04)
|
jpayne@7
|
475
|
jpayne@7
|
476 ### Fixed
|
jpayne@7
|
477 - Fallback match entries might lead to UnicodeDecodeError for large bytes sequence (PR #154)
|
jpayne@7
|
478
|
jpayne@7
|
479 ### Changed
|
jpayne@7
|
480 - Skipping the language-detection (CD) on ASCII (PR #155)
|
jpayne@7
|
481
|
jpayne@7
|
482 ## [2.0.9](https://github.com/Ousret/charset_normalizer/compare/2.0.8...2.0.9) (2021-12-03)
|
jpayne@7
|
483
|
jpayne@7
|
484 ### Changed
|
jpayne@7
|
485 - Moderating the logging impact (since 2.0.8) for specific environments (PR #147)
|
jpayne@7
|
486
|
jpayne@7
|
487 ### Fixed
|
jpayne@7
|
488 - Wrong logging level applied when setting kwarg `explain` to True (PR #146)
|
jpayne@7
|
489
|
jpayne@7
|
490 ## [2.0.8](https://github.com/Ousret/charset_normalizer/compare/2.0.7...2.0.8) (2021-11-24)
|
jpayne@7
|
491 ### Changed
|
jpayne@7
|
492 - Improvement over Vietnamese detection (PR #126)
|
jpayne@7
|
493 - MD improvement on trailing data and long foreign (non-pure latin) data (PR #124)
|
jpayne@7
|
494 - Efficiency improvements in cd/alphabet_languages from [@adbar](https://github.com/adbar) (PR #122)
|
jpayne@7
|
495 - call sum() without an intermediary list following PEP 289 recommendations from [@adbar](https://github.com/adbar) (PR #129)
|
jpayne@7
|
496 - Code style as refactored by Sourcery-AI (PR #131)
|
jpayne@7
|
497 - Minor adjustment on the MD around european words (PR #133)
|
jpayne@7
|
498 - Remove and replace SRTs from assets / tests (PR #139)
|
jpayne@7
|
499 - Initialize the library logger with a `NullHandler` by default from [@nmaynes](https://github.com/nmaynes) (PR #135)
|
jpayne@7
|
500 - Setting kwarg `explain` to True will add provisionally (bounded to function lifespan) a specific stream handler (PR #135)
|
jpayne@7
|
501
|
jpayne@7
|
502 ### Fixed
|
jpayne@7
|
503 - Fix large (misleading) sequence giving UnicodeDecodeError (PR #137)
|
jpayne@7
|
504 - Avoid using too insignificant chunk (PR #137)
|
jpayne@7
|
505
|
jpayne@7
|
506 ### Added
|
jpayne@7
|
507 - Add and expose function `set_logging_handler` to configure a specific StreamHandler from [@nmaynes](https://github.com/nmaynes) (PR #135)
|
jpayne@7
|
508 - Add `CHANGELOG.md` entries, format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/) (PR #141)
|
jpayne@7
|
509
|
jpayne@7
|
510 ## [2.0.7](https://github.com/Ousret/charset_normalizer/compare/2.0.6...2.0.7) (2021-10-11)
|
jpayne@7
|
511 ### Added
|
jpayne@7
|
512 - Add support for Kazakh (Cyrillic) language detection (PR #109)
|
jpayne@7
|
513
|
jpayne@7
|
514 ### Changed
|
jpayne@7
|
515 - Further, improve inferring the language from a given single-byte code page (PR #112)
|
jpayne@7
|
516 - Vainly trying to leverage PEP263 when PEP3120 is not supported (PR #116)
|
jpayne@7
|
517 - Refactoring for potential performance improvements in loops from [@adbar](https://github.com/adbar) (PR #113)
|
jpayne@7
|
518 - Various detection improvement (MD+CD) (PR #117)
|
jpayne@7
|
519
|
jpayne@7
|
520 ### Removed
|
jpayne@7
|
521 - Remove redundant logging entry about detected language(s) (PR #115)
|
jpayne@7
|
522
|
jpayne@7
|
523 ### Fixed
|
jpayne@7
|
524 - Fix a minor inconsistency between Python 3.5 and other versions regarding language detection (PR #117 #102)
|
jpayne@7
|
525
|
jpayne@7
|
526 ## [2.0.6](https://github.com/Ousret/charset_normalizer/compare/2.0.5...2.0.6) (2021-09-18)
|
jpayne@7
|
527 ### Fixed
|
jpayne@7
|
528 - Unforeseen regression with the loss of the backward-compatibility with some older minor of Python 3.5.x (PR #100)
|
jpayne@7
|
529 - Fix CLI crash when using --minimal output in certain cases (PR #103)
|
jpayne@7
|
530
|
jpayne@7
|
531 ### Changed
|
jpayne@7
|
532 - Minor improvement to the detection efficiency (less than 1%) (PR #106 #101)
|
jpayne@7
|
533
|
jpayne@7
|
534 ## [2.0.5](https://github.com/Ousret/charset_normalizer/compare/2.0.4...2.0.5) (2021-09-14)
|
jpayne@7
|
535 ### Changed
|
jpayne@7
|
536 - The project now comply with: flake8, mypy, isort and black to ensure a better overall quality (PR #81)
|
jpayne@7
|
537 - The BC-support with v1.x was improved, the old staticmethods are restored (PR #82)
|
jpayne@7
|
538 - The Unicode detection is slightly improved (PR #93)
|
jpayne@7
|
539 - Add syntax sugar \_\_bool\_\_ for results CharsetMatches list-container (PR #91)
|
jpayne@7
|
540
|
jpayne@7
|
541 ### Removed
|
jpayne@7
|
542 - The project no longer raise warning on tiny content given for detection, will be simply logged as warning instead (PR #92)
|
jpayne@7
|
543
|
jpayne@7
|
544 ### Fixed
|
jpayne@7
|
545 - In some rare case, the chunks extractor could cut in the middle of a multi-byte character and could mislead the mess detection (PR #95)
|
jpayne@7
|
546 - Some rare 'space' characters could trip up the UnprintablePlugin/Mess detection (PR #96)
|
jpayne@7
|
547 - The MANIFEST.in was not exhaustive (PR #78)
|
jpayne@7
|
548
|
jpayne@7
|
549 ## [2.0.4](https://github.com/Ousret/charset_normalizer/compare/2.0.3...2.0.4) (2021-07-30)
|
jpayne@7
|
550 ### Fixed
|
jpayne@7
|
551 - The CLI no longer raise an unexpected exception when no encoding has been found (PR #70)
|
jpayne@7
|
552 - Fix accessing the 'alphabets' property when the payload contains surrogate characters (PR #68)
|
jpayne@7
|
553 - The logger could mislead (explain=True) on detected languages and the impact of one MBCS match (PR #72)
|
jpayne@7
|
554 - Submatch factoring could be wrong in rare edge cases (PR #72)
|
jpayne@7
|
555 - Multiple files given to the CLI were ignored when publishing results to STDOUT. (After the first path) (PR #72)
|
jpayne@7
|
556 - Fix line endings from CRLF to LF for certain project files (PR #67)
|
jpayne@7
|
557
|
jpayne@7
|
558 ### Changed
|
jpayne@7
|
559 - Adjust the MD to lower the sensitivity, thus improving the global detection reliability (PR #69 #76)
|
jpayne@7
|
560 - Allow fallback on specified encoding if any (PR #71)
|
jpayne@7
|
561
|
jpayne@7
|
562 ## [2.0.3](https://github.com/Ousret/charset_normalizer/compare/2.0.2...2.0.3) (2021-07-16)
|
jpayne@7
|
563 ### Changed
|
jpayne@7
|
564 - Part of the detection mechanism has been improved to be less sensitive, resulting in more accurate detection results. Especially ASCII. (PR #63)
|
jpayne@7
|
565 - According to the community wishes, the detection will fall back on ASCII or UTF-8 in a last-resort case. (PR #64)
|
jpayne@7
|
566
|
jpayne@7
|
567 ## [2.0.2](https://github.com/Ousret/charset_normalizer/compare/2.0.1...2.0.2) (2021-07-15)
|
jpayne@7
|
568 ### Fixed
|
jpayne@7
|
569 - Empty/Too small JSON payload miss-detection fixed. Report from [@tseaver](https://github.com/tseaver) (PR #59)
|
jpayne@7
|
570
|
jpayne@7
|
571 ### Changed
|
jpayne@7
|
572 - Don't inject unicodedata2 into sys.modules from [@akx](https://github.com/akx) (PR #57)
|
jpayne@7
|
573
|
jpayne@7
|
574 ## [2.0.1](https://github.com/Ousret/charset_normalizer/compare/2.0.0...2.0.1) (2021-07-13)
|
jpayne@7
|
575 ### Fixed
|
jpayne@7
|
576 - Make it work where there isn't a filesystem available, dropping assets frequencies.json. Report from [@sethmlarson](https://github.com/sethmlarson). (PR #55)
|
jpayne@7
|
577 - Using explain=False permanently disable the verbose output in the current runtime (PR #47)
|
jpayne@7
|
578 - One log entry (language target preemptive) was not show in logs when using explain=True (PR #47)
|
jpayne@7
|
579 - Fix undesired exception (ValueError) on getitem of instance CharsetMatches (PR #52)
|
jpayne@7
|
580
|
jpayne@7
|
581 ### Changed
|
jpayne@7
|
582 - Public function normalize default args values were not aligned with from_bytes (PR #53)
|
jpayne@7
|
583
|
jpayne@7
|
584 ### Added
|
jpayne@7
|
585 - You may now use charset aliases in cp_isolation and cp_exclusion arguments (PR #47)
|
jpayne@7
|
586
|
jpayne@7
|
587 ## [2.0.0](https://github.com/Ousret/charset_normalizer/compare/1.4.1...2.0.0) (2021-07-02)
|
jpayne@7
|
588 ### Changed
|
jpayne@7
|
589 - 4x to 5 times faster than the previous 1.4.0 release. At least 2x faster than Chardet.
|
jpayne@7
|
590 - Accent has been made on UTF-8 detection, should perform rather instantaneous.
|
jpayne@7
|
591 - The backward compatibility with Chardet has been greatly improved. The legacy detect function returns an identical charset name whenever possible.
|
jpayne@7
|
592 - The detection mechanism has been slightly improved, now Turkish content is detected correctly (most of the time)
|
jpayne@7
|
593 - The program has been rewritten to ease the readability and maintainability. (+Using static typing)+
|
jpayne@7
|
594 - utf_7 detection has been reinstated.
|
jpayne@7
|
595
|
jpayne@7
|
596 ### Removed
|
jpayne@7
|
597 - This package no longer require anything when used with Python 3.5 (Dropped cached_property)
|
jpayne@7
|
598 - Removed support for these languages: Catalan, Esperanto, Kazakh, Baque, VolapΓΌk, Azeri, Galician, Nynorsk, Macedonian, and Serbocroatian.
|
jpayne@7
|
599 - The exception hook on UnicodeDecodeError has been removed.
|
jpayne@7
|
600
|
jpayne@7
|
601 ### Deprecated
|
jpayne@7
|
602 - Methods coherence_non_latin, w_counter, chaos_secondary_pass of the class CharsetMatch are now deprecated and scheduled for removal in v3.0
|
jpayne@7
|
603
|
jpayne@7
|
604 ### Fixed
|
jpayne@7
|
605 - The CLI output used the relative path of the file(s). Should be absolute.
|
jpayne@7
|
606
|
jpayne@7
|
607 ## [1.4.1](https://github.com/Ousret/charset_normalizer/compare/1.4.0...1.4.1) (2021-05-28)
|
jpayne@7
|
608 ### Fixed
|
jpayne@7
|
609 - Logger configuration/usage no longer conflict with others (PR #44)
|
jpayne@7
|
610
|
jpayne@7
|
611 ## [1.4.0](https://github.com/Ousret/charset_normalizer/compare/1.3.9...1.4.0) (2021-05-21)
|
jpayne@7
|
612 ### Removed
|
jpayne@7
|
613 - Using standard logging instead of using the package loguru.
|
jpayne@7
|
614 - Dropping nose test framework in favor of the maintained pytest.
|
jpayne@7
|
615 - Choose to not use dragonmapper package to help with gibberish Chinese/CJK text.
|
jpayne@7
|
616 - Require cached_property only for Python 3.5 due to constraint. Dropping for every other interpreter version.
|
jpayne@7
|
617 - Stop support for UTF-7 that does not contain a SIG.
|
jpayne@7
|
618 - Dropping PrettyTable, replaced with pure JSON output in CLI.
|
jpayne@7
|
619
|
jpayne@7
|
620 ### Fixed
|
jpayne@7
|
621 - BOM marker in a CharsetNormalizerMatch instance could be False in rare cases even if obviously present. Due to the sub-match factoring process.
|
jpayne@7
|
622 - Not searching properly for the BOM when trying utf32/16 parent codec.
|
jpayne@7
|
623
|
jpayne@7
|
624 ### Changed
|
jpayne@7
|
625 - Improving the package final size by compressing frequencies.json.
|
jpayne@7
|
626 - Huge improvement over the larges payload.
|
jpayne@7
|
627
|
jpayne@7
|
628 ### Added
|
jpayne@7
|
629 - CLI now produces JSON consumable output.
|
jpayne@7
|
630 - Return ASCII if given sequences fit. Given reasonable confidence.
|
jpayne@7
|
631
|
jpayne@7
|
632 ## [1.3.9](https://github.com/Ousret/charset_normalizer/compare/1.3.8...1.3.9) (2021-05-13)
|
jpayne@7
|
633
|
jpayne@7
|
634 ### Fixed
|
jpayne@7
|
635 - In some very rare cases, you may end up getting encode/decode errors due to a bad bytes payload (PR #40)
|
jpayne@7
|
636
|
jpayne@7
|
637 ## [1.3.8](https://github.com/Ousret/charset_normalizer/compare/1.3.7...1.3.8) (2021-05-12)
|
jpayne@7
|
638
|
jpayne@7
|
639 ### Fixed
|
jpayne@7
|
640 - Empty given payload for detection may cause an exception if trying to access the `alphabets` property. (PR #39)
|
jpayne@7
|
641
|
jpayne@7
|
642 ## [1.3.7](https://github.com/Ousret/charset_normalizer/compare/1.3.6...1.3.7) (2021-05-12)
|
jpayne@7
|
643
|
jpayne@7
|
644 ### Fixed
|
jpayne@7
|
645 - The legacy detect function should return UTF-8-SIG if sig is present in the payload. (PR #38)
|
jpayne@7
|
646
|
jpayne@7
|
647 ## [1.3.6](https://github.com/Ousret/charset_normalizer/compare/1.3.5...1.3.6) (2021-02-09)
|
jpayne@7
|
648
|
jpayne@7
|
649 ### Changed
|
jpayne@7
|
650 - Amend the previous release to allow prettytable 2.0 (PR #35)
|
jpayne@7
|
651
|
jpayne@7
|
652 ## [1.3.5](https://github.com/Ousret/charset_normalizer/compare/1.3.4...1.3.5) (2021-02-08)
|
jpayne@7
|
653
|
jpayne@7
|
654 ### Fixed
|
jpayne@7
|
655 - Fix error while using the package with a python pre-release interpreter (PR #33)
|
jpayne@7
|
656
|
jpayne@7
|
657 ### Changed
|
jpayne@7
|
658 - Dependencies refactoring, constraints revised.
|
jpayne@7
|
659
|
jpayne@7
|
660 ### Added
|
jpayne@7
|
661 - Add python 3.9 and 3.10 to the supported interpreters
|
jpayne@7
|
662
|
jpayne@7
|
663 MIT License
|
jpayne@7
|
664
|
jpayne@7
|
665 Copyright (c) 2019 TAHRI Ahmed R.
|
jpayne@7
|
666
|
jpayne@7
|
667 Permission is hereby granted, free of charge, to any person obtaining a copy
|
jpayne@7
|
668 of this software and associated documentation files (the "Software"), to deal
|
jpayne@7
|
669 in the Software without restriction, including without limitation the rights
|
jpayne@7
|
670 to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
jpayne@7
|
671 copies of the Software, and to permit persons to whom the Software is
|
jpayne@7
|
672 furnished to do so, subject to the following conditions:
|
jpayne@7
|
673
|
jpayne@7
|
674 The above copyright notice and this permission notice shall be included in all
|
jpayne@7
|
675 copies or substantial portions of the Software.
|
jpayne@7
|
676
|
jpayne@7
|
677 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
jpayne@7
|
678 IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
jpayne@7
|
679 FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
jpayne@7
|
680 AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
jpayne@7
|
681 LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
jpayne@7
|
682 OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
jpayne@7
|
683 SOFTWARE.
|