The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.
| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| cleaner code, better packaging, and some upgrades source code.tar.gz | 2020-08-29 | 246.4 kB | |
| cleaner code, better packaging, and some upgrades source code.zip | 2020-08-29 | 326.0 kB | |
| README.md | 2020-08-29 | 3.4 kB | |
| Totals: 3 Items | 575.8 kB | 0 | |
New and Changed:
- Expanded text statistics and refactored into a sub-package (PR [#307])
- Refactored
text_statsmodule into a sub-package with the same name and top-level API, but restructured under the hood for better consistency - Improved performance, API, and documentation on the main
TextStatsclass, and improved documentation on many of the individual stats functions - Added new readability tests for texts in Arabic (Automated Arabic Readability Index), Spanish (µ-legibility and perspecuity index), and Turkish (a lang-specific formulation of Flesch Reading Ease)
- Breaking change: Removed
TextStats.basic_countsandTextStats.readability_statsattributes, since typically only one or a couple needed for a given use case; also, some of the readability tests are language-specific, which meant bad results could get mixed in with good ones - Improved and standardized some code quality and performance (PR [#305], [#306])
- Standardized error messages via top-level
errors.pymodule - Replaced
str.format()with f-strings (almost) everywhere, for performance and readability - Fixed a whole mess of linting errors, significantly improving code quality and consistency
- Improved package configuration, and maintenance (PRs [#298], [#305], [#306])
- Added automated GitHub workflows for building and testing the package, linting and formatting, publishing new releases to PyPi, and building documentation (and ripped out Travis CI)
- Added a makefile with common commands for dev work, plus instructions
- Adopted the new
pyproject.tomlpackage configuration standard; updated and streamlinedsetup.pyandsetup.cfgaccordingly; and removedrequirements.txt - Moved all source code into a
/srcdirectory, for technical reasons - Added
mypy-specific config file to reduce output noisiness when type-checking - Improved and moved package documentation (PR [#309])
- Moved the docs site back to ReadTheDocs (https://textacy.readthedocs.io)! Pardon the years-long detour into GitHub Pages...
- Enabled markdown-based documentation using
recommonmarkinstead ofm2r, and migrated all "narrative" docs from.rstto equivalent.mdfiles - Added auto-generated summary tables to many sections of the API Reference, to help users get an overview of functionality and better find what they're looking for; also added auto-generated section heading references
- Tidied up and further standardized docstrings throughout the code
- Kept up with the Python ecosystem
- Trained a v1.1 language identifier model using
scikit-learn==0.23.0, and bumped the upper bound on that dependency's version accordingly - Updated and parametrized many tests using modern
pytestfunctionality (PR [#306]) - Got
textacyversions 0.9.1 and 0.10.0 up onconda-forge(Issue [#294]) - Added spectral seriation as a term-ordering technique when making a "Termite" visualization by taking advantage of
pandas.DataFramefunctionality, and otherwise tidied up the default for nice-looking plots (PR [#295])
Fixed:
- Corrected an incorrect and misleading reference in the quickstart docs (Issue [#300], PR [#302])
- Fixed a bug in the
delete_words()augmentation transform (Issue [#308])
Contributors:
Special thanks to @tbsexton, @marius-mather, and @rmax for their contributions! 💐