The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Download Latest Version Hazm 0.10.0 source code.tar.gz (3.7 MB)
Email in envelope

Get an email when there's a new version of Hazm

Home / v0.9
Name Modified Size InfoDownloads / Week
Parent folder
Hazm 0.9 source code.tar.gz 2023-05-20 5.7 MB
Hazm 0.9 source code.zip 2023-05-20 5.8 MB
README.md 2023-05-20 2.0 kB
Totals: 3 Items   11.5 MB 0

Added

  • Windows compaitiblity by using Python-crfsuite instead of Wapiti. @E-Ghafour.
  • Pretrained Chunker and POSTagger models with Python-crfsuite. @E-Ghafour.
  • new parameters in Normalizer to better text processing. @sir-kokabi.
  • Three regex patterns in Normalizer to fix ZWNJs and spacing issues. @sir-kokabi.
  • 400 Non-standard unicode characters to be replaced in Normalizer. @sir-kokabi.
  • 40,000+ new words to improve Lemmatizer and Tokenizer. @sir-kokabi.
  • train function for Word2vec and Sent2vec modules in Embedding. @E-Ghafour.
  • Implement keywordExtraction with the embedRank approach as a sample of Hazm usage. @E-Ghafour.
  • Support Universal tags in POSTagger. @E-Ghafour.
  • Support universal POS mapper in PeykareReader & DadeganReader (#239). @phsfr.
  • PersianPlainTextReader to process raw text datasets (#120). @mhbashari.
  • Support EZ tag in PeykareReader. @E-Ghafour.
  • Slash & back-slash (/ ) support in Tokenizer (#102). @elahimanesh.
  • Conjugation class to handle verb conjugation. @sir-kokabi.

Fixed

  • Improve the accuracy of POSTagger and Chunker. @E-Ghafour.
  • Improve InformalNormalizer [#219]. @riasati.
  • Fix pep8 issues. (#135). @hadifar.
  • Fix Some tests issues. @sir-kokabi @E-Ghafour.
  • Fix Stemmer issues with multiple suffixes. @sir-kokabi.
  • Fix various reported issues

Changed

  • Drop Python 2 support and migrate all code to Python 3. @sir-kokabi.
  • Use data_maker function instead of patterns in SequenceTagger. @E-Ghafour.
  • Refactor IOBTagger and POSTagger to be compatible with data_maker. @E_Ghafour.
  • Change می روم to می‌روم in example (#203). @SMSadegh19.
  • Overhaul the project structure and GitHub repo. @sir-kokabi.

Download Pretrained models

Full Changelog: https://github.com/roshan-research/hazm/compare/v0.8.2...v0.9

Source: README.md, updated 2023-05-20