2005-11-11 Németh László <nemethl@gyorsposta.hu>:
        * src/hunspell/affixmgr.*: fix Unicode MAP errors (sorted only n-1
          characters instead of n ones in UTF-16 MAP character lists).
          Bug reported by Rene Engelhard.

        * src/hunspell/affixmgr.*: fix infinite COMPOUND matching (default char
          type is unsigned on PowerPC, s390 and ARM platforms and it will never
          be negative). Bug reported by Rene Engelhard.
        
	* src/hunspell/{affixmgr,suggestmgr}.cxx: fix bad ONLYINCOMPOUND
          word suggestions.
	* tests/onlyincompound.sug: empty test file to check this fix.
          Bug reported by Björn Jacke.

        * src/hunspell/affixmgr.cxx: fix backtracking in COMPOUND pattern matching.
        * tests/compound6.*: test files to check this fix.
        
        * csutil.cxx: set bigger range types in flag_qsort() and flag_bsearch().
        
        * affixmgr.hxx: set better type for cont_classes[] Boolean data (short -> char)
        
        * configure.ac, tests/automake.am: set platform specific XFAIL test
          (flagutf8.test on ARM platform)

2005-11-09 Németh László <nemethl@gyorsposta.hu>:
improvements:
        * src/hunspell/affixmgr.*: new and improved affix file parameters:

        - COMPOUND definitions: compound patterns with regexp-like matching.
          See manual and test files: tests/compound*.*
          Suggested by Bram Moolenaar.
          Also useful for simple word-level lexical scanning, for example
          analysing numbers or words with numbers (OOo Issue #53643):
          http://qa.openoffice.org/issues/show_bug.cgi?id=53643
          Examples: tests/compound{4,5}.*.

        - NOSUGGEST flag: words signed with NOSUGGEST flag are not suggested.
          Proposed flag for vulgar and obscene words (OOo Issue #55498).
          Example: tests/nosuggest.*.
          Problem reported by bobharvey at OOo:
          http://qa.openoffice.org/issues/show_bug.cgi?id=55498

        - KEEPCASE flag: Forbid capitalized and uppercased forms of words 
          signed with KEEPCASE flags. Useful for special ortographies 
          (measurements and currency often keep their case in uppercased
          texts) and other writing systems (eg. keeping lower case of IPA
          characters).

        - CHECKCOMPOUNDCASE: Forbid upper case characters at word bound in compounds.
          Examples: tests/checkcompoundcase* and tests/germancompounding.*

        - FLAG UTF-8: New flag type: Unicode character encoded with UTF-8.
          Example: tests/flagutf8.*.
          Rationale: Unicode character type can be more readable
          (in a Unicode text editor) than `long' or `num' flag type.

bug fixes:
        * src/hunspell/hunspell.cxx: accept numbers and numbers with separators (i53643)
          Bug reported by skelet at OOo:
          http://qa.openoffice.org/issues/show_bug.cgi?id=53643

	* src/hunspell/csutil.cxx: fix casing data in ISO 8859-13 character table.

        * src/hunspell/csutil.cxx: add ISO-8859-15 character encoding (i54980)
          Rationale: ISO-8859-15 is the default encoding of the French OpenOffice.org
          dictionary. ISO-8859-15 is a modified version of ISO-8859-1
          (latin-1) character encoding with French œ ligatures and euro
	  symbol. Problem reported by cbrunet at OOo in OOo Issue 54980:
          http://qa.openoffice.org/issues/show_bug.cgi?id=54980

        * src/hunspell/affixmgr.cxx: fix zero-byte malloc after a bad affix header.
          Patch by Harri Pitkänen.

	* src/hunspell/suggestmgr.cxx: fix bad NEEDAFFIX word suggestion
          in ngram suggestions. Reported by Daniel Naber and Friedel Wolff.

        * src/hunspell/hashmgr.cxx: fix bad white space checking in affix files.
          src/hunspell/{csutil,affixmgr}.cxx: add other white space separators.
          Problems with tabulators reported by Frederik Fouvry.

        * src/hunspell/*: replace system-dependent <license.*> #include
          parameters with quoted ones. Problem reported by Dafydd Jones.

	* src/hunspell/hunspell.cxx: fix missing morphological analysis of dot(s)
          Reported by Trón Viktor.

changes:
	* src/hunspell/affixmgr.cxx: rename PSEUDOROOT to NEEDAFFIX.
	  Suggested by Bram Moolenaar.

        * src/hunspell/suggestmgr.hxx: Increase default maximum of 
          ngram suggestions (3->5). Suggested by Kevin Hendricks.

        * src/hunspell/htypes.hxx: Increase MAXDELEN for long affix flags.

        * src/hunspell/suggestmgr.cxx: modify (perhaps fix) Unicode map suggestion.
          tests/maputf test fail on ARM platform reported by Rene Engelhard.

        * src/hunspell/{affentry.cxx,atypes.hxx}: remove [PREFIX] and
          MISSING_DESCRIPTION messages from morphological analysis.
          Problems reported by Trón Viktor.

	* tests/germancompounding.{aff,good}: Add "Computer-Arbeit" test word.
          Suggested by Daniel Naber.

        * doc/man/hunspell.4: Proof-reading patch by Goldman Eleonóra.
          
        * doc/man/hunspell.4: Fix bad affix example (replace `move' with `work').
          Bug reported by Frederik Fouvry.
          
        * tests/*: new test files:
          affixes.*: simple affix compression example from Hunspell 4 manual page
          checkcompoundcase.*, checkcompoundcase2.*, checkcompoundcaseutf.*
          compound.*, compound2.*, compound3.*, compound4.*, compound5.*
          compoundflag.* (former compound.*)
          flagutf8.*: test for FLAG UTF-8
          germancompounding.*: simplification with CHECKCOMPOUNDCASE.
          germancompoundingold.* (former germancompounding.*)
          i53643.*: check numbers with separators
          i54980.*: ISO8859-15 test
          keepcase.*: test for KEEPCASE
          needaffix*.* (former pseudoroot*.* tests)
          nosuggest.*: test for NOSUGGEST

2005-09-19 Németh László <nemethl@gyorsposta.hu>:
	* src/hunspell/suggestmgr.cxx: improved ngram suggestion:
        - detect not neighboring swap characters (pernament -> permanent)
          Rationale: ngram method has a significant error with not neighboring
          swap characters, especially when swap is in the middle of the word.
        - suggest uppercase forms (unesco -> UNESCO, siggraph's -> SIGGRAPH's)
        - suggest only ngram swap character and uppercase form, if they exist.
          Rationale: swap character and casing equivalence give mutch better
          suggestions as any other (weighted) ngram suggestions.
        - add uppercase suggestion (PERMENANT -> PERMANENT)
          
        * src/hunspell/*: complete comparison with MySpell 3.2 (in OOo beta 2):
        - affixmgr.cxx: add missing numrep initialization
        - hashmgr.cxx: add_word(): don't allocate temporary records
        - hunspell.cxx: in suggest():
          - check capitalized words first (better sug. order for proper names),
          - check pSMgr->suggest() return value
          - set pSMgr->suggest() call to not optional in HUHCAP
        - csutil.cxx: fix bad KOI8-U -> koi8r_tbl reference in enc_entry encds
	- csutil.cxx: fix casing data in ISO 8859-2, Windows 1251 and KOI8-U
          encoding tables. Bug reported by Dmitri Gabinski.

        * src/hunspell/affixmgr.*: improved compound word and other features
        - generalize hu_HU specific compound word features with new affix file
          parameters, suggested by Bram Moolenaar:
        - CHECKCOMPOUNDDUP: forbid word duplication in compounds (eg. foo|foo)
        - CHECKCOMPOUNDTRIPLE: forbid triple letters in compounds (eg. foo|obar)
        - CHECKCOMPOUNDPATTERN: forbid patterns at word bounds in compounds
        - CHECKCOMPOUNDREP: using REP replacement table, forbid presumably bad
          compounds (useful for languages with unlimited number of compounds)
        - ONLYINCOMPOUND flag works also with words (see tests/onlyincompound.*)
          Suggested by Daniel Naber, Björn Jacke, Trón Viktor & Bram Moolenaar.
        - PSEUDOROOT works also with prefixes and prefix + suffix combinations
          (see tests/pseudoroot5.*). Suggested by Trón Viktor.
        - man/hunspell.4: updated man page

        * src/hunspell/affixmgr.*: fix incomplete prefix handling with twofold
          suffixes (delete unnecessary contclasses[] conditions in
          prefix_check_twosfx() and prefix_check_twosfx_morph()).
          Bug reported by Trón Viktor.
        
        * src/hunspell/affixmgr.*: complete also *_morph() functions with
          conditions of new Hunspell features (circumfix, pseudoroot etc.).

        * src/hunspell/suggestmgr.cxx:
        - fix missing suggestions for words with crossed prefix and suffix
        - fix redundant non compound word checking
        - fix losing suggestions problem. Bug reported by Dmitri Gabinski.

        * src/hunspell/dictmgr.*:
        - add new dictionary manager for Hunspell UNO modul
          Problems with eo_ANY Esperanto locale reported by Dmitri Gabinski.

	* src/hunspell/*: use precise constant sizes for 8-bit and 16-bit character
          arrays with MAXWORDUTF8LEN and MAXSWUTF8L macros.

	* src/hunspell/affixmgr.cxx: fix bad MAXNGRAMSUGS parameter handling

	* src/hunspell/affixmgr.cxx, src/tools/{un}munch.*: fix GCC 4.0 warnings
          on fgets(), reported by Dvornik László

	* po/hu.po: improved translation by Dvornik László

        * tests/test.sh: improved test environment
        - add suggestion testing (see tests/*.sug)
        - add memory debugging environment, based on the excellent Valgrind debugger.
          Usage on Linux and experimental platforms of Valgrind:
          VALGRIND=memcheck make check
        - rename test_hunmorph to test.sh

        * tests/*: new tests:
        - base.*: base example based on MySpell's checkme.lst.
        - map{,utf}.*, rep{,utf}: MAP and REP suggestion examples
        - tests on new CHECKCOMPOUND, ONLYINCOMPOUND and PSEUDOROOT features
        - i54633.*: capitalized suggestion test for Issue 54633 from OOo's Issuezilla
        - i35725.*: improved ngram suggestion test for Issue 35725

2005-08-26 Németh László <nemethl@gyorsposta.hu>:
improvements:

	* src/hunspell/suggestmgr.cxx:
	  Unicode support in related character map suggestion
	
	* src/hunspell/suggestmgr.cxx: Unicode support in ngram suggestion

	* src/hunspell/{suggestmgr,affixmgr,hunspell}.cxx: improve ngram suggestion.
	  Fix http://qa.openoffice.org/issues/show_bug.cgi?id=35725. See release
          notes for examples. This problem reported by beccablain at OOo.
        - ngram suggestions now are case insensitive (see `Permenant' bug in Issuezilla)
        - weight ngram suggestions (with the longest common subsequent algorithm,
          also considering lengths of bad word and suggestion, identical first
          letters and almost completely identical character positions)
        - set strict affix congruency in expand_rootword(). Now ngram suggestions
          are good for languages with rich morphology and also better for English.
          Rationale: affixed forms of the first ngram suggestion
          very often suppress the second and subsequent root word suggestions. But
          faults in affixes are more uncommon, and can be fix without suggestions.
          We must prefer the more informative second and subsequent root word
          suggestions instead of the suggestions for bad affixes.
        - a better suggestion may not be substring of a less good suggestion
	  Rationale: Suggesting affixed forms of a root word is
          unnecessary, when root word has got better weighted ngram value.
          (Checking substrings is a good approximation for this refinement.)
	- lesser ngram suggestions (default 3 maximum instead of 10)
          Rationale: For users need a big extra effort to check a lot of bad ngram
          suggestions, nine times out of ten unnecessarily. It is very
          distracting, because ngram suggestions could be very different.
          Usually Myspell and Hunspell suggest one or two suggestions with
          the old suggestion algorithms (maximum is 15), with ngram algorithm
          often gives maximum number suggestions. With strict affix congruency
          and other refinements, the good suggestion there is usually among the
          first three elements.
	- new affix parameter: MAXNGRAMSUG

        * src/hunspell/*: support agglutinative languages with rich prefix
	  morphology or with right-to-left writing system (for example, Turkic
	  and Austronesian languages with (modified) Arabic scripts).
	- new affix parameter: COMPLEXPREFIXES
	  Set twofold prefix stripping (but single suffix stripping)
	* src/hunspell/affixmgr.cxx:
	- speed up prefix loading with tree sorting algorithm.
	* tests/complexprefixes.*, tests/complexprefixesutf.*:
	  Coptic example posted by Moheb Mekhaiel

	* src/hunspell/hashmgr.cxx: check size attribute in dic file
	  suggested by Daniel Naber
	  Rationale: With missing size attribute Hunspell allocates too small and
          more slower hash memory, and Hunspell can lose first dictionary word.

	* src/hunspell/affixmgr.cxx: check stripping characters and condition
	  compatibility in affix rules (bugs detected in cs_CZ, es_ES, es_NEW,
          es_MX, lt_LT, nn_NO, pt_PT, ro_RO and sk_SK dictionaries). See release
          notes of Hunspell 1.0.9 in NEWS.

	* src/hunspell/affixmgr.cxx: check unnecessary fields in affix rules
          (bugs detected in ro_RO and sv_SE dictionaries). See release notes.

	* src/hunspell/affixmgr.cxx: remove redundant condition checking
	  in affix rules with stripping characters (redundancy in OpenOffice.org
	  dictionaries reported by Eleonóra Goldman)
          Rationale: this is a little optimization, but it was excellent for
          detect the bad ngram affixation with bad or weak affix conditions.

	* tests/germancompounding.aff: improve compound definition
	- use dash prefix instead of language specific tokenizer
          Rationale: Using uniform approach is the right way to check and analyze
	  compound words. Language specific word breaking is deprecated, need
          a sophisticated grammar checking for word-like word pairs
          (for example in Hungarian there is a substandard, but accepted
          syntax with dash for word pairs: cats, dogs -> kutyák-macskák (like
          cats/dogs in English).

	* test Hunspell with 54 OpenOffice.org dictionaries: see release notes

bug fixes:

	* src/hunspell/suggestmgr.*: add time limit to exponential
	  algorithm of the related character map suggestion
	  Rationale: a long word in agglutinative languages or a special pattern
          (for example a horizontal rule) made of map characters can `crash' the
          spell checker.

        * src/hunspell/affentry.cxx: add() functions: fix bad word generation
          checking stripping characters (see similar bug in unmunch)

	* src/hunspell/affixmgr.cxx: parse_file(): fix unconditional getNext()
	  call for ~AffixMgr() when affix file is corrupt.

	* src/hunspell/affixmgr.*: AffixMgr(), parse_cpdsyllable(): fix missing
	  string duplications for ~AffixMgr() when affix file is corrupt.

	* src/hunspell/affixmgr.*: parse_affix(): fix fprintf() call when affix
	  file is corrupt. Bug reported by Daniel Naber.

	* suggestmgr.cxx: replace single usage of 'strdup' with 'mystrdup'
	  patch by Chris Halls (debian.org)

	* src/hunspell/makefile.mk: add makefile.mk for compiling in OpenOffice.org
	  See README in Hunspell UNO modul.
	  Problems with separated compiling reported by Rene Engelhard

	* src/hunspell/hunspell.cxx: fix pseudoroot support
	- search a not pseudoroot homonym in check()
	* tests/pseudoroot4.*: test this fix

	* src/tools/unmunch.c: fix bad word generation when conditions
	  are shorter or incompatible with stripping characters in affix rules

	* src/tools/unmunch.c: fix mychomp() for de_AT.dic and other dic files
	  without last new line character.

other changes:
	* src/hunspell/suggestmgr.*: erase ACCENT suggestion
	  Rationale: ACCENT suggestion was the same as Kevin Hendrick's map
	  suggestion algorithm, but with a less good interface in affix file.

	* src/hunspell/suggestmgr.*: combine cycle number limit
	  in badchar(), and forgotchar() with a time limit.

	* src/hunspell/affixmgr.*: remove NOMAPSUGS affix parameter

	* src/hunspell/{suggestmgr,hunspell}.*: strip periods from
	  suggestions (restore MySpell's original behaviour)
	  Rationale: OpenOffice.org has an automatic period handling mechanism
	  and suggestions look better without periods.
	- new affix file parameter: SUGSWITHDOTS
	  Add period(s) to suggestions, if input word terminates in period(s).
          (No need for OpenOffice.org dictionaries.)

	* tests/germancompounding.aff: improve bad german affix in affix example
	  (computeren->computern). Suggested by Daniel Naber.

	* src/tools/example.cxx: add Myspell's example

	* src/tools/munch.cxx: add Myspell's munch

	* man{,/hu}/hunspell.4: refresh manual pages

2005-08-01 Németh László <nemethl@gyorsposta.hu>:
	* add missing MySpell files and features:
        - add MySpell license.readme, README and CONTRIBUTORS ({license,README,AUTHORS}.myspell)
        - add MySpell unmunch program (src/tools/unmunch.c)
        - add licenses to source (src/hunspell/license.{myspell,hunspell})
        - port MAP suggestion (with imperfect UTF-8 support)
        - add NOSPLITSUGS affix parameter
        - add NOMAPSUGS affix parameter

        * src/man/man.4: MAP, COMPOUNDPERMITFLAG, NOSPLITSUGS, NOMAPSUGS

	* src/hunspell/aff{entry,ixmgr}.cxx:
          - improve compound word support
	  - new affix parameter: COMPOUNDPERMITFLAG (see manual)
	* src/tests/compoundaffix{,2}.*: examples for COMPOUNDPERMITFLAG
	* src/tests/germancompounding.*: new solution for German compounding
	  Problems with German compounding reported by Daniel Naber

        * src/hunspell/hunspell.cxx: fix German uppercase word spelling
          with the spellsharps() recursive algorithm.
	  Default recursive depth is 5 (MAXSHARPS).
	* src/tests/germansharps*: extended German sharp s tests

	* src/tools/hunspell.cxx: fix fatal memory bug in non-interactive
          subshells without HOME environmental variable
	  Bug detected with PHP by András Izsók.

2005-07-22 Németh László <nemethl@gyorsposta.hu>:
	* src/hunspell/csutil.hxx: utf16_u8()
        - fix 3-byte UTF-8 character conversion

2005-07-21 Németh László <nemethl@gyorsposta.hu>:
	* src/hunspell/csutil.hxx: hunspell_version() for OOo UNO modul

2005-07-19 Németh László <nemethl@gyorsposta.hu>:
        * renaming:
          - src/morphbase -> src/hunspell
          - src/hunspell, src/hunmorph -> src/tools
          - src/huntokens -> src/parsers

        * src/tools/hunstem.cxx: add stemmer example

2005-07-18 Németh László <nemethl@gyorsposta.hu>:
        * configure.ac: --with-ui, --with-readline configure options
        * src/hunspell/hunspell.cxx: fix conditional compiling

        * src/hunspell/hunspell.cxx: set HunSPELL.bak temporaly file
          in the same dictionary with the checked file.

        * src/morphbase/morphbase.cxx:

            - handling German sharp s (ß)

            - fix (temporaly) analyize()

        * tests: a lot of new tests

	* po/, intl/, m4/: add gettext from GNU hello
	
	* po/hu.po: add Hungarian translation

	* doc/, man/: rename doc to man

2005-07-04 Németh László <nemethl@gyorsposta.hu>:
        * src/morphbase/hashmgr.cxx: set FLAG attributum instead of FLAG_NUM and FLAG_LONG
        
        * doc/hunspell.4: manual in English

2005-06-30 Németh László <nemethl@gyorsposta.hu>:
	* src/morphbase/csutil.cxx: add character tables from csutil.cxx of OOo 1.1.4

	* src/morphbase/affentry.cxx: fix Unicode condition checking

	* tests/{,utf}compound.*: tests compounding

2005-06-27 Németh László <nemethl@gyorsposta.hu>:
	* src/morphbase/*: fix Unicode compound handling

2005-06-23 Halácsy Péter:
        * src/hunmorph/hunmorph.cxx: delete spelling error message and suggest_auto() call

2005-06-21 Németh László <nemethl@gyorsposta.hu>:
        * src/morphbase: Unicode support
        * tests/utf8.*: SET UTF-8 test
        
        * src/morphbase: checking and fixing with Valgrind
        Memory handling error reported by Ferenc Szidarovszky

2005-05-26  Németh László <nemethl@gyorsposta.hu>:
	* suggestmgr.cxx: fix stemming
	* AUTHORS, COPYING, ChangeLog: set CC-LGPL free software license

2004-05-25  Varga Dániel  <daniel@all.hu>
	* src/stemtool: new subproject

2005-05-25  Halácsy Péter  <peter@halacsy.com>
	* AUTHORS, COPYING: set CC Attribution license

2004-05-23  Varga Dániel  <daniel@all.hu>
	* src: - modifications for compiling with Visual C++
	
	* src/hunmorph/csutil.cxx: correcting header of flag_qsort(),
	* src/hunmorph/*: correct csutil include

2005-05-19  Németh László <nemethl@gyorsposta.hu>
	* csutil.cxx: fix loop condition in lineuniq()
        bug reported by Viktor Nagy (nagyv nyelvtud hu).
	
	* morphbase.cxx: handle PSEUDOROOT with zero affixes
        bug reported by Viktor Nagy (nagyv nyelvtud hu).
	* tests/zeroaffix.*: add zeroaffix tests

2005-04-09  Németh László <nemethl@gyorsposta.hu>
	* config.h.in: reset with autoheader
	
	* src/hunspell/hunspell.cxx: set version

2005-04-06  Németh László <nemethl@gyorsposta.hu>
        * tests: tests
        
        * src/morphbase:
        New optional parameters in affix file:
        - PSEUDOROOT: for forbidding root with not forbidden suffixed forms.
        - COMPOUNDWORDMAX: max. words in compounds (default is no limit)
        - COMPOUNDROOT: signs compounds in dictionary for handling special compound rules
        - remove COMPOUNDWORD, ONLYROOT

2005-03-21  Németh László <nemethl@gyorsposta.hu>
	* src/morphbase/*:
        - 2-byte flags, FLAG_NUM, FLAG_LONG 
        - CIRCUMFIX: signed suffixes and prefixes can only occur together
        - ONLYINCOMPOUND for fogemorpheme (Swedish, Danish) or Flute-elements (German)
        - COMPOUNDBEGIN: allow signed roots, and roots with signed suffix in begin of compounds
        - COMPOUNDMIDDLE: like before, but middle of compounds
        - COMPOUNDEND: like before, but end of compounds
        - remove COMPOUNDFIRST, COMPOUNDLAST
