The XMLmind Spell Checker component comes with a command-line tool called the``Dictionary Builder'' that allows creating compiled dictionaries from plain word lists.
To build a compiled dictionary, the builder takes as input the following data:
An expanded word list, which is a simple text file with words separated by whitespace (space, tabs, or newlines). Actually, several such lists can be accepted by the builder: they are simply merged.
A hints file that defines language-specific properties, plus hints for the spell checker engine to work in a smarter way.
The syntax and the semantics of hints files are described below.
It is also possible to specify common prefixes (a word list file), a set of prefixes that can be prepended to any word (if the AllowPrefixes
option is enabled). For example, if you define "super-" as a common prefix, any legal word prefixed with "super-" will be accepted. This is a feature to use cautiously, therefore it can be disabled by an option.
Additionally, the builder accepts another word list file that defines the most frequent words in the language. This is used to "boost" those frequent words toward the top of the suggestion lists, and thus provides statistically better suggestions.