How to make custom dictionary for Hunspell [closed]
Asked Answered
C

5

31

I have a question about building a custom dictionary for hunspell. I'm using a general English dictionary and affix file right now. How can I add user-specified words to that dictionary for each of my users?

Challah answered 26/9, 2011 at 21:38 Comment(2)
Just for reference for those who are looking for a start: github.com/karandesai28/…Dissenter
Switch to Aspell. It looks a lot better documented. After the poor selection of answers to your question and almost nothing on the web I am switching...Evenhanded
N
19

create your own word-list and affix file for your language, if that doesn't exist. Well, for papiamentu - Curaçao's native language - such dictionary doesn't exist. But I had a hard time finding out how to create such files, so I am documenting it here: http://www.suares.com/index.php?page_id=25&news_id=233

Nigeria answered 20/2, 2013 at 15:29 Comment(2)
Hey cara @waldir A great job you're doing, can you please explain in more detaill the "frequency list of characters", what is the input file and what the output one, I mean is "words" corresponding to the words list file and where should I put the results, under what name, this part is not clear, what is better the first method or the second?Klara
@AndrésChandía I didn't write this answer, I just edited it to fix the markdown. You should contact the original writer of this answer instead (user1250098). Try here: suares.com/index.php?topic=contactSklar
K
6

I'm trying to do the same but haven't found enough information to begin yet.

However, you may want to look at hunspell - format of Hunspell dictionaries and affix files .

UPDATE

If you are working with .NET, you can download Hunspell .NET port. Using it is fairly easy too.

var bee = new Hunspell();
bee.Load("path_to_en_US.aff");
bee.Load("path_to_en_US.dic");
bee.Add("my_custom_word1");
bee.Add("my_custom_word2");
var suggestions = bee.Suggest("misspel_word");
Kiefer answered 22/12, 2011 at 23:47 Comment(1)
can we process dictionary files somehow? i mean arabic is too complex for me to solve but i need to get all words and related words of the dicOutbreak
S
4

The secret to getting hunspell to work (at least for me) was to figure out the locations it would search that were owned by me, and put the custom dictionaries there. Also bear in mind that the dictionaries are in a specific format, so you need to obey those rules.

Running hunspell -D will show you the search path. On MacOS, mine includes /Users/scott/Library/Spelling so I created that directory and put mine there. Let's say you want to call your dictionary mydict and your input datafile of words is called dict.txt. We'll use the path I just showed.

First, copy the default .aff file. You will see it when you run hunspell -D as described above. For me, it's in /Library/Spelling/en_US/. So

cp /Library/Spelling/en_US.aff /Users/scott/Library/Spelling/mydict.aff

Then, every time you update your input list (dict.txt), do this:

DICT=/Users/scott/Library/Spelling/mydict.dic
cd ~/doc/dict
cat dict.txt | sort | uniq > dict.in
wc -l dict.in > $DICT
cat dict.in >> $DICT
rm dict.in

To run hunspell, just specify both dictionaries. So for me, because I want a list of misspellings, I use

hunspell -l -d scott,en_US <filename>
Shoreless answered 9/6, 2018 at 16:51 Comment(1)
You can use the -p option and you only need the list of sorted words. cat dict.txt | sort -u > custom_words. Then hunspell -l -p custom_words and it will use the default dictionary, but also include the custom_words from your file. No need to copy the .aff file.Lucre
O
2

I am implementing this type of feature as well. Once you've created the Hunspell object with an associated dictionary you can add individual words to it.

Keep in mind though that these words will only be available for as long as the Hunspell object is alive. Every time you access a new object you will have to add all the user defined words again.

Openminded answered 21/3, 2013 at 19:16 Comment(0)
R
-1

Have a look at the documentation in openoffice

http://www.openoffice.org/lingucomponent/

specially this document http://www.openoffice.org/lingucomponent/dictionary.html

It's a good starting point

Ramentum answered 5/5, 2014 at 6:6 Comment(1)
I didn't down vote you but I do want to point out that when a user (and this is coming as a programmer - and programmers also happen to be users though many programmers ignore this in their insolence but never mind that) ask for help sending them to the documentation is not what they're after. I assure you of that. Users don't care how something works as long as it works. Rather than point to them what they probably already saw give them an example i.e. something to work on. That's what they're after. Yes documentation is often ignored but that's not the point here. Not at all.Misdoing

© 2022 - 2024 — McMap. All rights reserved.