Spell checking a file using command line, non-interactively
Asked Answered
S

2

14

I have a large text file, containing many miss/bad-spelled English words. I'm looking for a way to edit this file using a command-line spell checker in Linux. I found some ways to do this, But according to my searches all of them work in an interactive manner. I mean, seeing a miss/bad-spelled word, they suggest some corrections to the user and he/she should choose one of them. Since my file is rather large, and contains many wrong words, I can't edit it in this manner. I am looking for a way to tell the spell-checker that replace all the wrong words using the first candidate. Is there any way to do this? does (a/hun)spell have any option for doing so?

Regards.

Statfarad answered 17/9, 2012 at 4:37 Comment(9)
GNU emacs spell checking mode seems to fit the bill since you can replace all misspelled occurrences at once.Radiography
So, I have to open the file in emacs?Statfarad
May I open a 200MB file in emacs and do spell-checking without any problem?Statfarad
Yes you can (assuming you have several gigabytes of RAM, and a recent emacs).Radiography
and may I add and use my own dictionary? I mean, is it possible to feed emacs using a user-developed dictionary, and want emacs to also use it?Statfarad
Yes you can add your own dictionary.Radiography
How I can add my own dictionary?Statfarad
Auto-correcting non-interactively is not supported by sensible spell-checkers because it would just replace the (misspelled) correct words with perfectly spelled nonsense. Who would want that? Or is this for some prank, or some "reductio ad absurdum" demonstration?Ranjiv
Did my answer below answer your question? Any comments? I'm asking because you didn't accept any answer.Rosol
I
8

If you don't need it to replace every wrong word, but simply point out the errors and print suggestions in a non-interactive manner, you can use ispell:

$ ispell -a < file.txt | grep ^\& > errors.txt

I'm unfortunately not aware of any standard Linux utility that does what you're requesting from the command line, although the emacs suggestion in the comments above comes close.

Impolicy answered 9/3, 2014 at 16:33 Comment(0)
R
7

You can experiment with commands like these:

yes 0 | script -c 'ispell text.txt' /dev/null

or:

yes 1 | script -c 'aspell check text.txt' /dev/null

But keep in mind that the results can be poor even for simple things:

$ echo The quik broown fox jmps over the laazy dogg > text.txt
$ yes 0 | script -c 'ispell text.txt' /dev/null
Script started, file is /dev/null
Script done, file is /dev/null
$ cat text.txt
The quick brown fox amps over the lazy dog

It seems to be even worse with aspell so probably it's better to go with ispell.

You need the script command because some commands like ispell doesn't want to be scripted. Normally you would pipe the output of yes 0 to a command to simulate hitting the "0" key all the time but some commands detect being scripted and refuse to cooperate:

$ yes 0 | ispell text.txt
Can't deal with non-interactive use yet.

Fortunately they can be fooled with the script command:

$ yes 0 | script -c 'ispell text.txt' /dev/null
Script started, file is /dev/null
Script done, file is /dev/null

You can use other file than /dev/null to log the output:

$ yes 0 | script -c 'ispell text.txt' out.txt
Script started, file is out.txt
Script done, file is out.txt
$ cat out.txt 
Script started on Tue 02 Feb 2016 09:58:09 PM CET

Script done on Tue 02 Feb 2016 09:58:09 PM CET
Rosol answered 2/2, 2016 at 20:59 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.