/Academic-Writing-Check

check for passive words, weasel words, duplicate words, typographical errors and words strunk & white don't like

Primary LanguagePerl

NEW

Basic Vim integration support

Emacs Integration

HOWTO run on Windows

Introduction

This script attempts to find common errors in academic writings. This is focused only on academic writing in latex, but most things should work on any ASCII text. We don't attempt to do any sort of latex parsing currently(maybe someday).

Currently the script tries to find the following issues:

  • passive : Passive voice, colored red by default.
  • dups : Duplicate words: 'the the' across 2 lines, colored purple by default
  • weasel : Weasel words like {various, many}, colored green by default
  • abbr : Wrong abbreviations like i.e and et. al., colored blue by default
  • typography: Common typography errors like \footnotes before a punctuation, numbers without comma, URLs not typeset with \url, and others. colored yellow default
  • strunk : Issues that Strunk and White refer to in their classic. Currently, only has a sublist of words from Chapter IV. colored cyan by default.

The script accepts options via the standard UNIX style:

 --no-{option} 

where {option} is one of the things in bold in the above list. The script also ignores lines beginning with a % as a helper. It outputs filename and line number with the offending issues marked in color.

You can also send in a -d to disable all checks. Checks will need to be explicitly enabled. Thus -d --abbr will only look for abbr errors.

Colors can be specified with

--{option}_color={color} --def_color={color}

where {option} is one of the options above, and {color} is one of

('black','red','green','yellow','blue','purple','cyan','white').

You can also prefix the color name with dark to get a darker shade. The def_color option sets the color of unmarked text.

Thus,

--passive_color=darkgreen

will mark passive words with dark green color.

The script can be called in multiple ways:

  • ./checkwriting <files>
  • ./checkwriting <directory> : In this case the script uses all *.tex and *.bbl files in the directory. If it doesn't find any, then it waits for input from stdin.
  • ./checkwriting : With no files, the script waits for diff style input on STDIN. I use it this way often. Say, you made some changes to the manuscript. Just do git diff | ./checkwriting and you only have to look at new errors.

Notes on the warnings

Some of the warnings are obvious, some aren't. The non-obvious ones are discussed here.

  • The typography warning: "add a @" is to let LaTeX know when its end of line. LaTeX assumes that a period ends a sentence, unless it follows a capital letter in which case it assumes that it is an abbreviation. So to let LaTex know that 'iOS.' is really end of sentence, write 'iOS@.'

Acks

The original idea and code for this came from Matt Might's blog

Here are some other links that might be useful (and might be integrated into awc someday):

And to put it all in perspective, Stephen Fry's monologue on Language

Tip: If you want to pipe the output, less -R is useful to maintain the colors.