This project aims to recreate the classic Unix command line tool wc
(word count). The Unix wc
tool is a powerful utility used for counting the number of lines, words, characters, and bytes in text files. This project follows the Unix philosophy of writing simple parts connected by clean interfaces, resulting in a tool that does one thing but does it well.
- Count the number of bytes in a file (
-c
option). - Count the number of lines in a file (
-l
option). - Count the number of words in a file (
-w
option). - Count the number of characters in a file (
-m
option), with support for multibyte characters depending on the locale. - Default behavior without options to count lines, words, and bytes.
- Ability to read from standard input if no filename is specified, enabling integration with other command line tools.
- Any standard IDE or text editor for Python development (e.g., Visual Studio Code, PyCharm).
- Python 3.x installed on your system.
- Clone the repository to your local machine.
- Save a text file (e.g.,
test.txt
) in the project directory to test the functionalities of the tool.
Run the tool from the command line using the Python interpreter. Here are some example commands:
- Count bytes:
python ccwc.py -c test.txt
- Count lines:
python ccwc.py -l test.txt
- Count words:
python ccwc.py -w test.txt
- Count characters:
python ccwc.py -m test.txt
- Default count (lines, words, bytes):
python ccwc.py test.txt
- Read from stdin:
cat test.txt | python ccwc.py -l
The development of this tool was divided into several steps, each focusing on adding a specific feature:
- Step One: Implement byte count functionality.
- Step Two: Add support for counting lines.
- Step Three: Introduce word count capability.
- Step Four: Implement character count, considering multibyte characters.
- Step Five: Combine byte, line, and word count for default option.
- Final Step: Enable reading from standard input.