The word count command (wc) is a utility that will count the bytes, characters, words, or lines of standard input, a file, or several files.

Without passing any options, wc returns line, word and byte counts for its input.  Say we have a file called foo.txt with the following contents:

foo
foo

If we run wc against this file we will get:

$ wc foo.txt
2 2 8 foo.txt

Notice, however that the byte count is two bytes too many for what you’d expect. A quick glance at the content of the file tells us that there are only 6 characters, and therefore only 6 bytes in the file. Even if we try to run wc with the character specific option “-m”, we get the same:

$ wc -m foo.txt
8 foo.txt

The same thing happens whether we use the “-c” option for bytes or the “-m” option for characters. This is because of the unseen newline character at the end of each line. Even if we echo a line of text to wc, we get the same effect.

$ echo foobar | wc -m
7

If you want a true count of the characters, you have to take the newlines into account. We can remove the newlines from echo easily enough with tr:

$ echo foobar | tr -d '\n' | wc -m
6

Actually, we can do it even easier, by using echo‘s “-n” option which removes the newline characters from standard output:

$ echo -n foobar | wc -m
6

But if you want to pass a file to wc and get an output that doesn’t take the newlines into account, you’re probably stuck using tr

$ cat foo.txt | tr -d '\n' | wc -m
6

Advertisements