The word count command (wc) is a utility that will count the bytes, characters, words, or lines of standard input, a file, or several files.
Without passing any options, wc returns line, word and byte counts for its input. Say we have a file called foo.txt with the following contents:
foo
foo
If we run wc against this file we will get:
$ wc foo.txt
2 2 8 foo.txt
Notice, however that the byte count is two bytes too many for what you’d expect. A quick glance at the content of the file tells us that there are only 6 characters, and therefore only 6 bytes in the file. Even if we try to run wc with the character specific option “-m”, we get the same:
$ wc -m foo.txt
8 foo.txt
The same thing happens whether we use the “-c” option for bytes or the “-m” option for characters. This is because of the unseen newline character at the end of each line. Even if we echo a line of text to wc, we get the same effect.
$ echo foobar | wc -m
7
If you want a true count of the characters, you have to take the newlines into account. We can remove the newlines from echo easily enough with tr:
$ echo foobar | tr -d '\n' | wc -m
6
Actually, we can do it even easier, by using echo‘s “-n” option which removes the newline characters from standard output:
$ echo -n foobar | wc -m
6
But if you want to pass a file to wc and get an output that doesn’t take the newlines into account, you’re probably stuck using tr
$ cat foo.txt | tr -d '\n' | wc -m
6
Recent Comments