+34 93 244 08 80

WooRank assistant

What's in a word

The trouble with word counts

What's in a word

If I'm honest, I have to admit that I was very tempted to call this post “f*** words!”. Why? Because I'm a translator and translators are paid by the word. And that makes me angry every time I think about it.

What is a word anyway?

Easy to answer, says my friend who is a linguist, “a word is the smallest element that may be uttered in isolation with semantic or pragmatic content”. A wonderful definition, but the trouble starts when you try to count the little buggers, because they come in all shapes and sizes. Let’s look at a typical example of a text, the like of which as a technical translation company, we translate by the million every year. Just a warning, it is not an easy text to read, although it might be of use as bedtime reading if you have trouble falling asleep! In any case, you might find that you have to read it over and over before it reveals its intricate meaning. This particular text is an extract from a car manufacturer’s spare parts catalogue:





In addition to the semantic difficulties it presents, how you count the words of such a text is of vital importance as it has a big impact on what we as a translation company get paid for translating it into another language. According to MS Word, this text has 26 words. However, if you put a space after all the characters in the text that are squeezed in as separators between words (plus signs, commas) or used as elements to mark omissions (e.g. the slash in “O/Guide” for “operator’s guide”), the word count increases dramatically to 50 words!

This is just one example, but it is very telling in terms of how word counts have no relation whatsoever to the work required to translate the text. Nevertheless, the arbitrary use of separators or other characters between words is only part of the problem.

Why do they all count differently?

If you have ever wondered why word counts differ between different translation service providers, it may be linked to certain phenomena that appear in written language:

  • hyphenated words: computer-assisted
  • contractions: do not vs don’t
  • abbreviations, acronyms, units of measurement: e.m.p., f.s.a., q.d.a.m., mg/dL
  • numbers, dates, mathematical formulas, etc.

These words are counted differently depending on the word processing application or computer-assisted translation system used for quoting and there is actually no standard that prescribes how they must be counted. But hang on, it gets even more confusing.


If you are learning German, you may have come across the word for Danube steamship company captain, as an example of how long some German words can get. This one clocks in at 29 letters, but it is not just agglutinative languages like German, Hungarian, Finnish or Farsi that allow for the creation of long words. Take the famous “Supercalifragilisticexpialidocious” (34), or the wonderful “Antidisestablishmentarianism” (28) as proof that English can produce long words too. To cut a long story short, in certain languages and especially in technical texts, compound terms abound, but when it comes to paying the translator, they are only paid as one word. That is if, as is commonly the case nowadays, the word count is based on the source language.

It’s characters we need

There are other, more accurate ways to measure the work of a translator. One of them is the character or line count. A character is a character is a character, i.e. there is not much room for interpretation. The only question is if blank spaces should count as characters, and the answer is a straight “yes, they should!”. After all, it takes the same effort to press the spacebar on the keyboard as it does to type any other character. Do you know what the most widely used machine translation engine, Google Translate, uses as its invoicing unit? Yes, you're right, it is characters, not some shoddy word count.

In fact, invoicing by character was the prevailing method for translation invoicing in German-speaking countries before the Anglo-Saxon “furlong per fortnight” word counting system started to gain ground. This was caused by some of the bigger translation companies from the UK and US gaining a share in the German-speaking market and customers wanting to be able to compare the quotes of the newcomers to those of the older companies. In any case, the move to word counts is a terrible regression. It’s like going back to measuring liquids by kilderkin or textiles by the ell.

It seems odd that an industry has adopted a non-standardised, arbitrary, undefinable means of quantifying the service it provides to its customers. Are there hidden, vested interests? Is this a conspiracy? I don’t know, but I think we would all be better off – language providers and customers - if we left furlongs and firkins in the past and started to use characters instead.

Nothing is perfect

Counting by character is not the answer to everything when it comes to measuring the work that goes into translation. Both word and character prices are a means of measuring an aggregate service: project management, translation & revision, proof-of-concept, terminology management, quality assurance, etc. All of these activities or just some of them may be needed to produce a translation. As a buyer, you should always make sure you know what service level you are paying for when you compare quotes.*

However, counting characters instead of words would mean we gain in transparency, accountability and comparability in the industry. And that you could call progress, right?

* Learn more about translation service scopes in our blog post The best translation company in the world.


Leave a comment