Kilobytes, Kibibytes, and Why 1 TB ≈ 931 GiB Transcript

Transcript (PDF)

You just got yourself a shiny new 1 terabyte  hard disk drive. You’re all excited, until you find it says in Windows that there are only 931 GB. What the funk is going on?

In most of science, the standard convention is to think in terms of powers of 10. So under the Système International d’Unités or SI, the prefix kilo denotes 10³ or 1,000. The prefix mega is 10⁶ or a million; giga is 10⁹ or a billion; tera is 10¹² or a trillion. And so on and so forth.[1]

Back in the day, when some computer dudes first started talking about the kilobyte, they somehow preferred to think in terms of powers of 2. And so by kilobyte, they did NOT mean 1,000 bytes, as would have been the case, if they had simply obeyed the SI convention. Instead, by kilobyte, what they meant was 2¹⁰ or 1,024 bytes.[2]

Similarly, a megabyte meant 2²⁰ bytes and not a million bytes. A gigabyte meant 2³⁰ bytes and not a billion bytes. And so on and so forth.

At the same time, just to keep things confusing, there was also a small minority who preferred to stick with the SI convention and so take the kilobyte to mean 1,000 bytes, a megabyte to mean a million bytes, and so on and so forth.

After several decades of confusion, a new international convention was finally introduced in December 1998.[3]

Henceforth, a kilobyte would refer strictly to 1,000 bytes, in accordance with the SI. A new unit, called the kibibyte, or KiB in symbols, would be 1,024 bytes.

Similarly, a megabyte would refer strictly to 1,000 kilobytes or 1 million bytes, while a mebibyte, or MiB in symbols, would be 1,024 kibibytes or 2²⁰ bytes. And so on and so forth.

And so from 1998 onwards, everyone in the world strictly followed the new international convention and everyone lived happily ever after. JUST KIDDING. LOL.

Chaos and confusion reign to this very day. This is because adoption of the new international convention has been mixed. There are many who still cling on stubbornly to the old convention. “KB is 1024 bytes, damnit.” “Long live KB1024.”[4]

Amongst these diehards, we have Microsoft Windows, which continues to use the symbols GB to mean 2³⁰ bytes and TB to mean 2⁴⁰ bytes.[5]

In contrast, hard-disk manufacturers are only too happy to use the new standard, with GB and TB meaning 10⁹ and 10¹² bytes. This is because when they advertise their hard disk drives as having 1 TB of capacity, some of you will likely be misled into believing that you’re getting 1 tebibyte or 2⁴⁰ bytes, when in fact you’re really getting only about 90% of that.

Indeed, in the US, there have already been at least three class action lawsuits on this very matter. As usual, the only big winners from the lawsuits were the lawyers.[6] The consumer has gotten nothing, except some lousy fine print somewhere clarifying that 1 gigabyte is a billion bytes and 1 terabyte is a trillion bytes.[7]

So, now you know why your 1 terabyte hard disk has only 931 GB. The reason is that your hard disk manufacturer considers 1 terabyte to be a trillion bytes, whereas Windows considers 1GB to be 2³⁰ bytes. And hence, 1 trillion bytes is only about 931 of the GB that Windows is referring to.

“La conclusion! What exactly is a kilobyte? Is it 1,000 bytes or 1,024 bytes?”

It’s jolly simple really. If you’re an old geezer and you have only two fingers, then you can continue to insist that a kilobyte is 1,024 bytes.

For everyone else, please go with the times. A kilobyte is 1,000 bytes. And a kibibyte is 1,024 bytes.

Footnotes

[1] Le Système International d’Unités (The International System of Units), 8th edition, 2006 [2014 update], p. 121. PDF from BIPM webpage [backup]. Although it had many precedents (dating back to at least the French Revolution), the SI officially began only in 1960. New editions occasionally add new decimal prefixes. For example, yotta was added only in 1991.

[2] See Bruce Barrow, “A Lesson in Megabytes”, IEEE Standards Bearer, January 1997 [PDF].

[3] According to a US National Institute of Standards and Technology webpage [PDF backup], “In December 1998 the International Electrotechnical Commission (IEC), the leading international organization for worldwide standardization in electrotechnology, approved as an IEC International Standard names and symbols for prefixes for binary multiples for use in the fields of data processing and data transmission.”

[4] These quotes are from the Stackoverflow question “Do you use ‘kibibyte’ as a unit of measurement in your programs?” [PDF backup]. The heated question, answers, and comments on this page are indicative of the widespread disagreement over whether 1 kB = 1,024 B or 1,000 B.

[5] Here’s a 2009 blogpost [PDF backup] by Raymond Chen (some Microsoft employee) rationalizing Microsoft’s decision to ignore the new convention.

[6] It is stated in Vroegh v. Eastman Kodak (Cal.App.1st, 2007) [PDF backup] that the trial court had “awarded plaintiffs’ counsel $2,377,998.61 in fees and costs.” In Cho v. Seagate (177 Cal.App.4th 734, 2009) [PDF backup], Seagate “agreed not to challenge an application for attorney fees of up to $1.75 million, costs of up to $35,500, and an incentive fee of $5,000 for Cho.” In Safier v. Western Digital (Case No. 05-03353 BZ., N.D. Cal. Mar 17, 2006) [PDF backup], class counsel’s attorneys were awarded “fees of up to $485,000 and expenses of up to $15,000”.

[7] This was one of the settlement benefits, at least in Safier v. Western Digital (ibid).

 

 

Advertisements