Tuesday, March 27, 2012

File Sizes vs Disk Space Used

This week a friend of mine complained about an Android app consuming much more memory than what he thought should be necessary. It was eating up over 30MB of his memory card space when most comparable apps use far less.

He sent me a copy of the files so I could take a look.

The first thing I noticed was that running the du command on my version of the files showed that they only consumed about 4MB of space. Far less than the 30MB he reported. So why the dramatic difference?

We tend to forget (or at least I do) that filesystems store files in fixed sized blocks (not in bytes).

If the file system's block size is 2K (2048 bytes) than a 1 byte file will physically consume 2048 bytes of disk space. That is space which is no longer available to other files or applications.

The default block size on HFS (OS X / iOS) is 4K (4096 bytes).
My friend had a FAT32 memory card with a block size of 32K (32768 bytes). (*)

Now the files consisted of roughly 1000 images each of which is approximately 1K of data. The total data size then is about 1MB.

But when storing as separate files in whole blocks you get waste which increases as block size increases:

Number of 1K images * block size = total disk space consumed
1000 * 4KB = 3.8 MB
1000 * 32KB = 31.2 MB

Something to keep in mind when considering tradeoffs of how you store data.

References:
(*) According to Microsoft FAT32 partitions that are 32GB or higher have a default block size of 32K. (Note that a block is called a "cluster" in FAT terminology).


No comments: