I’ve written frequently about data here on the Tidy Bytes Blog so far, but some of you might feel unsure about what that term actually means. What is data? How is it stored? What’s the difference between a hard drive and a USB drive? What about external storage? How can I begin to identify my data if I’m not even sure what data is? The basic concepts of digital data composition and storage can easily feel arcane and unreachable to anyone who hasn’t dug into those topics.
In this week’s Tidy Tuesday post, we’ll go over some of the general ideas and terms that are important to understand for any digital organization project. After all, it’s hard to decide what to do with your data when the options don’t even make sense!
What is Data, Exactly?
Ask a scientist this question, and you’ll probably get an answer that’s too generic to do any good within the context of digital organization. For our purposes, “data” is any digital information that makes up distinct items or entities that we access through electronic means.
The list below contains some examples of data:
- A picture or video taken on a camera or smartphone
- A document written using Microsoft Word
- An email in your “Friends” folder
- A note in Evernote, Microsoft OneNote, or your smartphone’s Notes app
- An individual post on a social media site
- A bookmark in your web browser
- An appointment entered on your electronic calendar
- A file you can access through Windows Explorer, or Finder on macOS
Although we could go into even more detail from a technical standpoint, it’s not helpful to dwell there for the purpose of organization. However, we’ll still touch on it in the section below, because it’s impossible not to.
How is Data Stored, Logically?
It’s helpful to think of logical data storage in layers. As you’ll shortly see, we can ignore some of the more detailed layers when we’re focusing on digital organization.
- Layer 1: BITS
All digital data is stored fundamentally as bits of binary data, or a long stream of zeros and ones. A single “bit” is just that: a solitary 0 or 1.
- Layer 2: BYTES
If you string eight bits together, that makes a byte. The mathematically inclined among you might notice that having eight bits per byte means a single byte can only be one of 256 different possible values, i.e.
28. Why only (and exactly) 8 bits? It has to do with the efficient design of the physical hardware that makes computers good at special types of mathematics. If you’re curious, check out “Why are there 8 bits in a byte?” from Computerphile on YouTube—it’s quite entertaining.
- Layer 3: FILES
When you string a bunch of bytes together, you get what we generally referred to as a file. A file is a collection of bytes that logically belong together as the smallest useful block of data the computer (or phone or tablet) can interact with.
- Layer 4: FILESYSTEM 🠔 This is the layer we care about
Once you have enough data to make one or more files, you need some way to structure them with names, timestamps, and hopefully some kind of sensible hierarchy so you don’t just have thousands or millions of files all in a giant heap. The filesystem is the magic that makes this possible, by allowing not only file naming and identification but also multiple levels of folders for us to put files into.
There are literally hundreds of different filesystem implementations in existence, each specially designed with certain features and use cases in mind. Fortunately for us, however:
- There are only a few commonly used on modern computers, tablets, and smartphones
- Your device’s operating system (e.g. Windows, macOS, Linux, iOS, Android) usually controls (or narrowly limits) which filesystem is used
- They almost all work basically the same way for naming and structuring your data
- Many operating systems provide some level of compatibility with other filesystems
I’m oversimplifying slightly here a little for convenience. There are ways to store bytes together that aren’t technically the same as traditional files and don’t require traditional filesystems. But in those cases, we either have no control over organization (i.e. social media posts as a flat chronological list) or else we’re presented with something that is practically similar to files in a filesystem (i.e. notes in a notes app, where we can create named subfolders and note titles).
How is Data Stored, Physically?
Regardless of what filesystem (or filesystem-like implementation) you’ve got, that data needs to reside on something physical. Although it often feels like digital data has no physical presence, there are in fact real, tangible electronic devices that hold every last bit of data in existence.
For decades, the most common type of data storage was magnetic media, due to its relatively low cost. Examples of magnetic media storage include floppy disks, mechanical hard disks, and cassette tapes. These devices store bits of digital data as unique magnetic patterns on tape or spinning platters. Magnetic media has certain downsides (such as power consumption), limitations (such as read/write access speed), and risks (such as magnetic or electromagnetic destruction), but its economic price point has kept it in common use even today for certain applications where the trade-offs allow for it.
Today, nearly all modern devices use what is somewhat cryptically called non-volatile memory or solid-state storage. This is in contrast to mechanical hard disk drives (HDDs) above, which are full of moving parts. Solid-state drives (SSDs) use a special kind of semiconductor called flash memory that allows extremely rapid reading and writing of data and the ability to keep all of the written data intact even when the power is off. This type of memory has existed for a long time, but only in recent years has technology reached the point where it is cost-effective to use it for high-capacity storage on personal computers.
The small USB (Universal Serial Bus) devices you have on a keychain or in your pocket, often called flash drives or simply USB drives, also use flash memory. They are usually smaller in terms of storage capacity than the more spacious drives used in computers.
A third major category of physical data storage is optical media, which includes CDs and DVDs. These provided a reasonably compact method for data storage that was both inexpensive and high-capacity by the standards of the day, compared to the capacity of floppy disks and the (lack of) portability of hard disks. These days, as both magnetic and solid-state media have become ever cheaper, optical media sees far less use.
The “Evolution of Data Storage” video from Blaster Technology on YouTube provides an illuminating five-minute jaunt through data storage history.
Do One Thing
Take five minutes and see if you can identify a small portion of your data as being either actual files in a filesystem (often seen using File Explorer on Windows or Finder on a Mac) or the logical equivalent.
For fun, you might even try to determine which filesystem your devices use. A quick online search about “[your operating system] filesystem” will probably turn up relevant results.
Next week, we’ll look at another type of storage: the cloud!