If you’re like me, you’ve probably spent the last decade hearing about blockchain technology and all the ways it’ll change the world. And at some point, you’ve probably wondered, “Hey… what the heck even is a blockchain?”
So, like Gandalf giving Bilbo a few Tolkien-themed cryptocoins and sending him on an adventure (pretty sure that’s how the story goes), The Verge told me to try to learn about and demystify the tech that underlies everything from cryptocurrencies to NFTs. Possibly because my editors want to drive me to the point where I build an actual red string board.
So, shall we begin?
What is blockchain?
I’ve read 1,000 analogies trying to explain the blockchain. Could you give me another one?
You can think of a blockchain like an obsessive club filled with members who love to keep track of things. The club has a ton of complicated rules to make sure that every member writes down the exact same set of records about what happens each day (whether it’s bird sightings, or beer tastings, or flower sales) and that once data is recorded and accepted, it becomes exponentially more difficult to change as more and more records are added on top of it. Then, usually, outsiders can come by and check out all their records and go, “Oh, wow, a cardinal flew by at 10AM in front of Mike’s house. Cool.”
And, of course, there’s an unwritten rule that says you can never stop talking about being in the club.
So what’s the point?
At their core, blockchains let you agree about data with strangers on the internet.
Public blockchains provide a place to put information that anyone can add to, that no one can change, and that isn’t controlled by any single person or entity. (Generally, at least; we’ll deal with the caveats and exceptions later.) Instead of one company or person keeping track of everything, that responsibility is spread out to everyone on the network.
These properties are often described with very technical-sounding language like “distributed ledger,” “peer-to-peer,” and “cryptographically hashed,” but these are the basic properties that those words describe.
We’ll get into the technical side of how all that is done a bit later on, but there are probably a few basics we should cover first.
Blocks are what store data on the blockchain — and it’s up to whoever’s making the blockchain to determine what kind of data they store. I could, if I wanted to, create a blockchain where each block stored the entire text of The Great Gatsby. Would it be efficient? No. Would it be dumb? Yes. Have I done it? Also yes.
For normal cryptocurrencies, though, blocks contain the records of valid transactions that have taken place on the network. I sent you a MitchellCoin? Put it in a block. You sent me 10 MitchellCoins in return? That’s so kind of you! That’s in the block, too. For cryptocurrencies, you can imagine blocks as boxes of receipts.
Let’s say I just made a new blockchain: the first block would be there, shiny and new, but lonely. Then, the second block would come along and say, “the block before me is the first block.” The next block would say “the block before me is the second block,” and so on, creating a chain (of blocks).
Hello, I’m a blockchain expert, and this is a massive oversimplification. You haven’t mentioned orphaned blocks, block times, forks —
Okay yes, blockchain systems are very complex, as you’d expect for a system that needs to be able to handle millions of people using it, worldwide.
You definitely could get into galaxy-brain level discussions so thick with jargon that you’d need an entire article just to point you to the proper dictionary, but the extremely basic version is that there are a bunch of blocks that point back to each other in a line. Once a block is made and accepted onto the chain, it can’t be removed without extreme effort. You can only add new blocks. We’ll get into why that is and how the process works in just a bit.
Wait, we’re just talking about Bitcoin and cryptocurrencies here, right?
Well, yes and no. Cryptocurrencies are built using blockchain technology, and they’re by far the most well-known usage of the tech. At this point, you’ve probably heard of at least three cryptocurrencies: Bitcoin, Ethereum, and Dogecoin. All three run on their own, separate blockchains, and there’s way more where those came from, just in the cryptocurrency space alone.
So blockchains are used by people looking to get rich, but Online?
Well, the oodles of money being thrown around is what gets a lot of attention, but blockchain technology isn’t just limited to financial purposes. Technically, anyone can make a blockchain to keep track of anything, so there could really be infinite blockchains. (I even made a very silly one while writing this article.) There are also companies that run their own blockchains, but we’ll get into that later.
Hi, Mitchell, it’s your mom who, as you know, doesn’t keep up with tech news. Why should I care about blockchain?
Hi, mom! Great to talk to you again! How was your kayaking trip?
People talk about blockchain a lot, saying that it’s going to revolutionize everything, and that it could be the next internet. I know you weren’t, as you say, born yesterday, so you can tell that those claims may be just a bit grandiose. But there is still the possibility of it supporting fascinating new companies, apps, and systems — lucrative ones, at that — and having a basic understanding of the tech will let you know who’s a huckster and who may actually have interesting ideas.
Or maybe blockchains could accelerate the climate disaster destroying the Earth.
Nah, you don’t really need to worry about that. I’ve been told it’s perfectly fine, and the problem will go away any day now. [Sweating profusely (because of the climate change)]
What is blockchain’s impact on climate change?
I’m still worrying about it.
That may be because you’ve seen stories about how some cryptocurrencies use more energy than Switzerland or Libya, or you’ve heard that Bill Gates is worried about them. There are so many facets to the discussion about crypto’s energy use that would take several articles to cover (though one of my colleagues does have an excellent deep dive into the controversy), but it is safe to say that blockchains have a reputation for being environmentally unfriendly.
Part of the reason for that is a system called “proof of work,” which many blockchains (especially cryptocurrencies) employ for security and trust purposes. If a blockchain uses proof of work to validate blocks, then it requires a lot of computing power to complete transactions. Since computers need energy to run, transactions end up using a lot of energy.
It is worth noting that it doesn’t have to be this way: blockchains themselves don’t inherently use a ton of energy, and there are alternatives to proof of work. We’ll get into why that is a bit later. But, at the moment anyway, most of the applications of blockchain technology that people are familiar with, like Bitcoin and Ethereum, use proof of work.
To understand why the proof of work model needs computers to work so hard, we first have to understand how the other parts of blockchain technology operate.
Okay, so what does the blockchain look like? Is it a website? An app? An interactive VR experience?
Blockchains start out life as a completely empty list, with no information at all. Then, the creators will create something called the Genesis Block, which is just the first block in the chain. Unlike every other block, it doesn’t point back to anything. People can then add information to that list over time — what that information looks like, though, depends on what the blockchain is meant for: if it’s a cryptocurrency blockchain, it’ll be a bunch of transactions. If it’s a blockchain meant for tracking lettuce (which you could do, if you really wanted), it’ll probably look a bit different.
If you had to visualize what a blockchain actually looks like, imagine a bunch of receipts ordered into boxes, which are all tied together. Every so often, a new box is added, containing the receipts that were gathered since the last box was added to the chain.
In this example, the receipts are transactions, and the boxes are blocks. Managing the transactions as they happen, before they make it on to the blockchain, is a network of computers, commonly called nodes, that are running a special piece of software they use to communicate with each other.
What are the nodes saying?
Well, when users do any sort of transaction or change, they’re sending out messages to the entire network, for which the nodes are listening. Let’s use a made-up cryptocurrency named, completely randomly, MitchellCoin. If I wanted to send someone five MitchellCoins, I would broadcast that out.
So what’s stopping me from broadcasting out the message that everyone’s given me all their MitchellCoin? Besides my stand-up morals, of course.
When the nodes see messages, they do some checks on it: namely, they’d check to make sure that it was digitally signed by me, to confirm an impersonator isn’t spending my money, and that the message hasn’t been tampered with since I signed it. How the actual signature is made is a pretty complex process, but the end result is a message that is verifiably sent by a specific person — it would be almost impossible to forge (unlike a real signature). This prevents unscrupulous people from falsely claiming that someone else sent them MitchellCoin.
Nodes will also check to make sure the transaction is valid (say, by checking I actually have five MitchellCoins to spend, or that the person adding a shipment of lettuce to the blockchain is authorized to do so).
Uh, is that it? After the node does its verification, the transaction is done?
Transactions don’t go through right away. They have to wait for the next block to be added to the chain — a time period that can differ by blockchain. After a block is created and becomes part of the blockchain, all the transactions that are contained in it will become part of the blockchain, too. The process of competing to create that block is known as “mining.”
How the block is mined depends on the model that the blockchain operates on, which we’ll get into in a bit. After a mining node has created a block, it’ll broadcast it out to the world. The other nodes will check to make sure it’s a valid block, then add it to their own ledgers. It’s possible for multiple blocks to be created at once, but eventually the network will end up building more blocks on top of one than the other, making that block part of the official chain.
That seems… pretty easy to mess with?
It does, but blockchains have a few features to prevent tampering. To understand how they do that, you have to understand hashing —
Hahahah, blaze it.
No, it’s not a weed thing — though the confusion is understandable, given how Bitcoin was, for a time, widely associated with buying drugs on the dark web.
Wait, why would people buy drugs using a tech where every transaction is publicly available?
While the raw data of the Bitcoin blockchain is public, it doesn’t include your personal identifying information (or, at least, it shouldn’t). You will have a unique address to identify you as an entity on the blockchain, because you can’t just say “Hey, I’ve got 15 BTC to spend” without some way of identifying who you are and how many coins you have, but that doesn’t include data like your name or address. (If any purchase or a pattern of purchases reveals your identity, though, it’s all out in the open.) The most high-profile cases were in the early days before governments started regulating cryptocurrency exchanges, but the government still announces regular busts of organizations that try to launder Bitcoin for use in illegal marketplaces.
Okay. Sorry, you were talking about hashing?
Alright, buckle in — this is going to get complicated.
Hashing is a cryptographic technique that’s been essential to all sorts of computing since the 1950s and ‘60s, and blockchains use it to prevent tampering. In blockchains, hashes basically act as unique tags that prevent someone from changing data in a block, or even swapping in a fake block.
Hashing lets you create a string of characters (called the “hash”) from any piece of data. You put a bunch of data in (an entire block) and get a smaller, unique piece of data out (the hash).
To confirm nothing gets tampered with, each block stores the hash of the block before it. That way, if there’s ever a discrepancy between the two places the hash is stored, you’ll know something’s gone wrong (more accurately, your computer will know — you don’t have to manually check the chain yourself).
Hashes have a few important properties:
- They will always be the same given the same piece of data
- They will completely change if any part of that data changes, even by the slightest amount
- It’s very easy to double check that a given hash came from a given piece of data, but very difficult to tell what data was given just from the hash
Right, let’s do a quick example. Let’s pretend that when we run the word “blockchain” through our hashing algorithm, we get “ef7797” as a result (in reality, hashes are much longer). If we run “blocchain” through, which is only one letter different, we get “8e809e.”
If we wanted to make sure that we’re looking at the same data that was originally hashed, it would be easy as long as we were using the same program to create our hashes. Running “blocchain” through the hashing program will always result in “8e809e,” no matter who’s doing it. But it would take a very long time to go in reverse: if I wanted to know what someone put into the hashing program in order to get “9ed142,” I’d just have to make guesses until I found the specific word that produces that hash.
I’m still coming up with a lot of weed jokes, but not coming up with how this relates to blockchain.
Each block in the chain contains within it the hash of the previous block, which is just what the hashing algorithm spits out when given the piece of data that is the block. If anything about that block were to change (say, a transaction in it, or even the entire block itself), the block’s hash would change, breaking the chain: the next block, which contains within it the hash of the previous block would say “Hey, that block pointing to me isn’t the same one that was pointing to me when I was created! Something’s wrong!”
This all adds up to a system where anyone looking at a new block submitted to the chain can tell that nothing has been changed at any point. If it had, the hashes of every block after the change would have to be different than the ledger up to that point.
By the way, the hashes that blockchain uses are specifically cryptographic hashes. That’s part of the reason for the crypto- prefix that shows up on words like “cryptocurrency.”
I think I get it, but could you provide a snazzy illustration just in case?
So how does everyone agree on which version of the blockchain is correct?
Like how are we checking that these hashes match up?
The exact answer depends on what blockchain you’re talking about, but each one has something called a “consensus algorithm.” Basically, each blockchain decides how it wants to decide what the canonical truth is — generally, it’s based on the chain that has had the most work put into it. In a proof-of-work based blockchain, that means the chain with the most blocks: since every block requires work to mine, the longest chain will be the one with the most work put into it and will therefore be the official chain. (There are some alternative ways of doing it, however, which we’ll touch on later).
What if I wanted to attack this? Like if I spent 5,000 MitchellCoins, how would I change the record to say that I still had those coins?
It would be extremely painful (for your computer, that is). So first what you’d have to do is change the block where that happened. You’d then have to recompute the hash for that block, and every block that came after… And you’d not only have to do that on one computer (which would be hard enough, for reasons we’ll go into in a second), but on enough computers to drown out everyone else who was mining legitimately.
That sounds very hard.
That’s the point. That’s how you can have these things exist in public, yet still be reasonably sure that no one is messing with the record. Attacks can and do happen, but when so much computing power is required to pull one off, it’s hard to do without someone noticing.
The math changes, however, if there are very few people mining a particular coin. If MitchellCoin were a real thing, and only a few people were mining it on their home computers, it wouldn’t be that hard, or that expensive, for someone to amass 51 percent of the computing power.
The word “blockchain” is starting to feel fake…
Ah, that would be the semantic satiation kicking in. That’s the name for the feeling where you’ve heard a word so many times that it loses all meaning. It’s not surprising, given how many times I’ve used the word “blockchain” here. Let’s do it a few more times, just to make sure: blockchain blockchain blockchain.
So wait, how does this come together to actually make a block?
Well, when two nodes love each other very much…
Sorry. So how a block gets made, or “mined” depends on the blockchain itself. One of the most popular systems is called “proof of work.”
Proof of work systems are… complex, but we’ve already covered most of what we need to know to understand them. Basically, the blockchain will have certain rules for what it wants hashes to look like for blocks. Let’s say, for instance, that the MitchellCoin blockchain requires the first five characters of the hash to all be the letter a (so that it’s constantly screaming, like I am).
When a mining node wants to create a block, it would take all the data in the block, plus a special number called a nonce, and run it through the hashing algorithm. If the hash doesn’t start out as “aaaaa,” it would increase the nonce by one, and start again.
So basically, your computer is just… guessing numbers until it gets to the hash it wants?
Pretty much, yeah. And the hashes are huge — I’ve been using just a couple of characters as examples, but in general the hashes are 60+ characters long. On average, your computer will have to make a ton of guesses before it finds one that meets the criteria. But, again, while it takes us a long time to figure out an appropriate hash, it takes almost no time at all to check to make sure that our data actually does hash out to what we say it does.
Hi, it’s me, your computer that’s mining for crypto. Why are you making me work so haaarrrddd?
Oh no, the AI revolution has begun.
But really, the difficulty is an important part of the system, because it dictates the security of the block, as well as defining how blocks are made. As we noted before, if you wanted to change a record, you’d both have to recompute the hash for both the block and each subsequent block, as well as win the right to mine each of those blocks. The same is also true for double spends, which is where you try to undo a transaction so you can spend those coins again. The odds of you being able to double spend coins, and then create enough blocks afterward to make a chain long enough to be recognized as legitimate aren’t great. And, if you have enough computing power to tilt those odds in your favor, it’d likely be more profitable to just mine legitimate blocks instead.
So getting back to the energy thing…
Right, so when you’re creating, or mining, blocks each guess you make takes time and electricity, whether it’s right or not. And that adds up. But, as mentioned before, that’s what makes the blockchain secure — it would take a lot of time and energy to rewrite the record.
Sounds like blockchains are really dumb and wasteful then! Throw them in the trash!
[chanting] Trash! Trash! Trash! Trash!
Well the good news is that, while proof of work may be popular, it’s not the only way to do things. There’s also proof of stake systems, where, instead of solving puzzles, people put up crypto as a collateral to get a chance at being the next person to mine a block and be asked to validate blocks mined by others. If they validate malicious blocks, they’ll lose some or all of that money, depending on the blockchain’s rules. Proof of stake blockchains require way less energy, because mining a block doesn’t require making millions of guesses — those with stakes are randomly or algorithmically chosen to create a block, and they won’t need specialized, ultra-powerful hardware to do so.
If proof of stake makes it easy to mine, what would keep people from wanting to mess with it?
Well, an argument for proof of stake is that it incentivizes miners to actually care about the currency, since they have to be HODLers. Messing with the blockchain would likely reduce confidence in it — making it, and your stake, less valuable. This is in contrast to proof of work miners, who could immediately sell their coins and keep on mining without having to worry too much about the value or stability of the currency.
There have been talks of moving to proof of stake, especially on the Ethereum blockchain for a while, but the upgrade is still in a very early stage. It’s worth noting, though, that blockchains don’t necessarily have to use proof of work or proof of stake — there are other alternative consensus algorithms as well, and blockchains that aren’t public or used for currencies can create blocks in completely different ways that are way more efficient.
I think my brain has kind of melted.
Yeah, as I said it’s a pretty complex system. The good news is that, if you want to use the blockchain, you don’t actually have to know exactly how the system works — just like you don’t have to know how the banking system works to be able to swipe a credit card.
Speaking of credit cards, hold on a moment. I think I lost my wallet, I could swear I put it somewhere…
That sounds annoying, but imagine if you had a wallet that you could not only lose, but forget the password to as well. Remind me to talk about how those work sometime.
Ah, I found it. Okay, back to the blockchain. Can I just trust anything that’s on it?
Oh, no, I wouldn’t recommend that at all. The blockchain provides a way to verify, with a reasonable degree of certainty, that the data you’re looking at hasn’t been altered. But it doesn’t do much to help you determine whether the data was true when it was entered. There are private enterprise blockchains where every user is known and has specific permissions, but public blockchains are an entirely different beast.
For example, say I wanted to sell space rocks and claimed to prove their authenticity using blockchain technology. Even if I figured out a way to provide certificates of authenticity that lived on the blockchain and were indisputably tied to the physical rock I sent you, the blockchain wouldn’t do anything to help you if the “space rock” was actually just a pebble I got from my backyard.
(Please note: I completely made this up as an example. Any resemblance to someone running a scam with blockchain and space rocks is purely coincidental. Also, if you are doing that… don’t.)
Are blockchains just useful for cryptocurrency? Or are there other uses?
While cryptocurrencies obviously get all the hype and coverage, there’s tons of experimentation being done with blockchains in a bunch of different fields. Walmart has used the blockchain to track produce from the farm to its stores (and provide easy accountability if there’s a disease outbreak); there are experiments in creating and selling web addresses, or domains, on top of the blockchain; and there was talk at the beginning of the pandemic about tracking supplies and COVID-19 immunity using the blockchain. It does, though, remain to be seen if any of these systems actually catch on and become essential, or if they end up like all those businesses that sprung up in the mid-2010s that said they would use the blockchain without any real idea of what that meant.
There are also, of course, NFTs...
I recognize NFTs! Those are on the blockchain?
They are! Many NFTs exist on the Ethereum blockchain, which has specific features that allow for them. Yes, that does mean that you can do multiple things at once on a single blockchain — it just depends on how the data is set up.
Will you marry me on the blockchain?
Aw, okay. Well, will blockchain revolutionize voting / currency / inventory systems / news?
There are many blockchain boosters who like to say that the tech is the future of everything, and that it’ll be as big as the internet. However, as with anyone who’s telling you how great something they’re deeply invested in is, you should probably take what they say with 0.001 Saltcoin.
If a space would benefit in some way from being decentralized, or if everyone needs to share a known-truthful record, then yes, there is a chance blockchain could be a future tech. But if not, then there’s not a ton of benefit to using the technology over, say, a regular database. Blockchains are just a tool like any other — one of IBM’s fellows told me that when it creates blockchains for enterprises, the blockchain is really a small part of a larger IT system that also involves things like databases and other legacy programs. In other words, most of the time companies aren’t just throwing out their old systems and moving to blockchains, they’re integrating them in a way that makes sense.
It’s hard explaining how even relatively simple voting systems are trustworthy
As for voting specifically? Well, there’s certainly some interest in that area — a bill proposed in Alaska looks to move the state’s voting system to the blockchain, and a few other places have experimented with the idea. But at least one early effort has shown the increased risks that come with applying new and perhaps unneeded tech to voting.
For my part, I tend to agree with educational YouTuber Tom Scott on the matter of voting systems using blockchain to do electronic voting — even if the blockchain made voting completely trustworthy (which wouldn’t necessarily be the case), you also have to prove to the general public why it’s trustworthy. As 2020 showed, that can be hard to do with low-tech systems, much less ones that require explainers that are, like, a million words long.
Honestly, this all really sounds like something I’d like to watch an animated TV show about. Maybe from one of the people behind Rick and Morty?
Well, I’d only be interested in it if it had an obvious blockchain-based cash grab tie-in.
I have really good news for you (and bad news for everyone with… taste).
I want to put something on the blockchain.
So it’s actually not a ton of work to make your own blockchain from scratch. There’s some coding involved, to be sure, but it’s honestly not anything that couldn’t be figured out with a few days of research (and some basic programming knowledge).
There’s also no rule that says you have to create your own blockchain — some blockchains, like Ethereum, let you build on top of them, allowing you to take advantage of blockchain technology without having to create your own network.
But the biggest question you should ask yourself before diving into any of that is, of course: does my thing really need to be on the blockchain? Am I trying to fit a square peg into a round hole, potentially using way more energy than is necessary for my application?
No, I really need my thing to run on the blockchain! The fate of nations depends on it!
It’s definitely possible that you’re working on a specific problem that just needs blockchain technology! But if it’s that important then, uh, you really shouldn’t just be learning all this! I wish you the best of luck, though — see you on the ‘chain.
(By the way, no one says “see you on the ‘chain.” That was a test to see if you’re truly laser eyes enough.)
Nope, sorry. This has been long enough, we’ll have to get into that elsewhere.