In order to understand Blockchain deeply, let us first talk about the concept of a Digital Signature or a Hash.
Digital Signature is basically a function that takes a string as input and returns a fixed-size alphanumeric string. The output string is known as the Digital Signature or the Hash of the input message. The important point to note here is that the function via which we obtain the Digital Signature is “irreversible” in that given an input string, it can compute the Hash. However, given the Hash, it is virtually impossible to compute the input string. Further, it is also virtually impossible to find 2 values that have the same Hash.
hash1 = Hash(input1) hash2 = Hash(input2)
Here, what we are essentially trying to say is the following:
- It is easy to compute hash1 from input1 and hash2 from input2.
- It is virtually impossible to compute input1 given the value of hash1. Similarly for input2 and hash2.
- It is virtually impossible to find distinct input1 and input2 such that hash1 = hash2.
Such Hashing functions are carefully designed by cryptographers after years of research. Most programming languages have a built-in library function to compute the Hash of a particular input string.
Why are we talking about the Hash function?
Well, Blockchain as a concept relies heavily on Hashing. The idea is that in a Blockchain, we have an ordered chain of blocks such that each block contains the following information:
- Hash of the previous block.
- List of transactions.
- Hash of itself.
Let us take an example. Consider the following simple block: [0, “X paid $100 to Y”, 91b452].
Here, since this is the first block of the Blockchain, the Hash of the previous block is 0. The list of transactions contains just 1 transaction – X paid $100 to Y. The Hash of itself is computed by the following way:
hash_itself = Hash(List of transactions, Hash of the previous block)
Basically, we combine the List of transactions and the Hash of the previous block as a single input string and feed it to the Hash function to get the hash_itself value.
Such blocks where the Hash of the previous block is 0 are termed as Generis Blocks. A Genesis block is basically the very first block in a Blockchain.
Now, suppose we want to add some more blocks to this Blockchain. Let us have block1 = [91b452, “Y paid $20 to Z, X paid $10 to P”, 8ab32k].
Here, 91b452 is nothing but the Hash of the previous block (the Genesis block). There are 2 transactions:
- Y paid $20 to Z
- X paid $10 to P
Finally, we have the hash_itself value which is basically Hash(“Y paid $20 to Z, X paid $10 to P”, 91b452). This turns out to be 8ab32k.
Representing pictographically, our Blockchain looks like the following:
What’s so special about this “data structure”?
Well, the idea is that if suppose someone were to mutilate the Blockchain by say altering the transaction in the Genesis Block – changing “X paid $100 to Y” to “Y paid $100 to X”, this will change the hash value of 91b452. As a result, there will be a mismatch in the value of this hash in block1 (remember, the first value of each block is the hash value of its parent block). As a result, the chain becomes invalid. This effectively holds for each block in the Blockchain because as soon as we modify a block, the hashes of all subsequent blocks become invalid and so, the chain collapses. Therefore Blockchain provides a high level of data security.