# Measuring the Complexity of Bitcoin

I looked at the distribution of transactions of 1 hour of the block chain (269609 to 269618). I parsed the blockchain into two sets of transactions: originating and receiving addresses. I examined both sets independently and together, attempting to fit the three different trials to Gamma distributions. The receiving addresses proved to have the easiest to interpret set of data, and did not follow a Gamma distribution. Instead, the receiving transactions followed a Log-Normal distribution. Figure 1 shows the results of the data and the resulting fit. The data in figure 1 was parsed into blocks that are 2dB wide (every decade of transactions size is 5 bins, the first bin is $10^{-6}$BTC, 1BTC is the 30th bin) Figure 1, Receiving address distribution of bitcoin transactions blocks 269609 to 269618 with Log-Normal fit

The Log-Normal distribution is very common especially in biological sciences. It is very simple to estimate using the arithmetic mean and variance. $p\left(X\right)\sim LN\left(\mu,\sigma\right)$ $\mu=\mathrm{ln}\left[\frac{E\left(X\right)^2}{\sqrt{\mathrm{Var}\left(X\right)}}\right]$ $\sigma^2=\mathrm{ln}\left[1+\frac{\mathrm{Var}\left(X\right)}{E\left(X\right)^2}\right]$

Recall that $\frac{\mathrm{Var}\left(X\right)}{E\left(X\right)^2}\propto\frac{1}{N}$ where $N$ is the number of degrees of freedom of the system.

The entropy of the Log-Normal distribution is: $S=\frac{1}{2}+\frac{1}{2}\mathrm{ln}\left(2\pi\right)+\frac{1}{2}\mathrm{ln}\left(\sigma^2\right)-\frac{\sigma^2}{2}+\mathrm{ln}\langle X\rangle$       (1)

Where $\langle X\rangle=E\left(X\right)$

We find that the entropy of the system is an increasing function of $N\;\forall\sigma^2\geq 1$.

Differentiating (1), we have $\mathrm{d}S=\frac{1}{\langle X\rangle}\mathrm{d}\langle X\rangle+\frac{1}{2}\left(\frac{1}{\sigma^2}-1\right)\mathrm{d}\sigma^2$          (2)

This finding provides a simple method to calculate the action of bitcoin, $T=\langle X\rangle$  (3)

and a metric of the number of independent actors within the community, $\sigma^2=\mathrm{ln}\left[1+\frac{C}{N}\right]$

where $C$ is a constant of proportionality.

I fit the data in figure 1 by estimating the parameters by: $\mu=\frac{1}{n}\sum_i^n \mathrm{ln}\left[X_i\right]$ $\sigma^2=\frac{1}{n}\sum_i^n \mathrm{ln}\left[X_i\right]^2-\mu^2$

This approach seemed to provide the best results. The outcome is highly sensitive to how the parameters are estimated. I am still working on developing this concept, as it has implications in other fields. For example, I tested the income distribution of the US as being Log-Normal. Figure 2 shows the result, the maroon curve is the data and the blue curve is the fitted model. Figure 2 Plot of the Average Wage Index of the United States (maroon) and the Log-Normal fit (blue) for 2012

There are a number of items that need to be worked out, and as such these findings are still incomplete. The next step is to examine the entirety of the block chain to see how well this model holds up. Additionally, changes in the money supply (number of bitcoins in existence) needs to be accounted in the model. However, without examining the data, such model building is impossible.  If anyone has the ability to parse the blockchain any assistance would be appreciated.  Thank you.

### 3 responses

1. Kell says:

for the blockchain parsing, there are several tools available. perhaps abe would be best suited for your purposes? https://github.com/bitcoin-abe/bitcoin-abe

• Cal Abel says:

Thank you! I will see about compiling it.