×
Big Data
30% of the data we have access to is being generated in real time, are we over-informed? Is there more data available than we can consume? I don't think so, I explain it below.
Today we live in the Big Data era. Without data, artificial intelligence could not have experienced the acceleration we have witnessed. In fact, much of the algorithms and artificial intelligence techniques were developed decades ago, but only now are they bearing fruit because they are leveraging Big Data.
The growth of data has been exponential, with 90% of data created in the last two years. For next year it is estimated that we will reach 175 zettabytes; to put it in perspective, imagine we represent the Digital Universe with 128GB tablets like the ones we have at home. Well, 175 zettabytes are equivalent to 25 columns of those tablets stacked, each column with a height equivalent to the distance from Earth to the Moon, which is 384,400 km.
So, how can we consume all this information? What works best for me is following some guidelines:
1. Trying to extract data that has value: keep in mind that it is estimated that only 0.5% of data is currently being analyzed. We have a long way to go, but data quality is also important, that the data is truthful.
2. Expanding the spectrum of data: we tend to use only structured data, but only about 20% of it is (Excels to understand), with the rest being either semi-structured (HTML) or unstructured (social networks, satellite images, etc.).
3. Using new tools: there is a tendency to use linear models as prediction tools (linear regression, for example), but the reality is that relationships in financial data are usually not linear.
By way of conclusion, the Big Data era has been a great milestone and has allowed many of the artificial intelligence techniques that were envisioned decades ago to become reality today.