The below article was originally posted in the Jethro blog.
BI-on-Hadoop in 2016 – The Elephant in the Room
Last year it seemed like every organization was talking about harnessing the power of big data to gain the crucial business insights required to make data driven decisions. The need for people and groups to blend together big data sets and analyze big data with the finest resolution has become standard practice. However, the pursuit of gathering and analyzing big data has ushered in a new set of challenges. Despite data sets burgeoning to larger and larger magnitudes of size and complexity, people still expect to interact with the data using their BI and visual data discovery tools (Tableau, Qlik, Microstrategy etc.) at the speed of thought. In our “world of now,” no one likes to try and drill-down into their data only to sit agonizing minutes waiting for queries to stream back from Hadoop to their data visualization or BI tool.
In 2016 we are going to see individuals and organizations demanding that the data discovery process accommodate more massive and more varied datasets as well as allow for more complex analysis—all at interactive speeds. Business users are demanding self-service tools without requiring any IT assistance, which will enable superior flexibility with their data exploration in order to derive actionable business insights. The rigidness of partial extracts and predefined cubes of yesterday are not going to quench the business user’s thirst for complex data discovery and analysis. IT will no longer be able to keep up with the business users’ demands leaving them to seek out a more flexible and sustainable solution.
BI ON HADOOP HAS ITS OWN CHALLENGES
Hadoop by design was not intended for interactive BI use and SQL-on-Hadoop tools scramble to serve the full range of analytic and ETL use-cases. Each SQL-on-Hadoop solution has its own unique properties and best use-cases. For the unique needs of BI and data discovery, however, Jethro takes the cake. TPC-DS live benchmarks using Tableau and Qlik have shown that Jethro has the fastest query response time. Even on 2.9 billion rows of data, Jethro enables business users to interactively analyze data at the speed of thought. Jethro also gives BI users the most flexibility, as its index-based architecture indexes every single column alleviating the need for limiting predefined extracts and cubes.
Certain BI tools, like MicroStrategy, can tap directly into Hadoop via native connectivity with HDFS, but big data performance issues surface while dealing with large and complex data sets. These users would benefit from an indexing and caching layer that would accelerate the queries from Hadoop.
BI INGESTING DATA FROM HADOOP AND OTHER SOURCES
Data will always exist in multiple places with a variety of sources—even as Hadoop becomes more widely adopted. In order to allow for boundless data discovery and the highest granularity, SQL-on-Hadoop solutions will need to ingest data from both Hadoop and non-Hadoop sources. Jethro grants BI tools access to any data required, as it ingests data from any data source including traditional structured data.
2016 will be about addressing the elephant in the room and finding ways to overcome the BI on Hadoop hurdles in order to empower the business user to discover, visualize, analyze and derive priceless data-driven business insights to propel business forward.