The modern data landscape is so much more diverse than it was in the past, and the modern data warehouse needs to increase its flexibility to keep up. In the modern warehouse, it’s not enough to just source all the data; it’s equally important to source all data types as well. Data professionals need to derive insights from various systems of engagement (social media, mobile), systems of record (ERP systems, CRM systems, databases), and the Internet of Things, which means sourcing unstructured, semi-structured and structured data. These needs are driving a rapid evolution away from the familiar enterprise data warehouse (EDW) and toward a new, more flexible solution: the logical data warehouse (LDW).
The LDW abstracts data access so applications do not need to change to gain insights and value from data across the LDW. The challenge here is the availability of tools that enable this fluid movement of data between traditional analytics/relational databases and the various Hadoop implementations that are already or will soon be a part of most organizations’ analytics ecosystems.
How IBM Fluid Query Helps
IBM Fluid Query powers the logical data warehouse, giving users the ability to combine numerous types of data from various sources in a fast, agile manner to drive analytics and deeper insight without having to understand how to connect multiple data stores, use different syntaxes or APIs, or change their applications. Fluid Query provides several benefits:
- It allows you to provision data for better use by application developers, data scientists, and business users.
- It masks complexity so that data consumers can concentrate on deriving insights from all available data sources.
IBM Fluid Query realizes these benefits by allowing a data store to route a query (or even part of a query) to the correct data store within the LDW, letting you query disparate data sets as if they were one. This approach gives you several important advantages:
- It moves the query to the data, not the data to the query.
- It enables query access between an IBM PureData System for Analytics (Netezza) appliance and Hadoop platforms, relational data stores, and Spark data.
IBM Fluid Query has several common use cases in which it can be leveraged:
- Using Hadoop as a “Day 0” archive for data discovery, analytics, and exploration with the IBM Fluid Query connection providing data movement to/from the PureData System for Analytics.
- Accessing structured data from familiar sources like DB2, dashDB (cloud data warehouse), Oracle, Teradata, Microsoft, and other PureData System for Analytics appliances.
- Running multi-temperature queries combining “hot” data located in the PureData System for Analytics appliance with “cool” data from Hadoop.
- Creating an alternative to a “Day 0” Hadoop archive that moves data from the PureData System for Analytics to Hadoop for capacity relief, exploratory analytics, database backup, or disaster recovery.
- Moving archive data to Hadoop for querying while also leveraging IBM Big SQL on BigInsights or Hive to locally query the data on Hadoop.
What’s New in IBM Fluid Query 1.6
Open Access with a Generic Connector
Users can now use a generic query connector to access any relational database via JDBC.
Added Flexibility for Fast Data Movement
Users can now import databases to Hadoop and append data to a populated table on Hadoop for incremental capability. No more rebuilds are required for data refresh.
Support for Compressed Read from BigSQL on IBM BigInsights
IBM Fluid Query now has the ability to read compressed data in Hadoop file systems such as Big Insights, Cloudera, and Hortonworks.
IBM Fluid Query is included as a free feature in the Netezza Platform Software (NPS) for PureData System for Analytics appliances.
We briefly covered the features of and use cases for IBM Fluid Query in this article. For further details, look out for our upcoming articles about big data and information management. Contact Ironside to explore how you can optimize data and analytics for your business.