What causes advanced analytics and AI initiatives to fail? Some of the main reasons include not having the right compute infrastructure, not having a foundation of trusted data, choosing the wrong solution or technology for the task at hand and lacking staff with the right skill sets. Many organizations deploy minimum valuable products (MVP) but fail to successfully scale them across their business. The solution? Outsourcing elements of analytics and AI strategy in order to ensure success and gain true value.

64% of leaders surveyed said they lacked the in-house capabilities to support data-and-analytics initiatives. 

It’s essential to implement a data-driven culture across your organization if you’re looking to adopt advanced analytics. One of the keys to a data-driven culture is having staff with the correct skills that align with your initiatives. In our study, 64 out of 100 leaders identified a lack of staff with the right skills as a barrier to adopting advanced analytics within their organization. Even for organizations that do have the correct skill sets, retaining that talent is also a barrier they face. This is where outsourcing comes in.

Borrowing the right talent for only as long as you need it can be an efficient path forward.

Outsourcing parts of your analytics journey means you’re going directly to the experts in the field. Instead of spending time and money searching for the right person both technically and culturally, outsourcing allows you to “borrow” that talent. The company you choose to outsource to has already vetted their employees and done the heavy lifting for you. With outsourcing, you can trust that your organization is working with professionals with the skill sets you need.

Aside from securing professionals with the correct skill sets, there are plenty of other benefits to outsourcing your organization’s analytics needs. Professionals with the skill sets necessary for advanced analytics and AI initiatives can be very expensive. Outsourcing provides a cost-effective option to achieve the same goal. Rather than paying the full-time salary and benefits of a data science or analytics professional, an organization can test the value of these kinds of ventures on a project to project basis and then evaluate the need for a long-term investment.

Freeing full-time employees to make the most of their institutional knowledge.

Another benefit of outsourcing analytics is the increased productivity and focus of your organization’s full-time employees. By outsourcing your organization’s analytics, your full-time employees will naturally have more bandwidth to focus on other high priority tasks and initiatives. Rather than spending their time on what the outsourcing company is now working on, the full-time employees can dedicate their time to work on things that may require institutional knowledge or other tasks that are not suited for a third party. It’s a win-win situation for your organization – your analytics needs are being handled and your full-time staff is more focused and still productive.

There are many areas of analytics that an organization can outsource. These areas include but are not limited to viability assessments, prioritization of use cases, managing the ongoing monitoring, performance, maintenance and governance of a solution and implementing and deploying an MVP or use case.  In the words of Brian Platt, Ironside’s Practice Director of Data Science, “A partner with advanced analytics and data science capabilities can rapidly address AI challenges with skills and experience that are hard to develop in-house.”

Mid-tier organizations need the right talent and tools to successfully realize the value of their data and analytics framework in the cloud. The Corinium report shows that many companies are increasingly prepared to work with cloud consulting partners to access the skills and capabilities they require. 

 Areas that mid-market leaders consider outsourcing.

Overall, more and more data leaders are turning to outsourcing to help fill the gaps and expedite their organization’s analytics journey. Outsourcing services can help your organization reach analytics goals in many different areas, not just AI and Advanced Analytics. 

Organizations rely on outsourcing in key areas like these:

  • Developing a data and analytics cloud roadmap
  • Assessing advanced analytics use cases (figure shows 68% would consider outsourcing)
  • Implementation and deployment of a MVP, or use case (figure shows 43% outsource)
  • Developing and maintaining data pipelines
  • Documenting and assessing your BI and overall analytics environment(s)
  • Migrating your reporting environment from one technology to another
  • Overall management and monitoring of analytics or AI platform (figure shows 42% are already outsourcing)

When your company plugs into the right skill sets and processes, there’s nothing between you and a successful data-and-analytics transformation.

Take a look at the full whitepaper to learn more: Data Leadership: Top Cloud Analytics Mistakes – and How to Avoid Them

Contact Ironside Group today to accelerate your Advanced Analytics and AI Strategies.

In today’s world, data is everything and the pressure to go digital  is constant.  Data-driven decision making is critical for organizations across industries to stay competitive. As a result, companies are aiming to achieve a maturity level that enables them to make decisions based on “collective facts” instead of facts  from siloed datasets.

Organizations need to enrich their existing data ecosystems with third-party datasets (weather, financial market, geospatial and others) as it has the potential of offering huge value via additional insights from the consolidated datasets. In addition, they also need to be able to analyze both operational and transactional data using the same analytics platforms. On top of that, the speed at which businesses would like to explore and leverage different data domains to gain business insights changes rapidly so it is important for your data analytics strategy to be nimble and agile to quickly adapt to changes.

Companies must review and revise their data management strategies to gain competitive advantage, maximize benefits and reduce technical debt.

Our research with 100 data and analytics leaders from various companies shows that 42% feel the hub-and-spoke model was ideal for their business, 38% feel the “decentralized” model suits their organization, and the rest only — 20% — believe that the “centralized” model is the best model for their needs.

Three Data Models and How They Differ

80% of leaders surveyed agree that a fully centralized data and analytics operating model is not ideal for their organization.

It’s becoming clear that leaders across industries are moving away from centralized models and that developing and implementing an effective hub-and-spoke model empowers business users, frees data specialists to focus on high-value initiatives, and gives management trustworthy metrics that can help drive impressive outcomes.

For a successful transition or adoption of the hub-and-spoke model, stakeholder alignment and approval is paramount. The model eventually needs to be accepted by the business functions who in turn will define the data domains and own the data products that will be delivered. Clearly defining the targeted outcomes will assist in getting the business buy-in that’s required, as modern data management strategies and architecture are no longer driven by IT alone.

There are seemingly endless considerations and many challenges on the path to  designing, building, implementing, and fostering widespread adoption of a hub-and-spoke data model. But you don’t need to go it alone. Find a partner who can meet you where you are and help get you to where you need to be. If you have specific questions about your data model or data management strategy, let’s make a time to talk

To learn more, read the whitepaper: Data Leadership: Top Cloud Analytics Mistakes – and How to Avoid Them

Data governance. This concept is emphasized differently among different stakeholders. IT representatives have always held a more restrictive and cautious approach towards enterprise data access, while the business users continue asking for more and more data to become available. 

As enterprise leaders continue to be energized by the transformational promise of the cloud, the need for a renewed strategy around who gets access to what data and how becomes obvious. In fact, 100 data and analytics leaders from middle-market companies reported in a survey that the eight mistakes most disruptive to their enterprise data strategy all have to do with data governance.

Data governance as a continual process

So what is data governance? Its central purpose is to improve trust in the data lifecycle. Trust in understanding where the data came from, how it’s been moved, if it’s been changed, and who can see it. How is this trust built? Through organizational policies and procedures that govern how data is managed at the enterprise.

What’s worrisome is that no phrase comes up more often with customers than, “I cannot trust the data.” The organization’s “currency” has not only become seemingless worthless, but also costly. Leaders and analysts alike cannot rely on what they’re seeing and decisions are being made blindly. Time is wasted arguing over which metrics are correct and the cycle is unending.

So, how can an organization avoid this total disruption of their cloud data strategy? Stop treating data governance as an end product – something to be completed – rather than a continual process. Data & Analytics leaders believe this is the biggest mistake enterprises make today – attempting to implement a whole data governance framework in one ‘bang bang’ and viewing it as a ‘one and done’ type of project.

Data management touches all aspects of an organization – which means a cultural shift is needed and this takes time to implement. Rolling out a comprehensive program should take a modular approach tactically and behaviorally.

Furthermore, effective data governance is never done. There will always be new data, new definitions, new technologies and new issues that arise over time. Without well-understood and trusted processes for addressing each of these areas, organizations run the risk of their program falling flat.

Building the right team is critical

While conversations about analytics tend to be saturated by technology, it is the people that make data governance work. A centralized team will define individual roles and responsibilities, ranging all the way from executive leadership to data stewardship. Business subject matter experts will work together with the data stewards to define key business processes and metrics, ensuring the consistency of data definitions to all areas of the organization. The data engineers and stewards are the key cogs to a governance operating model, which is why failing to establish these data owners in the business is considered a top risk to data strategy execution.

Technology powers successful implementation

While technology shouldn’t be the initial focus, leveraging modern governance tools will be essential in automating and scaling processes. 42% of the surveyed leaders were concerned with the successfulness of data governance policies if they’re introduced without the tools to implement them effectively

These tools can range from spreadsheets to data catalogs to enterprise governance SaaS platforms. Choosing the right technology is entirely dependent on where you are on the journey. Since a full data governance implementation is exhaustive, companies can quickly become overwhelmed by attempting to accomplish too much, too soon. This is why our recommendation is typically labeled “agile data governance.” Start with a focused set of initiatives (dictionary, quality, lineage) and then leverage technology to automate and monitor these procedures. This will enable teams to tackle other governance disciplines (master data management, metadata management) without being stretched too thin. It is the proper blending of people and technology that drives successful data governance.

To see the full graphic, download the whitepaper, Data Leadership: Top Cloud Analytics Mistakes – and How to Avoid Them

Training and data literacy translate to business value

Once the key personnel, processes, and technology are in place within your data governance organization, scaling literacy and democratizing data is how you turn all this hard work into business value. Educating users on the available governance documentation and technology enables them to quickly get up to speed on why, where, and how to access data. These training sessions are technical but also help to establish a data culture. Leaders must evangelize the governance program and generate buy-in from the users by reinforcing why these best data practices are needed to ensure they aren’t circumvented moving forward. 

Consolidated and centralized access to data slows the value creation process. When too much of the analytics workload falls on IT resources, it restricts an organization from building competitive advantages effectively. Securely distributing and democratizing data to users across the business curates a smarter, faster organization. These technical and business users can spend more time in data analysis and less time dealing with data bottlenecks.

Typically, self-service and governance interests appear at odds with one another – one is an open policy, while the other is closed. But truthfully, the exact opposite is true – governance empowers self-service to be successful. With an enterprise more literate in data and educated on its lifecycle, the more it can confidently build value from it.

Protecting customer data and complying with regulations

Establishing rules and standards for data privacy, protection and security has always been a key tenet of the IT operating model. Global, state, and industry regulations have further tightened the data-handling standards that organizations must meet. And when we discuss data governance with our customers, the first thing they’re concerned about is remaining compliant and protecting all sensitive data.

Nonetheless, the last mistake highlighted by the surveyed leaders was overlooking data protection and privacy impact assessments. An assumption could be made that this would be considered a higher risk if leaders weren’t already confident in the data-compliant and protective culture that has existed for many years within their IT departments. But even if certain portions of the business are well versed in the protection of personally identifiable information, as data access is opened up to more users, more education will be required.

Viewing governance as a path to larger returns

As the appetite for data consumption increases, so should diligence in governance – it is needed now more than ever. Organizations should stop viewing governance initiatives as costs to the bottom line and start viewing them as mechanisms to generate larger returns from its most important asset (its data). In Breaking away: The secrets to scaling analytics, McKinsey identified what they defined as “breakaway companies” and found this group is, “…twice as likely to report strong data-governance practices.” 

The organizational excuses for deprioritizing data management over other analytics initiatives is becoming less acceptable and more risky. External sources can be a catalyst for mid-market companies, providing the leadership and experience required to build their internal governance functions they need. There are many avenues to getting started with data governance – organizations just need to take one.

To learn more, read the whitepaper: Data Leadership: Top Cloud Analytics Mistakes – and How to Avoid Them

Your data needs are different from those of any other client we’ve worked with. Plus, they’re ever-changing. 

That’s why we’re fluid in our approach to creating your framework and why we ensure fluidity in the framework itself. 

Diagram

Description automatically generated

Whether your current investment in assessments, governance, and technology is heavy or light, we can meet you where you are, optimize what you have, and help you move confidently forward. 

These steps are all necessary, but don’t happen in a strict sequence. Each of them is an iterative process — taking small steps, looking at the results, then choosing the next improvement. You need to start with assessment and governance — unless you already have some progress in those areas. 

Analytics are constantly evolving, and the Modern Analytics Framework is designed to evolve more readily as users discover new insights, new data, and new value for existing data. There will be constant re-assessment of the desired future state, modifications to your data governance goals and policies, design of data zones, and implementation of analytics and automated data delivery. Making these changes small and manageable is a key goal of the Modern Analytics Framework.

Can we ask you a few questions?

The better we understand your current state, the better we can speak to your specific needs. 

If you’d like to gain some insight into how your organization can move most effectively toward a Modern Analytics Framework, please schedule a time with Geoff Speare, our practice director.

Geoff’s Calendar
GSpeare@IronsideGroup.com
O 781-652-5758  |  484-553-1814

Get our comprehensive guide.

Learn about our proven, streamlined approach to taking your current analytics framework from where it is to where it needs to be, for less cost and in less time than you might imagine.

Download the eBook now

Check out the rest of the series.

Your data needs are different from those of any other client we’ve worked with. Plus, they’re ever-changing. 

That’s why we’re fluid in our approach to creating your framework and why we ensure fluidity in the framework itself. 

Diagram

Description automatically generated

Whether your current investment in assessments, governance, and technology is heavy or light, we can meet you where you are, optimize what you have, and help you move confidently forward. 

These steps are all necessary, but don’t happen in a strict sequence. Each of them is an iterative process — taking small steps, looking at the results, then choosing the next improvement. You need to start with assessment and governance — unless you already have some progress in those areas. 

Analytics are constantly evolving, and the Modern Analytics Framework is designed to evolve more readily as users discover new insights, new data, and new value for existing data. There will be constant re-assessment of the desired future state, modifications to your data governance goals and policies, design of data zones, and implementation of analytics and automated data delivery. Making these changes small and manageable is a key goal of the Modern Analytics Framework.

Can we ask you a few questions?

The better we understand your current state, the better we can speak to your specific needs. 

If you’d like to gain some insight into how your organization can move most effectively toward a Modern Analytics Framework, please schedule a time with Geoff Speare, our practice director.

Geoff’s Calendar
GSpeare@IronsideGroup.com
O 781-652-5758  |  484-553-1814

Get our comprehensive guide.

Learn about our proven, streamlined approach to taking your current analytics framework from where it is to where it needs to be, for less cost and in less time than you might imagine.

Download the eBook now

Check out the rest of the series.

In the same way that Software as a Service eliminates the need to install applications on your local network, Data as a Service lets you avoid storing and processing data on your network. Instead, you can leverage the power of cloud-based platforms that can handle high-speed data processing at scale. Combine that with the ready availability of low-cost cloud storage, and it’s easy to appreciate why so many organizations are turning to Data as a Service. 

Graphical user interface

Description automatically generated with medium confidence

One key component of a modern analytics framework.

In Ironside’s Modern Analytics Framework, Data as a Service is one of 3 key components.

Diagram

Description automatically generated

How can Data as a Service serve your organization?

We know your time is valuable. So, let us speak to your specific needs. 

Schedule a time with Geoff Speare, our practice director.

Schedule a time with Geoff Speare, our practice director:

Geoff’s Calendar
GSpeare@IronsideGroup.com
O 781-652-5758  |  484-553-1814

Get our comprehensive guide.

Learn about our proven, streamlined approach to taking your current analytics framework from where it is to where it needs to be, for less cost and in less time than you might imagine.

Download the eBook now

Check out the rest of the series.

If you rely solely on a data warehouse as your repository,  you have to put all your data in the warehouse–regardless of how valuable it is. Updating a data warehouse is more costly. It also takes a lot of time and effort, which usually leads to long delays between requests being made and fulfilled. Analytics users may turn to other, less efficient means to get their work done.

If you rely solely on a data lake, you have the opposite problem: all the data is there, but it can be very hard to find and transform it into a format useful for analytics. The data lake drastically reduces the cost to ingest data, but does not address issues such as data quality, alignment with related data, and transformation into more valuable formats. High value data may reside here but not get used.

When you have a system of repositories with different levels of structure and analysis, and a value-based approach for assigning data to those repositories, you can invest more refinement and analytics resources in higher-value data.

Striking the right balance between refinement and analytics is key. Performing analytics on unrefined data is a more costly, time-consuming process. When you can identify value upfront, you can invest in refining your high-value data, making analytics a faster, more cost-efficient process. 

Our value-based approach can help deliver higher ROI from all your data.

A picture containing diagram

Description automatically generated

This value-based approach also helps your modern analytics framework better meet the needs of your knowledge workers. For example, analysts can jump into complex analysis, rightly assuming that high-value data is always up to date. In addition, automated value delivery automatically distributes high-value data in ways users can act on. 

Let’s invest in a conversation.

We want to hear about your current framework and your changing needs. 

Schedule a time with Geoff Speare, our practice director:

Geoff’s Calendar
GSpeare@IronsideGroup.com
O 781-652-5758  |  484-553-1814

Get our comprehensive guide.

Learn about our proven, streamlined approach to taking your current analytics framework from where it is to where it needs to be, for less cost and in less time than you might imagine.

Download the eBook now

Check out the rest of the series.

Today’s companies have no shortage of data. In fact, they have more than ever before. And they know that, hiding in that data, are insights for better business decisions and competitive superiority. 

But even with all the investments they’ve already made, in everything from data marts and warehouses to operational data stores, companies suspect the most valuable insights may remain hidden. And they’re often right. If your current analytics framework is no longer meeting your needs, the signs are all around you.

Top indicators that it’s time to evolve your current framework

  1. Frustrated users can’t find the data or insights they need to drive better decision making. 
  2. Users rely on self-created spreadsheets not accessible to others. 
  3. Analysts spend more time on manual updates than on actionable insights. 
  4. IT-owned analytic assets take too long to update, reducing their usefulness. 
  5. High cost to manage data curtails innovation. 
  6. Your framework is already overwhelmed by the data sources you have, leaving no room for new ones. 
  7. Value leakage — in the form of data you aren’t acting on — grows every day.

So, what’s stopping you?

In working with companies across virtually all major industries, we’ve encountered just about every obstacle that keeps companies trapped in a data analytics environment that no longer meets their needs. Here are the most common concerns we hear, and how Ironside helps to address them.

“We can’t walk away from the investment we’ve already made in our current framework.”

We get that. It’s why a core part of our approach involves meeting you where you are, and helping you move forward from there – not from the starting line. You keep what you have and invest in ways that will give you the greatest ROI based on your needs.

“We don’t have the budget for a big increase in analytics spending.”

We find that a lot of companies actually spend more than they need to by treating all their data equally. It all goes to the data warehouse, where processing costs are high. With our value-based approach, you could end up reducing your spend. 

“We don’t have the time or resources to take this on right now.”

We can function as an add-on to your existing team, so that they won’t be overwhelmed by even more to do. Plus, the new architecture could drastically reduce manual tasks that are taking up their time—freeing them up to focus more on generating game-changing insights.

Three framework components will help you reach new levels of data analytics.

Diagram

Description automatically generated

Let’s make time for a conversation.

We want to hear about your current framework and your changing needs. 

Schedule a time with Geoff Speare, our practice director:

Geoff’s Calendar
GSpeare@IronsideGroup.com
O 781-652-5758  |  M 484-553-1814

Get our comprehensive guide.

Learn about our proven, streamlined approach to taking your current analytics framework from where it is to where it needs to be, for less cost and in less time than you might imagine.

Download the eBook now

Check out the rest of the series.

The Ironside Take30 webinar series premiered on April 16th, 2020, with the goal to share expert dialog across a variety of data and analytics related topics to a wide range of audiences. The series has three primary dialog categories, each hosted by a BI Expert, Data Scientist or Data Advisor. In the past year, we’ve shared best practices with over 200 companies ranging from Fortune 50 to small businesses. Our success has been measured by participants returning and telling a colleague; on average, each unique company attended over six Take30 sessions. 

While some Data Advisor sessions are more technical, the focus is on describing concepts and tools at a less detailed level. We want to give people a sense of how rapidly the data and analytics environment is changing. To that end, the Data Advisor series worked with our partners, including IBM, AWS, Snowflake, Matillion, Precisely and Trifacta, to bring demonstrations of their tools and discuss the impact of their capabilities. We talked about the rapid expansion both of data and of the solution space to move, structure, and analyze that data. 

Most importantly, we had a special series on the Modern Analytics Framework, Ironside’s vision for and approach to a unified approach to insight generation that puts the right data in the hands of the right people. Regardless of your industry, your tools, or your use cases, you need a way to keep your data, users, and processes organized.

“What do you need in your data warehouse?” used to be the chief question asked when thinking about data for analytics. That time is past. Now, a data warehouse is just one possible source of analytics. Most organizations have so much data that building a warehouse to contain all of it would be impossible. At the same time, data lakes have emerged as a popular option. They can easily hold vast amounts of data regardless of structure or source. But that doesn’t mean the data is easy to analyze. 

And just as there’s no longer a single question to ask about structuring data, there’s no longer just one voice asking the question. Data scientists and data engineers are among the many personas that have emerged as consumers of data. Each has their own toolset(s) and preferences for how data should look for their purposes. 

All of this diversity demands a more distributed approach to ingesting, transforming and storing data – as well as robust and flexible governance to manage it. All of the topics covered in the past year of Take30 with a Data Advisor touch on these points, and on Ironside’s goal to help you make better decisions using your data.  Here are five of the 27 Data Advisor sessions we hosted this year: 

  • Modern Analytics Framework: Series Summary  (7/30/20-9/1/21) – This 6-part series covers all aspects of Ironside’s Modern Analytics Framework: overall concepts, assessment and design, governance, identification of user personas, implementation, and usage. If you are looking to upgrade your existing analytics environment, or creating one for the first time, this is an essential series, and one that Ironside will be expanding on in 2021.

  • Snowflake as a Data Wrangling Sandbox (6/3/20) – Snowflake is a tremendous cloud database platform for data storage and transformation. Its complete separation of compute and storage allows for many usage scenarios, and most importantly for easily scalability based on volume of data and consumption. Nirav Valia (from Ironside’s Information Management practice) presents on one common Snowflake use case: using Snowflake as a data wrangling sandbox. Data wrangling typically involves unpredictable consumption patterns and creation of new data sets, as an analyst seeks to discover new insights or answer new questions by manipulating data sets. Snowflake’s power and flexibility easily handles these types of activities without requiring up-front investment or significant recurring costs. It’s easy to create transformations, let them run, then let the data sit until it is needed again. (If you are interested in Snowflake, also consider our later Take30 Snowflake: Best Practices (9/24/20), including commentary from a Snowflake engineer.) 

  • What is Analytics Modernization? How can Data Prep Accelerate It? (with Trifacta) (2/4/21) – Toward the end of our first year of Take 30s, we held a panel, hosted by Monte Montemayor of Trifacta, around data prep and accelerating analytics modernization. As I mentioned earlier, there is a tremendous amount of data available today – but getting it analytics-ready is a huge challenge. Tools like Trifacta (known as Advanced Data Prep, or ADP, in the IBM world) are extremely useful for giving analysts and business users the ability to visualize and address data quality issues in an automated fashion. This is useful for data science, dashboarding, data warehouses – any place where data is consumed. (If you are interested in Data Prep, check out IBM Advanced Data Prep in Action (7/8/20) and Data Wrangling made Simple and Flexible with Trifacta (5/6/20))

  • A Data Warehousing Perspective: What is IBM Cloud Pak™ for Data? (5/27/20) – IBM has created a single platform for data and analytics that works across cloud vendors and on-premise. If you want to be able to shift workload between local nodes and the cloud easily, this is the solution for you. In this Take30, we provide an overview of the technologies that make Cloud Pak for Data possible, and how you can take advantage of them. (We also have a session Netezza is back, and in the cloud (7/23/20) discussing Netezza, one of the many technologies available on the Cloud Pak platform)

  • A Data Strategy to Empower Analytics in Higher Ed (7/1/20) – Occasionally, we have the opportunity to host an industry-specific Take30, and where possible, we have clients join us. Northeastern University joins this Higher Ed focused Take30 to discuss their approach taken with Ironside in developing a multi-year roadmap. This was geared towards increasing the “democratization of data and analytics” by establishing the organizational foundation, technology stack and governance plan necessary to grow self service throughout the institution. Our discussion highlights the particular challenges of a decentralized, highly autonomous structure, and shared the value of a data science pilot in the admissions area executed during the strategy engagement to generate tangible results.

2020 was a unique year. At Ironside, it gave us the opportunity to reach out to customers in a new way – one that we are continuing into 2021. We look forward to more detailed sessions on the Modern Analytics Framework, and on trends and tools that we see gaining prominence. 

After a year of delivering these sessions, we’ve realized that customers are not only looking for specific solutions, but for a sense of where the analytics world is going. Which cloud platforms make the most sense? What transformation and data wrangling tools are the most useful? Should I redesign my warehouse or just add a data lake? We look forward to exploring those and other questions with you.

By 2022, 35% of large organizations will be either sellers or buyers of data via formal online data marketplaces, up from 25% in 2020. With AI and ML supplementing existing data sources, there is always more value to be derived from large quantities of data.

For years, the data management industry has been talking about the ever-growing volumes, velocity, and variety of data.  For traditional analytics, the challenge has been about how to reduce the data used in reporting and BI; how to separate the noise from the signal, how to prioritize the most relevant and accurate data, and how to make a company’s universe of data usable to an increasingly self-service user population. This notion of having too much data is well-founded – so much data in an organization isn’t readily useful for traditional analytics. Data may be incomplete, inaccurate, too granular, unavailable, or simply not useful for a particular use case. However, in implementing AI and ML, it turns out that the more data that is available from as many sources as possible is one of the most important ingredients in building a successful model.

In traditional analytics, the user decides which data is most useful to their analysis and, in so doing, taints their results through their own intentional omissions and unintentional biases. But, in AI/ML (and especially when we’re leveraging Automated Machine Learning (AML) technologies), we really can’t have too much good data. We can throw massive amounts of data at the problem and let AML ascertain what’s relevant and helpful, and what isn’t. We want lots of data, and unfortunately we usually don’t actually have enough.

In a recent project, we met a customer who (as with most) believed that they had all the data they needed to accurately predict insurance loss risk – they knew their customers, their properties, various demographics, payment histories, on and on. And so we built a loss prediction model for them, and got good results. The customer was very pleased.  

Then we decided to train the model with a combination of internal and 3rd party data to see whether there would be a difference. We loaded several sets of data that significantly enriched that customer’s already voluminous customer and property data.  The result was a 25% increase in the efficacy of the AI model – which as any Data Scientist will tell you, is a massive improvement. And the cost of that data was a drop in the bucket relative to the scope of the larger budget.

My message to customers facing these issues has evolved; I now encourage them to seek out more data than they already have. The inclusion of external data at marginal cost can drive substantial improvements in the quality of models and outputs. And many data vendors have made it easier to test, acquire, and parse data for where it is most impactful. The bottom line is that, in the area of AI, more is definitely better, and you can never be too data-rich. 

Ironside and our partner Precisely recently published a white paper where you can learn more about data enrichment for data science, which you can download here.