The Hidden Costs of Cloud Data Lakes

This blog series from Cazena's engineering team investigates the hidden costs of cloud data lakes. Learn the top three hidden costs of cloud data lakes!

Read the Blog Series

Understanding the Bardess Data Science Maturity Curve: Phase Two “Emerging”

Hannah Smalltree, Cazena & Daniel Parton, Lead Data Scientist, Bardess

The Data Science Maturity Curve

The Data Science Maturity Curve is a tool developed by our partners at Bardess Group. I recently interviewed Daniel Parton, Lead Data Scientist with Bardess, for more insight about each stage of the curve. 

The Bardess Data Science Maturity Curve

Click to view larger (new window) and download.


Data Science Maturity Curve – Phase Two: Emerging

As organizations mature their data science program, they will move out of Phase 1 – into Phase 2, labeled the “emerging” stage on the Bardess Data Science Maturity Curve. During this growth period, companies will generally hire or allocate teams. Some may hire data scientists, or perhaps expand the roles of analysts.

This is often how a lot of companies get started with data science, Parton said. “They’re maybe not ready to kind of completely bite the bullet and hire a whole data science team, but they have some existing analysts who start to work on some data science projects.”

During this stage, technology support may be nascent or growing, Parton acknowledges. Data scientists may be downloading tools and datasets to their laptops, or potentially starting to use the cloud. While this exploratory stage is part of the maturation process – it’s also when things can be very risky, Parton explained.

Some may use the cloud, and upload data, which could be a security risk or a compliance problem. Lack of integration can also be an issue – whether with the cloud vendor for security, or with the business. The business integration is critical.

Specifically, integration with business workflows is particularly important at this stage. This might mean integration with key business processes or even business applications.

Parton gave an example of prediction model:

“It’s one thing to build a model and make some predictions,” Parton explained. “But how do you then integrate that so you are getting fresh predictions every night or every hour?”

This is where your data stack and data pipeline become very important for outcomes. For input, you’ll want recent, accurate data to feed and optimize models. For output, you’ll want accurate predictions surfaced at the right place and time – ideally into the right application or business process. Trying to do this manually can be very difficult, if impossible, to scale.

But getting it right helps move organizations to Phase 3, the “functional” phase. This is where many organizations land after establishing a team and experiencing some early wins. However, this phase introduces another set of challenges…read on to learn more.

Watch this Data Science Maturity Curve interview excerpt to hear Daniel share more about “Phase Two – Emerging”  – and why technology is important for operationalizing data science and growing impact. Hear more advice about data science platforms and technology.

Related Resources