Blog

At AWS re:Invent today, Cazena announced a concept called AppCloud that allows enterprises to attach innovative analytic or machine learning applications to their enterprise data in the cloud.

Read more ›

Microsoft’s Azure Data Lake Store (ADLS) is a highly scalable storage solution which boasts the ability to store trillions of files, including files a petabyte in size. It is also 3x cheaper than running HDFS on Azure’s Standard Storage solution. We have been eager to see what it can do.  Cloudera recently published an analysis, and we did some complementary benchmarking of popular SQL on Hadoop tools. In this Cazena Engineering blog post, we present our findings and assess the price-performance of ADLS vs HDFS

Read more ›

September 25, 2017

Cazena: The EZ-PaaS for Big Data

By: Prat Moghe

As enterprises seek to migrate and manage their production analytic workloads in the public cloud, we increasingly hear teams considering PaaS offerings (such as AWS EMR, Redshift, Azure HDI) as the first stop for implementation. But many don’t realize that only provides a foundational analytic capability in the cloud. Significant additional work is needed to migrate and manage analytics in production.

Read more ›

A telecommunications company called Cazena to learn about our Data Lake as a Service, hoping that it would help them deliver an at-risk project. It definitely would, it’s a cloud use case we know well, with a solution designed accordingly.

Read more ›

When we started the Cazena journey, we commissioned a big data survey and research analysis from GigaOm. Our goal was to discover the greatest challenges for organizations adopting cloud services for analytics, data science or big data. The top challenge? Security, unsurprisingly.
 

Read more ›

Cazena partner Sentier Health Informatics weighs in on their experiences navigating the complicated world of cloud security in a regulated industry. Sentier is focused on healthcare, life sciences, and insurance organizations who are quickly discovering that judicial use public cloud services is critical for competitive advantage in predictive analytics.
 

Read more ›

As part of our ongoing series on Productionizing Hadoop and Spark in the cloud, we explore performance optimization, and how companies scale and tune for the best performance. We also discuss what’s required for production-grade deployments, often an underestimated part of the process.
 

Read more ›

As part of our ongoing series on Productionizing Hadoop and Spark in the cloud, we explore performance optimization, and how companies scale and tune for the best performance. We also discuss what’s required for production-grade deployments, often an underestimated part of the process.

Read more ›

In the first installment of our series on The Hidden Challenges of Putting Hadoop and Spark in Production, we explore the infrastructure selection process. This series was inspired after we read a recent Gartner survey, which estimates that only 14% of Hadoop deployments are in production. We’re not surprised....
 

Read more ›

There is an interesting theme mentioned by the leaders of data science and advanced analytics groups: All are focused on how to make their team as productive as possible. The resources for these teams are notoriously hard to find.  So, naturally, team leaders want to ensure that these scarce, highly-skilled workers have everything they need to be efficient. Here are the most common pitfalls we hear about. Do you agree?

Read more ›

Pages