Microsoft’s Azure Data Lake Store (ADLS) is a highly scalable storage solution which boasts the ability to store trillions of files, including files a petabyte in size. It is also 3x cheaper than running HDFS on Azure’s Standard Storage solution. We have been eager to see what it can do. Cloudera recently published an analysis, and we did some complementary benchmarking of popular SQL on Hadoop tools. In this Cazena Engineering blog post, we present our findings and assess the price-performance of ADLS vs HDFS.
As enterprises seek to migrate and manage their production analytic workloads in the public cloud, we increasingly hear teams considering PaaS offerings (such as AWS EMR, Redshift, Azure HDI) as the first stop for implementation. But many don’t realize that only provides a foundational analytic capability in the cloud. Significant additional work is needed to migrate and manage analytics in production.
A telecommunications company called Cazena to learn about our Data Lake as a Service, hoping that it would help them deliver an at-risk project. It definitely would, it’s a cloud use case we know well, with a solution designed accordingly.
As part of our ongoing series on Productionizing Hadoop and Spark in the cloud, we explore performance optimization, and how companies scale and tune for the best performance. We also discuss what’s required for production-grade deployments, often an underestimated part of the process.
There is an interesting theme mentioned by the leaders of data science and advanced analytics groups: All are focused on how to make their team as productive as possible. The resources for these teams are notoriously hard to find. So, naturally, team leaders want to ensure that these scarce, highly-skilled workers have everything they need to be efficient. Here are the most common pitfalls we hear about. Do you agree?
Over the past few years, I have observed a deepening organizational divide in large data-driven companies. On one hand, IT and data owners have their hands full managing their current data infrastructure and platforms.
Japanese rock gardens, or zen gardens, were first constructed centuries ago at temples as aids to meditation. Also called “dry landscapes,” zen gardens are designed as miniature models of natural landscapes. This practice of artfully modeling the world in miniature seemed like a beautiful analogy to launch our new Data Science Sandbox as a Service…
In the past, protecting and securing enterprise data was simpler—handled mainly through the use of basic perimeter-based devices like firewalls and intrusion protection services. As more and more enterprises now look to migrate or augment their big data clusters with the cloud, the amount of access points to their data continues to exponentially increase. For the modern enterprise, perimeters are almost gone. Thorough security and compliance measures for this newly distributed data are now a top priority for CISOs and security teams, well-covered in several recent articles around the web.
For our upcoming webinar, we’re proud to feature guest speaker Mike Gualtieri, Forrester VP and principal analyst, an industry favorite. Why do we like him – especially on these topics? Well, as an industry analyst, Mike has a fascinating coverage area (bio), which includes big data and IoT strategy, Hadoop/Spark, predictive analytics, streaming analytics, prescriptive analytics, machine learning, data science, Artificial Intelligence, and emerging technologies.
In an upcoming webinar, we'll get up to date on Cloud Data Warehousing with guest speaker Noel Yuhanna, Forrester principal analyst; Prat Moghe, (Cazena founder & former Netezza SVP) and...You! Noel will present a short and succinct overview of cloud data warehousing, a market he's tracked as an analyst since its inception. Then, we've planned lots of time for your questions and an interactive discussion, moderated by yours truly.
This week Cazena made a major announcement, un-coincidentally timed with Strata + Hadoop World. We seriously enhanced our Data Lake as a Service, which is based on Cloudera Enterprise, runs on Microsoft Azure or AWS, and includes many new features for data science. Read more here. It’s been exciting to see the momentum in the Big Data as a Service category and I loved sharing the news at Strata. Walking through buzzy hum of expo floor conversations, I overheard the same terms over and over.
37 marketing emails from retailers landed in my inbox today before noon. (For real.) Now, don’t shed a tear for me, I’m fine with it. Retail emails go into their own folder, where they don’t distract me until I’m ready. Plenty of them will never be opened, depending both on the subject line and my schedule. Suffice to say, I’m a fan of retail, often calling myself a “professional” shopper. I like to find exactly the right item, ideally with some efficiency. I like sales. I like to know about new products. So, I appreciate (and reward!) companies who use analytics wisely.
Next week, I’ll be speaking about big data technologies at the Cloud Expo in New York City, June 7 – 9. The “expo” label is inspiring, reminding me of world expos throughout history. These huge national events feature exhibits, performances and activities, often with global, futuristic themes. The first event like this was “The Great Exhibition” in London in 1851. In the following decades, events became so popular that the Bureau of International Expositions formed in 1928 to oversee the calendar and consistency of expos.
I am excited to share that Gartner Inc. recently named Cazena in its Cool Vendors in DBMS report for 2016! We’re proud to be included in the research.
Each year, Gartner identifies "Cool Vendors" in key technology areas and publishes a series of research reports highlighting pioneering vendors and their products and services. This year, analysts named Cazena’s Big Data as a Service, along with four other vendors, in the report.
Strata+Hadoop World (Spring Edition) is just around the corner. This year, I’m speaking about best practices for enterprise adoption of the cloud for big data. Along the way, I’ll share some real-world stories from our collaborations with enterprises. The cloud project that looked easy, started great, and eventually took three years to production? My colleagues call it the IT zombie apocalypse. The innocent analyst who inadvertently caused the $57,650 AWS bill? Funny, if they’re not on your team.
You always remember your first…database. You didn’t quite know what you were doing and it was a bit awkward, but you figured it out and eventually ran your first query. Whether you built it, maintained it, used it or cursed it, I'm guessing that you have at least one memorable database in your past. Can you plot it on our new infographic below and share it?
Rob Ramrath is that rare breed - a long-lived CIO whose team is intimately involved in creating value for customers. Rob has led IT at Bose for 17 years, the last 10 as CIO. Survival and success required transforming his IT organization for the digital connected world. Rob and I sat down in his office to discuss what he’s learned over a remarkable career. Five lessons jumped out at me.
It can be hard to present the first session of the day at a tech conference, especially on day two, after the previous night’s parties. Frankly, I worried no one would come. On Twitter I threatened to bring my ukulele and sing parody songs about the cloud - “Rockin’ The PaaS-bah” or “IaaS a Rock” (IaaS an island?). But when a good crowd of attendees streamed in just before the start time, sans beers or concert t-shirts, I decided to stick with the planned presentation and spare them my songwriting skills.
In the 15th century, innovative Scottish farmers seeking a beer-alternative discovered distilling, a new way to harvest the life or “spirits” from barley. Malting, mashing, fermenting and boiling the grain produced a steamy cloud that rained down its concentrated essence into an alcoholic, amber liquid now known as scotch whisky. It was arguably the earliest-ever case of a new cloud producing positive, measurable value!
Earlier this week, our founder Prat authored an article on InsideBigData about planning for data lakes. He shared five big questions to consider. Experience tells us this is necessary. Far too many people jump into data lakes hastily, and not just as a justification to overuse water-related puns. It’s more often an excuse to splash around with Hadoop, which is an understandable impulse given all the hype.
It’s exciting to see Big Data as a Service, or “BDaaS” at its inevitably abbreviated, start to take off. Numerous media outlets have covered the trend’s potential and challenges. Social media (namely, Twitter) continues to share a Forbes article about Big Data as a Service by Bernard Marr, author and industry expert. Marr estimates an impressive $30 billion valuation of the market by 2021. Yet, he explained:
Last week, Michael Copeland of Andreessen Horowitz interviewed me and Peter Levine, General Partner at Andreessen Horowitz, on the topic of big data moving from on-prem infrastructure to the cloud.
It was a fun chat after we literally took over Peter’s office while he was attending a board meeting! In the interview, Peter, Michael and I discuss how:
We founded Cazena with a vision to provide Big Data on Demand: data made available for access and analysis simply and immediately. Today is an exciting milestone. After two years of development, we are officially announcing Cazena’s Big Data as a Service. This service essentially wires up enterprises for Big Data on Demand. Let me explain.
Manjit Singh, CIO of The Clorox Company, has successfully led enterprises through scale and transformation. Previously he was an executive at Box and CIO at Las Vegas Sands and Chiquita Brands. Manjit is also an early leader in big data. Recently he shared with me his perspective on the challenges of the CIO role, the transformative potential of the cloud and big data.
Three observations stood out.
Our CEO Prat Moghe recently wrote an article for Multichannel Merchant on findings from research we commissioned. To learn more about what is motivating retailers to move their big data analytics to the cloud, read the article here: Retailers Are Marching Toward the Big Data Analytics Cloud
A few days ago we released the results of our Enterprise Big Data survey, led by GigaOm Research Director for Big Data and Analytics Andrew Brust. The report provides a clear view into how people are really thinking about Big Data, within all the context and constraints of operating a large enterprise.
Sometime back, I had lunch with a CTO of a well-established financial services enterprise. I posed the question – what do you guys think about doing big data processing in the cloud? That created a fairly violent reaction, followed by a less violent discussion eventually leading to a few really interesting points and questions in my mind.
WALTHAM, Massachusetts (November 10, 2014) – Gigaom Research today announced the findings of a study that reveals that the majority of enterprises are looking to leverage public cloud resources for their big data analytic needs. The report, “How enterprises will use the cloud for big data analytics,” also suggests that potential migrations to the cloud are being delayed due to a number of significant concerns, chief among them security, compliance and complexity.
If you had told me only a year ago that Cazena would launch today with a team of five big data and infrastructure superstars, I would have laughed you off. But that’s exactly what has happened. Jit Saxena, Jim Baum, Peter Levine, Steve Papa, Ed Anderson have joined forces with me to announce our latest venture, Cazena.
WALTHAM, Mass. (Oct. 20, 2014) – Cazena today announced it has assembled a team of technology luminaries to take on the next challenge in big data. To execute its vision, Cazena has raised $8 million in Series A funding led by Andreessen Horowitz and North Bridge Venture Partners.