Abstract: Learn more about how we built, tested and delivered a near-real-time data pipeline using Apache Spark in the cloud in two weeks — and still saw our families. We faced a looming deadline, and real-time analytics requirements. Using a cloud-based platform with Spark and Impala running on Microsoft Azure, and armed with a few hundred lines of Python code, we designed, tested and deployed an end-to-end data pipeline and analytics infrastructure in two weeks. The project had its challenges, both technical and operational; learn what we learned and our tips for success.
Webinar Replay: SaaS Data Lakes
Simplifying Cloud Data Lakes for Rapid AnalyticsWatch the Webinar