Build versus Buy: It is the age-old question that businesses regularly face. As an engineer, my first reaction is, “I can build that!” Generally, for most engineers, this is true. With some effort, you might be able to build your own cloud data lake for analytics and machine learning. But at what cost?
For a moment, let’s compare the process of building your own “do it yourself” (DIY) data lake to the process of generating your own electricity. Electricity is very convenient to use – simply plug your device into the wall receptacle and you’re done. It’s rare to think about how electricity gets to that plug from the grid – that is, unless you have a power outage and see a worker on the electrical pole!
With a little know-how, it’s fairly easy to build an electrical source yourself. You could start small and purchase a portable generator that can supply a household with electricity. Simply fill it with gasoline, connect it to your home electrical panel and off you go.
Until you need more gasoline, that is. Or the generator needs routine maintenance, or it breaks down. Because it’s your only source of electricity, it is up to you to safely build, maintain and monitor the equipment powering your home.
Ultimately you have to decide: is your time better spent creating and maintaining the supply? Or is it better to just buy electricity and spend your time on other things?
Now, let’s apply this analogy to Cloud Data Lakes. Building your own cloud data lake does not, itself, give your company a competitive advantage. The corporate advantages you seek are in the analytics and data science results, rather than in the data lake infrastructure itself.
This blog series will focus on the various challenges and hidden costs that you will have to address when building your own Cloud Data Lake for analytics and machine learning.
Subscribe & Stay Informed!