AWS recently released their newest one-day training course, Building a Serverless Data Lake, and we wanted to answer some questions about the course and provide you with additional details that extend beyond the course outline. But first-
What exactly is a data lake?
A data lake provides storage for massive amounts of any data type. It also delivers enormous processing power and the ability to handle virtually unlimited tasks simultaneously.
According to AWS, there are the three defining characteristics of a data lake:
- It can collect everything by providing unlimited storage.
- It provides the ability to dive in anywhere to enable different departments or even third-party customers to search and analyze data.
- It enables flexible access to allow a multi-pool data access pattern for querying, in-memory processing and more.
The benefits extend even further. Data lakes enable new insights into your data to examine and extract business value. By separating analytics and storage, you’ll save cost. Data lakes allow you to utilize the most advanced business intelligence (BI) and big data tools, as well as analyze streaming data in real-time.
Yet, maybe most importantly, data lakes retain all your data for years. You may find yourself collecting data that you have no current use for. While it isn’t providing business value now, you may realize a business need in the future and data lakes allow you to revisit all of the data collected.
What is the goal of this course?
We know that most organizations, especially those using the power of the cloud, are collecting large amounts of data. Whether it is structured or unstructured, questions still remain, such as:
- How will this remain secure?
- What tools can I use to process all the data we’ve collected?
- What can I build to enable people to use this data effectively?
- How can we analyze this data in near real time?
Building a Serverless Data Lake on AWS aims to address these questions so companies can get the most bang for their buck from the data they’re collecting.
What will you learn?
Not only will this course help you understand how to make better use of your data, it’s also an instructor-led, hands-on training course. You’ll have access to one of our experts who has their own personal experience using the tools. Learn the ability to “play” safely in a lab environment while learning the skills to apply in your real AWS environment when you return to work.
Who should take it?
From solutions architects to big data developers, data architects and those who like to get their hands dirty analyzing data (data analysis practitioners), this course will help a wide range of professionals maximize the tools and services within their AWS environments.
AWS Enterprise Solutions Architect Sebastian Menant spoke about this course at an AWS Summit in 2016.
“Beware! We want to make sure you build the appropriate data lake,” Menant said. “You’ll want good design and good management, because if you don’t this will turn into a data swamp-something we want to avoid!”
To get the most out of this course as a student, you’ll want experience with Amazon Elastic Compute Cloud (EC2) and Amazon Simple Storage Service (S3), preferably knowledge of a programming or scripting language, and familiarity with Linux operating systems (OS) and command-line interfaces.
How will this course impact your job?
Building a Serverless Data Lake on AWS will advance your ability to build a solid metadata index and data lake architecture, follow the “ingest, store, process and analyze workflow,” and provide the best data processing tools to choose for the business use case at hand. The instructor will also cover best practices for deployment and operations so you’re ready to build a serverless data lake on AWS as soon as you return to work.