LinkedIn Google + Facebook YouTube Twitter
Build infinite skills
with project-based training videos
Using Kudu with Apache Spark and Apache Flume Training Video

Using Kudu with Apache Spark and Apache Flume Training Video

A Practical Training Course That Teaches Real World Skills

In this project-based Using Kudu with Apache Spark and Apache Flume video tutorial series, you'll quickly have relevant skills for real-world applications.

Follow along with our expert instructor in this training course to get:

  • Concise, informative and broadcast-quality Using Kudu with Apache Spark and Apache Flume training videos delivered to your desktop
  • The ability to learn at your own pace with our intuitive, easy-to-use interface
  • A quick grasp of even the most complex Using Kudu with Apache Spark and Apache Flume subjects because they're broken into simple, easy to follow tutorial videos

Practical working files further enhance the learning process and provide a degree of retention that is unmatched by any other form of Using Kudu with Apache Spark and Apache Flume tutorial, online or offline... so you'll know the exact steps for your own projects.

Share |
SKU: 02470 | Duration: 0.5 hours - 6 tutorial videos | Date Released: 2017-03-10
Works on: Windows PC or Mac | Format: Download | Instructor: Ryan Bosshart

Table of Contents

expand all / collapse all

Course Description

Apache Kudu, the breakthrough storage technology, is often used in conjunction with other Hadoop ecosystem frameworks for data ingest, processing, and analysis. This is a practical, hands-on course that shows you how Kudu works with four of those frameworks: Apache Spark, Spark SQL, MLlib, and Apache Flume.

You'll use the Kudu-Spark module with Spark and SparkSQL to seamlessly create, move, and update data between Kudu and Spark; then use Apache Flume to stream events into a Kudu table, and finally, query it using Apache Impala. The course is designed for learners with some limited experience using Hadoop ecosystem components like HDFS, Hive, Spark, or Impala.

* Get hands-on experience with Kudu and add more tools to your Big Data toolbox

* Learn how to move data between Kudu tables and Spark apps using the Kudu-Spark module

* Understand how to stream and analyze data in real-time with Flume and Kudu

* Create a movie ratings predictor using Flume and save the predicted values into Kudu

* See how these open source tools combine to create simple and fast data engineering pipelines

Ryan Bosshart is a Principal Systems Engineer at Cloudera, where he leads a specialized team focused on Hadoop ecosystem storage technologies such as HDFS, Hbase, and Kudu. An architect and builder of large-scale distributed systems since 2006, Ryan is co-chair of the Twin Cities Spark and Hadoop User Group. He speaks about Hadoop technologies at conferences throughout North America and holds a degree in computer science from Augsburg College.

Unlimited Online Access

The O’Reilly Learning Library is now part of Safari

Broad, deep, and trustworthy information—everything from the Learning Library plus much more. 40,000 books, videos, and tutorials from 200+ pro publishers.

Try it Free
  • Learn the way you like to—video, audio, books, tutorials, lessons
  • Available anytime, anywhere—mobile, desktop, even offline
  • Find the exact information you need to solve a problem on the fly, or learn something new from the ground up.

100% Money Back Guarantee

Complete Customer Satisfaction is our goal. All O'Reilly Training DVDs come with a 100% money back guarantee. If you are not happy with your Training DVD just contact our sales department within 30 days of purchase for a refund. View our full terms and conditions


-->