Big Data is here! Hadoop is one of the most popular new frameworks because it offers:
- huge storage & processing capacity
- expandability
- redundancy (fault tolerance)
- open source/low cost.
By combining Hadoop with SAS you get the Big Data benefits of the Hadoop Ecosystem plus the unmatched analytical capabilities of the SAS System.
Course details
Introduction to SAS and Hadoop
Duration
2 days
Date
TBA
Price
AU$2,200 + GST
Overview
This course teaches you how to use SAS programming methods to read, write, and manipulate Hadoop data. Base SAS methods that will be covered include reading and writing raw data with the Data Step and managing the Hadoop file system and executing Map Reduce and PIG code from SAS via Proc Hadoop. In addition, SAS Access Interface to Hadoop methods that allow LIBNAME Access and SQL pass-through techniques to read and write Hadoop HIVE or Cloudera Impala tables structures will be part of this course. Although not covered in any detail, a brief overview of additional SAS/Hadoop technologies including DS2, High Performance Analytics, SAS LASR server, and In Memory Statistics and the computing infrastructure and data access methods that support these will also be part of this course.
Learn how to:
- access Hadoop distributions using the LIBNAME statement and the SQL Pass-Through Facility
- create and use SQL procedure Pass-Through queries
- use options and efficiency techniques for optimizing data access performance
- join data using the SQL procedure and the DATA step
- read and write Hadoop files with the Filename statement
- execute and use Hadoop commands with PROC Hadoop
- use Base SAS procedures with Hadoop.
Who should attend
SAS programmers that need to access data in Hadoop from within SAS
Prerequisites
Before attending this course, you should be comfortable programming in SAS and Structured Query Language (SQL). You can gain this experience from the SAS SQL 1: Essentials course. You can gain knowledge of SAS from the SAS Programming 1: Essentials course. A working knowledge of Hadoop is helpful.
Software Addressed
This course addresses Base SAS, SAS/ACCESS software. This course addresses SAS/ACCESS Interface to Hadoop.