Formation certfiante Cloudera Administrateurs pour Hadoop Apache

Formation certfiante Cloudera Administrateurs pour Hadoop Apache

Inscription pdf Inscription en ligne Catalogue 2012

Course Summary

This three-day hands-on training course is for system administrators and others responsible for managing Apache Hadoop clusters in production or development environments.

You Will Learn:

  • How the Hadoop Distributed File System and MapReduce work
  • What hardware configurations are optimal for Hadoop clusters
  • What network considerations to take into account when building out your cluster
  • How to configure Hadoop’s options for best cluster performance
  • How to configure the FairScheduler to provide service-level agreements for multiple users of a cluster
  • How to maintain and monitor your cluster
  • How to load data into the cluster from dynamically-generated files using Flume, and from relational database management systems using Sqoop
  • What system administration issues exist with other Hadoop projects such as Hive, Pig, and HBase

Audience

This course is designed for people with at least a basic level of Linux system administration experience. Prior knowledge of Hadoop is not required.

Additional Notes

Download the full agenda for Cloudera’s Administrator Training for Apache Hadoop.

Hands-On Exercises

Throughout the course, hands-on labs help students build their knowledge and apply the concepts being discussed.

Certification Exam

Following the training, attendees will take an exam which leads to the Cloudera Certified Administrator for Apache Hadoop (CCAH) credential.


Informations

  • Dates 2012 : 25 au 27 juin, 22 au 24 Octobre, 10-12 Décembre
  • Type : Inter-entreprise et intra-entreprise
  • Lieu : Paris / Selon les demandes du client
  • Durée : 3 jours
  • Prix : 1995 HT

Introduction

An Introduction To Hadoop And HDFS

  • Why Hadoop?
  • HDFS
  • MapReduce
  • Hive, Pig, HBase, and Other Ecosystem Projects
  • Hands-On Exercise:

Planning Your Hadoop Cluster

  • General Planning Considerations
  • Choosing The Right Hardware
  • Choosing The Right Hardware
  • Network Considerations
  • Configuring Nodes

Configuring and Deploying Your Cluster

  • Deployment Types
  • Installing Hadoop
  • Using Cloudera Manager for Easy Installation
  • Typical Configuration Parameters
  • Configuring Rack Awareness
  • Using Configuration Management Tools
  • Hands-On Exercise

Managing and Scheduling Jobs

  • Managing Running Jobs
  • Hands-On Exercise
  • The FIFO Scheduler
  • The FairScheduler
  • Configuring the FairScheduler
  • Hands-On Exercise

Cluster Maintenance

  • Checking HDFS Status
  • Hands-On Exercise:
  • Copying Data Between Clusters
  • Adding and Removing Cluster Nodes
  • Rebalancing the Cluster
  • Hands-On Exercise:
  • NameNode Metadata Backup
  • Cluster Upgrading

Cluster Monitoring and Troubleshooting

  • General System Monitoring
  • Managing Hadoop’s Log Files
  • Using the NameNode and JobTracker Web UI
  • Hands-On Exercise
  • Cluster Monitoring with Ganglia
  • Common Troubleshooting Issues
  • Benchmarking Your Cluster

Populating HDFS From External Sources

  • An Overview of Flume
  • Hands-On Exercise
  • An Overview of Sqoop
  • Best Practices for Importing Data

Installing And Managing Other Hadoop Projects

  • Hive
  • Pig
  • HBase

Conclusion

Cloudera Certified Administrator for Apache Hadoop Exam