Bigdata Training

Forscher Education provides the Bigdata Training Program that delivers the focused, product knowledge, as-well-as the technical knowledge requirements to perform day-to-day management operations on Enterprise environments weaved within your own or your customer’s organization.

“Demonstrate your expertise with the most sought after technical skills. Big data success requires professionals who can prove their mastery with the tools and techniques of the Hadoop stack”

MODULE 1: AN INTRODUCTION TO BIG DATA 

  • What is Big Data?
  • Types of big data
  • Why analyse Big Data?
  • The need to analyse new more complex data sources
  • Industry use cases  – Popular big data analytic applications
  • What is Data Science?
  • Data Warehousing and BI Versus Big Data
  • Popular patterns for Big Data technologies

MODULE 2: AN INTRODUCTION TO BIG DATA ANALYTICS

  • Types of Big Data analytical workloads
  • Streaming data analytics at high velocity
  • Exploratory analysis of multi-structured data
  • Complex analysis of structured data
  • Graph analytics
  • Challenges when managing and analysing big data

MODULE 3: BIG DATA PLATFORMS AND STORAGE OPTIONS

  • The new multi-platform analytical ecosystem
  • Beyond the data warehouse – Hadoop NoSQL and analytical  RDBMSs, NewSQL DBMSs
  • NoSQL DBMSs
  • Key Value stores, Document DBMSs, Column Family DBMSs and Graph databases
  • An introduction to Hadoop and the Hadoop Stack
  • HDFS, MapReduce, Pig & Hive
  • Hadoop 2.0 Spark Framework
  • SQL on Hadoop options
  • The Big Data Marketplace
  • Hadoop distributions – Cloudera, Hortonworks, MapR, IBM BigInsights, Microsoft HD Insight, PivotalHD
  • Big Data Appliances – Oracle Big Data Appliance, IBM PureData System for Hadoop, HP HAVeN, Teradata Aster Discovery Server,
  • NoSQL databases, e.g. Datastax, Neo4J, Yarcdata, MongoDB, Riak
  • Analytical databases and DW appliances, e.g. Teradata, Exasol, IBM PureData, Oracle Exadata, SAP HANA, Kognitio, Actian ParAccel

MODULE 4: BIG DATA INTEGRATION AND GOVERNANCE IN A MULTI-PLATFORM ANALYTICAL ENVIRONMENT

  • Types of Big Data
  • Connecting to Big Data sources, e.g. web logs, clickstream, sensor data, and multi-structured content
  • Supplying consistent data to multiple analytical platforms
  • Loading Big Data – what’s different about loading HDFS, Hive & NoSQL Vs analytical relational databases
  • Change data capture – what’s possible
  • Data warehouse offload
  • Tools for ELT processing on Hadoop – The Enterprise Data Refinery
  • ETL tools Vs Pig Vs self-service DI/DQ
  • Dealing with data quality in a Big Data environment
  • Parsing unstructured data

MODULE 5: TOOLS AND TECHNIQUES FOR ANALYSING BIG DATA

  • Data Science projects
  • Creating Sandboxes for Data Science projects
  • Options for analysing unstructured content – Text analytics, c ustom MapReduce code and MapReduce developer tools
  • Using R as an analytical language for Big Data
  • Text analysis and visualisation, Sentiment analysis and visualisation
  • Clickstream analysis and visualisation
  • Analysing big data using MapReduce BI Tools and applications for Hadoop, e.g. Datameer, Karmasphere, Platfora, IBM Customer Insight
  • Exploratory graph analysis and visualisations
  • Big data analytics – query performance enablers

MODULE 6: INTEGRATING BIG DATA ANALYTICS INTO THE ENTERPRISE

  • Integrating Big Data platforms with traditional DW/BI environments – what’s involved
  • Integrating stream processing with Hadoop and Analytical DW Appliances
  • Integrating Hadoop with DW Appliances and Enterprise Data Warehouses
  • Tying together front end tools
  • Options for implementing multi-platform analytics
  • Cross-platform analytical workflows
  • The role of Data Virtualisation in a Big Data environment
  • Multi-platform optimisation

 

Call Us