- Getting Started with Greenplum for Big Data Analytics
- Sunila Gollapudi
- 250字
- 2025-02-22 07:02:40
What this book covers
Chapter 1, Big Data, Analytics, and Data Science Life Cycle, defines and introduces the readers to the core aspects of Big Data and standard analytical techniques. It covers the philosophy of data science with a detailed overview of standard life cycle and steps in business context.
Chapter 2, Greenplum Unified Analytics Platform (UAP), elaborates the architecture and application of Greenplum Unified Analytics Platform (UAP) in Big Data analytics' context. It covers the appliance and the software part of the platform. Greenplum UAP combines the capabilities to process structured and unstructured data with a productivity engine and a social network engine that cans the barriers between the data science teams. Tools and frameworks such as R, Weka, and MADlib that integrate into the platform are elaborated.
Chapter 3, Advanced Analytics – Paradigms, Tools, and Techniques, introduces standard analytic paradigms with a dive deep into some core data mining techniques such as simulations, clustering, data mining, text analytics, decision trees, association rules, linear and logistic regression, and so on. R programming, Weka, and in-database analytics using MADlib are introduced in this chapter.
Chapter 4, Implementing Analytics with Greenplum UAP, covers the implementation aspects of a data science project using Greenplum analytics platform. A detailed guide to loading and unloading structured and unstructured data into Greenplum and HD, along with the approach to integrate Informatica Power Center, R, Hadoop, Weka, and MADlib with Greenplum is covered. A note on Chorus and other Greenplum specific in-database analytic options are detailed.