Data Preparation with KNIME

Utilizing the 003001_SimpleFlow_with_Statistics workflow – if we are interested only in the 35-45 working age group. What is the mean capital-loss? 152.6 134.9 187.7 1 point 2.  If we ask the same question but limit it only to people with doctorate degrees – what would be the mean capital loss? 208.6 203.4 201.9 203.7 1 point 3.  Utilizing the 003002_StandardProcessing and editing the Nominal Value Row Filter node (#7) in the workflow, what is the highest Hours-per-week for the…

Coursera : Spark Lesson 3

Coursera : Spark Lesson 3   Check all true statements about the Directed Acyclic Graph Scheduler   Each transformation is executed as soon as it is called on a RDD The DAG is managed by the cluster manager A DAG is used to track dependencies of each partition of each RDD If a partition is lost, the DAG is traversed forward to check what other steps are affected Why is building a DAG necessary in Spark but not in…

Data Science Basics

Find below the results for this Quiz 1.  What is Data Science?   Multidisciplinary Just another way to describe statisticians A relatively new discipline Extraction of knowledge from large volumes of data   2.  Which one of the V’s below does NOT describe one of the 4 major characteristics of Big Data?   Volume Variety Velocity Viscosity Veracity 3.  Correlation implies Causality?   True False 4.  According to the McKinsey’s “Big Data Report”, by 2018 what is the forecast…

