If you were to use the following SQL statement in the Quickstart VM via the Hue interface – utilizing the Hive query editor:
SELECT product_name, product_price
WHERE ( product_price > 10)
ORDER BY product_price DESC
what would be the correct answer for the most expensive product?
SOLE E25 Elliptical
SOLE E35 Elliptical
SOLE F85 Treadmill
Spalding Beast 60″ Glass portable Basketball
In the sample_08 data set – who had the highest salary in 2008?
(hint: look at the example SQL query from the hands-on exercise)
In the sample_08 data set – who had the lowest salary above $50000 in 2008?
(hint: look at the example SQL query from the hands-on exercise –
ASC vs. DESC)
Food service managers
Postal service clerks
If you only had $10 – what is the most expensive product you could afford to buy from the products table?
Clicgear Rovic Shoe Brush
$10 gift card
Toronto FC Team Color Soccer Bracelet
adidas Brazuca 2014 Mini Soccer Ball
LIJA Women’s golf Beanie
In this assignment we will use the Bay Area bike share data set. First, open a browser inside the VM download both the year 1 and year 2 data sets from:
You can save it on your desktop or at the location of your preference within the Quickstart VM. (additional info can be found in the reading Titled “Uploading Bike Share data into Hive”)
Within the Hue window click on “Data Browsers” and then on the “Metastore Tables”. Here, under the Database click on “Create a new table from a file”. Choose the 201402_trip_data.csv from the folder that you have downloaded from the website above. Use the default values on the data import, except for Duration change from Tinyint to an Int.
Now that you have a new database in the Hadoop – you can query the data. (hint: refresh the database list to show the newly updated database in the Hive query view)
Which startstation has the longest trip duration?
Davis at Jacskon
University and Emerson
Park at Olive
Powell Street BART