Week 2 : HIVE Assignment

[premium_content]
1. 

If you were to use the following SQL statement in the Quickstart VM via the Hue interface – utilizing the Hive query editor:

SELECT product_name, product_price

FROM products

WHERE ( product_price > 10)

ORDER BY product_price DESC

LIMIT 1000

what would be the correct answer for the most expensive product?

SOLE E25 Elliptical

SOLE E35 Elliptical

SOLE F85 Treadmill

Spalding Beast 60″ Glass portable Basketball

1
point
2. 

In the sample_08 data set – who had the highest salary in 2008?

(hint: look at the example SQL query from the hands-on exercise)

Chief Executives

Anesthesiologists

Surgeons

Lawyers

1
point
3. 

In the sample_08 data set – who had the lowest salary above $50000 in 2008?

(hint: look at the example SQL query from the hands-on exercise –

ASC vs. DESC)

Food service managers

Millwrights

Postal service clerks

Interior designers

1
point
4. 

If you only had $10 – what is the most expensive product you could afford to buy from the products table?

Clicgear Rovic Shoe Brush

$10 gift card

Toronto FC Team Color Soccer Bracelet

adidas Brazuca 2014 Mini Soccer Ball

LIJA Women’s golf Beanie

1
point
5. 

In this assignment we will use the Bay Area bike share data set. First, open a browser inside the VM download both the year 1 and year 2 data sets from:

http://www.bayareabikeshare.com/datachallenge

You can save it on your desktop or at the location of your preference within the Quickstart VM. (additional info can be found in the reading Titled “Uploading Bike Share data into Hive”)

Within the Hue window click on “Data Browsers” and then on the “Metastore Tables”. Here, under the Database click on “Create a new table from a file”. Choose the 201402_trip_data.csv from the folder that you have downloaded from the website above. Use the default values on the data import, except for Duration change from Tinyint to an Int.

Now that you have a new database in the Hadoop – you can query the data. (hint: refresh the database list to show the newly updated database in the Hive query view)

Which startstation has the longest trip duration?

Davis at Jacskon

University and Emerson

Park at Olive

Powell Street BART

[/premium_content]
Post Tagged with

Leave a Reply