Skip to main content
Lab : Hive Queries

Objectives:

To understand and be able to write simple HQL queries.

Due date:

Tuesday, Sep 18 at 8:30 pm (in class exercise).

Tools for the assignment

For this assignment, you will use your Cloudera Live Hive query engine.

What to turn in:

You need to turn in a word document containing the Screenshot of HQL query and the query result for steps 4 – 11 below.

Lab Description

In this lab you are asked to write some HQL queries. The data in this database is already in your Cloudera QuickStart VM that we installed last week.  The database consists of six tables as shown in the db schema – namely categories, customers, departments, order_items, orders and products.

  1. Launch Cloudera QuickStart VM. Open firefox browser and Launch Hue UI.
  2. Click on button on Hue select Files go to /user/hive/warehouse you should see folders for each of the tables mentioned above.
  3. Click on button on Hue select tables select categories Explore metadata related categories table
  4. Run the following query that you ran in Impala in the previous lab. Note the amount of time it takes to give you the result first time.
-- Most popular product categories
select c.category_name, count(order_item_quantity) as count
from order_items oi
inner join products p on oi.order_item_product_id = p.product_id
inner join categories c on c.category_id = p.product_category_id
group by c.category_name
order by count desc
limit 10;
  1. Run the following query that you ran in Impala in the previous lab. Note the amount of time it takes to give you the result first time.
-- top 10 revenue generating products
select p.product_id, p.product_name, r.revenue
from products p inner join
(select oi.order_item_product_id, sum(cast(oi.order_item_subtotal as float)) as revenue
from order_items oi inner join orders o
on oi.order_item_order_id = o.order_id
where o.order_status <> 'CANCELED'
and o.order_status <> 'SUSPECTED_FRAUD'
group by order_item_product_id) r
on p.product_id = r.order_item_product_id
order by r.revenue desc
limit 10;
  1. Create a new HQL query to List all the customers who bought at least one ‘Electronics’ product
  2. List first name and last name of all the customers who purchased products that has ‘Under Armour' in product name.
  3. List top 10 customers with largest total purchase – list first name, last name and total sales for each top customer   
  4. For every customer, find the total number of orders created. Sort the answers in decreasing order of the total number of orders.
  5. Find all the customers who have made at least 10 orders
  6. Find all the customers who have purchased at least 25 distinct products
  7. Submit word document with screenshot of query and query result to Black Board.

Comments

Popular posts from this blog

A Comprehensive Evaluation of the Internal Consulting Process: Steps and Considerations

Introduction Internal consulting has emerged as a critical function within organizations, offering in-house expertise to solve complex business problems and drive change. It closely mirrors external consulting in methodology but is differentiated by the consultant's intimate knowledge of the organization and a vested interest in its long-term success. This article aims to evaluate the key steps involved in the internal consulting process, offering insights into each phase's significance and challenges. Steps in the Internal Consulting Process The internal consulting process can generally be segmented into five distinct stages: Initial Assessment, Data Collection and Analysis, Solution Development, Implementation, and Evaluation. Below is an evaluation of each step: Step 1: Initial Assessment Objective: To understand the problem or opportunity area and define the scope of the project. Significance: A well-defined scope ensures that the consulting project stays focused and manage...

The Evolving Landscape of Consulting Practice: Changes and Implications

Introduction Consulting is a field that thrives on its ability to adapt to market demands and emerging trends. As businesses evolve due to technological advancements, shifts in consumer behavior, and fluctuations in global markets, consulting practices must keep pace. This article explores some of the significant changes currently transforming the consulting industry and discusses their implications for both consultants and clients. Technological Disruption Data Analytics and Artificial Intelligence Consulting firms are increasingly integrating data analytics and artificial intelligence into their service offerings. These technologies allow consultants to offer data-driven insights that can significantly enhance strategic decision-making. This evolution means consultants now need skills in data interpretation and analysis, alongside their traditional expertise in business strategy. Virtual Consulting Platforms The advent of digital platforms enables consulting services to be offered re...

The Imperative of Transition Structure in Implementing Change and A Model for Effective Transition

Introduction Organizational change is an inevitable phenomenon in the dynamic business landscape of today. While the conception of change is significant, its successful implementation is even more crucial. One key factor that often determines the success of implementing change is the presence of a well-designed transition structure. This article aims to discuss the necessity of having a transition structure in place and proposes a model to effectively guide the transition during organizational change. The Need for a Transition Structure Aligning Stakeholders Any significant change involves a variety of stakeholders, from senior management to front-line employees. A transition structure ensures that all parties are aligned, understand their roles, and are committed to the objectives of the change. Mitigating Risks Change often comes with risks, such as resistance from employees, potential loss in productivity, or lapses in quality. A structured approach can help mitigate these risks by ...