Skip to content

Latest commit

 

History

History
89 lines (61 loc) · 3.25 KB

File metadata and controls

89 lines (61 loc) · 3.25 KB

Power BI Assignment: Data Cleaning, Modeling, and Dashboard Preparation

The purpose of this assignment is to work with a raw and "dirty" dataset to gain hands-on experience with Power BI’s core processes for data preparation, modeling, and reporting.

1. Downloading the Dataset

Download the "Raw Booking Data" dataset to your computer from the link below:

Kaggle Dataset:
https://www.kaggle.com/datasets/aliosmanozpinar/raw-booking-data

If you don't have a Kaggle account, you may need to create one quickly.

2. Loading Data into Power BI

  • Open Power BI Desktop.
  • Follow the steps: Get Data → Excel/CSV to load the downloaded dataset into Power BI.
  • Check the data types in the data preview screen.

3. Data Cleaning (Power Query)

In the Power Query Editor, complete the following tasks:

a. Identify Dirty Data

  • Missing values
  • Incorrectly formatted date/number fields
  • Inconsistent text entries (e.g., country, city, product names in different formats)

b. Clean Missing Values or Fill Them Using Appropriate Methods

  • Fill using average/median
  • Complete with the most frequent value (mode)
  • Remove the relevant row/column if filling is not possible
  • You can use AI-powered table matching with your data to fill in some values (like country and city names)

c. Correct Data Types

  • Convert date fields to "Date"
  • Convert currency and numeric amounts to "Decimal / Whole Number"
  • Change categorical fields to "Text" type

d. Remove Unnecessary Columns

  • You can remove columns that do not contribute to the analysis (a justification is required).

4. Data Modeling: Creating Dimension & Fact Tables

Based on the cleaned data, you are expected to create a suitable star schema model.

You can use the following structure as an example:

Fact Table

  • Reservation / transaction details
  • The main table connecting to dimensions like Date, Customer, Product, Location

Dimension Tables

  • DimDate
  • DimCustomer
  • DimHotel
  • etc.

Each dimension table must:

  • Contain a unique identifier (ID).
  • Organize repetitive categorical information.
  • Establish a one-to-many relationship with the fact table.

5. Creating the Data Model and Relationships

  • Arrange all tables in the Model View.
  • Place the fact table in the center and dimension tables around it to create a correct star schema.
  • Ensure One-to-Many relationships are established correctly.
  • If necessary, adjust the cross-filter direction to Single Direction.

6. Preparation for Dashboard

You are not required to design a dashboard for this assignment. However, you must prepare your model for dashboard creation. For this:

  • Mark the Date table and set it as the "Mark as Date Table".
  • You can prepare appropriate DAX measures for calculations (optional).
  • Rename all tables in an organized manner.

Submission Requirements

At the end of the assignment, you must submit the following:

  1. Power BI (.pbix) file containing the cleaned data model.
  2. A short document (1–2 pages) explaining the transformations you performed:
  • How you handled missing values.
  • The dimension/fact structure you created.
  • The relationships you established.