The purpose of this assignment is to work with a raw and "dirty" dataset to gain hands-on experience with Power BI’s core processes for data preparation, modeling, and reporting.
Download the "Raw Booking Data" dataset to your computer from the link below:
Kaggle Dataset:
https://www.kaggle.com/datasets/aliosmanozpinar/raw-booking-data
If you don't have a Kaggle account, you may need to create one quickly.
- Open Power BI Desktop.
- Follow the steps: Get Data → Excel/CSV to load the downloaded dataset into Power BI.
- Check the data types in the data preview screen.
In the Power Query Editor, complete the following tasks:
✅ a. Identify Dirty Data
- Missing values
- Incorrectly formatted date/number fields
- Inconsistent text entries (e.g., country, city, product names in different formats)
✅ b. Clean Missing Values or Fill Them Using Appropriate Methods
- Fill using average/median
- Complete with the most frequent value (mode)
- Remove the relevant row/column if filling is not possible
- You can use AI-powered table matching with your data to fill in some values (like country and city names)
✅ c. Correct Data Types
- Convert date fields to "Date"
- Convert currency and numeric amounts to "Decimal / Whole Number"
- Change categorical fields to "Text" type
✅ d. Remove Unnecessary Columns
- You can remove columns that do not contribute to the analysis (a justification is required).
Based on the cleaned data, you are expected to create a suitable star schema model.
You can use the following structure as an example:
✅ Fact Table
- Reservation / transaction details
- The main table connecting to dimensions like Date, Customer, Product, Location
✅ Dimension Tables
- DimDate
- DimCustomer
- DimHotel
- etc.
Each dimension table must:
- Contain a unique identifier (ID).
- Organize repetitive categorical information.
- Establish a one-to-many relationship with the fact table.
- Arrange all tables in the Model View.
- Place the fact table in the center and dimension tables around it to create a correct star schema.
- Ensure One-to-Many relationships are established correctly.
- If necessary, adjust the cross-filter direction to Single Direction.
You are not required to design a dashboard for this assignment. However, you must prepare your model for dashboard creation. For this:
- Mark the Date table and set it as the "Mark as Date Table".
- You can prepare appropriate DAX measures for calculations (optional).
- Rename all tables in an organized manner.
At the end of the assignment, you must submit the following:
- Power BI (.pbix) file containing the cleaned data model.
- A short document (1–2 pages) explaining the transformations you performed:
- How you handled missing values.
- The dimension/fact structure you created.
- The relationships you established.