Hotel System C++ — File-Based Storage Engine Project

🏨 1. Project Title

Hotel System C++: Variable-Length File Organization with Multi-Index Retrieval

🧭 2. Professional Project Overview

This project is a C++ hotel management system built as a file-organization engine rather than a DBMS-backed application. It persists Rooms, Guests, and Reservations using delimiter-based variable-length records, maintains in-memory primary/secondary indexes, and resolves records using physical file offsets.

The system intentionally exposes storage-engine internals that are typically hidden by database layers, making it an excellent educational reference for understanding how low-level data persistence works under the hood.

❓ 3. Motivation and Problem Statement

The system models core storage-engine concerns in a compact, production-quality educational implementation:

Record serialization/deserialization with custom delimiter protocols
Variable-length record storage avoiding fixed-width padding waste
Logical deletion and free-space reuse via availability lists
Primary and secondary indexing with binary search optimization
Offset-based direct access for O(1) record retrieval after index lookup

The Problem Solved

Efficient retrieval and mutation of structured hotel data without SQL databases, ORM layers, or any external dependencies — pure C++ STL and file I/O only.

🎯 4. System Objectives

✅ Persist all entities to disk in plain text files (human-readable format).
✅ Support full CRUD operations: insert, search, update, and delete.
✅ Keep lookups fast by maintaining in-memory indexes (primary + secondary).
✅ Reuse deleted space efficiently using first-fit availability lists.
✅ Keep index files and main data files synchronized across runtime and shutdown.
✅ Demonstrate real storage-engine patterns used by actual database systems.

✨ 5. Core Features

Feature	Implementation Detail
📝 Variable-Length Records	`\|` field delimiters + `#` record terminators; size computed at runtime
🔍 Primary Indexing	`id → offset` mapping with binary search — O(log n) lookup
🏷️ Secondary Indexing	Non-unique key support for room type, guest name, and nationality
🗑️ Logical Deletion	`*` marker preserves record boundaries; enables future space reuse
♻️ First-Fit Reuse	Avail-list tracks deleted slots for efficient reallocation
✏️ In-Place Updates	Overwrite when new ≤ old size; relocate (delete + rewrite) otherwise
💾 Immediate Persistence	Indexes flushed after every operation — minimizes data-loss window
💰 Reservation Pricing	Dynamic pricing based on room type and stay duration

Secondary Index Coverage

Rooms: room_type → room_id (filter by room class)
Guests: full_name → personal_id (search by name, supports duplicates)
Guests: nationality → personal_id (group/report by country)

🏗️ 6. Architecture Overview

┌─────────────────────────────────────────────────────────────────┐
│                    UI Layer (main.cpp)                          │
│                  Console menus & user interaction               │
└──────────────────────────┬──────────────────────────────────────┘
                           │
┌──────────────────────────▼──────────────────────────────────────┐
│              Data Collection Layer (DataCollector.h)            │
│           Input validation & workflow orchestration             │
└──────────────────────────┬──────────────────────────────────────┘
                           │
        ┌──────────────────┼──────────────────┐
        │                  │                  │
┌───────▼──────────┐ ┌─────▼──────────┐ ┌────▼─────────────────┐
│   RAM/Index Layer│ │  Storage Layer │ │   Free-Space Layer   │
│ (RAM_Manager.h)  │ │ (File_Handler/)│ │   (Avail_list.h)     │
│                  │ │                │ │                      │
│ • Sorted vectors │ │ • RoomHandler  │ │ • Track deleted slot │
│ • Binary search  │ │ • GuestHandler │ │   offsets & sizes    │
│ • Index mutations│ │ • ReserveHndlr │ │ • First-fit alloc    │
│ •Memory↔file sync│ │ • Serialize    │ │ • Binary metadata    │
│                  │ │ /Deserialize   │ │   files              │
└──────────────────┘ └────────────────┘ └──────────────────────┘
        │                  │                  │
        └──────────────────┼──────────────────┘
                           │
                  ┌────────▼────────┐
                  │   Disk Files    │
                  │   (12 files)    │
                  └─────────────────┘

Module Responsibilities

Layer	Team Member	Responsibilities
UI Layer	Nora	Menu flow, user interaction, I/O presentation, navigation
Collection Layer	Mahmoud	Input validation, multi-entity workflows, business rule enforcement
RAM/Index Layer	Nada	In-memory sorted index vectors, binary search, memory↔file synchronization
Storage Layer	Farouk	File I/O, record serialization/deserialization, offset management, logical deletion
Free-Space Layer	Farouk	Deleted-slot tracking, first-fit reuse algorithm, avail-list persistence
Configuration	Shared	Delimiters, file paths, constants (`Constants.h`)

🧱 7. Internal System Architecture

mermaid
flowchart TD
    UI[main.cpp menus] --> COL[DataCollector operations]
    COL --> RAM[RAM_Manager indexes]
    COL --> RH[RoomHandler]
    COL --> GH[GuestHandler]
    COL --> RVH[ReserveHandler]
    RH --> AV1[Avail_list room]
    GH --> AV2[Avail_list guest]
    RVH --> AV3[Avail_list reservation]
    RH --> FILES[Main + Index Files]
    GH --> FILES
    RVH --> FILES
    RAM --> IDX[Index Files on disk]

Data Flow Sequence

mermaid
sequenceDiagram
    participant User
    participant UI as main.cpp
    participant Ops as DataCollector
    participant RAM as RAM_Manager
    participant H as Handlers
    User->>UI: choose operation
    UI->>Ops: invoke workflow
    Ops->>RAM: search/validate via indexes
    Ops->>H: read/write/delete by offset
    H->>RAM: update indexes
    H->>H: flush files

🗂️ 8. Layered Design Explanation

Interaction Layer (Nora)

Console-based menu system that routes user selections into business operations. Handles all user-facing input/output presentation and navigation flow.

Collection/Coordination Layer (Mahmoud)

The orchestrator layer that:

Validates all user inputs against business rules
Ties together multi-entity workflows (e.g., reservation requires valid guest AND available room)
Triggers the appropriate handlers after validation passes
Manages cross-entity side effects (room status toggling on reservation create/cancel)

RAM/Index Management Layer (Nada)

The memory subsystem responsible for:

Maintaining sorted index vectors in RAM for all entity types
Performing binary search on primary indexes (O(log n))
Scanning secondary indexes for non-unique key lookups
Mutating index entries on insert/update/delete
Flushing index state to disk files and reloading on startup

Storage/File System Layer (Farouk)

The lowest-level persistence layer handling:

Physical file read/write operations using <fstream>
Record serialization (writing delimited fields) and deserialization (parsing)
File offset calculation and seeking
Logical deletion marking (* prefix)
End-of-file positioning for append operations

💾 9. Storage Engine Design

The engine stores each entity in a set of coordinated files:

File Type	Purpose	Count per Entity
Main Data File	Serialized variable-length records	1
Primary Index File	`ID → offset` mapping (sorted)	1
Secondary Index File(s)	`key → ID` mappings (non-unique)	0–2
Avail-List File	Free-space metadata (binary)	1

The Core Principle: Offset Indirection

Traditional approach:  Scan entire file linearly → O(n)
Our approach:          ID → [Binary Search Index] → Offset → [Direct Seek] → Record → O(log n) + O(1)

ID lookup goes through indexes to obtain the exact file byte position, then reads exactly one record from the main file — no scanning required.

🗃️ 10. File Organization Strategy

Configured in Constants.h:

Rooms (4 files)

File	Contents
`Room-main.txt`	Serialized room records
`Room-primary.txt`	Primary index: `id\|offset`
`Room-secondary.txt`	Secondary index: `room_type\|room_id`
`Avail-list-room.txt`	Binary free-space metadata

Guests (5 files)

File	Contents
`Guest-main.txt`	Serialized guest records
`Guest-primary.txt`	Primary index: `id\|offset`
`Guest-fullname.txt`	Secondary index: `name\|id`
`Guest-nationality.txt`	Secondary index: `nationality\|id`
`Avail-list-guest.txt`	Binary free-space metadata

Reservations (3 files)

File	Contents
`Reservation-main.txt`	Serialized reservation records
`Reservation-primary.txt`	Primary index: `id\|offset`
`Avail-list-reserve.txt`	Binary free-space metadata

Total: 12 files (9 core data/index files + 3 avail-list files for free-space management)

📏 11. Variable-Length Record Architecture

Records are written as concatenated text segments with delimiters. No padding, no fixed-width fields — length is derived dynamically from field content sizes plus separator overhead at runtime.

Update Strategies

Condition	Strategy	Description
`new_size ≤ old_size`	In-place overwrite	Rewrite bytes at existing offset; no index offset change needed
`new_size > old_size`	Relocation	Mark old record as deleted (`*`), write new record at reused slot or EOF, update index with new offset

This dual-strategy approach minimizes unnecessary file fragmentation while ensuring data integrity.

🔤 12. Delimiter-Based Serialization Protocol

Symbol	Meaning	Position
`\|`	Field separator	Between fields within a record
`#`	Record terminator	End of each record
`*`	Deletion marker	First byte of a logically-deleted record

Serialization Example (Conceptual)

// Writing a room record:
write(id);        put('|');
write(room_type); put('|');
write(room_view); put('|');
write(room_status);put('#');

// Result: "42|double|garden-view|available#"

The handlers manually serialize each field using fstream::write and put, and deserialize with getline(..., delimiter) — giving complete control over the binary/text format.

🧹 13. Logical Deletion Mechanism

Delete does NOT physically remove bytes from the file (which would require shifting all subsequent data). Instead:

Step 1: Seek to record's byte offset in main file
Step 2: Overwrite first byte with '*' (deletion marker)
Step 3: Push (offset, size) tuple into entity's availability list
Step 4: Remove corresponding entries from ALL RAM index vectors (primary + secondary)
Step 5: Persist updated index files and avail-list metadata to disk

Benefits: O(1) deletion operation; deleted space becomes immediately reusable; no file compaction needed during runtime.

📍 14. Offset Management System

Offset Generation

Offsets are generated by seeking to end-of-file:

file.seekp(0, ios::end);
offset = file.tellp();  // Current position = next write location

Offset Storage

Primary index stores offsets as short (16-bit, max ~32K unique positions)
Offsets represent byte positions within the main data file

Complete Lookup Path

User enters ID
  → Binary search primary index vector in RAM     [O(log n)]
    → Retrieve stored offset value
      → fseek() to that offset in main file        [O(1)]
        → Read exactly one record (until '#' found) [O(1)]
          → Deserialize fields by splitting on '|'
            → Return structured entity data

Reuse Path

Deleted slots stored in avail-list can be reclaimed if a new/updated record fits:

New record needs N bytes
  → Scan avail-list for first entry where size ≥ N    [O(k), k = free slots]
    → If found: reuse that offset (overwrite '*' with new data)
    → If not found: append at EOF (new offset from tellp)

🔑 15. Primary Indexing Strategy

Structure

Format: id|offset
Example: 42|1056
Meaning: Entity with ID=42 is stored at byte offset 1056 in main file

Implementation Details

Implemented for all three entities (Room, Guest, Reservation)
Stored in-memory as vector<PrimaryIndex> — always kept sorted
SearchPrimaryIndex() uses binary search → returns offset or -1
On every insert/update/delete: mutate RAM vector → flush to disk file
Loaded from disk into RAM at program startup

Why Binary Search?

Linear scan:   O(n) comparisons — slow for large datasets
Binary search: O(log n) comparisons — e.g., 10,000 records → ~13 comparisons

🧩 16. Secondary Indexing Strategy

Structure

Format: key|id
Example (Room):   double|42
Example (Guest):  Ahmed Mohamed Ali|10
Example (Guest):  egypt|10

Secondary Index Configuration

Entity	Secondary Key(s)	Purpose	Uniqueness
Room	`room_type`	Filter rooms by category (single/double/suit)	Non-unique (many rooms share a type)
Guest	`full_name`	Search guests by name	Non-unique (duplicate names allowed)
Guest	`nationality`	Group/report guests by country	Non-unique

Lookup Algorithm (Current)

Input: secondary key value (e.g., "double")
  → Linear scan of secondary index vector               [O(n)]
    → Collect ALL matching IDs (may be multiple)         [m matches, m ≤ n]
      → For each matched ID:
          → Binary search primary index                 [O(log n) each]
            → Get offset → read record from main file   [O(1) each]
              
Total: O(n + m · log n) — worst case O(n · log n) when m = n

Note: Future improvement would replace linear scan with binary/range search on sorted secondary keys.

🔗 17. Entity Relationship Model

                    ┌─────────────┐
                    │    Room     │
                    │─────────────│
                    │ room_id (PK)│
                    │ room_type   │
                    │ room_view   │
                    │ room_status │◄──────────────────────┐
                    └──────┬──────┘                       │
                           │                              │
                           │ 1                            │1
                           │                              │
                    ┌──────▼──────────────────────────────▼──────┐
                    │              Reservation                   │
                    │────────────────────────────────────────────│
                    │ reservation_id (PK)                        │
                    │ room_id (FK ──→ Room)                      │
                    │ guest_id (FK ──→ Guest)                    │
                    │ check_in                                   │
                    │ check_out                                  │
                    │ total_price                                │
                    └──────────────────────┬─────────────────────┘
                                           │
                                           │ 1
                                           │
                                    ┌──────▼──────┐
                                    │    Guest    │
                                    │─────────────│
                                    │ personal_id │◄──(PK)
                                    │ full_name   │
                                    │ nationality │
                                    └─────────────┘

Relationship Rules

Relationship	Cardinality	Constraint
Reservation → Room	Many-to-One	Room must exist; must be `available` at creation time
Reservation → Guest	Many-to-One	Guest must exist before reservation can be created
Room ↔ Guest	Independent	No direct relationship; linked only through Reservation

Side Effects on Relationships

Creating reservation: Sets room.status = 'reserved'
Cancelling reservation: Logically deletes reservation record + sets room.status = 'available'

🔄 18. Data Persistence Lifecycle

┌─────────────────────────────────────────────────────────────────────┐
│  STARTUP                                                            │
│  ────────                                                           │
│  1. Program launches                                                │
│  2. LoadAllIndexesFromDisk() called                                 │
│  3. Each handler opens/creates its data + index files               │
│  4. Avail-list files loaded into RAM vectors                        │
│  5. System ready for user input                                     │
├─────────────────────────────────────────────────────────────────────┤
│  RUNTIME OPERATIONS                                                 │
│  ─────────────────                                                  │
│  6. User triggers CRUD operation via menu                           │
│  7. DataCollector validates input                                   │
│  8. Handler mutates main file + RAM indexes                         │
│  9. Changes flushed IMMEDIATELY to disk (per-operation persist)     │
│  10. Steps 6-9 repeat until user exits                              │
├─────────────────────────────────────────────────────────────────────┤
│  SHUTDOWN                                                           │
│  ────────                                                           │
│  11. Destructors invoked for all handlers                           │
│  12. Final flush of all index vectors to disk files                 │
│  13. Final flush of all avail-lists to disk files                   │
│  14. All file streams closed cleanly                                │
│  15. Program terminates                                             │
└─────────────────────────────────────────────────────────────────────┘

🧠 19. Memory Layer vs Storage Layer

Aspect	RAM Layer (`RAM_Manager`)	Storage Layer (Handlers)
Data Structure	Sorted `vector<PrimaryIndex>`, `vector<SecondaryIndex>`	Raw file streams (`fstream`)
Operations	Binary search, vector insertion/deletion	`seekg`, `seekp`, `read`, `write`, `getline`
Search Method	Binary search on sorted vectors	Direct offset access (no searching)
Mutation	Insert/erase elements in vectors	Overwrite bytes at specific offsets
Persistence Trigger	Handler calls flush after mutation	Immediate `write` to file
Lifetime	Exists only during program execution	Persists across program runs

The Bridge Pattern (`DataCollector`)

User Request
    ↓
DataCollector
    ↓ (uses RAM layer to validate/find)
RAM_Manager.search(index, key) → returns offset or -1
    ↓ (passes offset to storage layer)
Handler.readByOffset(offset) → returns parsed record
    ↓
Result returned to user

📚 20. Detailed Entity Documentation

Each entity follows the same architectural pattern: ID-based primary indexing + offset-based record access, with optional secondary indexes for non-ID query paths.

🛏️ 21. Rooms Module

Primary Key: room_id (unique identity; auto-generated 1–100 on first run)
Secondary Key: room_type (supports filtering by class of room: single/double/suit)

Room Files

File	Format	Example
Main	`id\|room_type\|room_view\|room_status#`	`42\|double\|garden-view\|available#`
Primary Index	`id\|offset`	`42\|1056`
Secondary Index	`room_type\|room_id`	`double\|42`

Internal Behavior

Auto-Generation (First Run Only)

When the system detects no existing room data, it automatically generates 100 rooms:

ID Range	Room Type	Count
1 – 40	`single`	40 rooms
41 – 80	`double`	40 rooms
81 – 100	`suit`	20 rooms

View distribution is determined by ID ranges (garden-view vs sea-view).
Initial status for all auto-generated rooms: available.

State Machine (room_status)

         Reservation Created              Reservation Cancelled
    ┌───────────────────┐           ┌──────────────────┐
    │                   │           │                  │
    ▼                   │           ▼                  │
┌─────────┐    set      │       ┌───────┐    set       │
│available│─────────────┼─────▶│reserved│─────────────┼──▶
└─────────┘             │       └───────┘              │
                        │                              │
                        └──────────────────────────────┘

Search Paths

Query Type	Path	Complexity
By Room ID	Primary index binary search → offset → direct read	O(log n)
By Room Type	Secondary index scan → matching IDs → primary index lookups → reads	O(n + m·log n)

👤‍💼 22. Guests Module

Primary Key: personal_id (user-provided, must be unique)
Secondary Keys: full_name, nationality (both non-unique)

Guest Files

File	Format	Example
Main	`id\|full_name\|nationality#`	`10\|Ahmed Mohamed Ali\|egypt#`
Primary Index	`id\|offset`	`10\|2080`
Name Secondary	`name\|id`	`Ahmed Mohamed Ali\|10`
Nationality Secondary	`nationality\|id`	`egypt\|10`

Why Two Secondary Indexes?

Index	Use Case	Multi-value Support
Name Index	Find guest(s) by name	✅ Yes — multiple guests can share a name (one-to-many entries)
Nationality Index	Report/group guests by country	✅ Yes — many guests per nationality

Validation Rules

Rule	Enforcement
Duplicate personal ID	❌ Blocked — checked via primary index before insert
Duplicate full name	✅ Allowed — handled via one-to-many secondary index entries
Nationality = "Israel"	❌ Rejected — case-insensitive check blocks all variants ("israel", "ISRAEL", "IsRaEl", etc.)

📅 23. Reservations Module

Primary Key: reservation_id (user-provided, must be unique)
No secondary indexes — reservations are always looked up by ID.

Reservation Files

File	Format	Example
Main	`reservation_id\|room_id\|guest_id\|check_in\|check_out\|total_price#`	`1\|42\|10\|2026-05-01\|2026-05-04\|2400.000000#`
Primary Index	`id\|offset`	`1\|3200`

Lifecycle (7-Step Workflow)

Step 1: Validate reservation_id is unique (via primary index)
           │
           ▼
Step 2: Validate guest_id exists (via guest primary index)
           │
           ▼
Step 3: Validate room_id exists AND room_status == "available"
           │
           ▼
Step 4: Validate date format (YYYY-MM-DD) and check_out > check_in
           │
           ▼
Step 5: Compute total_price = days × rate(room_type)
           │
           ▼
Step 6: PERSIST reservation record → Set room.status = "reserved"
           │
           ▼
Step 7: On CANCEL: logically delete reservation → Set room.status = "available"

🧾 24. Record Serialization Examples

# ─── Active Records ──────────────────────────────────────────────

Room:        42|double|garden-view|available#
Guest:       10|Ahmed Mohamed Ali|egypt#
Reservation: 1|42|10|2026-05-01|2026-05-04|2400.000000#

# ─── Logically Deleted Record (notice leading *) ─────────────────

Deleted:     *2|double|sea-view|reserved#
             ↑
                          Deletion marker overwrites first byte

Visual Breakdown of a Room Record

 42   |  double  |  garden-view  |  available  #
└─┘   └──┘└─────┘└──────┘└──────┘└──────┘└──┘┘
id      sep   room_type   sep   room_view   sep   status   terminator

📂 25. Main File Structure Summary

Entity	Field Count	Structure
Room-main	4 fields	`id \| room_type \| room_view \| room_status #`
Guest-main	3 fields	`id \| full_name \| nationality #`
Reservation-main	6 fields	`reservation_id \| room_id \| guest_id \| check_in \| check_out \| total_price #`

🗄️ 26. Index File Structures Summary

Index Type	Structure	Stored As	Search Algorithm
Primary (all entities)	`id \| offset`	Sorted vector	Binary search — O(log n)
Room Secondary	`room_type \| room_id`	Sorted vector	Linear scan — O(n)
Guest Name Secondary	`name \| guest_id`	Sorted vector	Linear scan — O(n)
Guest Nationality Secondary	`nationality \| guest_id`	Sorted vector	Linear scan — O(n)

➕ 27. Insert Operation Workflow

┌─────────────────────────────────────────────────────────────┐
│  1. COLLECT INPUT                                           │
│     Gather all field values from user via console prompts   │
├─────────────────────────────────────────────────────────────┤
│  2. VALIDATE                                                │
│     • Check uniqueness (primary index)                      │
│     • Check foreign keys exist (for reservations)           │
│     • Apply business rules (e.g., nationality block)        │
├─────────────────────────────────────────────────────────────┤
│  3. COMPUTE SIZE                                            │
│     Calculate serialized byte length of the new record      │
├─────────────────────────────────────────────────────────────┤
│  4. ALLOCATE SPACE                                          │
│     • Scan avail-list for first-fit slot (size ≥ needed)    │
│     • OR append at EOF if no suitable slot exists           │
├─────────────────────────────────────────────────────────────┤
│  5. WRITE RECORD                                            │
│     Serialize and write to main file at allocated offset    │
├─────────────────────────────────────────────────────────────┤
│  6. UPDATE INDEXES                                          │
│     • Insert into primary index vector (sorted position)    │
│     • Insert into secondary index vector(s) if applicable   │
├─────────────────────────────────────────────────────────────┤
│  7. PERSIST                                                 │
│     Flush updated indexes + avail-list (if slot was reused) │
└─────────────────────────────────────────────────────────────┘

🔍 28. Search Operation Workflow

Primary Search (by ID)

Input: entity ID
  │
  ▼
Binary search RAM primary index vector
  │
  ├── Found → return offset
  │
  └── Not found → return -1 (entity doesn't exist)
  │
  ▼ (if found)
fseek(main_file, offset)
  │
  ▼
Read record until '#' encountered
  │
  ▼
Split fields by '|' → construct entity object → return to user
  
Complexity: O(log n) + O(1) = O(log n)

Secondary Search (by non-ID key)

Input: secondary key value (e.g., room_type="double")
  │
  ▼
Linear scan secondary index vector
  │
  ▼
Collect ALL matching IDs → [id₁, id₂, ..., idₘ]
  │
  ▼ (for each matched ID)
Binary search primary index → get offset → read record
  │
  ▼
Return collection of matching entity objects

Complexity: O(n) scan + m × O(log n) lookup = O(n + m·log n)

✏️ 29. Update Operation Workflow

┌─────────────────────────────────────────────────────────────┐
│  1. LOCATE OLD RECORD                                       │
│     Find current offset via primary index binary search     │
├─────────────────────────────────────────────────────────────┤
│  2. READ OLD DATA                                           │
│     Seek to offset → read current record → deserialize      │
├─────────────────────────────────────────────────────────────┤
│  3. SERIALIZE NEW DATA                                      │
│     Build new record string from updated field values       │
├─────────────────────────────────────────────────────────────┤
│  4. SIZE COMPARISON                                         │
│                                                             │
│     ┌──────────────────────────────────────────┐            │ 
│     │  new_size <= old_size ?                  │            │
│     ├──────────────────────────────────────────┤            │
│     │  YES → IN-PLACE OVERWRITE                │            │
│     │  • Seek to existing offset               │            │
│     │  • Write new record bytes                │            │
│     │  • Patch secondary indexes if key changed│            │
│     │  • Offset stays the SAME                 │            │
│     ├──────────────────────────────────────────┤            │
│     │  NO  → RELOCATION                        │            │
│     │  • Mark old record with '*' (delete)     │            │
│     │  • Push old (offset,size) to avail-list  │            │
│     │  • Allocate new slot (reuse or append)   │            │
│     │  • Write new record at new offset        │            │
│     │  • Update primary index with NEW offset  │            │
│     │  • Rebuild secondary index entries       │            │
│     └──────────────────────────────────────────┘            │
├─────────────────────────────────────────────────────────────┤
│  5. PERSIST ALL CHANGES                                     │
│     Flush mutated indexes + avail-list to disk              │
└─────────────────────────────────────────────────────────────┘

🗑️ 30. Delete Operation Workflow

┌─────────────────────────────────────────────────────────────┐
│  1. RESOLVE OFFSET                                          │
│     Binary search primary index → get byte offset           │
├─────────────────────────────────────────────────────────────┤
│  2. MARK DELETED                                            │
│     Seek to offset → write '*' as first byte                │
│     Record still physically present but logically removed   │
├─────────────────────────────────────────────────────────────┤
│  3. REGISTER FREE SPACE                                     │
│     Push (offset, record_size) tuple into entity avail-list │
│     This slot is now available for future first-fit reuse   │
├─────────────────────────────────────────────────────────────┤
│  4. CLEANUP PRIMARY INDEX                                   │
│     Remove entry from primary index vector                  │
├─────────────────────────────────────────────────────────────┤
│  5. CLEANUP SECONDARY INDEXES                               │
│     Remove ALL secondary index entries referencing this ID  │
├─────────────────────────────────────────────────────────────┤
│  6. PERSIST                                                 │
│     Flush updated primary index file                        │
│     Flush updated secondary index file(s)                   │
│     Flush updated avail-list file                           │
└─────────────────────────────────────────────────────────────┘

✅ 31. Validation Rules (Complete Reference)

Rule	Entity	Check	Failure Action
Unique ID	All	Primary index lookup	Block insert with error message
Guest exists	Reservation	Guest primary index lookup	Block — "Guest not found"
Room exists	Reservation	Room primary index lookup	Block — "Room not found"
Room available	Reservation	Check `room_status == "available"`	Block — "Room already reserved"
Date format	Reservation	Regex/pattern match `YYYY-MM-DD`	Block — "Invalid date format"
Date order	Reservation	`check_out > check_in`	Block — "Check-out must be after check-in"
Nationality block	Guest	Case-insensitive compare to `"Israel"`	Block — "Nationality not allowed"

💲 32. Reservation Pricing Logic

Price Table (defined in code as `ROOM_PRICES` map)

Room Type	Daily Rate (EGP/Night)
`single`	500
`double`	800
`suit`	1,500

Computation Formula

days        = (check_out_date - check_in_date) converted to whole days
daily_rate  = ROOM_PRICES[room_type]
total_price = days × daily_rate

Pricing Examples

Scenario	Room Type	Check-In	Check-Out	Nights	Total Price
Weekend getaway	double	2026-05-01	2026-05-04	3	3 × 800 = 2,400
Business trip	single	2026-06-10	2026-06-12	2	2 × 500 = 1,000
Luxury stay	suit	2026-07-01	2026-07-07	6	6 × 1,500 = 9,000

⚙️ 33. Internal Processing Lifecycle

sequenceDiagram
    participant User
    participant UI as main.cpp
    participant Ops as DataCollector
    participant RAM as RAM_Manager
    participant H as Handlers
    User->>UI: choose operation
    UI->>Ops: invoke workflow
    Ops->>RAM: search/validate via indexes
    Ops->>H: read/write/delete by offset
    H->>RAM: update indexes
    H->>H: flush files

🛠️ 34. Engineering Decisions

Decision	Rationale	Impact
Primary index as offset map	Eliminates need for full-file scans on lookup	Fast O(log n) retrieval; requires index maintenance overhead
Variable-length records	Avoids wasted space from fixed-width padding	More compact storage; more complex update logic (size comparison needed)
Logical deletion	Keeps writes simple; preserves record boundaries; enables space reuse	No O(n) byte-shifting on delete; eventual fragmentation without compaction
Separate secondary index files	Supports flexible non-ID queries without changing main-file format	Extra storage + sync complexity; powerful multi-path search capability
Immediate persistence after operations	Minimizes data-loss window between mutation and disk write	Slightly lower throughput; maximum durability guarantee
In-memory sorted vectors for indexes	Enables binary search; simple implementation	RAM usage scales linearly with dataset size

🚀 35. Performance Considerations

Operation	Time Complexity	Space Complexity	Notes
Primary key lookup	O(log n)	O(1) extra	Binary search in RAM + single file seek
Secondary key lookup	O(n + m·log n)	O(m) results	Linear scan + per-match primary lookup
Record fetch (by known offset)	O(1)	O(1)	Single `fseek` + read until `#`
Insert (append path)	O(1) amortized	O(1)	EOF write + index insertion
Insert (reuse path)	O(k)	O(1)	k = free slots scanned in avail-list
Update (in-place)	O(log n)	O(1)	Index lookup + overwrite
Update (relocation)	O(log n + k)	O(1)	Delete + insert combined
Delete	O(log n)	O(1)	Index lookup + mark + cleanup

Legend: n = total records, m = number of secondary matches (m ≤ n), k = free slots in avail-list

📈 36. Scalability Discussion

Current Design Scope

✅ Well-suited for: Small to medium datasets (up to ~10,000 records per entity), educational purposes, embedded systems, single-user desktop applications

Scaling Limitations

Limitation	Root Cause	Impact	Potential Fix
ID/Offset ceiling	`short` type (16-bit signed)	Max ~32K unique records	Migrate to `int` or `size_t`
Secondary lookup bottleneck	Linear scan of secondary index	Degraded performance with large datasets	Implement binary/range search on sorted secondary keys
Full index rewrites	Entire index vector rewritten on every mutation	Unnecessary I/O for large indexes	Differential/delta updates
File fragmentation	Relocation leaves "holes" (deleted records)	Larger file sizes than data warrants	Periodic compaction utility
No concurrency support	Single-threaded design	Cannot handle simultaneous users	Add file locking or move to client-server architecture

⚖️ 37. Design Tradeoffs

✅ Advantages

Benefit	Explanation
Transparent file format	All data human-readable; debuggable with any text editor
Low conceptual overhead	No ORM magic, no query parser — everything explicit
Educational value	Exposes real database internals usually hidden by abstraction layers
Zero external dependencies	Pure C++17 standard library only — runs anywhere with a compiler
Predictable performance	No query optimizer vagaries — complexity is deterministic

⚠️ Tradeoffs

Tradeoff	Mitigation
Manual consistency management	Careful ordering of operations in handler code
Duplicate logic across handlers	Shared utilities in common headers
Eventual fragmentation	Avail-list reuse mitigates; compaction planned for future
Platform-specific directory creation	Documented issue; portable replacement planned

🚨 Known Consistency Edge Case

Some room update/delete code paths do not fully clean up stale room_type secondary index entries when a room's type changes. This can leave orphaned secondary index mappings pointing to non-existent or incorrect room types. Not critical for correctness (primary index remains authoritative), but could cause ghost entries in secondary searches.

🤔 38. Why File Organization Was Chosen

This implementation intentionally avoids database abstractions to demonstrate foundational concepts:

Database Concept	Our Analog	Learning Value
B-Tree index	Sorted vector + binary search	Understand why indexes are sorted
Row ID / Tuple identifier	File byte offset	Learn physical addressing
MVCC / soft delete	Logical deletion marker (`*`)	See how deletes work internally
Free space map / FSM	Availability list	Understand space reuse mechanisms
WAL / write-ahead log	Immediate flush policy	Appreciate durability guarantees
Query planner execution	Manual index selection path	See how query optimization works

These are the exact mechanisms that power PostgreSQL, MySQL, SQLite, and other production databases — now visible and understandable.

🧪 39. Technical Challenges Encountered

Challenge	Solution Approach
Index/file consistency across CRUD	Every mutation updates both main file AND all affected indexes atomically (within single operation scope)
Safe variable-length updates	Size comparison before write; relocation fallback prevents data corruption
Cross-entity side effects	Reservation handler explicitly toggles room status; DataCollector coordinates the multi-step transaction
Deleted-space reuse with boundary safety	Avail-list stores (offset, size) tuples; first-fit ensures new record fits entirely within reclaimed slot
Date parsing and arithmetic	Custom date validation and day-difference calculation in workflow layer
Pricing integration	Room-type price lookup embedded in reservation creation flow
Nationality filtering	Explicit case-insensitive block list checked at insertion time

🌱 40. Future Improvements Roadmap

High Priority

Widen numeric types: Replace short with int/size_t for IDs and offsets (remove 32K ceiling)
Optimize secondary indexes: Implement binary or range search instead of linear scan — reduces secondary lookup from O(n) to O(log n)
Referential integrity guards: Prevent deleting a guest or room that has active reservations (currently allowed — could cause orphaned foreign keys)

Medium Priority

Compaction/defragmentation utility: Rebuild main files by copying only live records, resetting all offsets
Vocabulary normalization: Standardize status terms (reserved vs booked vs occupied) consistently across all modules
Portable filesystem API: Replace system("if not exist ...") with C++17 <filesystem> for cross-platform directory creation
Fix include paths: Convert absolute Windows paths in main.cpp to relative includes for out-of-the-box builds on all platforms

Low Priority (Nice-to-Have)

CMake build configuration: Replace manual g++ commands with a proper build system
Automated unit test suite: Test CRUD operations, edge cases, and consistency invariants
Export/import functionality: JSON or CSV export for data portability
Search history/recent queries: Cache frequently searched values
Statistics dashboard: Show occupancy rates, revenue totals, guest nationality breakdowns

👥 41. Team Contributions

Farouk — Storage Systems Engineer 🗄️

File handling architecture and design patterns
Persistent storage layer implementation
Hard disk interaction and stream management
Record writing/reading (serialization/deserialization)
Offset management and addressing scheme
File organization architecture decisions
Low-level storage operations and bit-level concerns
Availability list design and implementation

Nada — Memory Systems Engineer 🧠

RAM layer design and data structure selection
Runtime memory handling strategies
In-memory index/data structure implementation
Sorted vector maintenance algorithms
Data movement between memory and files
Binary search implementation and optimization
Index synchronization logic

Mahmoud — Integration Engineer 🔗

Data collection layer architecture
Input acquisition and parsing workflows
Data preparation before persistence
Collector workflow logic and orchestration
Business rule validation and enforcement
Cross-entity coordination (reservation ↔ room/guest)
Error messaging and user guidance

Nora — UX/UI Engineer 🖥️

User interface design and layout
User interaction flow and navigation
Input/output presentation formatting
Usability and interaction handling
Console menu hierarchy and routing
Display formatting and readability
User experience polish

🏁 42. Conclusion

This project represents a practical, from-scratch storage engine implemented as a hotel management system. It demonstrates that the fundamental mechanisms underlying modern databases — persistence, indexing, record management, and lifecycle workflows — can be built directly on files with careful engineering.

What Makes This Implementation Valuable

Aspect	What You Learn
Transparency	Every byte on disk is inspectable; no black boxes
Performance reasoning	Understand WHY operations have their complexity characteristics
Tradeoff awareness	See real engineering choices and their consequences
Systems thinking	Connect high-level operations to low-level file mechanics
Debugging skills	Diagnose issues at the file/hex level, not just application level

Whether you're studying database internals, preparing for systems programming interviews, or building your own embedded data solution, this project provides a concrete reference implementation of storage-engine fundamentals.

⏱️ 43. Time Complexity Comparison (Complete Reference)

Operation	Primary Index Path	Secondary Index Path	Notes
Lookup by ID	O(log n)	—	Binary search in RAM primary index, then direct file read
Lookup by secondary key	—	O(n + m·log n)	O(n) linear scan + O(log n) primary lookup per match; worst-case O(n·log n) when m = n
Record fetch (after offset resolved)	O(1)	O(1)	Single `fseek` + sequential read until `#` terminator
Insert (append path)	O(log n)	O(log n)	Index insertion cost dominates; file write is O(1)
Insert (reuse path)	O(log n + k)	O(log n + k)	Plus O(k) avail-list scan for first-fit slot
Update (in-place)	O(log n)	—	Index lookup + overwrite at known offset
Update (relocation)	O(log n + k)	—	Delete (log n) + Insert (log n + k) combined
Delete	O(log n)	O(log n)	Index lookup + mark + index removal
Space reuse (first-fit)	O(k)	O(k)	k = number of free slots in avail-list vector

* Index insertion into sorted vector is O(n) due to element shifting, but often cited as O(log n) for the search component alone.

Legend: n = total index entries, m = number of secondary matches (m ≤ n), k = free slots in avail-list

🗂️ 44. Project Structure

Hotel-System-CPP/
│
├── Final_Project/                    # ═══ Source Code ═══
│   │
│   ├── main.cpp                      # Entry point, UI menus, main() function
│   ├── DataCollector.h               # Input validation, workflow orchestration
│   ├── RAM_Manager.h                 # In-memory index structures, search algorithms
│   ├── Avail_list.h                  # Free-space tracking, first-fit allocation
│   ├── Constants.h                   # Delimiters, filenames, paths, configuration
│   │
│   ├── File_Handler/                 # ═══ Entity-Specific Handlers ═══
│   │   ├── RoomHandler.h             # Room CRUD, serialization, file I/O
│   │   ├── GuestHandler.h            # Guest CRUD, serialization, file I/O
│   │   └── ReserveHandler.h          # Reservation CRUD, serialization, file I/O
│   │
│   └── HotelSYS/Data/                # ═══ Runtime Data Files (auto-generated) ═══
│       │
│       ├── Room/
│       │   ├── Room-main.txt              # Room records (variable-length)
│       │   ├── Room-primary.txt           # Primary index: id→offset
│       │   ├── Room-secondary.txt         # Secondary index: room_type→id
│       │   └── Avail-list-room.txt        # Binary free-space metadata
│       │
│       ├── Guest/
│       │   ├── Guest-main.txt             # Guest records
│       │   ├── Guest-primary.txt          # Primary index: id→offset
│       │   ├── Guest-fullname.txt         # Secondary index: name→id
│       │   ├── Guest-nationality.txt      # Secondary index: nationality→id
│       │   └── Avail-list-guest.txt       # Binary free-space metadata
│       │
│       └── Reservation/
│           ├── Reservation-main.txt       # Reservation records
│           ├── Reservation-primary.txt    # Primary index: id→offset
│           └── Avail-list-reserve.txt     # Binary free-space metadata
│
├── structure/                       # ═══ Documentation ═══
│   └── structure.txt                # Additional structural notes
|
├── .gitignore
├── LICENSE
└── README.md                        # This file

Total: 7 source files + 12 data files = 19 files

🚀 45. Quick Start Guide

Prerequisites

✅ C++17 compatible compiler (GCC 7+, Clang 5+, MSVC 2017+)
✅ Git (for cloning the repository)
✅ Terminal/command prompt access

Step 1: Clone the Repository

git clone https://github.com/Farouk0Elsayed/Hotel-System-CPP.git
cd Hotel-System-CPP

Step 2: Resolve Include Paths ⚠️ (Required)

Known Issue (documented, fix planned — see Section 40):

Final_Project/main.cpp currently contains absolute Windows paths in its #include directives. These will fail on any machine except the original developer's.

Solution — replace absolute paths with relative includes:

// ❌ BEFORE (absolute Windows path — won't work on your machine):
#include "C:\\Users\\Farouk\\Documents\\Hotel-System-CPP\\Final_Project\\DataCollector.h"

// ✅ AFTER (relative path — works everywhere):
#include "DataCollector.h"

Apply this fix to all #include directives in main.cpp. Alternatively, configure your IDE/build system's include search paths to contain ./Final_Project and ./Final_Project/File_Handler.

Step 3: Build the Project

g++ -std=c++17 \
    -I./Final_Project \
    -I./Final_Project/File_Handler \
    -o hotel ./Final_Project/main.cpp

Note: This assumes Step 2 is completed. This project uses a header-only implementation pattern where main.cpp is the sole translation unit — compiling it generates the complete executable.

Step 4: Run the Application

./hotel        # Linux/macOS
hotel.exe      # Windows

You'll be presented with a console menu system for managing rooms, guests, and reservations.

🧰 46. Tech Stack & Concepts Covered

Language & Libraries

Technology	Usage
C++17	Core language standard
STL Containers	`vector<T>` for index storage, `string` for text handling
STL Algorithms	`std::sort`, `std::lower_bound`, `std::find`
`<fstream>`	`ifstream`, `ofstream` for all file I/O operations
`<iostream>`	Console input/output for UI layer

Storage Concepts Demonstrated

Concept	Where Applied
Variable-length records	All main data files
Delimiter-based serialization	`\|` field separators, `#` record terminators
Logical deletion (soft delete)	`*` marker in first byte
Primary indexing	`id → offset` for all entities
Secondary indexing	Non-unique key lookups (room type, name, nationality)
Offset-based direct access	`fseek` + read by byte position
Availability lists (free-space management)	First-fit slot reuse
In-place update optimization	Size comparison before write decision

Domain Logic Concepts

Concept	Implementation
Reservation lifecycle	Create → active → cancel → deleted
Room availability state machine	available ↔ reserved transitions
Dynamic pricing	Rate lookup by room type × duration
Referential integrity (partial)	Foreign key validation on reservation creation
Business rule enforcement	Nationality blocking, uniqueness constraints

👥 47. Authors

GitHub	Role
https://github.com/Farouk0Elsayed	Storage Systems Engineer
https://github.com/nada1102006	Memory Systems Engineer
https://github.com/7okaa1	Integration Engineer
https://github.com/NoraAlaa97	UX/UI Engineer

Built with dedication and precision by

Farouk 🗄️ · Nada 🧠 · Mahmoud 🔗 · Nora 🖥️

🏨 Hotel System C++ — A Storage Engine, Not Just a Homework Assignment

Repository · Version 1.0.0 · MIT License

File	Format	Example
Main	`id\|full_name\|nationality#`	`10\|Ahmed Mohamed Ali\|egypt#`
Primary Index	`id\|offset`	`10\|2080`
Name Secondary	`name\|id`	`Ahmed Mohamed Ali\|10`
Nationality Secondary	`nationality\|id`	`egypt\|10`

Name		Name	Last commit message	Last commit date
Latest commit History 96 Commits
Final_Project		Final_Project
structure		structure
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Hotel System C++ — File-Based Storage Engine Project

🏨 1. Project Title

🧭 2. Professional Project Overview

❓ 3. Motivation and Problem Statement

The Problem Solved

🎯 4. System Objectives

✨ 5. Core Features

Secondary Index Coverage

🏗️ 6. Architecture Overview

Module Responsibilities

🧱 7. Internal System Architecture

Data Flow Sequence

🗂️ 8. Layered Design Explanation

Interaction Layer (Nora)

Collection/Coordination Layer (Mahmoud)

RAM/Index Management Layer (Nada)

Storage/File System Layer (Farouk)

💾 9. Storage Engine Design

The Core Principle: Offset Indirection

🗃️ 10. File Organization Strategy

Rooms (4 files)

Guests (5 files)

Reservations (3 files)

📏 11. Variable-Length Record Architecture

Update Strategies

🔤 12. Delimiter-Based Serialization Protocol

Serialization Example (Conceptual)

🧹 13. Logical Deletion Mechanism

📍 14. Offset Management System

Offset Generation

Offset Storage

Complete Lookup Path

Reuse Path

🔑 15. Primary Indexing Strategy

Structure

Implementation Details

Why Binary Search?

🧩 16. Secondary Indexing Strategy

Structure

Secondary Index Configuration

Lookup Algorithm (Current)

🔗 17. Entity Relationship Model

Relationship Rules

Side Effects on Relationships

🔄 18. Data Persistence Lifecycle

🧠 19. Memory Layer vs Storage Layer

The Bridge Pattern (DataCollector)

📚 20. Detailed Entity Documentation

🛏️ 21. Rooms Module

Room Files

Internal Behavior

Auto-Generation (First Run Only)

State Machine (room_status)

Search Paths

👤‍💼 22. Guests Module

Guest Files

Why Two Secondary Indexes?

Validation Rules

📅 23. Reservations Module

Reservation Files

Lifecycle (7-Step Workflow)

🧾 24. Record Serialization Examples

Visual Breakdown of a Room Record

📂 25. Main File Structure Summary

🗄️ 26. Index File Structures Summary

➕ 27. Insert Operation Workflow

🔍 28. Search Operation Workflow

Primary Search (by ID)

Secondary Search (by non-ID key)

✏️ 29. Update Operation Workflow

🗑️ 30. Delete Operation Workflow

✅ 31. Validation Rules (Complete Reference)

💲 32. Reservation Pricing Logic

Price Table (defined in code as ROOM_PRICES map)

Computation Formula

Pricing Examples

The Bridge Pattern (`DataCollector`)

Price Table (defined in code as `ROOM_PRICES` map)

Packages