GPU-Accelerated NLP for Topic Modeling

Aknowledgement

This project was completed under the supervison of Dr. Ayaz ul Hassan Khan as part of the "GPU Programming & Architecture" course at King Fahad University of Petroleum and Minerals By Haya Alhuraib and Yasmin Alshalabi.

Project Overview

This project implements Latent Dirichlet Allocation (LDA), a probabilistic topic modeling algorithm, across four different computing platforms to compare performance:

Sequential CPU Implementation (C++)
OpenACC GPU Implementation (C++ with OpenACC directives)
CUDA Python Implementation (Python with Numba/CUDA)
CUDA C/C++ Implementation (Native CUDA with Thrust)

Purpose

Benchmark and analyze the performance differences between CPU and GPU implementations of LDA topic modeling, comparing different GPU programming paradigms.

Data sets used are in the available in the zip folder

NOTE: After selecting the dataset you want to process, make sure to update the dataset filename in the preprocessing script

📁 Project Structure

Data Preprocessing (`Data Import & Preprocessing` section)

Input: CSV files with text data
Processing:
- Tokenization and cleaning (lowercase, remove short words, stopwords)
- Vocabulary building
- Bag-of-Words (BoW) representation
Output:
- training_bow.csv: Document-word-count matrix
- vocab_map.txt: Vocabulary mapping (word_id → word)

Implementation Details

1. Sequential CPU Implementation (`lda_cpu.cpp`)

Pure C++ implementation running on CPU
Uses standard C++ libraries (<random>, <chrono>, etc.)
Serial Gibbs sampling for topic assignment
Output: Nw_k_matrix.csv, Nd_k_matrix.csv, corpus_tokens.csv

2. OpenACC GPU Implementation (`lda_openacc.cpp`)

C++ with OpenACC directives for GPU offloading
Uses #pragma acc for parallelization
Includes custom xorshift64* PRNG for GPU
Memory management with malloc/free
Compilation: nvc++ -acc -gpu=managed,cc70

3. CUDA Python Implementation (`lda_gpu.py`)

Python implementation using Numba CUDA
Kernels for zero initialization, count building, and topic sampling
Uses Numba's random number generators
Automatic block/thread configuration
Dependencies: numpy, numba

4. CUDA C/C++ Implementation (`lda_gpu_c.cu`)

Native CUDA with Thrust library
Uses Thrust device vectors and algorithms
Custom CUDA kernels for count building
Functor-based topic sampling with Thrust
Compilation: nvcc -arch=sm_75

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
Code Documentation.docx		Code Documentation.docx
GPU_ProjectV2.ipynb		GPU_ProjectV2.ipynb
GPU_ProjectV3.zip		GPU_ProjectV3.zip
README.md		README.md
codeofconduct		codeofconduct

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GPU-Accelerated NLP for Topic Modeling

Aknowledgement

Project Overview

Purpose

Data sets used are in the available in the zip folder

📁 Project Structure

Data Preprocessing (`Data Import & Preprocessing` section)

Implementation Details

1. Sequential CPU Implementation (`lda_cpu.cpp`)

2. OpenACC GPU Implementation (`lda_openacc.cpp`)

3. CUDA Python Implementation (`lda_gpu.py`)

4. CUDA C/C++ Implementation (`lda_gpu_c.cu`)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

GPU-Accelerated NLP for Topic Modeling

Aknowledgement

Project Overview

Purpose

Data sets used are in the available in the zip folder

📁 Project Structure

Data Preprocessing (Data Import & Preprocessing section)

Implementation Details

1. Sequential CPU Implementation (lda_cpu.cpp)

2. OpenACC GPU Implementation (lda_openacc.cpp)

3. CUDA Python Implementation (lda_gpu.py)

4. CUDA C/C++ Implementation (lda_gpu_c.cu)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Data Preprocessing (`Data Import & Preprocessing` section)

1. Sequential CPU Implementation (`lda_cpu.cpp`)

2. OpenACC GPU Implementation (`lda_openacc.cpp`)

3. CUDA Python Implementation (`lda_gpu.py`)

4. CUDA C/C++ Implementation (`lda_gpu_c.cu`)

Packages