Large File Data Indexing and Word Search using Trie

This Java-based project reads a large text file and efficiently indexes every word using a Trie (Prefix Tree) data structure. Once indexed, users can search for any word to check if it exists in the file and how many times it appears.

Features

Reads large files line-by-line using BufferedReader
Uses Trie for fast word insertion and lookup
Handles punctuation and case-insensitivity
Interactive CLI: search for words or exit anytime
Shows word frequency if present in the file

How It Works

The program prompts for a file name (supports relative paths).
Reads the file, extracts valid words, and inserts them into the trie.
Accepts user input to search words interactively.
Returns the count of each word's occurrence or a not-found message.

Technologies Used

Java 11+
Trie Data Structure
BufferedReader
Scanner

Future Enhancements

Show suggestions for near matches (fuzzy search)
Export indexed data as a report
GUI integration using JavaFX or Swing
Add support for multiple files or file types

Notes

Input file must be in .txt format.
Words are normalized: lowercase and stripped of punctuation.
File path must be correct, or the program will exit gracefully.
Due to GitHub's file size limitations, the test dataset (170,000+ rows) has not been uploaded. However, the project has been successfully tested on this large data file locally.

Author

Monika
B.Tech, CSE (Data Science)
Linkedin: [https://www.linkedin.com/in/monika-nahadiya-a99558289/] Email: [monikanahadiya@gmail.com]

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
src		src
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Large File Data Indexing and Word Search using Trie

Features

How It Works

Technologies Used

Future Enhancements

Notes

Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Large File Data Indexing and Word Search using Trie

Features

How It Works

Technologies Used

Future Enhancements

Notes

Author

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages