Deep Learning networks are a class of machine learning models inspired by the structure and function of the human brain. They consist of multiple layers of interconnected nodes (neurons) that can automatically learn hierarchical representations of data. By stacking these layers, deep learning models excel at recognizing complex patterns, enabling breakthroughs in computer vision, natural language processing, speech recognition, and many other domains.
This repository is dedicated to learning, experimenting, and applying existing neural network architectures, as well as building projects on top of them. It also serves as a space for exploring new optimization techniques, experimenting with innovative tools, and potentially designing custom self-made architectures.
I am genuinely curious and deeply passionate about AI—almost to the point of being crazy about it. This repository reflects that enthusiasm and serves as both a learning ground and a creative outlet for my exploration of neural networks.
The main goal of this repository is to document and summarize all of the AI/ML projects I have worked on. Each project is organized as a subdirectory, where every subdirectory represents an independent implementation, experiment, or idea.
- Implementations of existing neural network architectures
- Projects built on top of established models
- Explorations of optimization strategies and training improvements
- Experiments with new tools and frameworks
- Attempts at creating original architectures
- Each subdirectory = a separate project
- Independent codebases for easy navigation and experimentation
- Clear separation of ideas, tools, and implementations
This repository is not just a collection of code, but a journey in deep learning—covering everything from fundamental applications to innovative approaches that may lead to new architectures or optimization methods. It embodies my curiosity, persistence, and excitement for advancing in the world of AI.
This project goes beyond standard digit classification to tackle the recognition of variable-length sequences (1, 2, or 3 digits) of handwritten numbers. It documents the complete machine learning engineering lifecycle: starting with a baseline CNN, diagnosing overfitting using a validation set, and systematically applying advanced techniques to achieve high performance.
- Architecture: A deep, multi-head CNN with Dropout for regularization.
- Key Techniques: Data Augmentation, Adaptive Learning Rate Scheduling (
ReduceLROnPlateau), and Early Stopping. - Outcome: Achieved a final 99.25% exact sequence match accuracy on the test set (can be high or low for you).