Skip to content
View caio-moliveira's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report caio-moliveira

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
caio-moliveira/README.md

Hi 👋, I'm Caio Oliveira

Senior AI Engineer · Data Engineer · Analytics Engineer

Building production-grade AI systems that turn data and documents into decisions.


🧠 About Me

I’m an AI Engineer and Data Engineer specialized in designing end-to-end intelligent automation systems that combine data engineering, LLMs, RAG architectures, and orchestration frameworks.

My journey started when I left Brazil to study Computing & IT in Dublin, where I built a strong technical foundation while developing adaptability and a global mindset. Today, I work at the intersection of AI, data platforms, and real-world business problems, transforming traditional workflows into scalable, AI-driven ecosystems.

I’ve led and implemented large-scale AI solutions in the public sector, including intelligent agents capable of processing millions of documents autonomously, enabling levels of efficiency and transparency that were previously unattainable.

Alongside my industry work, I’m also a technical instructor and mentor, helping engineers transition into Data Engineering and AI Engineering, with a strong focus on practical, production-ready systems.


🚀 What I Work On

  • 🤖 AI Agents & LLM Systems

    • RAG pipelines (chunking, embeddings, retrieval strategies)
    • Agent-based architectures with memory and tools
    • OCR + NLP for large-scale document intelligence
  • 🧱 Data Engineering

    • ETL / ELT pipelines (batch & real-time)
    • Data modeling and analytics engineering
    • API-first data products
  • ☁️ Cloud-Native & Scalable Architectures

    • Containerized services
    • Orchestrated workflows
    • Vector databases and hybrid search
  • 🎓 Education & Mentorship

    • AI & Data Engineering bootcamps
    • Technical workshops (RAG, LLMs, Vector DBs)
    • Mentoring engineers transitioning into data & AI roles

🛠️ Tech Stack

🧠 AI Engineering & LLM Ecosystem

  • RAG Architectures · AI Agents · Retrieval Systems
  • Vector Search & Hybrid Retrieval
  • Embeddings Pipelines · Semantic Search
  • OCR + NLP Document Processing

🐍 Languages & Backend Frameworks


📊 Data Engineering & Analytics

  • ETL / ELT Pipelines
  • Data Modeling & Analytics Engineering
  • Batch + Real-Time Data Processing

🗄️ Databases, Storage & Vector Infrastructure

  • Relational & NoSQL Databases
  • Caching Layers & High-Performance Retrieval
  • Vector Databases & Semantic Indexing

☁️ Cloud, DevOps & Infrastructure

  • Containerized Microservices
  • Cloud-Native AI Systems
  • CI/CD & Production Deployments

Pinned Loading

  1. sales-pipeline-project sales-pipeline-project Public

    This project demonstrates an end-to-end data pipeline, integrating cloud storage, data processing, and real-time visualization. It serves as a practical foundation for similar data engineering and…

    Python 13 3

  2. dbt-snowflake-project dbt-snowflake-project Public

    This repository implements a fully automated data pipeline integrating AWS S3, Snowflake, DBT, Apache Airflow, and Streamlit. It handles data ingestion, transformation, and visualization, providing…

    Python 1 1

  3. airflow-astro-project airflow-astro-project Public

    This project successfully demonstrates the use of Astro CLI for managing and deploying an Airflow DAG that interacts with an external API and stores data in PostgreSQL. This setup exemplifies how o…

    Python 1

  4. read-files-to-dataframe read-files-to-dataframe Public

    This project showcases how object-oriented principles like hierarchy, polymorphism, and encapsulation can improve data handling and processing. Python, SQLite, and Poetry for dependency management,…

    Python

  5. databricks-duckdb-1billion-rows databricks-duckdb-1billion-rows Public

    Python

  6. ai-agent-service ai-agent-service Public

    AI Agent robot that is able to connect to your database and answers your questions about the data at the same time that generates the query for the answe!

    Python 6 4