VS Code extension to easily view and handle large datasets. Look at JSONL/Parquet/CSV files without crashes + 16 production LLM tokenizers for chat completion data
-
Updated
Feb 5, 2026 - TypeScript
VS Code extension to easily view and handle large datasets. Look at JSONL/Parquet/CSV files without crashes + 16 production LLM tokenizers for chat completion data
Multi-tenant Postgres-compatible database on object storage. 12× cheaper disk than Postgres, native vector search, per-tenant isolation. Built in Rust.
Production-style PySpark ETL pipeline processing 100K+ e-commerce records with optimized joins, feature engineering, and scalable Parquet outputs.
Fine-tuning DistilGPT2 on the EmpatheticDialogues dataset to create an emotionally intelligent chatbot. Features custom attention calibration and a Streamlit-based interface for wellness support.
Add a description, image, and links to the paraquet topic page so that developers can more easily learn about it.
To associate your repository with the paraquet topic, visit your repo's landing page and select "manage topics."