skeval

Semantic Evaluation Layer for LLMs

skeval is a lightweight library designed to evaluate how well Large Language Models (LLMs) understand and generate different types of sentences—such as facts, emotions, opinions, and instructions.

🚀 Motivation

Most LLM evaluation focuses on:

Accuracy
BLEU / ROUGE scores
Reasoning benchmarks

But real-world language understanding also requires:

Distinguishing facts from opinions
Detecting emotions
Identifying intent and instruction

skeval fills this gap by providing a semantic classification and evaluation layer.

🧠 What It Does

Classifies sentences into categories:
- Fact
- Emotion
- Opinion
- Instruction
- (extendable)
Evaluates LLM outputs based on:
- Classification accuracy
- Confusion between categories
- Per-class metrics
Works with:
- LLM outputs
- Custom datasets
- Benchmark pipelines

📦 Features

Modular architecture (classifier, evaluator, metrics)
Custom evaluation metrics for semantic types
Compatible with LLM pipelines
Extensible label taxonomy
Clean CLI support (planned)

🏗️ Project Structure

skeval/
│
├── src/skeval/
│   ├── classifier/
│   ├── evaluator/
│   ├── metrics/
│   └── dataset/
│
├── data/
│   ├── raw/
│   └── processed/
│
├── tests/
├── scripts/
├── docs/
└── notebooks/

⚙️ Installation

git clone https://github.com/skeval-ai/skeval.git
cd skeval
pip install -e .

🧪 Example Usage

from skeval.classifier import SentenceClassifier
from skeval.evaluator import Evaluator

sentences = [
    "Water boils at 100 degrees Celsius",
    "I feel sad today",
    "I think this movie is amazing",
    "Please close the door",
]
labels = ["fact", "emotion", "opinion", "instruction"]

classifier = SentenceClassifier(embed_dim=64)
classifier.train(sentences, labels, epochs=20)

predictions = classifier.predict([
    "The sky is blue",
    "I am so happy",
    "I believe dogs are better than cats",
    "Turn off the lights",
])

evaluator = Evaluator()
results = evaluator.evaluate(predictions, ["fact", "emotion", "opinion", "instruction"])
print(results)

📊 Example Output

{
  "accuracy": 0.75,
  "per_class": {"fact": {"precision": 1.0, "recall": 1.0, "f1-score": 1.0, ...}, ...},
  "macro_avg": {"precision": ..., "recall": ..., "f1-score": ...},
  "weighted_avg": {"precision": ..., "recall": ..., "f1-score": ...},
  "confusion_matrix": [[...], ...],
  "labels": ["emotion", "fact", "instruction", "opinion"]
}

📚 Documentation

Full documentation (Sphinx-based) is available in the docs/ directory.

To build locally:

cd docs
make html

🧠 Future Roadmap

Multi-label classification (mixed sentences)
Sarcasm detection
Benchmark dataset release
Integration with LLM evaluation tools
CLI interface

🤝 Contributing

Contributions are welcome!

Please read CONTRIBUTING.md before submitting a PR.

📄 License

This project is licensed under the MIT License.

⚠️ Disclaimer

This project is for research and educational purposes. It does not guarantee perfect semantic understanding and should not be used for critical decision-making systems without validation.

⭐ Acknowledgments

Inspired by the need for better semantic evaluation in modern LLM systems.

🔥 Tagline

“Not just what the model says—but what it means.”

Name		Name	Last commit message	Last commit date
Latest commit History 69 Commits
.github		.github
data		data
docs		docs
examples		examples
notebooks		notebooks
scripts		scripts
src/skeval		src/skeval
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.readthedocs.yaml		.readthedocs.yaml
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
CONTRIBUTORS.md		CONTRIBUTORS.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.cfg		setup.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

skeval

🚀 Motivation

🧠 What It Does

📦 Features

🏗️ Project Structure

⚙️ Installation

🧪 Example Usage

📊 Example Output

📚 Documentation

🧠 Future Roadmap

🤝 Contributing

📄 License

⚠️ Disclaimer

⭐ Acknowledgments

🔥 Tagline

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

skeval

🚀 Motivation

🧠 What It Does

📦 Features

🏗️ Project Structure

⚙️ Installation

🧪 Example Usage

📊 Example Output

📚 Documentation

🧠 Future Roadmap

🤝 Contributing

📄 License

⚠️ Disclaimer

⭐ Acknowledgments

🔥 Tagline

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages