Red Chamber Dream Knowledge Platform

The Red Chamber Dream Knowledge Platform is a digital humanities research workflow project using Dream of the Red Chamber as its primary example. It is not only a website about the novel, but also a reusable Digital Humanities framework for transforming literary texts into searchable, analyzable, visualizable, and citable research data.

This repository provides a methodological overview, conceptual data architecture, research scripts, citation guidelines, and license information for digital humanities work on classical Chinese fiction.

Demo:

https://textoria.cclin.cc/red-chamber-dream

Project Overview

This project demonstrates an annotation-oriented corpus architecture for converting literary texts into inspectable, reproducible, and extensible research data.

The project emphasizes:

Digital Humanities: preserving literary texts as citable and reproducible research data.
NLP: using segmentation, named entity recognition, and rule-based reinforcement.
Annotation: combining automatic processing, authority tables, aliases, and human review.
Knowledge Graph: modeling characters, places, buildings, identities, motifs, and relationships.
Hybrid Workflow: allowing computational output and scholarly interpretation to coexist.
Static Publication: converting research data into static website-ready outputs.

Research Goals

This project addresses questions such as:

How can classical fiction be transformed from plain text into a structured corpus?
How can characters, places, buildings, identities, and motifs be annotated and reviewed?
How can automatic NER, authority tables, and rule-based reinforcement form an interpretable annotation pipeline?
How can character co-occurrence networks and curated semantic relationships coexist?
How can digital humanities data be rebuilt, cited, preserved, and presented?

Repository Contents

This repository includes:

a digital humanities workflow overview
an overview of annotation-oriented corpus architecture
scripts related to corpus construction, annotation, motif analysis, network modeling, and relationship modeling
academic citation and Research Attribution guidelines
GPL-3.0-or-later licensing

Researchers may adapt the scripts according to their own corpus, annotation rules, and research questions.

Conceptual Workflow

plain text
  -> corpus segmentation
  -> tokenization
  -> named entity recognition
  -> authority table normalization
  -> annotation review
  -> motif and thematic tagging
  -> social network / relationship modeling
  -> research database
  -> static digital humanities website

Getting Started

When adapting this workflow, first clarify the following research layers:

corpus segmentation: textual hierarchy and citable reading units
authority normalization: methods for resolving persons, aliases, and normalized entities
annotation workflow: NER, motifs, and human review
relationship modeling: semantic character relations, kinship, marriage, service relations, or other research-defined relations
presentation outputs: derived materials for search, statistics, network views, and display layers

For a public data architecture overview, see DATA_SCHEMA.md.

Run the Example Workflow

This repository includes a minimal example for testing the workflow scripts:

python3 build_tables.py
python3 build_ner_seed_tables.py
python3 build_person_authority.py
python3 build_ner_tables.py
python3 build_motif_tables.py
python3 build_person_relationships.py
python3 build_person_occurrence_summary.py
python3 build_person_social_network.py
python3 build_annotated_texts.py
python3 build_basic_annotation_browser.py

Outputs are written to:

public_output/

The example demonstrates a basic path from text segmentation, entity annotation, motif annotation, person relationships, co-occurrence networks, and simple HTML review pages.

Citation

If you use, cite, adapt, or refer to this project's methodology, data workflow, annotation workflow, knowledge graph design, or display architecture in a paper, research project, teaching material, website, software, database, digital humanities platform, or derived system, please cite the source.

Recommended citation:

Chun-Cheng Lin. Red Chamber Dream Knowledge Platform: A Digital Humanities Workflow and Knowledge Graph Construction Demonstration for Classical Chinese Fiction. GitHub repository, 2026.
https://github.com/cclintw/red-chamber-dream

For full citation formats and methodological attribution guidelines, see CITATION.md.

Research Attribution

This project is a demonstration platform for a digital humanities research workflow. If you cite, adapt, or extend any of the following, please retain methodological attribution:

corpus construction from text to structured corpus layers
annotation workflows combining authority tables, aliases, and rule-based reinforcement
annotation methods for characters, places, buildings, identities, and motifs
conceptual design of genealogy views, co-occurrence networks, and knowledge graphs
annotation-oriented corpus architecture
hybrid workflow for classical fiction research

License

This project uses layered licensing to distinguish code, documentation, and research data:

Code: GPL-3.0-or-later.
Documentation: CC BY 4.0.
Data: CC BY-NC 4.0, unless a specific data file or source states otherwise.

Code licensing does not waive the need for scholarly attribution of the research methodology, documentation architecture, schema design, annotation workflow, and knowledge graph workflow. If you cite, adapt, migrate, or extend this project's research workflow and methodology, please retain appropriate attribution.

Documentation

DATA_SCHEMA.md: public data architecture overview.
CITATION.md: academic citation formats, Research Attribution, and methodological citation guidelines.
LICENSE: license terms.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Red Chamber Dream Knowledge Platform

Project Overview

Research Goals

Repository Contents

Conceptual Workflow

Getting Started

Run the Example Workflow

Citation

Research Attribution

License

Documentation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
examples		examples
.gitignore		.gitignore
CITATION.cff		CITATION.cff
CITATION.md		CITATION.md
DATA_SCHEMA.md		DATA_SCHEMA.md
LICENSE		LICENSE
README.md		README.md
README_zh-TW.md		README_zh-TW.md
build_annotated_texts.py		build_annotated_texts.py
build_basic_annotation_browser.py		build_basic_annotation_browser.py
build_motif_tables.py		build_motif_tables.py
build_ner_seed_tables.py		build_ner_seed_tables.py
build_ner_tables.py		build_ner_tables.py
build_person_authority.py		build_person_authority.py
build_person_occurrence_summary.py		build_person_occurrence_summary.py
build_person_relationships.py		build_person_relationships.py
build_person_social_network.py		build_person_social_network.py
build_tables.py		build_tables.py

Folders and files

Latest commit

History

Repository files navigation

Red Chamber Dream Knowledge Platform

Project Overview

Research Goals

Repository Contents

Conceptual Workflow

Getting Started

Run the Example Workflow

Citation

Research Attribution

License

Documentation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages