Skip to content

Solr reindex Jun 2019 #9

@cdrini

Description

@cdrini
  • Step 1: Create a local postgres copy of the database
    • 1a: Create the postgres instance
    • 1b: Populate the postgres instance
  • Step 2: Populate solr
    • 2a: Setup
    • 2b: Insert works & orphaned editions
    • 2c: Insert authors
    • 2d: Insert subjects
  • Step 3: Final Sync
    • 3a: Run solrupdater
Step Time taken
Step 1: Create a local postgres copy of the database 03:30
-- 1a: Create the postgres instance 00:01
-- 1b: Populate the postgres instance 04:27
---- Downloading the dump 00:03
---- Counting Rows 00:06
---- Sleeping between chunks 00:18
---- Actual import 02:05
---- Creating indices 01:00
Step 2: Populate solr
-- 2a: Setup 00:05
-- 2b: Insert works & orphaned editions
---- Offset startup 00:05
---- Works/orphans import (6 cores) 26:45
-- 2c: Insert authors 11:30
---- Offset startup 00:05
---- Authors import (6 cores) 11:15
-- 2d: Insert subjects (in parallel w 2b) 03:34
Step 3: Final Sync
-- 3a: Run solrupdater

Numbers Validation (old solr dump from 8 May 2019)

Type # in postgres # in old solr # in new solr psql diff solr diff
Works 18081999
Orphans 3735145
Authors 6980217
Subjects 0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions