Clone all the repositories in a working directory.
We're going to be running docker from this working directory,
so babel-local-dev has access to the other repositories.
First clone this repository:
git clone git@github.com:hathitrust/babel.git babelThen run:
cd babel
./setup.shThis will check out the other repositories along with their submodules.
There's a lot, because we're replicating running on the dev servers with
debug_local=1 enabled.
In your workdir:
docker compose --profile frontend build
docker compose --profile backend build
Build the CSS and JavaScript for firebird-common and pt:
#build firebird-common
docker compose run firebird /htapps/babel/firebird-common/bin/build.sh
# build pt/firebird
docker compose run page-turner /htapps/babel/pt/bin/build.shWithin the Babel project, we have the option to stand up the frontend, backend, or everything.
- To stand up the frontend and core services for development, use the following command.
docker compose --profile frontend up- To run the backend and core services for development, run the following command
docker compose --profile backend upIn your browser:
- catalog:
http://localhost:8080/Search/Home - catalog solr:
http://localhost:9033 - full-text solr:
http://localhost:8983
PageTurner & imgsrv:
http://localhost:8080/cgi/pt?id=test.pd_openhttp://localhost:8080/cgi/imgsrv/cover?id=test.pd_openhttp://localhost:8080/cgi/imgsrv/image?id=test.pd_open&seq=1http://localhost:8080/cgi/imgsrv/html?id=test.pd_open&seq=1http://localhost:8080/cgi/imgsrv/download/pdf?id=test.pd_open&seq=1&attachment=0
mysql is exposed at 127.0.0.1:3307. The default username & password with write
access is mdp-admin / mdp-admin (needless to say, do not use this image in
production!)
mysql -h 127.0.0.1 -p 3307 -u mdp-admin -pHuzzah!
- catalog runs + php
- babel cgi apps run under apache in a single container
- imgsrv plack/psgi process runs in its own container
- apache proxies to imgsrv & catalog
There are two sets of tests:
docker compose run perl-testThere is a helper script to set up the environment for unauthenticated and authenticated tests with playwright.
To run all playwright tests:
bin/run_playwright_tests.sh allIf a particular suite fails, the tests will stop; you can examine the traces with
docker compose up -d playwright_reportsthen going to http://localhost:9323.
You can run a particular set of tests with:
Tests with an unauthenticated user (most of them):
bin/run_playwright_tests.sh unauthedTests for print-disabled access:
bin/run_playwright_tests.sh ssd_userTests for resource sharing:
bin/run_playwright_tests.sh resource_sharing_userThe easiest way to do this (for internal developers) is to copy a ZIP and METS from production:
First set the HT_REPO_HOST environment variable to somewhere you can scp from:
export HT_REPO_HOST=somebody@whatever.hathitrust.orgThen:
./stage_item_scp.sh uc2.ark:/13960/t4mk66f1dThis will download the item via scp as well as its catalog metadata, stage it to the local repository, and index it in the local full-text index. You should then be able to view it via for example http://localhost:8080/cgi/pt?id=uc2.ark:/13960/t4mk66f1d
With access to the ht_text_pd dataset (this should work from the Library VPN
but is also available to other qualified
individuals,
it is also possible to stage only the full-text for the item using the HTRC
datasets. This
is potentially faster, since you only need to transfesr the full-text rather
than the images. This allows full-text search and access via e.g.
http://localhost/cgi/ssd?id=HTID, but PageTurner won't work (because the zip
doesn't contain the images.)
bash ./stage_item_rsync.sh uc2.ark:/13960/t4mk66f1d
reset_database.sh: If you need to reset or update the database or solr
schema, you will need to make sure the persistent volumes for them are removed
so that when you restart the containers they will get a fresh copy of the
schema. The reset_database.sh script will do this.
mysql_sdr.sh: This will connect to the ht database running inside the mysql
container.
You can simulate various authenticated scenarios by setting environment
variables in Apache. There is appropriate setup for a variety of scenarios and
user types in configuration files under apache-cgi/auth, and a helper script
switch_auth.sh to allow you to pick a particular scenario and configure the
local Apache server to use it.
There are several synthetic items in the sample data that can be tested with authenticated access:
test.ic_currently_held: in-copyright; holdings API indicates it is currently heldtest.ic_not_current: in-copyright; holdings API indicates it is withdrawntest.ic_not_held: in-copyright; holdings API indicates it is not held at all
By default, all access will be geolocated as from the US. See info about the test geoip databases for more information about how for testing non-US access.
- link to documentation for important tasks - e.g. running apps under debugging, updating css/js, etc
- easy mechanism to generate placeholder volumes in
imgsrv-sample-datathat correspond to the records in the catalog