Skip to content

data-ambassade/federatedsearch

Repository files navigation

NOTICE: THIS REPOSITORY CONTAINS THE DOCUMENTATION AND CODE OF THE PROTOTYPE FOR FEDERATED SEARCH. NO FURTHER DEVELOPMENT.

Federated search

Design & development of a prototype with customizable user interface (frontend) and federation service (backend API) for federated search with the following services:

  • Build on existing federation services and search engines
  • Standards & Model-driven - exchange of search results between frontend and backend comply to linked data model - configurable
  • Metadata at the source - actual search results link directly to the source, no copying of metadata, no harvesting of metadata required
  • AI Ready - backend service as a knowledge graph for LLM
  • Open source

The objective of this development is to deliver a prototype solution that helps people in setting up of a federative search based on open source software with a minimal dependency of developers or specific applications. It is easy to setup (docker) and easy to deploy and adapt into an existing infrastructure. This prototype has been developed within a research of the Grenzenloos Datalandschap. See the report on this research (Dutch): https://grenzeloosdatalandschap.nl/kennis/samen-versnellen-praktijkgids-federatieve-catalogus/

The development was split into 3 phases:

  • lexical search - find results purely on words entered
  • semantical & contextual search

Please see this table to understand the differences between lexical and semantical search.

image

Key functionality

The frontend is a Wordpress plug-in consisting of a query/result screen and a detail screen ('More Info'). The detail screen is used to show the result of a specific search result. It refers to the actual source of the metadata. In case no reference is supplied a detail page is shown with the metadata that was used for the federated search.

The federation backend service API (implemented in NodeRed) receives search requests from the frontend and returns results that conforms to DCAT-AP. This API service queries routes the query to existing endpoints (REST, GraphQL, SparQL). The results from these endpoints is translated into JSON-LD with a DCAT2 compliant structure.

federated search diagram

Two modes for federation

The frontend has the ability to directly call the federated search API of OpenCatalogi (marked with B). In that case the Node-red backend is not used. The results from the OpenCatalogi API are translated internally in the frontend into a JSON-LD DCAT2 compliant structure. One can activitate this option as a parameter in the Wordpress plug-in of the Federated Search. In the mode marked with A the frontend sends it's query to the federation endpoint. Node-red serves in this case as a router to catalogs.

Semantical & contextual search

The original prototype for searching with natural language used PromptQL as a chat frontend. This has been replaced by a more flexible and simple chatbot Tawk.io and is integrated in the Wordpress website as a plug-in. Contextual search is not implemented yet. One can simulate this with adding contextual information in the prompt of the chatbot.

Gartner on semantics

See below the perspective of Gartner on semantics (source: Gartner: What Data and Analytics Leaders Should Know Before Implementing a Data Catalog - Jason Medd - Data & Analytics summit 11-13 may 2025)

image

About

User interface design for federated search

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •