NCSTRL Documentation

Replacing the Dienst built-in search engine

The Dienst server contains a simple, but adequate, search engine for bibliographic data. Data for the search engine is in the form of inverted indexes created by build-inverted-index.pl. The indexes are stored on disk in the directory Indexer/Indexes. At server startup, or on receipt of a USR2 signal, these indexes are read into memory into associate arrays. All database searches use these associative arrays.

The system assumes that bibliographic data is available in RFC 1807-formatted ASCII bibliography files. However, the only RFC 1807 specific code in the server is in the file Indexer/parse_bib_file.pl. It would not be difficult to use a different bibliography format, as long as the supported search fields (described below) are present in that format.

All communication between the server and the search engine flows through the subroutines in the file Indexer/indexer_interface.pl. The components of the interface are described below. Using a new search engine would require replacing the calls within these subroutines to the Dienst database engine with semantically equivalent calls to the new search engine.

Tagged_Search
Description: Perform a fielded search on the database. The input is an associative array whose keys are the fields to search on and the values are the search criteria for the respective field. The supported fields are: Rules for bibliographic keyword matching - Words in the three bibliographic keyword fields (author, title, abstract) are matched to bibliographic entries according to the following rules:
Arguments:
Returns: a string that is a status message if errors occurred in the search.
Get_Bib_Data
Description: Get bibliographic data for a document.
Arguments:
Returns: a string that is a status message if errors occurred in the retrieval.
Get_All_IDs
Description: Get a list of all the docids in the database sorted in lexigraphic order.
Arguments
Returns: a string that is a status message if errors occurred.
Get_All_Authors
Description: Get a list of all author names in the database, order is undefined.
Arguments
Returns: a string that is a status message if errors occurred.
Get_Indexer_Status
Description: Return an HTML document describing the status of the indexer (number of records, etc.). The contents and format of this status document is undefined, it is simply displayed.
Arguments
Returns: none

Up to Main Information Menu


NCSTRL Documentation
Any comments or questions?
Contact us at help@ncstrl.org.