clarin.sru.fcs.server.search

class clarin.sru.fcs.server.search.Layer(id: str, result_id: str, type: str, encoding: ContentEncoding, qualifier: str | None = None, alt_ValueInfo: str | None = None, alt_ValueInfo_url: str | None = None)[source]

Bases: object

This class is used to information about a Layers that is available by the endpoint.

class ContentEncoding(value)[source]

Bases: str, Enum

The content encoding policy for a Layer.

VALUE = 'value'

Value information is encoded as element content in this layer.

EMPTY = 'empty'

No additional value information is encoded for this layer.

id: str

The identifier of the layer

result_id: str

The unique URI that used in the Advanced Data View to refer to this layer

type: str

The type identifier for the layer

encoding: ContentEncoding

The content encoding for this layer

qualifier: str | None = None

An optional layer qualifier to be used in FCS-QL to refer to this layer or None. Defaults to None.

alt_ValueInfo: str | None = None

An additional information about the layer or None. Defaults to None.

alt_ValueInfo_url: str | None = None

An additional URI for pointing to more information about the layer or None. Defaults to None.

class clarin.sru.fcs.server.search.DataView(identifier: str, mimetype: str, deliveryPolicy: DeliveryPolicy)[source]

Bases: object

This class is used to hold information about a data view that is implemented by the endpoint.

class DeliveryPolicy(value)[source]

Bases: str, Enum

Enumeration to indicate the delivery policy of a data view.

SEND_BY_DEFAULT = 'send-by-default'

The data view is sent automatically by the endpoint.

NEED_TO_REQUEST = 'need-to-request'

A client must explicitly request the endpoint.

identifier: str

A unique short identifier for the data view

mimetype: str

The MIME type of the data view

deliveryPolicy: DeliveryPolicy

The delivery policy for this data view

class clarin.sru.fcs.server.search.ResourceInfo(pid: str, title: Dict[str, str], description: Dict[str, str] | None, landing_page_uri: str | None, languages: List[str], available_DataViews: List[DataView], available_Layers: List[Layer] | None = None, sub_Resources: List[ResourceInfo] | None = None)[source]

Bases: object

This class implements a resource info record, which provides supplementary information about a resource that is available at the endpoint.

pid: str

Rhe persistent identifier of the resource

title: Dict[str, str]

The title of the resource represented as a map with pairs of language code and title

description: Dict[str, str] | None

The description of the resource represented as a map with pairs of language code and description or None if not applicable

landing_page_uri: str | None

A URI to the landing page of the resource or None if not applicable

languages: List[str]

The languages represented within this resource represented as a list of ISO-632-3 three letter language codes

available_DataViews: List[DataView]

The list of available data views for this resource

available_Layers: List[Layer] | None = None

The list if layers available for Advanced Search or None if not applicable

sub_Resources: List[ResourceInfo] | None = None

A list of resource sub-ordinate to this resource or None if not applicable

get_title(language: str) str | None[source]

Get the title of the resource for a specific language code.

Parameters:

language – the language code (ISO-632-3 three letter language code)

Returns:

Optional[str] – the title for the language code or None if not title for this language code exists

get_description(language: str) str | None[source]

Get the description of the resource for a specific language code.

Parameters:

language – the language code (ISO-632-3 three letter language code)

Returns:

Optional[str] – the description for the language code or None if not description for this language code exists.

has_available_Layers() bool[source]

Check if any layers are available for Advanced Search.

Returns:

boolTrue if any layer for Advanced Search is available, False otherwise

has_sub_Resources() bool[source]

Determine, if this resource has sub-resources.

Returns:

boolTrue if the resource has sub-resources, False otherwise

class clarin.sru.fcs.server.search.EndpointDescription[source]

Bases: object

An interface for abstracting resource endpoint descriptions. This interface allows you to provide a version of a endpoint description tailored to your environment.

The implementation of this interface must be thread-safe.

VERSION_1 = 1

Constant for endpoint description version number for FCS 1.0

VERSION_2 = 2

Constant for endpoint description version number for FCS 2.0

PID_ROOT = 'root'

Constant for a (synthetic) persistent identifier identifying the top-most (= root) resources in the resource inventory.

abstract destroy() None[source]

Destroy the resource info inventory. Use this method for any cleanup the resource info inventory needs to perform upon termination, i.e. closing of persistent database connections, etc.

abstract get_version() int[source]

Get the version number of this endpoint description. Valid version are 1 for FCS 1.0 and 2 for FCS 2.0.

Returns:

int – the version number for this endpoint description

abstract is_version(version: int) bool[source]

Check if this endpoint description is in a certain version.

Parameters:

version – the version to check for

Returns:

boolTrue if version number matches

abstract get_capabilities() List[str][source]

Get the list of capabilities supported by this endpoint. The list contains the appropriate URIs defined by the CLARIN-FCS specification to indicate support for certain capabilities. This list must always contain at least http://clarin.eu/fcs/capability/basic-search for the Basic Search capability.

The implementation of this method must be thread-safe.

Returns:

List[str] – the list of capabilities supported by this endpoint

abstract get_supported_DataViews() List[DataView][source]

Get the list of data views supported by this endpoint. This list must always contain an entry for the Generic Hits (HITS) data view.

The implementation of this method must be thread-safe.

Returns:

List[DataView] – the list of data views supported by this endpoint

abstract get_supported_Layers() List[Layer][source]

Get the list of layers that are supported in Advanced Search by this endpoint.

The implementation of this method must be thread-safe.

Returns:

List[Layer] – the list of layers supported in Advanced Search by this endpoint

abstract get_ResourceInfos(pid: str) List[ResourceInfo] | None[source]

Get a list of all resources sub-ordinate to a resource identified by a given persistent identifier.

The implementation of this method must be thread-safe.

Parameters:

pid – the persistent identifier of the superior resource

Returns:

List[ResourceInfo] – a list of all sub-ordinate ResourceInfo or None if not applicable

Raises:

SRUException – if an error occurred

class clarin.sru.fcs.server.search.EndpointDescriptionBase(version: int, capabilities: List[str], supported_DataViews: List[DataView], supported_Layers: List[Layer] | None)[source]

Bases: EndpointDescription

An abstract base class for implementing endpoint descriptions. It already implements the methods required for capabilities and supported data views.

[Constructor]

Parameters:
  • version – version of this endpoint description

  • capabilities – a list of capabilities supported by this endpoint

  • supported_DataViews – a list of data views that are supported by this endpoint

  • supported_Layers – a list of layers that are supported by this endpoint

Raises:
  • TypeError – if arguments are invalid (None)

  • ValueError – if argument values are not allowed

get_version() int[source]

Get the version number of this endpoint description. Valid version are 1 for FCS 1.0 and 2 for FCS 2.0.

Returns:

int – the version number for this endpoint description

is_version(version: int) bool[source]

Check if this endpoint description is in a certain version.

Parameters:

version – the version to check for

Returns:

boolTrue if version number matches

get_capabilities() List[str][source]

Get the list of capabilities supported by this endpoint. The list contains the appropriate URIs defined by the CLARIN-FCS specification to indicate support for certain capabilities. This list must always contain at least http://clarin.eu/fcs/capability/basic-search for the Basic Search capability.

The implementation of this method must be thread-safe.

Returns:

List[str] – the list of capabilities supported by this endpoint

get_supported_DataViews() List[DataView][source]

Get the list of data views supported by this endpoint. This list must always contain an entry for the Generic Hits (HITS) data view.

The implementation of this method must be thread-safe.

Returns:

List[DataView] – the list of data views supported by this endpoint

get_supported_Layers() List[Layer][source]

Get the list of layers that are supported in Advanced Search by this endpoint.

The implementation of this method must be thread-safe.

Returns:

List[Layer] – the list of layers supported in Advanced Search by this endpoint

class clarin.sru.fcs.server.search.SimpleEndpointDescription(version: int, capabilities: List[str], supported_DataViews: List[DataView], supported_Layers: List[Layer], resources: List[ResourceInfo], pid_case_sensitive: bool)[source]

Bases: EndpointDescriptionBase

A very simple implementation of an endpoint description that is initialized from static information supplied at construction time. Mostly used together with SimpleEndpointDescriptionParser, but it is agnostic how the static list of resource info records is generated.

Constructor.

Parameters:
  • version – version of this endpoint description

  • capabilities – a list of capabilities supported by this endpoint

  • supported_DataViews – a list of data views that are supported by this endpoint

  • supported_Layers – a list of layers supported for Advanced Search by this endpoint or None

  • resources – a static list of resource info records

  • pid_case_sensitiveTrue if comparison of persistent identifiers should be performed case-sensitive, False otherwise

Raises:

TypeError – if resources are None

destroy() None[source]

Destroy the resource info inventory. Use this method for any cleanup the resource info inventory needs to perform upon termination, i.e. closing of persistent database connections, etc.

get_ResourceInfos(pid: str) List[ResourceInfo] | None[source]

Get a list of all resources sub-ordinate to a resource identified by a given persistent identifier.

The implementation of this method must be thread-safe.

Parameters:

pid – the persistent identifier of the superior resource

Returns:

List[ResourceInfo] – a list of all sub-ordinate ResourceInfo or None if not applicable

Raises:

SRUException – if an error occurred

find_recursive(items: List[ResourceInfo] | None, pid: str) ResourceInfo | None[source]
class clarin.sru.fcs.server.search.SimpleEndpointSearchEngineBase[source]

Bases: SRUAuthenticationInfoProviderFactory, SRUSearchEngine

A base class for implementing a simple search engine to be used as a CLARIN-FCS endpoint.

init(config: SRUServerConfig, query_parser_registry_builder: Builder, params: Dict[str, str]) None[source]

This method should not be overridden. Perform your custom initialization in the do_init method instead.

See also

do_init, SRUSearchEngine.init

destroy() None[source]

This method should not be overridden. Perform you custom cleanup in the do_destroy method.

See also

do_destroy, SRUSearchEngine.destroy

create_SRUAuthenticationInfoProvider(params: Dict[str, str]) SRUAuthenticationInfoProvider | None[source]

Create a authentication info provider.

explain(config: SRUServerConfig, request: SRURequest, diagnostics: SRUDiagnosticList) SRUExplainResult | None[source]

Handle an explain operation. Implementing this method is optional, but is required, if the writeExtraResponseData block of the SRU response needs to be filled. The arguments for this operation are provides by the SRURequest object.

The implementation of this method must be thread-safe.

Parameters:
  • config – the SRUEndpointConfig object that contains the endpoint configuration

  • request – the SRURequest object that contains the request made to the endpoint

  • diagnostics – the SRUDiagnosticList object for storing non-fatal diagnostics

Returns:

SRUExplainResult

a SRUExplainResult object or None

if the search engine does not want to provide write_extra_response_data

Raises:

SRUException – if an fatal error occurred

scan(config: SRUServerConfig, request: SRURequest, diagnostics: SRUDiagnosticList) SRUScanResultSet | None[source]

Handle a scan operation. This implementation provides support to CLARIN FCS resource enumeration. If you want to provide custom scan behavior for a different index, override the do_scan method.

abstract create_EndpointDescription(config: SRUServerConfig, query_parser_registry_builder: Builder, params: Dict[str, str]) EndpointDescription[source]
abstract do_init(config: SRUServerConfig, query_parser_registry_builder: Builder, params: Dict[str, str]) None[source]

Initialize the search engine. This initialization should be tailed towards your environment and needs.

Parameters:
  • config – the SRUServerConfig object for this search engine

  • query_parser_registry_builder – the SRUQueryParserRegistry.Builder object to be used for this search engine. Use to register additional query parsers with the SRUServer

  • params – additional parameters from the server configuration

Raises:

SRUConfigException – if an error occurred

do_destroy() None[source]

Destroy the search engine. Override this method for any cleanup the search engine needs to perform upon termination.

do_scan(config: SRUServerConfig, request: SRURequest, diagnostics: SRUDiagnosticList) SRUScanResultSet[source]

[Deprecated] Handle a scan operation. The default implementation is a no-op. Override this method, if you want to provide a custom behavior.

Parameters:
  • config – the SRUEndpointConfig object that contains the endpoint configuration

  • request – the SRURequest object that contains the request made to the endpoint

  • diagnostics – the SRUDiagnosticList object for storing non-fatal diagnostics

Returns:

SRUScanResultSet

a SRUScanResultSet object or None

if this operation is not supported by this search engine

Raises:

SRUException – if an fatal error occurred