Metadata schema

This annex provides an overview of the metadata schema used for the European Language Grid, ELG-SHARE or, in short, ELG schema. We describe the basic concepts, provide links to the full schema documentation, and finally present the minimal version of the schema, consisting only of required and recommended elements 1.

Basic concepts

The following figure shows the main notions upon which the ELG schema builds, using the tool/service as the illustrative case.

ELG schema basic concepts (use case: tool/service)

The main concepts include:

  • MetadataRecord: It corresponds to the catalogue item, and records information concerning the registration process, such as who created the item and when, whether it was harvested from another catalogue, who is responsible for its curation (updates), etc.

  • DescribedEntity: It corresponds to any entity that can be described by a metadata record. It can be a Language Resource, a Person, Organization, etc. (cf. Types of catalogue items and the green box in the above image). The LanguageResource class is further distinguished into one of four resource types: ToolService, Corpus, LexicalConceptualResource and LanguageDescription 2. A Language Resource can be described through a set of metadata elements common to all types, and a further set that fits to each of these four types.

  • Distribution: It corresponds to the physical form with which a Language Resource is made available through the catalogue, e.g. as a downloadable file, or a form accessed via an interface, etc.

Full schema documentation

You can find the full schema XSD, documentation as well as templates and examples of metadata records for all resource types in the ELG SHARE schema Git repository.

You can browse the full schema documentation here:

1

To register a metadata record at the ELG platform, the recommended elements do not have to be filled in. However, they increase the visibility and usability of the item, and providers are encouraged to fill them in. The ELG interactive editor contains both the mandatory and recommended elements. The full schema is currently supported through the upload of metadata records.

2

The ELG catalogue and editor use the term Language description is substituted with its subclasses, namely Model, Grammar and Uncategorized language description.