Minimal elements for all language resources and technologies

This page describes the minimal metadata elements common to all language resources and technologies.


resourceName

Path MetadataRecord.DescribedEntity.LanguageResource.resourceName

Data type multilingual string

Optionality Mandatory

Explanation & Instructions

Introduces a human-readable name or title by which the resource is known

This is the “brand name” of your resource; try to use a name that is unique.

Example

<ms:resourceName xml:lang="en">GATE: English Named Entity Recognizer</ms:resourceName>

resourceShortName

Path MetadataRecord.DescribedEntity.LanguageResource.resourceShortName

Data type multilingual string

Optionality Recommended

Explanation & Instructions

Introduces a short form (e.g., abbreviation, acronym , etc.) used to refer to a language resource

Example

<ms:resourceShortName xml:lang="en">annie-named-entity-recognizer</ms:resourceShortName>

description

Path MetadataRecord.DescribedEntity.LanguageResource.description

Data type multilingual string

Optionality Mandatory

Explanation & Instructions

Introduces a short free-text account that provides information about the resource (e.g., service function, contents of a data resource, technical information , etc.)

Example

<ms:description xml:lang="en">Identifies names of persons, locations, organizations, as well as money amounts, time and date expressions in English texts automatically. </ms:description>

LRIdentifier

Path MetadataRecord.DescribedEntity.LanguageResource.LRIdentifier

Data type string with attribute

Optionality Recommended when applicable

Explanation & Instructions

A string (e.g., PID, DOI, internal to an organization , etc.) used to uniquely identify a language resource

You must also use the attribute LRIdentifierScheme to specify the identifier scheme (e.g., DOI, Hanldle, …)

If the resource is already described in another resource and has a PID, please add it with the appropriate attribute.

Example

<ms:LRIdentifier ms:LRIdentifierScheme="http://w3id.org/meta-share/meta-share/elg">ELG id automatically assigned</ms:LRIdentifier>


version

Path MetadataRecord.DescribedEntity.LanguageResource.version

Data type string

Optionality Mandatory

Explanation & Instructions

Associates a language resource with a pattern that indicates its version; the recommended way is to follow the semantic versioning guidelines (http://semver.org) and use a numeric pattern of the form major_version.minor_version.patch

If no version is provided, the system will automatically assign the resource a ‘v1.0.0 (automatically assigned)’ value

Example

<ms:version>v8.6</ms:version>

additionalInfo

Path MetadataRecord.DescribedEntity.LanguageResource.additionalInfo

Data type component

Optionality Mandatory

Explanation & Instructions

Introduces a point that can be used for further information (e.g. a landing page with a more detailed description of the resource or a general email that can be contacted for further queries)

It’s a recommended practice to give at least a landing page (landingPage) or a general email addresss (email); if you want, you can also specify a contact person (see full schema for contactPerson)

Example

<ms:additionalInfo>
            <ms:landingPage>https://provider.example.com/product</ms:landingPage>
</ms:additionalInfo>

<ms:additionalInfo>
            <ms:email>product@example.com</ms:email>
</ms:additionalInfo>

keyword

Path MetadataRecord.DescribedEntity.LanguageResource.keyword

Data type multilingual string

Optionality Mandatory

Explanation & Instructions

Introduces a word or phrase considered important for the description of a language resource, person or organization and thus used to index or classify it

You can repeat the element if you want to add more keywords. Keywords are used for discovery purposes; so, try to use words or phrases that you think users will use to find similar resources to yours.

Example

<ms:keyword xml:lang="en">Named entity recognition</ms:keyword>
<ms:keyword xml:lang="en">person</ms:keyword>
<ms:keyword xml:lang="en">location</ms:keyword>
<ms:keyword xml:lang="en">fake news</ms:keyword>
<ms:keyword xml:lang="en">tweets</ms:keyword>

domain

Path MetadataRecord.DescribedEntity.LanguageResource.domain

Data type component

Optionality Recommended

Explanation & Instructions

Identifies the domain according to which a resource is classified

You must fill in the CategoryLabel element with a free text value. If you prefer to add a value from an established controlled vocabulary, you can also use the DomainIdentifier (with the attribute DomainClassificationScheme with the appropriate value).

Example

<ms:domain>
        <ms:categoryLabel xml:lang="en">EDUCATION &amp; COMMUNICATIONS</ms:categoryLabel>
        <ms:DomainIdentifier ms:DomainClassificationScheme="http://w3id.org/meta-share/meta-share/EUROVOC">32</ms:DomainIdentifier>
</ms:domain>

<ms:domain>
        <ms:categoryLabel xml:lang="en">health</ms:categoryLabel>
</ms:domain>

resourceProvider

Path MetadataRecord.DescribedEntity.LanguageResource.resourceProvider

Data type component

Optionality Recommended

Explanation & Instructions

The person/organization responsible for providing, curating, maintaining and making available (publishing) the resource

The resource provider is very similar to the publisher of scientific articles; it can be an individual or an organization.

For organizations you must add the name of the organizations (organizationName) and, if possible, the website.

For persons, you must add the given name and surname and, if possible, an email address or an identifier (such as ORCID id) to help uniquely identify them.

Example

    <ms:resourceProvider>
            <ms:Organization>
                    <ms:actorType>Organization</ms:actorType>
                    <ms:organizationName xml:lang="en">Organization</ms:organizationName>
                    <ms:website>https://provider.org/</ms:website>
            </ms:Organization>
</ms:resourceProvider>

<ms:resourceProvider>
            <ms:Person>
                    <ms:actorType>Person</ms:actorType>
                    <ms:surname xml:lang="en">Smith</ms:surname>
                    <ms:givenName xml:lang="en">John</ms:givenName>
            </ms:Person>
</ms:resourceProvider>

publicationDate

Path MetadataRecord.DescribedEntity.LanguageResource.publicationDate

Data type date

Optionality Recommended

Explanation & Instructions

Specifies the date when a language resource has been made available to the public

Publication date is important for citation purposes, just as for scientific articles. If this is the first time your resource is published, please use the same date as for metadataCrationDate. If the resource has been previously published in another repository, please add the date it was first provided there.

Example

<ms:publicationDate>2015-12-17</ms:publicationDate>

resourceCreator

Path MetadataRecord.DescribedEntity.LanguageResource.resourceCreator

Data type component

Optionality Recommended

Explanation & Instructions

Links a resource to the person, group or organisation that has created the resource

The element is important for citation and acknowledgement purposes.

For organizations you must add the name of the organizations (organizationName) and, if possible, the website.

For persons, you must add the given name and surname and, if possible, an email address or an identifier (such as ORCID id) to help uniquely identify them.

Example

<ms:resourceCreator>
            <ms:Organization>
                    <ms:actorType>Organization</ms:actorType>
                    <ms:organizationName xml:lang="en">example organization</ms:organizationName>
                    <ms:website>https://provider.org/</ms:website>
            </ms:Organization>
</ms:resourceCreator>

<ms:resourceCreator>
            <ms:Person>
                    <ms:actorType>Person</ms:actorType>
                    <ms:surname xml:lang="en">Smith</ms:surname>
                    <ms:givenName xml:lang="en">John</ms:givenName>
            </ms:Person>
    </ms:resourceCreator>

fundingProject (RA)

Path MetadataRecord.DescribedEntity.LanguageResource.fundingProject

Data type component

Optionality Recommended when applicable

Explanation & Instructions

Links a language resource to the project that has funded its creation, enrichment, extension , etc.

Funding information is important for acknowledgement purposes.

For projects, you must provide the name of the project (projectName) and, if possible, a website (website) and/or an identifier (ProjectIdentifier).

Example

<ms:fundingProject>
            <ms:projectName xml:lang="en">European Language Resource Coordination LOT3</ms:projectName>
            <ms:ProjectIdentifier ms:ProjectIdentifierScheme="http://w3id.org/meta-share/meta-share/other">SMART 2015/1091 - 30-CE-0816766/00-92</ms:ProjectIdentifier>
            <ms:website>http://www.lr-coordination.eu</ms:website>
</ms:fundingProject>

intendedApplication

Path MetadataRecord.DescribedEntity.LanguageResource.

Data type component

Optionality Recommended

Explanation & Instructions

Specifies an LT application for which the language resource has been created or for which it can be used or is recommended to be used

The element is important for discovery purposes.

You can use the element LTClassRecommended with one of the recommended values from the LT taxonomy (class ‘Function’ of the OMTD-SHARE ontology at http://w3id.org/meta-share/omtd-share/), or add a free text at the LTClassOther element.

You can repeat the element if the resource can be used for various applications. For instance, a part-of-speech tagger can be used as a component for Named entity recognition, for sentiment analysis, etc.

Example

<ms:intendedApplication>
            <ms:LTClassRecommended>http://w3id.org/meta-share/omtd-share/NamedEntityRecognition</ms:LTClassRecommended>
</ms:intendedApplication>

<ms:intendedApplication>
            <ms:LTClassRecommended>http://w3id.org/meta-share/omtd-share/SentimentAnalysis</ms:LTClassRecommended>
</ms:intendedApplication>

<ms:intendedApplication>
            <ms:LTClassOther>face recognition</ms:LTClassRecommended>
</ms:intendedApplication>

isDocumentedBy

Path MetadataRecord.DescribedEntity.LanguageResource.

Data type component

Optionality Recommended

Explanation & Instructions

Links a language resource to a document (e.g., research paper describing its contents or its use in a project, user manual, etc.) or any other form of documentation (e.g., a URL with support information) that is related to the resource

You can use this element to add

  • supporting documentation (user manuals, training material, etc.) for the installation and use of your resource

  • scientific publications that describe the resource.

If you want, you can use one of the more fine-grained relations to documents (see full schema).

You can repeat the element if you want to add more documents.

You must fill in the title element with the title of the document (or even an entire bibliographic record). When available, it’s also recommended to add the DocumentIdentifier with the DOI of the document, or any other link to the document; if you do, use the attribute DocumentIdentifierScheme to indicate the identifier type.’

Example

<ms:isDocumentedBy>
        <ms:title xml:lang="en">Product User Manual</ms:title>
        <ms:DocumentIdentifier ms:DocumentIdentifierScheme="http://purl.org/spar/datacite/url">https://www.company.org/product.pdf</ms:DocumentIdentifier>
</ms:isDocumentedBy>