Provide Language Resources and Technologies¶
Minimal requirements¶
To share a Language Resource or Language Technology tool/service through the ELG platform, you must
- register to the platform and ask (by email to contact@european-language-grid.eu) to be granted “provider” permissions;
- describe your LRT according to the ELG metadata schema (at least the minimal version) and upload this description to the platform, and
- provide access to it, as described in the respective sections for Language Technologies (Provide a functional LT service) and Language Resources (Provide a Language Resource).
Minimal version - List of elements common to all LRTs¶
The following section lists the metadata elements that are common to all LRTs of the minimal version, presented in the order they must have to comply with the schema XSD. Dedicated sections present the additional elements that are specific to each resource type.
For a quick guide to the ELG template, see Template - Explanations.
Note
For this release, you MUST provide an ELG-compliant XML file. Upcoming releases will also include a metadata editor and other functionalities supporting an easy import of metadata records.
MetadataRecord (M)¶
Path MetadataRecord
Data type component
Optionality Mandatory
Explanation & Instructions
A set of formalized structured information used to describe the contents, structure, function, etc. of an entity, usually according to a specific set of rules (metadata schema)
The MetadataRecord
element includes a set of administrative data, of which the main elements that are required for metadata records registered by individuals are:
metadataCreator
: the person that has created the metadata record; you must provide at least the given name and surname of the personmetadataCurator
: the person that will be assigned the responsibility to update the metadata record when imported in the ELG database; you must provide at least the given name and surname of the personmetadataCreationDate
: the date when the metadata record was createdcompliesWith
: for ELG metadata records, this must always be the ELG-SHARE metadata schema
Example
<ms:MetadataRecord>
<ms:MetadataRecordIdentifier ms:MetadataRecordIdentifierScheme="http://w3id.org/meta-share/meta-share/elg">default id</ms:MetadataRecordIdentifier>
<ms:metadataCreationDate>2020-02-28</ms:metadataCreationDate>
<ms:metadataCurator>
<ms:actorType>Person</ms:actorType>
<ms:surname xml:lang="en">Smith</ms:surname>
<ms:givenName xml:lang="en">John</ms:givenName>
</ms:metadataCurator>
<ms:compliesWith>http://w3id.org/meta-share/meta-share/ELG-SHARE</ms:compliesWith>
<ms:metadataCreator>
<ms:actorType>Person</ms:actorType>
<ms:surname xml:lang="en">Brown</ms:surname>
<ms:givenName xml:lang="en">George</ms:givenName>
</ms:metadataCreator>
</ms:metadataRecord>
resourceName (M)¶
Path MetadataRecord.DescribedEntity.LanguageResource.resourceName
Data type multilingual string
Optionality Mandatory
Explanation & Instructions
Introduces a human-readable name or title by which the resource is known
This is the “brand name” of your resource; try to use a name that is unique.
Example
<ms:resourceName xml:lang="en">GATE: English Named Entity Recognizer</ms:resourceName>
resourceShortName (R)¶
Path MetadataRecord.DescribedEntity.LanguageResource.resourceShortName
Data type multilingual string
Optionality Recommended
Explanation & Instructions
Introduces a short form (e.g., abbreviation, acronym , etc.) used to refer to a language resource
Example
<ms:resourceShortName xml:lang="en">annie-named-entity-recognizer</ms:resourceShortName>
description (M)¶
Path MetadataRecord.DescribedEntity.LanguageResource.description
Data type multilingual string
Optionality Mandatory
Explanation & Instructions
Introduces a short free-text account that provides information about the resource (e.g., service function, contents of a data resource, technical information , etc.)
Example
<ms:description xml:lang="en">Identifies names of persons, locations, organizations, as well as money amounts, time and date expressions in English texts automatically. </ms:description>
LRIdentifier (RA)¶
Path MetadataRecord.DescribedEntity.LanguageResource.LRIdentifier
Data type string
Optionality Recommended when applicable
Explanation & Instructions
A string (e.g., PID, DOI, internal to an organization , etc.) used to uniquely identify a language resource
You must also use the attribute LRIdentifierScheme
to specify the identifier scheme (e.g., DOI, Hanldle, …)
If the resource is already described in another resource and has a PID, please add it with the appropriate attribute.
For metadata records imported manualy at ELG, you can use the value ‘ELG id automatically assigned’; the value will be automatically replaced by an ELG identifier once imported in the ELG database
Example
<ms:LRIdentifier ms:LRIdentifierScheme="http://w3id.org/meta-share/meta-share/elg">ELG id automatically assigned</ms:LRIdentifier>
logo (R)¶
Path MetadataRecord.DescribedEntity.LanguageResource.logo
Data type URL
Optionality Recommended
Explanation & Instructions
Links to a URL with an image file containing a symbol or graphic object used to identify the entity
The logo is like a brand name for the resource; it is displayed next to the resource name in the catalogue.
Example
<logo>https://gate.ac.uk/plugins/gau-0.1/images/logo-gate.png</logo>
version (M)¶
Path MetadataRecord.DescribedEntity.LanguageResource.version
Data type string
Optionality Mandatory
Explanation & Instructions
Associates a language resource with a pattern that indicates its version; the recommended way is to follow the semantic versioning guidelines (http://semver.org) and use a numeric pattern of the form major_version.minor_version.patch
If no version is provided, the system will automatically assign the resource a ‘v1.0.0 (automatically assigned)’ value
Example
<ms:version>v8.6</ms:version>
additionalInfo (M)¶
Path MetadataRecord.DescribedEntity.LanguageResource.additionalInfo
Data type component
Optionality Mandatory
Explanation & Instructions
Introduces a point that can be used for further information (e.g. a landing page with a more detailed description of the resource or a general email that can be contacted for further queries)
It’s a recommended practice to give at least a landing page (landingPage
) or a general email addresss (email
); if you want, you can also specify a contact person (see full schema for contactPerson
)
Example
<ms:additionalInfo>
<ms:landingPage>https://provider.example.com/product</ms:landingPage>
</ms:additionalInfo>
<ms:additionalInfo>
<ms:email>product@example.com</ms:email>
</ms:additionalInfo>
keyword (M)¶
Path MetadataRecord.DescribedEntity.LanguageResource.keyword
Data type multilingual string
Optionality Mandatory
Explanation & Instructions
Introduces a word or phrase considered important for the description of a language resource, person or organization and thus used to index or classify it
You can repeat the element if you want to add more keywords. Keywords are used for discovery purposes; so, try to use words or phrases that you think users will use to find similar resources to yours.
Example
<ms:keyword xml:lang="en">Named entity recognition</ms:keyword>
<ms:keyword xml:lang="en">person</ms:keyword>
<ms:keyword xml:lang="en">location</ms:keyword>
<ms:keyword xml:lang="en">fake news</ms:keyword>
<ms:keyword xml:lang="en">tweets</ms:keyword>
domain (R)¶
Path MetadataRecord.DescribedEntity.LanguageResource.domain
Data type component
Optionality Recommended
Explanation & Instructions
Identifies the domain according to which a resource is classified
You must fill in the CategoryLabel
element with a free text value. If you prefer to add a value from an established controlled vocabulary, you can also use the DomainIdentifier
(with the attribute DomainClassificationScheme
with the appropriate value).
Example
<ms:domain>
<ms:categoryLabel xml:lang="en">EDUCATION & COMMUNICATIONS</ms:categoryLabel>
<ms:DomainIdentifier ms:DomainClassificationScheme="http://w3id.org/meta-share/meta-share/EUROVOC">32</ms:DomainIdentifier>
</ms:domain>
<ms:domain>
<ms:categoryLabel xml:lang="en">health</ms:categoryLabel>
</ms:domain>
resourceProvider (R)¶
Path MetadataRecord.DescribedEntity.LanguageResource.resourceProvider
Data type component
Optionality Recommended
Explanation & Instructions
The person/organization responsible for providing, curating, maintaining and making available (publishing) the resource
The resource provider is very similar to the publisher of scientific articles; it can be an individual or an organization.
For organizations you must add the name of the organizations (organizationName
) and, if possible, the website.
For persons, you must add the given name and surname and, if possible, an email address or an identifier (such as ORCID id) to help uniquely identify them.
Example
<ms:resourceProvider>
<ms:Organization>
<ms:actorType>Organization</ms:actorType>
<ms:organizationName xml:lang="en">Organization</ms:organizationName>
<ms:website>https://provider.org/</ms:website>
</ms:Organization>
</ms:resourceProvider>
<ms:resourceProvider>
<ms:Person>
<ms:actorType>Person</ms:actorType>
<ms:surname xml:lang="en">Smith</ms:surname>
<ms:givenName xml:lang="en">John</ms:givenName>
</ms:Person>
</ms:resourceProvider>
publicationDate (R)¶
Path MetadataRecord.DescribedEntity.LanguageResource.publicationDate
Data type date
Optionality Recommended
Explanation & Instructions
Specifies the date when a language resource has been made available to the public
Publication date is important for citation purposes, just as for scientific articles. If this is the first time your resource is published, please use the same date as for metadataCrationDate
. If the resource has been previously published in another repository, please add the date it was first provided there.
Example
<ms:publicationDate>2015-12-17</ms:publicationDate>
resourceCreator (R)¶
Path MetadataRecord.DescribedEntity.LanguageResource.resourceCreator
Data type component
Optionality Recommended
Explanation & Instructions
Links a resource to the person, group or organisation that has created the resource
The element is important for citation and acknowledgement purposes.
For organizations you must add the name of the organizations (organizationName
) and, if possible, the website.
For persons, you must add the given name and surname and, if possible, an email address or an identifier (such as ORCID id) to help uniquely identify them.
Example
<ms:resourceCreator>
<ms:Organization>
<ms:actorType>Organization</ms:actorType>
<ms:organizationName xml:lang="en">example organization</ms:organizationName>
<ms:website>https://provider.org/</ms:website>
</ms:Organization>
</ms:resourceCreator>
<ms:resourceCreator>
<ms:Person>
<ms:actorType>Person</ms:actorType>
<ms:surname xml:lang="en">Smith</ms:surname>
<ms:givenName xml:lang="en">John</ms:givenName>
</ms:Person>
</ms:resourceCreator>
fundingProject (RA)¶
Path MetadataRecord.DescribedEntity.LanguageResource.fundingProject
Data type component
Optionality Recommended when applicable
Explanation & Instructions
Links a language resource to the project that has funded its creation, enrichment, extension , etc.
Funding information is important for acknowledgement purposes.
For projects, you must provide the name of the project (projectName
) and, if possible, a website (website
) and/or an identifier (ProjectIdentifier
).
Example
<ms:fundingProject>
<ms:projectName xml:lang="en">European Language Resource Coordination LOT3</ms:projectName>
<ms:ProjectIdentifier ms:ProjectIdentifierScheme="http://w3id.org/meta-share/meta-share/other">SMART 2015/1091 - 30-CE-0816766/00-92</ms:ProjectIdentifier>
<ms:website>http://www.lr-coordination.eu</ms:website>
</ms:fundingProject>
intendedApplication (R)¶
Path MetadataRecord.DescribedEntity.LanguageResource.
Data type component
Optionality Recommended
Explanation & Instructions
Specifies an LT application for which the language resource has been created or for which it can be used or is recommended to be used
The element is important for discovery purposes.
You can use the element LTClassRecommended
with one of the recommended values from the LT taxonomy (class ‘Function’ of the OMTD-SHARE ontology at http://w3id.org/meta-share/omtd-share/), or add a free text at the LTClassOther
element.
You can repeat the element if the resource can be used for various applications. For instance, a part-of-speech tagger can be used as a component for Named entity recognition, for sentiment analysis, etc.
Example
<ms:intendedApplication>
<ms:LTClassRecommended>http://w3id.org/meta-share/omtd-share/NamedEntityRecognition</ms:LTClassRecommended>
</ms:intendedApplication>
<ms:intendedApplication>
<ms:LTClassRecommended>http://w3id.org/meta-share/omtd-share/SentimentAnalysis</ms:LTClassRecommended>
</ms:intendedApplication>
<ms:intendedApplication>
<ms:LTClassOther>face recognition</ms:LTClassRecommended>
</ms:intendedApplication>
isDocumentedBy (R)¶
Path MetadataRecord.DescribedEntity.LanguageResource.
Data type component
Optionality Recommended
Explanation & Instructions
Links a language resource to a document (e.g., research paper describing its contents or its use in a project, user manual, etc.) or any other form of documentation (e.g., a URL with support information) that is related to the resource
You can use this element to add
- supporting documentation (user manuals, training material, etc.) for the installation and use of your resource
- scientific publications that describe the resource.
If you want, you can use one of the more fine-grained relations to documents (see full schema).
You can repeat the element if you want to add more documents.
You must fill in the title
element with the title of the document (or even an entire bibliographic record). When available, it’s also recommended to add the DocumentIdentifier
with the DOI of the document, or any other link to the document; if you do, use the attribute DocumentIdentifierScheme
to indicate the identifier type.’
Example
<ms:isDocumentedBy>
<ms:title xml:lang="en">Product User Manual</ms:title>
<ms:DocumentIdentifier ms:DocumentIdentifierScheme="http://purl.org/spar/datacite/url">https://www.company.org/product.pdf</ms:DocumentIdentifier>
</ms:isDocumentedBy>