Skip to content

Hound-fm/metadata-standard

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

Metadata Standard

Overview

Using standardised metadata descriptions makes datasets:

- More discoverable
- Easily syndicated
- Transferable
- Easily combined with other datasets

Ultimately makes it easier for datasets to be used in real-world situations to add value.

Learn more: Metadata standards for open data

Metadata

Metadata is structured information about a stream or channel separate from the content itself (title, language, media type, etc.). It is stored in the blockchain as the value property of a claim.

ℹ️ The content of this document will cover only specific areas for improvement, please read the complete metadata specification

Metadata fields

This are all the metadata fields mentioned in the doucument:

Name Description Required
license A valid spdx license identifier or english acronym Required
license_url A valid url for the actual license Not required
description A simple description of the content. It can include nested metadata (YFM) Not required

Licensed content

Copyright is a law that gives the owner of a work (for example, a book, movie, picture, song or website) the right to say how other people can use it. These rights include:

- The right to reproduce the work.
- Prepare derivative works. 
- Distribute copies. 
- Perform and display the work publicly.

It helps protect authors from other people copying their works without permission and/or for commercial purposes.

Including this information on the metadata is important to prevent unintentional copyright infringement and makes easy for everyone to discover, share, reuse or remix content legally.

Why use an identifier and not the license name ?

Identifiers are short strings so they can take less space and are easy to process by other software or programs.

By providing a short identifier, users can efficiently refer to a license without having to redundantly reproduce the full license.

They also help dealing with typos and multilingual content, for example take a look at this two licenses:

- Attribution-NonCommercial-ShareAlike 4.0 International
- Attribution - Pas d’Utilisation Commerciale - Partage dans les Mêmes Conditions 4.0 International

Unless you can read and understand both languages (english and french) it is difficult to tell if they are the same license or different types.

Example using the correct format:

{ "license": "CC-BY-NC-SA-4.0" }

Learn more: https://spdx.org/licenses/

All Rights Reserved identifier:

There is no identifier registered for "All rights reserved" on the SPDX License list, but you can use the ARR acronym instead of the legacy string.

Example using the correct format:

{ "license": "ARR" }

Public domain

For public domain is recommended to use the CC0-1.0 spdx-license-identifier or the english acronym PD instead of the legacy string "Public domain".

Example using the correct format:

{ "license": "CC0-1.0" }

License url

With a valid spdx license identifier there is no need to provide an url and the license_url field can be ignored. However if your content is published under a different license that is not registered on the SPDX License list please include a valid one.

Example using the correct format:

{ "license_url": "http://domain.com/custom_license/1.0/archive.txt" }

Legacy strings

Legacy strings are supported for compatibility with old metadata published and they will be deprecated in the future. You should use the english acronym instead.

Name Legacy string
PD Public Domain
ARR All rights reserved, Copyrighted

Tags

⚠️ Work in progress, we need more help and feedback from the community.

Extending the metadata

Some types of content require very specific metadata information wich is not provided in the current metadata schema. Since most platforms interpret the description field as markdown, it is possible to include nested metadata within this field using yaml or json front matter:

Front matter is metadata located at the top of the markdown file.

Front matter examples:

YAML

---
key: value
---

Additional content ( usually as markdown format )...

JSON

{ key: value }

Additional content ( usually as markdown format )...

The nested metadata included on the yaml block should be very minimal and only used if the current metadata fileds don't provide enough information.

Nested metadata keys should follow a specific naming convention and never tried to replace the current available metadata fields.

Nested metadata values should only include common data types such as string or numbers.

If the nested metadata has an invalid syntax, format or structure or does not provide any relevant information it should be ignored.

Schema.org

Schema.org is a collaborative, community activity with a mission to create, maintain, and promote schemas for structured data on the Internet, on web pages, in email messages, and beyond.

https://schema.org/

Software or applications should use a clear predefined schema to validate the nested metadata before any other process or interaction with it. Schema.org provides the prefered structured data schemas to use for extending the claim metadata. See list of available schemas

Releases

No releases published

Packages

No packages published