Skip to content

Commit

Permalink
Merge remote-tracking branch 'origin/master'
Browse files Browse the repository at this point in the history
  • Loading branch information
danDiamo committed Jun 16, 2022
2 parents 10522fb + ae61933 commit 1a928d0
Show file tree
Hide file tree
Showing 3 changed files with 39 additions and 42 deletions.
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@ component-id: folk_ngram_analysis
name: FONN - FOlk N-gram aNalysis
description: Work-in-progress on pattern extraction and melodic similarity tools, with an associated test corpus of monophonic Irish folk tunes.
type: Repository
release-date: 19/05/2022
release-number: v0.5-dev
release-date: 15/06/2022
release-number: v0.6-dev
work-package:
- WP3
licence: CC BY 4.0, https://creativecommons.org/licenses/by/4.0/
Expand Down
75 changes: 36 additions & 39 deletions corpus/README.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
---
component-id: cre_corpus
name: Ceol Rince na hÉireann MIDI corpus
brief-description: A corpus of 1,224 monophonic instrumental Irish traditional dance tunes.
brief-description: A corpus of 1,195 monophonic instrumental Irish traditional dance tunes.
type: Corpus
release-date: 8/12/2021
release-number: v0.4-dev
release-date: 15/06/2022
release-number: v0.6-dev
work-package:
- WP3
licence: CC BY 4.0, https://creativecommons.org/licenses/by/4.0/
Expand All @@ -19,19 +19,19 @@ credits:
---


## About dataset
## About the dataset

**Corpus title:** _Ceol Rince na hÉireann_

**Source:** [Black, B 2020, _The Bill Black Irish tune archive homepage_, viewed 5 January 2021.][1]
**Source:** [Black, B 2020, _The Bill Black Irish tune archive homepage_, viewed 5 January 2021.](http://www.capeirish.com/webabc)

**Contents:** 1,224 traditional Irish dance tunes, each of which is represented as a monophonic MIDI file.
**Contents:** 1,195 traditional Irish dance tunes, represented in [MIDI](https://github.com/polifonia-project/folk_ngram_analysis/tree/master/corpus/MIDI) and [ABC Notation](https://github.com/polifonia-project/folk_ngram_analysis/tree/master/corpus/abc).

Between 1963 and 1999, Irish State publishing companies Oifig an tSolatáthair and An Gúm issued five printed volumes of tunes from the collections of Breadán Breathnach (1912-1985) under the series title _Ceol Rince na hÉireann_ (Dance Music of Ireland, hereafter _CRÉ_). The five volumes of _CRÉ_ contain 1,208 traditional tunes, a subset of Breathnach's more extensive personal collection of 5,000+ melodies. The collection has been transcribed into ABC notation by American traditional music researcher Bill Black, and made freely available online via his [personal website][1]. Addition of alternative tune versions and variation in numbering of unique melodies has resulted in a total of 1,224 tunes in the Bill Black ABC corpus. This resource has been used in previous research work, for example it makes up part of a larger aggregated corpus used in the [_Tunepal_][2] Music Information Retrieval app. We have created a new cleaned and annotated MIDI version of the corpus, from which feature sequence data can be extracted and analysed via Polifonia's [FONN][3] music pattern analysis toolkit.
Between 1963 and 1999, Irish State publishing companies Oifig an tSolatáthair and An Gúm issued five printed volumes of tunes from the collections of Breadán Breathnach (1912-1985) under the series title _Ceol Rince na hÉireann_ (Dance Music of Ireland, hereafter _CRÉ_). The five volumes of _CRÉ_ contain 1,208 traditional tunes, a subset of Breathnach's more extensive personal collection of 5,000+ melodies. The collection has been transcribed into ABC notation by American traditional music researcher Bill Black, and made freely available online via his [personal website]((http://www.capeirish.com/webabc)). Addition of alternative tune versions and variation in numbering of unique melodies has resulted in a total of 1,224 tunes in the Bill Black ABC corpus. This resource has been used in previous research work, for example it makes up part of a larger aggregated corpus used in the [_Tunepal_](https://tunepal.org/index.html) Music Information Retrieval app. We have created a new cleaned and annotated version of the corpus, from which feature sequence data can be extracted and analysed via Polifonia's [FONN](https://github.com/polifonia-project/folk_ngram_analysis) music pattern analysis toolkit.

NOTE: Please see [corpus_stats.ipynb][11] for a Jupyter notebook exploring the corpus data.
NOTE: Please see [corpus_demo.ipynb](https://github.com/polifonia-project/folk_ngram_analysis/blob/master/corpus/corpus_demo.ipynb) for a Jupyter notebook exploring the corpus data.

Deliverable 3.2 of the Polifonia project will describe the context and research in more detail. It will be published on [Cordis](https://cordis.europa.eu/project/id/101004746/it).
Deliverable 3.3 of the Polifonia project will describe the context and research in more detail. It will be published on [Cordis](https://cordis.europa.eu/project/id/101004746/it).


## About corpus pre-processing methodology
Expand All @@ -40,57 +40,54 @@ Bill Black's ABC version of the _CRÉ_ collection has been manually edited and a
* Removal of alternative tune versions, so that the ABC collection more accurately reflects the original print collection.
* Removal of non-valid ABC notation characters.
* Editing of repeat markers to ensure accurate MIDI output.
* Conversion to MIDI via EasyABC software.
* Manual assignment of root note (as chromatic pitch class) for every piece of music in the corpus. This data is stored in the file [roots.csv][4], which is used to derive key-invariant secondary feature sequence data from the MIDI files.
* Manual assignment of root note (as chromatic pitch class) for every piece of music in the corpus. This data is stored in [roots.csv]( https://github.com/polifonia-project/folk_ngram_analysis/tree/master/corpus/roots.csv), which is used to derive key-invariant secondary feature sequence data from the MIDI files.


## Description of the data

```
corpus/
-MIDI/
-1,224 monophonic MIDI files (.mid)
-1,195 monophonic MIDI (.mid) files, one representing each tune.
-abc/
-1 ABC NOtation corpus file (.abc) containing scores for all 1,195 tunes.
-roots.csv
-README.md
-LICENSE.md
```

Each melody in the corpus is represented as a monophonic MIDI file, named per the melody title. There are 1,224 files in total, stored in the [./MIDI][4] directory.
- ```corpus``` directory contains roots.csv, this README.md, and a LICENSE.md file.

The [corpus][6] root directory contains a [roots.csv][5] file, this readme, and a LICENSE.md file.
Roots.csv holds two columns with one row per each MIDI file in the corpus:
'title': MIDI file title
'root': expert-assigned root note of each melody, represented as a [chromatic pitch class][7] (i.e.: An integer value from C=0 through B=11).
- Roots.csv holds two columns with one row per each MIDI file in the corpus:
- 'title': MIDI file name (tune title)
- 'root': expert-assigned root note of each melody, represented as a [chromatic pitch class](https://en.wikipedia.org/wiki/Pitch_class) (i.e.: An integer value from C=0 through B=11).

<img width="400" alt="image" src="https://user-images.githubusercontent.com/78231894/142916162-9ace1c42-ceae-412f-95df-98ce34acd359.png">
<br><br>

To extract feature sequence data from the MIDI corpus, please download the corpus data and run [setup_corpus.main()][9] from folk_ngram_analysis component. Please see [folk_ngram_analysis readme][8] for further information.
- To convert corpus form ABC Notation to MIDI format, please download the corpus data and run FONN [abc_ingest.py](https://github.com/polifonia-project/folk_ngram_analysis/blob/master/abc_ingest.py) script. Please see [FONN README.md](https://github.com/polifonia-project/folk_ngram_analysis/blob/master/README.md) for further information.

- To extract feature sequence data from the MIDI corpus, please download the corpus data and run FONN [setup_corpus.py](https://github.com/danDiamo/music_pattern_analysis/blob/master/setup_corpus.py) script. Please see [FONN README.md](https://github.com/polifonia-project/folk_ngram_analysis/blob/master/README.md) for further information.


## Online repository link<br>
https://github.com/polifonia-project/folk_ngram_analysis/tree/master/corpus
## Attribution

## Authors
If you use the code in this repository, please cite this software as follow:
```
@software{diamond_fonn_2022,
address = {Galway, Ireland},
title = {{FONN} - {FOlk} {N}-gram {aNalysis}},
shorttitle = {{FONN}},
url = {https://github.com/polifonia-project/folk_ngram_analysis},
publisher = {National University of Ireland, Galway},
author = {Diamond, Danny and Shahid, Abdul and McDermott, James},
year = {2022},
}
```

* Danny Diamond
* Dr. Abdul Shahid Khattak
* Dr. James McDermott
* Dr Mathieu d'Aquin
## License

This work is licensed under CC BY 4.0, https://creativecommons.org/licenses/by/4.0/


## License
This project is licensed under the MIT License - see [LICENSE.md][10] file for details

[1]: http://www.capeirish.com/webabc
[2]: https://tunepal.org/index.html
[3]: https://github.com/polifonia-project/folk_ngram_analysis
[4]: https://github.com/polifonia-project/folk_ngram_analysis/tree/master/corpus/MIDI
[5]: https://github.com/danDiamo/music_pattern_analysis/blob/master/corpus/roots.csv
[6]: https://github.com/polifonia-project/folk_ngram_analysis/tree/master/corpus
[7]: https://en.wikipedia.org/wiki/Pitch_class
[8]: https://github.com/polifonia-project/folk_ngram_analysis/blob/master/README.md
[9]: https://github.com/danDiamo/music_pattern_analysis/blob/master/setup_corpus/setup_corpus.py
[10]: https://github.com/polifonia-project/folk_ngram_analysis/blob/master/corpus/license.md
[11]: https://github.com/polifonia-project/folk_ngram_analysis/blob/master/corpus/corpus_stats.ipynb
2 changes: 1 addition & 1 deletion root_note_detection/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ type: Repository

release-date: 20/05/2022

release-number: v0.5-dev
release-number: v0.6-dev

work-package:
- WP3
Expand Down

0 comments on commit 1a928d0

Please # to comment.