UPDATES in process
Statement of Purpose
This Python 3 program extracts data from MARC8 authority files and outputs it in a CSV format in order to manipulate the data in a spreadsheet. The program will only extract data from one type of authority file at a time. An optional keyword can be specified to search all fields and output only those records that contain the keyword.
It uses the specialzed Python library pymarc to handle the MARC Format as well as several standard Python libraries.
The current functions cover the extraction of:
- general headings from Library of Congress Subject Headings
- general subdivision headings from Library of Congress Subject Headings
- form subdivision headings from Library of Congress Subject Headings
- headings from Library of Congress Children Subject Headings (CYAC)
- headings from the Library of Congress Genre/Form Terms
- headings from Library of Congress Demographic Group Terms
- headings from Library of Congress Medium of Performance Terms
A freely available copy of the Library of Congress Subject Headings authority file in MARC8 is available via the MARC Distribution Services.
How to run the program
python subjauth.py <input file path> -type [sh | fd | gd | dg | gf | sj | mp] -o <csv path> [-key <keyword string>]
Option | Explanation |
-type sh | Subject authority records |
-type fd | Subject authority records for form subdivisions |
-type gd | Subject authority records for general subdivisions |
-type sj | Children's Subject authority records |
-type gf | Genre/Form authority records |
-type dg | Demographic Group Terms authority records |
-type mp | Medium of Performance Terms authority records |
-o | Output location and filename for csv file |
-key | Authority records Keyword search (phrases in quotes) |
The CSV file
The output of the CSV file contains three columns:
- MARC field 010 $a
- the text of the heading
- MARC field 1XX $a (with possible additional subfields) for headings in LCSH, LCGFT, LCDGT, CYAC, and LCMPT
- MARC field 185 $v (with possible additional subfields $v or $x) for LCSH form subdivisions
- MARC field 185 $x (with possible additional subfields $v or $x) for LCSH general subdivisions
- scope note, if one exists
- MARC field 680 ($i and possible $a subfields)
- Melanie Polutta: @bibliotecaria
- Bobbi Fox: @bobbi-SMR