Skip to content
This repository has been archived by the owner on Apr 30, 2021. It is now read-only.

bibtex @mastersthesis is not handled correctly #98

Closed
fibsifan opened this issue Jan 9, 2015 · 12 comments
Closed

bibtex @mastersthesis is not handled correctly #98

fibsifan opened this issue Jan 9, 2015 · 12 comments

Comments

@fibsifan
Copy link

fibsifan commented Jan 9, 2015

I use pandoc-citeproc to convert BibTeX to YAML (pandoc-citeproc -y refs.bib > refs.yaml) and afterwards I put the refs.yaml in the list of pandoc input files. When I have a @mastersthesis reference in my bibliography, it gets converted as follows:

@mastersthesis{Mustermann:1998,
  type = {Diplomarbeit},
  author = {Max Mustermann},
  title = {{Max Mustermanns Diplomarbeit}},
  school = {Musteruniversität Musterstadt},
  year = {1998},
}

becomes

---
references:
- author:
  - family: Mustermann
    given: Max
  id: Mustermann:1998
  issued:
    date-parts:
    - - 1998
  title: Max Mustermanns Diplomarbeit
  type: thesis
  genre: "Master’s thesis"
  publisher: "Musteruniversität Musterstadt"
...

I would rather have "Diplomarbeit" in the genre field, as that seems to be the one, that gets printed. How can this be fixed, without having to correct every entry in the yaml file (is this a pandoc-citeproc issue at all)?

@jgm
Copy link
Owner

jgm commented Jan 9, 2015

It's a bibtex conversion issue. The string "Master's thesis" is
inserted by the resolveKey' function. This function has the capacity
to be localized -- it takes a parameter for the locale -- but currently
only English values have been defined.

       "inpreparation" -> "in preparation"
       "submitted"     -> "submitted"
       "forthcoming"   -> "forthcoming"
       "inpress"       -> "in press"
       "prepublished"  -> "pre-published"
       "mathesis"      -> "Master’s thesis"
       "phdthesis"     -> "PhD thesis"
       "candthesis"    -> "Candidate thesis"
       "techreport"    -> "technical report"
       "resreport"     -> "research report"
       "software"      -> "computer software"
       "datacd"        -> "data CD"
       "audiocd"       -> "audio CD"
       "patent"        -> "patent"
       "patentde"      -> "German patent"
       "patenteu"      -> "European patent"
       "patentfr"      -> "French patent"
       "patentuk"      -> "British patent"
       "patentus"      -> "U.S. patent"
       "patreq"        -> "patent request"
       "patreqde"      -> "German patent request"
       "patreqeu"      -> "European patent request"
       "patreqfr"      -> "French patent request"
       "patrequk"      -> "British patent request"
       "patrequs"      -> "U.S. patent request"
       "countryde"     -> "Germany"
       "countryeu"     -> "European Union"
       "countryep"     -> "European Union"
       "countryfr"     -> "France"
       "countryuk"     -> "United Kingdom"
       "countryus"     -> "United States of America"
       "newseries"     -> "new series"
       "oldseries"     -> "old series"

If you can edit this and insert German equivalents to the English
phrases, we can add German localization.

+++ Johannes Ballmann [Jan 09 15 10:58 ]:

I use pandoc-citeproc to convert BibTeX to YAML (pandoc-citeproc -y refs.bib > refs.yaml) and afterwards I put the refs.yaml in the list of pandoc input files. When I have a @mastersthesis reference in my bibliography, it gets converted as follows:

@mastersthesis{Mustermann:1998,
 type = {Diplomarbeit},
 author = {Max Mustermann},
 title = {{Max Mustermanns Diplomarbeit}},
 school = {Musteruniversität Musterstadt},
 year = {1998},
}

becomes

---
references:
- author:
 - family: Mustermann
   given: Max
 id: Mustermann:1998
 issued:
   date-parts:
   - - 1998
 title: Max Mustermanns Diplomarbeit
 type: thesis
 genre: "Master’s thesis"
 publisher: "Musteruniversität Musterstadt"
...

I would rather have "Diplomarbeit" in the genre field, as that seems to be the one, that gets printed. How can this be fixed, without having to correct every entry in the yaml file (is this a pandoc-citeproc issue at all)?


Reply to this email directly or view it on GitHub:
#98

@fibsifan
Copy link
Author

fibsifan commented Jan 9, 2015

I'll have a look on the translations this weekend.
But even if this was translated – "mathesis" -> "Masterarbeit" – I would have no way to reference to a Diplomarbeit. In BibTeX type = {Diplomarbeit}, could be used in a @mastersthesis reference (@phdthesis would work too, I think), to accomplish that.
That's my Problem: the type argument does not seem to be used.

@njbart
Copy link
Contributor

njbart commented Jan 9, 2015

@fibsifan: You could try the biblatex entry type @thesis with type = {Diplomarbeit}. This is mapped to CSL YAML genre: Diplomarbeit as expected.

@jgm: It seems this reveals a small pandoc-citeproc bug. Apparently in bibtex mastersthesis and phdthesis entries, the content of the type field, if present, should replace the default value of “Master’s thesis” and “PhD thesis” just as it does for biblatex thesis entries.

From the 1988 bibtex specs, http://mirrors.ctan.org/biblio/bibtex/base/btxdoc.pdf:

“The MASTERSTHESIS and PHDTHESIS entry types now take an optional type field. For example you can get the standard styles to call your reference a ‘Ph.D. dissertation’ instead of the default ‘PhD thesis’ by including a type = "{Ph.D.} dissertation" in your database entry.”

@fibsifan
Copy link
Author

Thank you @nickbart1980.

@jgm: I am not sure about the first 4 and the last 2. phdthesis could also be translated to Doktorarbeit. When i use LaTeX and

\usepackage[ngerman]{babel}
\usepackage[numbers,round]{natbib}
\bibliographystyle{alphadin}

it is listed as "Diss."

"inpreparation" -> "in Vorbereitung"
"submitted" -> "eingereicht"
"forthcoming" -> "im Erscheinen"
"inpress" -> "in Druck"
"prepublished" -> "vorveröffentlicht"
"mathesis" -> "Masterarbeit"
"phdthesis" -> "PhD-Dissertation"
"candthesis" -> "Kandidatur-Dissertation"
"techreport" -> "technischer Bericht"
"resreport" -> "Forschungsbericht"
"software" -> "Software"
"datacd" -> "Daten-CD"
"audiocd" -> "Audio-CD"
"patent" -> "Patent"
"patentde" -> "Deutsches Patent"
"patenteu" -> "Europäisches Patent"
"patentfr" -> "Französisches Patent"
"patentuk" -> "Britisches Patent"
"patentus" -> "US-Patent"
"patreq" -> "Patentantrag"
"patreqde" -> "Deutscher Patentantrag"
"patreqeu" -> "Europäischer Patentantrag"
"patreqfr" -> "Französischer Patentantrag"
"patrequk" -> "Britischer Patentantrag"
"patrequs" -> "US-Patentantrag"
"countryde" -> "Deutschland"
"countryeu" -> "Europäische Union"
"countryep" -> "European Union"
"countryfr" -> "Frankreich"
"countryuk" -> "Vereinigtes Königreich"
"countryus" -> "Ver­ei­nig­te Staa­ten von Ame­ri­ka"
"newseries" -> "neue Folge"
"oldseries" -> "alte Folge"

@njbart
Copy link
Contributor

njbart commented Jan 13, 2015

I'd simply borrow the biblatex translations; adapted from my local /usr/local/texlive/2014/texmf-dist//tex/latex/biblatex/lbx/german.lbx:

"inpreparation" -> "in Vorbereitung"
"submitted" -> "eingereicht"
"forthcoming" -> "im Erscheinen"
"inpress" -> "im Druck"
"prepublished" -> "Vorveröffentlichung"
"mathesis" -> "Magisterarbeit"
"phdthesis" -> "Dissertation"
-- "candthesis" -> "" -- missing
"techreport" -> "Technischer Bericht"
"resreport" -> "Forschungsbericht"
"software" -> "Computer-Software"
"datacd" -> "CD-ROM"
"audiocd" -> "Audio-CD"
"patent" -> "Patent"
"patentde" -> "deutsches Patent"
"patenteu" -> "europäisches Patent"
"patentfr" -> "französisches Patent"
"patentuk" -> "britisches Patent"
"patentus" -> "US-Patent"
"patreq" -> "Patentanmeldung"
"patreqde" -> "deutsche Patentanmeldung"
"patreqeu" -> "europäische Patentanmeldung"
"patreqfr" -> "französische Patentanmeldung"
"patrequk" -> "britische Patentanmeldung"
"patrequs" -> "US-Patentanmeldung"
"countryde" -> "Deutschland"
"countryeu" -> "Europäische Union"
"countryep" -> "Europäische Union"
"countryfr" -> "Frankreich"
"countryuk" -> "Großbritannien"
"countryus" -> "USA"
"newseries" -> "neue Folge"
"oldseries" -> "alte Folge"

from /usr/local/texlive/2014/texmf-dist//tex/latex/biblatex/lbx/french.lbx:

"inpreparation" -> "en préparation"
"submitted" -> "soumis"
"forthcoming" -> "à paraître"
"inpress" -> "sous presse"
"prepublished" -> "prépublié"
"mathesis" -> "mémoire de master"
"phdthesis" -> "thèse de doctorat"
"candthesis" -> "thèse de candidature"
"techreport" -> "rapport technique"
"resreport" -> "rapport scientifique"
"software" -> "logiciel"
"datacd" -> "cédérom"
"audiocd" -> "disque compact audio"
"patent" -> "brevet"
"patentde" -> "brevet allemand"
"patenteu" -> "brevet européen"
"patentfr" -> "brevet français"
"patentuk" -> "brevet britannique"
"patentus" -> "brevet américain"
"patreq" -> "demande de brevet"
"patreqde" -> "demande de brevet allemand"
"patreqeu" -> "demande de brevet européen"
"patreqfr" -> "demande de brevet français"
"patrequk" -> "demande de brevet britannique"
"patrequs" -> "demande de brevet américain"
"countryde" -> "Allemagne"
"countryeu" -> "Union européenne"
"countryep" -> "Union européenne"
"countryfr" -> "France"
"countryuk" -> "Royaume-Uni"
"countryus" -> "États-Unis"
"newseries" -> "nouvelle série"
"oldseries" -> "ancienne série"

Other biblatex language files (including brazilian.lbx, catalan.lbx, croatian.lbx, czech.lbx, danish.lbx, dutch.lbx, finnish.lbx, greek.lbx, icelandic.lbx, italian.lbx, norsk.lbx, norwegian.lbx, nynorsk.lbx, polish.lbx, portuges.lbx, portuguese.lbx, russian.lbx, slovene.lbx, spanish.lbx, swedish.lbx) could easily be adapted, too, though not all of them provide the complete set of terms.

EDIT: removed duplicate lines

@njbart
Copy link
Contributor

njbart commented Mar 22, 2015

Tried to patch Bibtex.hs by adding

resolveKey' (Lang "fr" "FR") k =
  case map toLower k of
       "inpreparation" -> "en préparation"
       "submitted"     -> "soumis"
       "forthcoming"   -> "à paraître"
       "inpress"       -> "sous presse"
       "prepublished"  -> "prépublié"
       "mathesis"      -> "mémoire de master"
       "phdthesis"     -> "thèse de doctorat"
       "candthesis"    -> "thèse de candidature"
       "techreport"    -> "rapport technique"
       "resreport"     -> "rapport scientifique"
       "software"      -> "logiciel"
       "datacd"        -> "cédérom"
       "audiocd"       -> "disque compact audio"
       "patent"        -> "brevet"
       "patentde"      -> "brevet allemand"
       "patenteu"      -> "brevet européen"
       "patentfr"      -> "brevet français"
       "patentuk"      -> "brevet britannique"
       "patentus"      -> "brevet américain"
       "patreq"        -> "demande de brevet"
       "patreqde"      -> "demande de brevet allemand"
       "patreqeu"      -> "demande de brevet européen"
       "patreqfr"      -> "demande de brevet français"
       "patrequk"      -> "demande de brevet britannique"
       "patrequs"      -> "demande de brevet américain"
       "countryde"     -> "Allemagne"
       "countryeu"     -> "Union européenne"
       "countryep"     -> "Union européenne"
       "countryfr"     -> "France"
       "countryuk"     -> "Royaume-Uni"
       "countryus"     -> "États-Unis"
       "newseries"     -> "nouvelle série"
       "oldseries"     -> "ancienne série"
       _               -> k

… but I'm confused as to what settings are required to activate this. locale: fr-FR in a document's YAML metadata switches CSL terms to French, but leaves the biblatex terms above unchanged. Adding -V locale: fr-FR to the command line does not seem to work at all, not even for the CSL terms.
I'm also confused by the documentation: http://johnmacfarlane.net/pandoc/README.html only mentions lang (which doesn't even seem to affect CSL terms) but not locale.
I feel it'd be best if we had just one variable for this, e.g. locale with values en-US, fr-FR, etc. which would also set the correct babel language for LaTeX etc.
A command line switch for pandoc-citeproc -y and -j to enable setting the locale to be used would also be nice.

@jgm
Copy link
Owner

jgm commented May 6, 2015

@nickbart1980 - in bibtex conversion, pandoc-citeproc looks to the LANG environment variable.
Also, you can use -M locale=fr-FR to set the locale metadata variable from the command line.
(-V only affects variables passed to templates.)

@jgm
Copy link
Owner

jgm commented May 6, 2015

OK, I think this is all working now.

@jgm jgm closed this as completed May 6, 2015
@njbart
Copy link
Contributor

njbart commented May 6, 2015

Well, it doesn't seem to work here yet. Neither locale: fr-FR in the metadata nor -M locale=fr-FR on the command line seem to do the trick – both generate French guillemets around the title (as expected) but output “PhD thesis”. Only when export LANG=fr_FR.UTF-8 is uncommented, “Thèse de doctorat” is printed.

Example:

#!/bin/sh

#export LANG=fr_FR.UTF-8

echo "@thesis{item1,
    Author = {Student, Stu},
    Date = {2013},
    Institution = {Institution},
    Location = {Location},
    Title = {Title},
    Type = {phdthesis},
}
" > test-thesis.bib

pandoc -F pandoc-citeproc -t markdown-citations << EOT

@item1

---
locale: fr-FR
bibliography: test-thesis.bib
...

EOT

pandoc -F pandoc-citeproc -t markdown-citations  -M locale=fr-FR << EOT

@item1

---
bibliography: test-thesis.bib
...
EOT

@jgm
Copy link
Owner

jgm commented May 6, 2015

+++ nickbart1980 [May 06 15 03:16 ]:

Well, it doesn't seem to work here yet. Neither locale: fr-FR in the metadata nor -M locale=fr-FR on the command line seem to do the trick – both generate French guillemets around the title (as expected) but output “PhD thesis”. Only when export LANG=fr_FR.UTF-8 is uncommented, “Thèse de doctorat” is printed.

Yes, for bibtex conversion only LANG matters; locale in metadata is
ignored. I'll look into fixing that.

jgm added a commit that referenced this issue May 6, 2015
@jgm
Copy link
Owner

jgm commented May 6, 2015

@nickbart1980 this last commit should do the trick.

@njbart
Copy link
Contributor

njbart commented May 7, 2015

Yes it does. Great.

# for free to subscribe to this conversation on GitHub. Already have an account? #.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants