Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Add cleanup action for "LaTeX to LaTeX aware Unicode" #8715

Open
JasonGross opened this issue Apr 23, 2022 · 2 comments
Open

Add cleanup action for "LaTeX to LaTeX aware Unicode" #8715

JasonGross opened this issue Apr 23, 2022 · 2 comments

Comments

@JasonGross
Copy link

Problem:

  • There is no cleanup action that allows converting (old) bibliographic data that is (still) formatted in LaTeX with Non-Unicode characters to Unicode aware LaTeX formatting (newer LaTeX engines (e.g. LaTeX2e) can now read most Unicode characters).
  • Current workarounds include converting to from LaTeX to Unicode and then back to LaTeX, while manually checking, if any characters were wrongly converted. This is inefficient and takes a long time.

Desired Solution:

  • Create cleanup action for "LaTeX to Unicode aware LaTeX".

Example workflow:

  1. Have the following entry (BEFORE using the cleanup action):

    @Article{Testkey,
      author   = {Testauthor},
      title    = {Bibliographic data that can be read by LaTeX engines},
      a = {Here is a backslashed percentage sign \% and it should be excluded from conversion},
      b = {Here is a \textcopyright{} and it should be converted to Unicode}, 
    }
    

    (Comment: \textcopyright{} can be converted to © by the inputenc package. When using the LaTeX to Unicode aware LaTeX cleanup action, the result of the conversion should also be ©)

  2. Use cleanup action "LaTeX to Unicode aware LaTeX"

  3. AFTER using the cleanup action, the following result should emerge:

    @Article{Testkey,
      author   = {Testauthor},
      title    = {Bibliographic data that can be read by LaTeX engines},
      a = {Here is a backslashed percentage sign \% and it should be excluded from conversion},
      b = {Here is a © and it should be converted to Unicode}, 
    }
    

"Special Symbols" that would need to be excluded from conversion:

  • The list should be similar to the symbols mentioned in Add integrity check for LaTeX special characters #8712.
  • At the very least Page 15 (Tables 1); Table 1 lists escapable special characters in LaTeX.
  • Maybe also Page 15 Table 2 and Page 16 Table 3.
  • There might be a lot more, but I am not knowledgable enough to list them here. If you know of any, just post it in this thread.

Additional Information

Originally posted by @ThiloteE in #8490 (comment)

@zkl-ai
Copy link
Contributor

zkl-ai commented Apr 30, 2022

Hello, can I take this issue? I have done something related to cleanup actions.

@ThiloteE
Copy link
Member

Sure you can!

# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
Status: Free to take
Status: Normal priority
Development

No branches or pull requests

4 participants