Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

use correct options in specs #1511

Merged
merged 13 commits into from
Jul 2, 2019
Merged

use correct options in specs #1511

merged 13 commits into from
Jul 2, 2019

Conversation

UziTech
Copy link
Member

@UziTech UziTech commented Jul 1, 2019

Marked version: master

Description

I am trying to implement the changes from @Feder1co5oave in #1407

  • CommonMark tests are running with gfm: false
  • GFM tests include CommonMark tests but are using gfm: true
  • Original tests are running with pedantic: true
  • BREAKING CHANGE Move fences to CommonMark
  • BREAKING CHANGE Move tables to GFM
  • Improve heading, lheading, and paragraph

These changes improve CommonMark compliance significantly:

--------------------------------------------------------
|                        CommonMark                    |
|                                         before  after|
| Tabs                                       91%   91% |
| Precedence                                100%  100% |
| Thematic breaks                           100%  100% |
| ATX headings                               83%   94% |
| Setext headings                            81%   89% |
| Indented code blocks                      100%  100% |
| Fenced code blocks                         97%   97% |
| HTML blocks                               100%  100% |
| Link reference definitions                 89%   89% |
| Paragraphs                                100%  100% |
| Blank lines                               100%  100% |
| Block quotes                               88%   92% |
| List items                                 73%   73% |
| Lists                                      50%   58% |
| Inlines                                   100%  100% |
| Backslash escapes                          85%   85% |
| Entity and numeric character references    76%   76% |
| Code spans                                 91%   91% |
| Emphasis and strong emphasis               64%   64% |
| Links                                      78%   78% |
| Images                                     68%   68% |
| Autolinks                                  79%  100% |
| Raw HTML                                   90%   90% |
| Hard line breaks                          100%  100% |
| Soft line breaks                          100%  100% |
| Textual content                           100%  100% |
--------------------------------------------------------

Issues

closes #1407
fixes #1510

Contributor

  • Test(s) exist to ensure functionality and minimize regression

Committer

In most cases, this should be a different person than the contributor.

  • Draft GitHub release notes have been updated.
  • CI is green (no forced merge required).
  • Merge PR

@UziTech UziTech requested review from styfle, davisjam and joshbruce July 1, 2019 14:01
@UziTech
Copy link
Member Author

UziTech commented Jul 1, 2019

I also agree with @Feder1co5oave that we should remove the tables option. I can't see why anyone would want GFM without tables.

hr: /^ {0,3}((?:- *){3,}|(?:_ *){3,}|(?:\* *){3,})(?:\n+|$)/,
heading: /^ *(#{1,6}) *([^\n]+?) *(?:#+ *)?(?:\n+|$)/,
nptable: noop,
heading: /^ {0,3}(#{1,6}) +([^\n]*?)(?: +#+)? *(?:\n+|$)/,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks OK

table: noop,
lheading: /^([^\n]+)\n {0,3}(=|-){2,} *(?:\n+|$)/,
paragraph: /^([^\n]+(?:\n(?!hr|heading|lheading| {0,3}>|<\/?(?:tag)(?: +|\n|\/?>)|<(?:script|pre|style|!--))[^\n]+)*)/,
lheading: /^([^\n]+)\n {0,3}(=+|-+) *(?:\n+|$)/,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

lheading: /^([^\n]+)\n {0,3}(=+|-+) *(?:\n+|$)/,
// regex template, placeholders will be replaced according to different paragraph
// interruption rules of commonmark and the original markdown spec:
_paragraph: /^([^\n]+(?:\n(?!hr|heading|lheading|blockquote|fences|list|html)[^\n]+)*)/,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

fences: /^ {0,3}(`{3,}|~{3,})([^`\n]*)\n(?:|([\s\S]*?)\n)(?: {0,3}\1[~`]* *(?:\n+|$)|$)/,
paragraph: /^/,
heading: /^ *(#{1,6}) +([^\n]+?) *#* *(?:\n+|$)/
nptable: /^ *([^|\n ].*\|.*)\n *([-:]+ *\|[-| :]*)(?:\n((?:.*[^>\n ].*(?:\n|$))*)\n*|$)/,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super-linear.

  1. The construct .*\|.* can be exploited through a run of |||...|. Can we replace with [^|]*|[^|]* ?
  2. I believe the same problem applies later in the (?:.*[^>\n ].* section.

paragraph: /^/,
heading: /^ *(#{1,6}) +([^\n]+?) *#* *(?:\n+|$)/
nptable: /^ *([^|\n ].*\|.*)\n *([-:]+ *\|[-| :]*)(?:\n((?:.*[^>\n ].*(?:\n|$))*)\n*|$)/,
table: /^ *\|(.+)\n *\|?( *[-:]+[-| :]*)(?:\n((?: *[^>\n ].*(?:\n|$))*)\n*|$)/
Copy link
Contributor

@davisjam davisjam Jul 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same genre of problem as the nptable.

if (/^ *\|(.+)\n *\|?( *[-:]+[-| :]*)(?:\n((?: *[^>\n ].*(?:\n|$))*)\n*|$)/.exec(' | \n' + ' '.repeat(50000))) {
  console.log('match');
}

(Note, here and elsewhere I'm just checking regexes, not full exploitability).

Copy link
Contributor

@davisjam davisjam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To discuss: super-linear regexes.

Not sure if the regexes are new or just re-org'd (in which case we should note them but they shouldn't block this PR).

@UziTech
Copy link
Member Author

UziTech commented Jul 2, 2019

I updated the fences regex but nptable and table kept breaking tests when I was trying to fix them.

nptable and table were just moved from block.tables anyway.

The only changed regexes were heading, lheading, and paragraph

Copy link
Contributor

@davisjam davisjam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM then

@UziTech
Copy link
Member Author

UziTech commented Jul 2, 2019

I did remove the tables option since this PR would make it do nothing anyway.

@Lmillan123

This comment was marked as spam.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

md title parse error
4 participants