-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Extend images regex to also match the appendix after file extension #409
base: master
Are you sure you want to change the base?
Conversation
The original code tries to match several URLs in a buffer, with this change that no longer works: (let ((haystack "foo http://xxx.jpg <https://bar.jpg?>http://baz.jpg")
(needle "\\(https?://[^ ]*?\\.\\(?:png\\|jpg\\|jpeg\\|svg\\|gif\\).*\\)"))
(with-temp-buffer
(insert haystack)
(goto-char (point-min))
(while (re-search-forward needle nil t)
(message "%S" (match-string-no-properties 1)))))
;; "http://xxx.jpg <https://bar.jpg?>http://baz.jpg" Generally, |
That's what I was wondering to but I thought it only works per line so how can white space by an issue.
That sounds like it make sense. |
>Instead, it should be delimited by URL delimiters (whitespace, double
>quote, angular brackets). The existing regex could be rewritten using rx
>if you prefer.
What is rx in this case?
|
S-expression notation for regular expressions. Try |
Some images have an appendix after the file extension making the builtin regex fail. An example would be something like https://example.com/foo.svg?size=800x600. A fix for this is to match anything until a delimiters like "'>) or :space:tab:
aa569ac
to
e6efcd6
Compare
@@ -74,7 +74,7 @@ See `enable-circe-display-images'." | |||
:group 'circe) | |||
|
|||
(defcustom circe-display-images-image-regex | |||
"\\(https?://[^ ]*?\\.\\(?:png\\|jpg\\|jpeg\\|svg\\|gif\\)\\)" | |||
"\\(https?://[^ ]*?\\.\\(?:png\\|jpg\\|jpeg\\|svg\\|gif\\)[^\]\)>\"\' \n]*\\)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason you use rx is because it generates optimal regex and avoids typical errors, such as needlessly escaping characters that are not considered special in a character alternative. The URL delimiters would be matched by (rx (* (not (any "<>\"\t\r\n "))))
for example. Parentheses, square brackets and single quotes are not supposed to be there.
Additionally, that regex should be used after the "https?://" part. rx allows avoiding duplication here if the non-macro form is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, I pushed my fixes to the existing regex that I already saved before working on replacing it with rx.
Some images have an appendix after the file extension making the
builtin regex fail.
An example would be something like https://example.com/foo.svg?size=800x600.