-
Notifications
You must be signed in to change notification settings - Fork 727
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Add a kakquote function to the prelude of shell blocks #3340
Comments
@Screwtapello How about quoting each argument, instead of splatting? kak_escape() {
for argument do
printf "'"
printf '%s' "$argument" | sed "s/'/''/g"
printf "'"
printf ' '
done
} |
@Screwtapello I’m using the following pattern for my scripts.
kak_escape() {
for argument do
printf "'"
printf '%s' "$argument" | sed "s/'/''/g"
printf "'"
printf ' '
done
}
declare-option -hidden str chronic_path %sh(dirname "$kak_source")
provide-module chronic %{
define-command chronic -params .. -docstring 'Pipe to Chronic' %{
evaluate-commands -save-regs '|' %{
set-register | %sh{
. "$kak_opt_chronic_path/prelude.sh"
kak_quoted_arguments=$(kak_escape "$@")
printf 'chronic %s' "$kak_quoted_arguments"
}
execute-keys '|<ret>'
}
}
}
require-module chronic |
I like the modularity of having printf "eval -try-client %s echo -markup {Information}%s" \
"$(kakquote "$kak_client")"
"$(kakquote "$msg")" ...but now that I think about it, I could still do that with a quoter that quoted each argument individually. In fact, if I quoted each argument individually, I could do this: kakquote eval -try-client "$kak_client" echo -markup {Information}"$msg" ...which is a lot nicer, although it quotes more than strictly necessary. My one fear with processing arguments individually is that it might be a bit more expensive than quoting the whole string at once. But rather than guess wildly, I decided to write a little benchmark: log() { echo "$*" >&2; }
test_quoter() {
actual=$("$1" "Don't Let's Start")
expected="'Don''t Let''s Start' "
if [ "$actual" != "$expected" ]; then
log "Quoter $1 failed: got $actual expected $expected"
return 1
fi
}
benchmark_quoter() {
quoter="$1"
timeout=5
(
trap "printf '%s\\t%5d\\n' \$quoter \$i >&2; exit" TERM
i=0
while true; do
"$1" "Don't" "Let's" "Start" >/dev/null
i=$(( i + 1 ))
done
) &
sleep "$timeout"
kill $!
wait
}
single_sed_quoter() {
printf "'"
printf "%s" "$*" | sed "s/'/''/g"
printf "' "
}
single_flat_sed_quoter() {
printf "'%s' " "$(printf "%s" "$*" | sed "s/'/''/g")"
}
multi_sed_quoter() {
for each; do
printf "'"
printf "%s" "$each" | sed "s/'/''/g"
printf "' "
done
}
single_builtin_quoter() {
text="$*"
printf "'"
while [ -n "$text" ]; do
case "$text" in
*"'"*)
printf "%s" "${text%%\'*}"
printf "''"
text=${text#*\'}
;;
*)
printf "%s" "$text"
text=""
;;
esac
done
printf "' "
}
multi_builtin_quoter() {
for text; do
printf "'"
while [ -n "$text" ]; do
case "$text" in
*"'"*)
printf "%s" "${text%%\'*}"
printf "''"
text=${text#*\'}
;;
*)
printf "%s" "$text"
text=""
;;
esac
done
printf "' "
done
}
quoters="
single_sed_quoter
single_flat_sed_quoter
multi_sed_quoter
single_builtin_quoter
multi_builtin_quoter
"
for each in $quoters; do
test_quoter "$each" && benchmark_quoter "$each"
done
Note that Here's the results for whatever version of
My original
Overall, I think |
Note that If the numbers above are the amount of times a given function has run in 5s, then here is the time spent on one iteration (first dataset only):
The other implementations all take less than 5ms to run, so knowing which one is N times faster than others is not relevant, because a single iteration of those implementations represents a negligible time loss, considering we're in the shell world.
|
I'd argue that ~5ms is much too expensive for something as trivial as escaping a string. Consider that it's not uncommon to generate option lists and to have to escape each element. OpenBSD's sh is the only platform shell that I've seen which doesn't have printf as builtin, and it would be trivial (although not obvious) for an OpenBSD user to set KAKOUNE_POSIX_SHELL to |
I have some new measurements. I was pointed to @occivink's quoting function in his kakoune-snippets package, which double-escapes apostrophes for some reason, but I edited it to match the other implementations: occivink_builtin_quoter()
{
rest="$*"
printf "'"
while :; do
beforequote="${rest%%"'"*}"
if [ "$rest" = "$beforequote" ]; then
printf %s "$rest"
break
fi
printf "%s''" "$beforequote"
rest="${rest#*"'"}"
done
printf "' "
} I also figured out how to escape a value entirely within single_all_sed_quoter() {
printf "%s" "$*" | sed -e "s/'/''/g; 1s/^/'/; \$s/\$/' /"
}
multi_all_sed_quoter() {
for arg; do
printf "%s" "$arg" | sed -e "s/'/''/g; 1s/^/'/; \$s/\$/' /"
done
} Results for these new implementations, on the same laptop as before, best of 3 runs:
|
Apparently once you have a benchmarking harness, you can't stop coming up with new variants to benchmark. I have some new variations based on
The new versions look like this: single_builtin_quoter_all_backslashes_ntmp() {
text="$*"
printf "'"
while true; do
case "$text" in
*\'*)
printf "%s''" "${text%%\'*}"
text=${text#*\'}
;;
*)
printf "%s' " "$text"
break
;;
esac
done
}
multi_builtin_quoter_all_backslashes_ntmp() {
for text; do
printf "'"
while true; do
case "$text" in
*\'*)
printf "%s''" "${text%%\'*}"
text=${text#*\'}
;;
*)
printf "%s' " "$text"
break
;;
esac
done
done
} (the "ntmp" in the name is short for "no test, merged printfs") Benchmarking them (as before):
Converting to ms for comparison with lenormf's table, that's:
Basically, the new multi-quoter is as fast as the old single-quoter, and the new single-quoter is even faster. My current position:
|
@Screwtapello Does it slow it down to make the implementation a bit more readable? # Prototype:
# kak_escape [text…]
#
# Description:
# Similar to `shell_escape` you may find in other programming languages,
# `kak_escape` escapes each argument so that it can be safely passed to Kakoune.
#
# Implementation:
# Single quotes each argument and doubles the single quotes inside.
#
# Note:
# The resulted text should be used unquoted and is not intended for use in double quotes, nor in single quotes.
#
# Example:
# kak_escape evaluate-commands -try-client "$KAKOUNE_CLIENT" 'echo Tchou' | kak -p "$KAKOUNE_SESSION"
#
kak_escape() {
for text do
printf "'"
while true; do
case "$text" in
*"'"*)
head=${text%%"'"*}
tail=${text#*"'"}
printf "%s''" "$head"
text=$tail
;;
*)
printf "%s' " "$text"
break
;;
esac
done
done
printf '\n'
} |
It does slow things down a bit, although you'll have to decide whether it's significant enough to worry about.
|
@Screwtapello Do you think a final newline character should be added to allow multiple invocations of the function? Example: kak_escape set-register / "$regex"
kak_escape echo -markup "$message" |
Printing a final newline makes that use-case much more convenient, and it wouldn't prevent people from saying I've been using |
This escaping is so common. I could imagine an |
Build up the string & only printf onceI think you can get even more performance by building up the string incrementally and then
Results
TestingI put a snippet up (which also accepts
|
Wow. I was pretty sure I'd squeezed all the juice out of that function already, but clearly not! I had assumed that printing was cheap (because stdout buffering) and string concatenation was expensive (because O(N²)) but apparently it's the other way around in this use-case. That's good to know! |
The devil with But actually it gets even more interesting. By pre-collapsing the strings a-la occivink, and avoiding variables completely (
The results are some 50% faster compared to
|
... but of course, |
Since Kakoune's quoting system was reworked, it's pretty easy to reliably quote Kakoune strings by just doubling apostrophes. In shell, it looks something like this:
However, while working on #3336, I've copied and pasted this fragment into three or four shell blocks already; I expect it (or something like it) is already present in a bunch of other scripts too.
We already inject environment variables into shell blocks; how about we prepend this helpful and near-universally-useful function as well?
(it should be possible to implement this in pure POSIX shell without
sed
, but it would be too complex. If avoiding a fork turns out to be a performance improvement, it'd be nice to make the change once in Kakoune's shell prelude, and not have to copy/paste the change into thousands of existing Kakoune scripts)The text was updated successfully, but these errors were encountered: