Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Add char #290

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open

Add char #290

wants to merge 5 commits into from

Conversation

puripuri2100
Copy link
Contributor

@puripuri2100 puripuri2100 commented Sep 23, 2021

Close #407

I added char type and some functions.

The char type uses Uchar.t type for implementation.

ref: https://github.com/gfngfn/SATySFi/projects/1#card-54377937

List of added functions:

  • char-to-string : char -> string
  • char-to-unicode-point : char -> int
  • char-of-unicode-point : int -> char
  • char-same : char -> char -> bool

See the tests/char.saty file for how to use them.

@puripuri2100
Copy link
Contributor Author

🤔🤔🤔

### output ###
# Error: Unable to set up the cache root directory:
# Unix.Unix_error(Unix.EEXIST, "mkdir",
# "/Users/runner/.cache/dune/db/files/v4")

https://github.com/gfngfn/SATySFi/runs/3682988795?check_suite_focus=true#step:5:5926

@na4zagin3
Copy link
Contributor

na4zagin3 commented Sep 23, 2021

  1. *-unicode-point should be renamed for *-unicode-scalar-value because “Unicode point” is not a term (See “Unicode scalar value”). Did you mean “Unicode code point”)?.
  2. Is *-same conventional? I would expect char-same should be char-equal or something because this is the equality defined on char. (I'd love to hear @gfngfn’s opinion)

@na4zagin3
Copy link
Contributor

na4zagin3 commented Sep 23, 2021

This is my just two cents.

Can this be implemented as a macro like ~(char @`2`) with char : input-position * string -> char?

Otherwise, is it possible to introduce more generic literal syntax (like Scala’s String Interpolation, C++’s User defined literals, Lisp-families’ read macros, SRFI-10) instead of one specific to char?

For example, define a new syntax @⟨ident_tag⟩⟨string-literal⟩ (e.g., @char`a`) which will be parsed as ~(⟨ident_tag⟩ @⟨string-literal⟩) where ⟨ident_tag⟩ should be a function with type input-position * string -> char. If the string is not valid, the function will call abort-with-message. It may be better to introduce another name space for tags that are defined with a new syntax let-literal @⟨ident_tag⟩ ⟨args⟩ = ⟨expr⟩.

@na4zagin3
Copy link
Contributor

na4zagin3 commented Sep 23, 2021

This is another my two cents. Another option would be “not to introduce a literal syntax for char at all”. I believe casual users shouldn’t care what a Unicode Scalar Value is. I'm not a big fan of user-facing APIs requiring char arguments; they will likely not consider Unicode equivalence.

@y-yu y-yu mentioned this pull request Oct 19, 2021
@puripuri2100
Copy link
Contributor Author

Thank you for pointing out the naming of the primitive functions.

I explain why I introduced the literal syntax of the char type.
I would like to use the char type when parsing strings. So I expect the char type to have two properties:

  • Guaranteed to be exactly one character
  • Pattern matching is possible

Currently, I must do pattern matching with Unicode scalar value:
https://github.com/puripuri2100/SATySFi-json/blob/master/src/json.satyg#L78

@na4zagin3
Copy link
Contributor

Hmm, then can we introduce a macro function char : string -> int and make it available at matching clauses?

val f x =
  match x with
  | ~(char `/`) -> lex-string (str-stack^`"`) line (column + 2) ys

or with a new macro syntax @⟨ident_tag⟩⟨string-literal⟩,

val f x =
  match x with
  | @char`/` -> lex-string (str-stack^`"`) line (column + 2) ys

Otherwise, we can extend matching with view patterns

val char c =
  match string-length c with
  | 0 -> ``
  | 1 -> string-sub c 0 1
  | _ -> ``
  end

val f x =
  match x with
  | (char -> `\`) -> lex-string (str-stack^`"`) line (column + 2) ys

or extractors

val match Char c =
  match string-length c with
  | 0 -> None
  | 1 -> Some (string-sub c 0 1)
  | _ -> None
  end

val f x =
  match x with
  | Char(`\`) -> lex-string (str-stack^`"`) line (column + 2) ys

@gfngfn gfngfn added this to the v0.1.0 milestone Apr 3, 2024
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Introduce base type char for Unicode code points
3 participants