Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

同步文件夹路径含中文时,实际同步到非预期的路径 #576

Closed
binsee opened this issue Oct 30, 2020 · 2 comments · Fixed by rime/librime#806
Closed

同步文件夹路径含中文时,实际同步到非预期的路径 #576

binsee opened this issue Oct 30, 2020 · 2 comments · Fixed by rime/librime#806

Comments

@binsee
Copy link

binsee commented Oct 30, 2020

weasel版本:0.14.3
系统版本:win10 x64

问题描述

installation.yaml 中配置 sync_dir 以将配置同步到指定目录。
但当指定的目录路径中包含中文(非ansi字符)时,实际同步到的路径是一个乱码路径,而非配置文件中指定路径。

详细

  • 配置文件:installation.yaml
# encoding: utf-8

distribution_code_name: Weasel
distribution_name: "小狼毫"
distribution_version: 0.14.3
install_time: "Sat Oct 31 01:25:18 2020"
installation_id: "MyPC"
rime_version: 1.5.3
sync_dir: 'E:\Documents\坚果云\RimeSync'
  • 执行同步

  • 实际同步到的路径: E:\Documents\鍧氭灉浜慭RimeSync\MyPC
    image

  • 而将配置文件以 ANSI 编码保存,则可以同步到预期路径

分析

根据测试如下:

// utf-8编码
E:\Documents\坚果云\RimeSync

// 以ANSI编码显示
E:\Documents\鍧氭灉浜慭RimeSync
  • rime的yml默认都是utf-8编码,因此修改installation.yaml 时默认也是utf-8
  • librime从 installation.yaml 读取路径数据,但生成路径时却是将路径当作ANSI编码处理。
  • 当路径字符串包含中文且以utf-8编码,直接当作ANSI编码使用时,便会导致乱码

暂时的解决方式

  • 同步文件夹路径不要包含中文
  • installation.yaml 文件以ANSI编码进行保存

补充

由于weasel是调用librime的api来执行的同步操作,因此问题代码在librime中,此issue或许应转移至librime。
但由于不熟悉c++,怀疑是否是属于编译方面问题或其他,因此发在weasel中,请维护者给予判断,是否应转移到librime。

@lotem
Copy link
Member

lotem commented Nov 2, 2020

同意分析和暂时的解决方式。
librime全部用UTF-8编码,到平台编码的软换应该由前端处理。

@Qeynos
Copy link

Qeynos commented Dec 18, 2023

另外包括扩展字典名称中出现中文文件名也会无法引用,毕竟我们是个主要用于输入中文的输入法,稍有怪异

lotem added a commit to lotem/librime that referenced this issue Feb 3, 2024
Follow @fxliang 's PR, use `u8path` on Windows to convert UTF-8 string
to Windows native path.
Closes rime#804

Fixes rime/weasel#576
Fixes rime/weasel#1080

BREAKING CHANGE: `installation.yaml` should be UTF-8 encoded.

Previouly on Windows, the file can be written in local encoding to
enable paths with non-ASCII characters. It should be updated to UTF-8
after this change.
lotem added a commit to lotem/librime that referenced this issue Feb 3, 2024
Follow @fxliang 's PR, use `u8path` on Windows to convert UTF-8 string
to Windows native path.
Closes rime#804

Fixes rime/weasel#576
Fixes rime/weasel#1080

BREAKING CHANGE: `installation.yaml` should be UTF-8 encoded.

Previouly on Windows, the file can be written in local encoding to
enable paths with non-ASCII characters. It should be updated to UTF-8
after this change.
lotem added a commit to lotem/librime that referenced this issue Feb 3, 2024
Follow @fxliang 's PR, use `u8path` on Windows to convert UTF-8 string
to Windows native path.
Closes rime#804

Fixes rime/weasel#576
Fixes rime/weasel#1080

BREAKING CHANGE: `installation.yaml` should be UTF-8 encoded.

Previouly on Windows, the file can be written in local encoding to
enable paths with non-ASCII characters. It should be updated to UTF-8
after this change.
lotem added a commit to lotem/librime that referenced this issue Feb 3, 2024
Follow @fxliang 's PR, use `u8path` on Windows to convert UTF-8 string
to Windows native path.
Closes rime#804

Fixes rime/weasel#576
Fixes rime/weasel#1080

BREAKING CHANGE: `installation.yaml` should be UTF-8 encoded.

Previouly on Windows, the file can be written in local encoding to
enable paths with non-ASCII characters. It should be updated to UTF-8
after this change.
lotem added a commit to lotem/librime that referenced this issue Feb 3, 2024
Follow @fxliang 's PR, use `u8path` on Windows to convert UTF-8 string
to Windows native path.
Closes rime#804

Fixes rime/weasel#576
Fixes rime/weasel#1080

BREAKING CHANGE: `installation.yaml` should be UTF-8 encoded.

Previouly on Windows, the file can be written in local encoding to
enable paths with non-ASCII characters. It should be updated to UTF-8
after this change.
lotem added a commit to lotem/librime that referenced this issue Feb 4, 2024
Follow @fxliang 's PR, use `u8path` on Windows to convert UTF-8 string
to Windows native path.
Closes rime#804

Fixes rime/weasel#576
Fixes rime/weasel#1080

BREAKING CHANGE: `installation.yaml` should be UTF-8 encoded.

Previouly on Windows, the file can be written in local encoding to
enable paths with non-ASCII characters. It should be updated to UTF-8
after this change.
graphemecluster pushed a commit to TypeDuck-HK/librime that referenced this issue Mar 18, 2024
refactor: convert path to native encoding on Windows

feat(rime_api): provide secure version of path getter functions `RimeApi::get_*_dir_s`.

Follow @fxliang 's PR, use `u8path` on Windows to convert UTF-8 string
to Windows native path.

Closes rime#804
Fixes rime/weasel#576
Fixes rime/weasel#1080

BREAKING CHANGE: Most `string` filenames in APIs are changed to `path`;
`installation.yaml` should be UTF-8 encoded.

Previouly on Windows, the file can be written in local encoding to
enable paths with non-ASCII characters. It should be updated to UTF-8
after this change.

Details of the code refactor

Wrap `std::filesystem::path` in a thin wrapper class `rime::path` which calls `std::filesystem::u8path` in the constructor on Windows.

Operator `/=` and `/` are also overloaded to convert the right operand from UTF-8 string to native path.

Follow these rules to apply correct conversion between `string` and `rime::path`:

- construct `rime::path` with UTF-8 encoded string;
- get native string by `path::u8string`;
- to extract UTF-8 string from `path`, for example to find schema ID from file name, call `path::u8string`;
- avoid implicit conversion from string, which results in `std::filesystem::path` without performing UTF-8 to native conversion;
- explicitly construct `rime::path` from `std::filesystem::path` before append operation, to ensure the overloaded operator with string conversion is used.
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants