-
-
Notifications
You must be signed in to change notification settings - Fork 218
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Allow attribute DISABLE-COMPRESSIONS in UNICODE collations #6915
Comments
I've changed the title of this issue so it may be considered an improvement instead of a bug. Attribute DISABLE-COMPRESSIONS will be allowed for UNICODE collations, hence, without compressions (contractions), the problem with search keys do not exist. Of course, that will change sort keys. One must analyze pros and cons of this. Usage will be:
|
Hi, Speed is now OK. Unfortunately, we cannot use the "DISABLE-COMPRESSIONS=1" option because it changes the sorting behavior, which is then incorrect (https://firebirdsql.org/refdocs/langrefupd21-ddl-collation.html Disables compressions (aka contractions). Compressions cause certain character sequences to be sorted as atomic units, e.g. Spanish c+h as a single character ch.). How to simulate the problem: create collation UNICODE_CSCZ_CS CREATE TABLE TEST_UNICODE_COLLATE ( INSERT INTO TEST_UNICODE_COLLATE (FIELD_WIN1250_PXW_CSY_CS, FIELD_UTF8_UNICODE_CSCZ_CI, FIELD_UTF8_UNICODE_CSCZ_CS) INSERT INTO TEST_UNICODE_COLLATE (FIELD_WIN1250_PXW_CSY_CS, FIELD_UTF8_UNICODE_CSCZ_CI, FIELD_UTF8_UNICODE_CSCZ_CS) INSERT INTO TEST_UNICODE_COLLATE (FIELD_WIN1250_PXW_CSY_CS, FIELD_UTF8_UNICODE_CSCZ_CI, FIELD_UTF8_UNICODE_CSCZ_CS) INSERT INTO TEST_UNICODE_COLLATE (FIELD_WIN1250_PXW_CSY_CS, FIELD_UTF8_UNICODE_CSCZ_CI, FIELD_UTF8_UNICODE_CSCZ_CS) INSERT INTO TEST_UNICODE_COLLATE (FIELD_WIN1250_PXW_CSY_CS, FIELD_UTF8_UNICODE_CSCZ_CI, FIELD_UTF8_UNICODE_CSCZ_CS) SELECT FIELD_WIN1250_PXW_CSY_CS SELECT FIELD_UTF8_UNICODE_CSCZ_CI SELECT FIELD_UTF8_UNICODE_CSCZ_CS ` |
I explained the problem and the trade-off. Your initial problem (test) was also a bit artificial, when you wanted to search starting with a single letter. Depending on your data volume (or your artificial test data), you will have problem with whatever implementation. |
Adriano, the problem is that the trade-off is unacceptable for certain languages. So it's more like a workaround band-aid around "speed problem" than real solution. |
@asfernandes, did you consider a better solution? Changing sort keys is really unacceptable for any language that uses contractions (like Czech), so users that use it don't have a real fix as the trade-off decision is between very bad performance or incorrect sorting. It would be much appreciated if better solution would appear in next maintenance release. |
::: test details ::: |
::: QA note ::: |
Written by @javihonza in firebird-support list https://groups.google.com/g/firebird-support/c/VCXnWp0IZVw:
Hello,
we have a speed problem when using national COLLATE on UTF8 columns.
If we use UTF8 without COLLATE the speed problem does not arise. The problem is not when using ANSI with national COLLATE.
The speed problem is both in CASE SENSITIVE and CASE INSENSITIVE. Tested on FB3 (UCI 6.9) and FB 4 (default UCI).
The speed problem prevents us from switching to unicode (ANSI -> UTF8).
Please who should we contact to solve the problem?
How to simulate the problem:
The text was updated successfully, but these errors were encountered: