-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Specify collation on columns #19275
Comments
@arrkaye Can you point to where this was closed before? |
|
@roji This is 19275. :-) |
Oops, I was referring to #6577 but on second look, that issue is about specifying collation (and other things) on the database at creation time, rather than as on the column here. So not at a dup.. |
@roji Agreed, but my conclusion from thinking about this last night, backed up by the discussions on the other issue today, is that we need to take a holistic approach to collations. Ideally we should be able to run efficient queries when the collation and query are compatible, and throw good, provider-specific exception messages for other cases indicating how to get to efficient queries for other cases.
You already did! |
EF.Collate! |
Thinking about this in general, per-operation collation support (as opposed to column/database collation settings) is a bit scary... Assuming we add this to SqlBinaryExpression (in the extreme case we can make SqlExpression IAnnotatable 😨), preserving these across all transformations is not going to be easy, can see lots of bugs coming. |
EF.Collate is quite similar to EF.StoreType. |
Right, but we already have type mapping on SqlExpression, and a whole inference infrastructure for composing expressions with regards to type mapping... But if you think this is simple 😃 |
I am not saying it is simple but we would need to do it regardless ;) |
Then possibly related: #6717 (comment) |
Tracking this via #19866 and linked issues. |
Reopening to track column collation specification. |
Column specs with both collation and generated: MySQLBoth orderings work: CREATE TABLE BothWork
(
name1 VARCHAR(32) COLLATE utf8mb4_german2_ci AS ('hello'),
name2 VARCHAR(32) AS ('hello') COLLATE utf8mb4_german2_ci
); SqliteBoth orderings work: CREATE TABLE works (
id INT,
name1 TEXT COLLATE NOCASE AS ('hello') STORED,
name2 TEXT AS ('hello') STORED COLLATE NOCASE
); PostgreSQLBoth orderings work: CREATE TABLE works
(
name1 TEXT COLLATE "en-US-x-icu" GENERATED ALWAYS AS ('hello') STORED,
name2 TEXT GENERATED ALWAYS AS ('hello') STORED COLLATE "en-US-x-icu"
); SQL Server... wait for it... wait for it... Only one ordering works (likely because of SQL Server's peculiar syntax around generated columns). CREATE TABLE works
(
id INT,
name AS 'hello' COLLATE German_PhoneBook_CI_AS
);
CREATE TABLE doesntwork
(
id INT,
name1 COLLATE German_PhoneBook_CI_AS AS 'hello'
); |
* Added metadata and migration support for column- and database- level collations. * Scaffolding implemented for SQL Server. Sqlite seems to not be reporting the collation in pragma_table_info (need newer version?) Closes dotnet#19275 Closes dotnet#6577
* Added metadata and migration support for column- and database- level collations. * Scaffolding implemented for SQL Server. Sqlite seems to not be reporting the collation in pragma_table_info (need newer version?) Closes dotnet#19275 Closes dotnet#6577
We're also lacking this feature for quite some time now and almost ran into some serious security issues due to this. We have columns containing cryptographic hashes and IDs encoded in Base64. Due to the lack of support for column collations by EF we modified affected columns within CodeFirst migrations. At a later point in time we applied some other modification to the column and created another CodeFirst migration, this time without manual modification. As EF didn't know about the collation, it created migration code that didn't consider this aspect and we almost moved back to case insensitive collation, opening vulnerabilities for collision attacks. I think, having base64-encoded fields or other columns where collation matters, is not so uncommon. Please add this feature! |
@minichma thanks for the issue link. This feature is being added for 5.0 - a PR is already out, so you should be able to try this in an upcoming preview. Out of sheer curiosity, any specific reason to store BASE64 in a database text field rather than the binary data directly in a binary/varbinary? That would avoid any text-processing issues (such as collation, but there are others), and also be a more efficient. |
@roji In some cases we see IDs that happen to be composed from base64 string. But regardless of the way they have been composed, they are just plain strings but in contrast to (e.g.) GUIDs, they require case sensitive treatment. The shape of the IDs often has historic reasons and is not something up to the DB designer. In other cases we see key material where the standard way of storing it is base64 encoded, like it's the case with PEM-encoded keys. But even when talking about plain byte arrays, storing them string-encoded often has benefits over storing as varbinary, mainly because on C# side |
Thanks for the added info @minichma, makes sense! |
Here's my solution:
The enum type:
Usage example:
|
@marmagni in case you're not aware, EF Core 5.0 has 1st-class support for collations, so you don't have to resort to the above; see the docs. |
Hi,
This has been raised before and closed, but I would like to emphasis how useful it is.
I want to be able to specify the collation on a column, perhaps through an annotation and have EF Core respect and generate the column as such in the migration.
I have temporarily worked around this by adding some SQL to ALTER the column manually in the migration, but this is a slightly tenuous solution as any subsequent ALTER on the column will reset it back to default.
I posit that setting a column as case-sensitive in the DB is a usual and valid use-case and ideally would like it to be supported in EFCore.
Thank you for considering it.
Ark
The text was updated successfully, but these errors were encountered: