-
Notifications
You must be signed in to change notification settings - Fork 1.8k
fix(NODE-3803): Fix _id typing on collection create operations #3077
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
fix(NODE-3803): Fix _id typing on collection create operations #3077
Conversation
- Update the inferType helper type to correctly infer optional _id types - Update the OptionalId helper to make the _id optional in all cases - Create a new helper to require _id if the _id is defined on the schema
- Remove union type for optional _id fields - Add tests for insertX operations to test custom defined _id types
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
@baileympearson @dariakp I don't understand the intended API usage after this change. Consider this example: old API usage (before this PR) interface Book {
_id: ObjectId;
title: string
}
const books: Collection<Book> = db.collection('books');
// Let's add a book. Normally books must have an _id,
// but during creation the service will generate the _id for us,
// so we omit that field for this particular operation:
const newBook: WithoutId<Book> = {
title: 'My Book'
};
await books.insertOne(newBook);
// Later if I retrieve the book, its _id field is present:
const foundBook: Book | null = await books.findOne({ title: 'My Book' });
if (foundBook) {
console.log('The book's ID is: ' + foundBook._id);
} The above code worked fine prior to this PR. However after upgrading the new API usage (after this PR) interface Book {
// "_id defined as optional means that the user is expecting to rely on
// a pkFactory method to generate a consistent id"
_id?: ObjectId;
title: string
} But then the type of This doesn't make sense. The Am I misunderstanding something? (Maybe we just need to expand the unit tests to better indicate the intended usage for this API.) |
@octogonz Thanks for reaching out. The intended usage for the API is to not specify the |
@dariakp Thanks for explaining. I've removed the However the result is a bit awkward: Basically everywhere that used to say e.g. In practice, this effect cascades across the code base, because you never know where a function may need to read the And in theory, "book" objects always have an ID. So it seems odd to declare I wonder if you might reconsider this API design choice. |
@octogonz We do appreciate the concern here, but as I highlighted in the linked comment:
The purpose of the collection type is to inform the driver of the expected document shape across all supported operations, hence this is currently the only viable api choice that would allow us to correctly assert _id types on inserts. To mitigate the transition pain, you could keep the _id on |
This seems like a regression to me, its very frustrating. I'm reading the thought process behind it and its not good logic.
insertOne(doc: OptionalId<TSchema>): Promise<InsertOneResult<TSchema>>; Boom done. I need to reference the Did you all test this out on a real code base before implementing it? I can't really see how this choice would have gotten past the prototype phase if it had actually been applied to a codebase even a single time. This whole notion that you shouldn't require it to be defined on your schemas because you're using auto id's on creation seems to completely overlook the fact that you'll need to reference the My workaround is to just cast it when inserting, even though that sucks too for different reasons: const book = WithoutId<Book> {
title: 'How to Design APIs'
}
await books.insertOne(book as Book) Please reconsider the comments of @octogonz because he's 100% right and this feels like a regression, what was there before was better. |
Looking at the code in this PR in more detail, I would just recommend straight up reverting it. |
Please see reply here: #3100 (comment) tl;dr There are users who need the id to be required on inserts. What you pass in as the collection type is the insert shape, and regardless of the insert shape, finds will always return a document with a correctly typed required id field, therefore there is no need to cast anything in ANY of the individual methods (read or write) if you accurately specify the id as optional in the collection insert schema. interface IdPet {
_id: ObjectId;
name: string;
age: number;
}
const database = client.db("<your database>");
const collection = db.collection<OptionalId<IdPet>>("<your collection>");
collection.insertOne({
name: "Spot",
age: 2
});
const puppy = collection.findOne(); // type of find result is IdPet
// puppy: IdPet
const puppyId = collection.findOne()._id; //_id is defined and accessible
// puppyId: ObjectId We have more examples in our docs: https://www.mongodb.com/docs/drivers/node/master/fundamentals/typescript/ |
Ok not gonna lie that's pretty confusing, in my schema its not optional, that's probably the source of the confusion. If I set it to optional on the schema then its optional on the insert but included in the results. Doesn't make sense but it does work. |
I agree that it is counterintuitive, but it should make sense if you think of it from the perspective of what a typescript schema in the driver can actually enforce. When you specify a typescript schema for the collection, the driver has no way to know whether those types are actually accurate in the database itself - it is not uncommon for more than one microservice to share a collection and modify different attributes. So, you aren't expected to describe the exact shape of the data that exists in the database, instead, you are expected to tell the driver what fields your service will be interacting with (i.e., reading and writing). *(see note) Ultimately, the collection schema is the "fields you care about when using the driver methods" schema with an indication of which fields you care about always and which only sometimes.
The _id field is only special in that the driver knows that documents without an _id field cannot exist, so it always at least allows the _id field on inserts and knows that it is always present on finds with the default type of ObjectId OR the type you specify in your schema - even if you only sometimes or never care about interacting with it. However, for the users who always care about interacting with it, we allow the tightening of that condition from optional to required on inserts, since there is no harm in allowing you to set a property you ultimately do not use, but there is harm in not enforcing the requirement of a property you care about personally setting. *Note: In fact, you could have objects of different shapes living side by side in the same collection, and if you have a module in your application that you wish to only work with a specific type of object, then you could specify only that object type as the collection schema used by that module and would get typescript-backed safety for operations performed by the driver inside that module. |
I guess my point was that reading should be favored over writing. In other words, this is the important basic scenario: const foundBook: Book | null = await books.findOne({ title: 'My Book' });
if (foundBook) {
console.log('The book's ID is: ' + foundBook._id);
} In this scenario we expect By contrast, writing is the messier scenario: const newBook: WithoutId<Book> = {
title: 'My Book'
};
await books.insertOne(newBook); The clunky
In the old design, those users would accomplish that by using In the new implementation, this policy is controlled by the export declare type OptionalUnlessRequiredId<TSchema> = TSchema extends {
_id: any;
} ? TSchema : OptionalId<TSchema>;
But the effect is more like this:
By adding a feature to help the minority, it seems like things got worse for the majority. Isn't there some better way to solve this? Like maybe instead of @justinmchase It might help to create a proper GitHub issue rather than discussing this problem in a closed PR thread. |
@octogonz I appreciate the thoughtful feedback here. Opening a ticket in our Jira project: https://jira.mongodb.org/projects/NODE would be the best way to keep track of the discussion and potential changes. Write schema is preferred over read schema because the write schema is what has the highest risk of leading to data inconsistency. However, if it truly is a major pain point for many users and there are suggestions for a better implementation, we are always open to hearing them. |
Here's a ticket : https://jira.mongodb.org/browse/NODE-4470 |
Description
This PR improves type inference around the
_id
fields for schemas used by theCollection
class.What is changing?
This PR updates the type inference that extracts the type of the
_id
field from schemas toSee https://jira.mongodb.org/browse/NODE-3761.
This PR also updates the parameter types for write operations on the
collection
class to match the following behavior (discussed in this slack thread)Is there new documentation needed for these changes?
Yes (I think)
What is the motivation for this change?
Improved typing of the
_id
field on schemas results in a better developer experience for typescript consumers of the node driver.Double check the following
npm run check:lint
script<type>(NODE-xxxx)<!>: <description>