Skip to content
This repository has been archived by the owner on Jan 25, 2023. It is now read-only.

Don't cache npm packages with version * or which are missing from package-lock #518

Open
MikeBeas opened this issue Jan 11, 2021 · 5 comments
Labels
init/monorepo-support type: bug code to address defects in shipped code

Comments

@MikeBeas
Copy link

MikeBeas commented Jan 11, 2021

Hi all, the community team sent me your way this suggestion.

One issue I’m running into a decent amount is related to the build cache. I maintain a private node package that contains a lot of core functionality shared between my sites, like UI components, a bit of shared logic, and an API service to talk to my backend. I mean, who wants to write that stuff more than once? Not me!

Because I fully control this package and don’t have to worry about unexpected breaking changes, I don’t feel I need to lock it to a specific version, so I keep the package out of my package-lock and set the version to * in my package.json.

I did some updates to all my sites and to my core package the other day, and after doing some beta branch deploys with extra logging enabled, I found that they were still running the old version of the core package. One was several versions out of date. This is a pretty common thing for me to find.

The purpose of marking a package as using * (and keeping it out of package-lock) is to make sure it’s always running the very latest version at build time. While caching may be useful for other packages which are listed in the package-lock, caching those that aren’t creates unexpected behavior.

I understand the reason for the build cache is to help aleviate some of the strain on the system when building sites, but it seems to me that a configuration like mine, where the package is set to use version * or where a package is specifically kept out of package-lock, should be pretty rare and put a relatively small strain on the system overall if those packages were reinstalled every time. This is especially true when you consider that the current way to get around this is to wait for the new build to start, cancel it (or wait until it completes) then clear the entire build cache and re-run the build.

It would probably end up being more efficient on your end to allow packages with this configuration to avoid the build cache. Consider the two possible workflows:

  • Push changes to git
  • Build starts and finishes
  • “Ah, this isn’t right.”
  • Clear entire build cache and rebuild
  • Result: two full builds (maybe 1.5 if I cancel), plus the entire cache being cleared and all packages being reinstalled from scratch

versus

  • Push changes to git
  • Netlify reinstalls the one package with this specific config, use the cache for the rest
  • “This looks good.”
  • Result: One build with one package install, and everything else coming out of the cache

And while you might think this setup would incentivize people to use * versions as a means of bypassing the build cache whenever they want, it really wouldn’t since doing so for all packages would actually lead to a ton of breaking changes and be a very bad idea for them. This would only ever make sense for a subset of packages fully controlled by the owners of the sites on which they’re being used which always need to be at their latest version. (This would also allow us to push up a change to the package and then trigger a rebuild via build hook without having to login to the Netflify dashboard and clear the cache, another helpful bonus!)

As an engineer myself, I realize “this would probably only ever be used by a small subset of users” is the dumbest possible argument for why a feature should be implemented, but in this case I think it makes sense since you’d see at least some of the build-rebuild cycle listed above go away.

@JGAntunes
Copy link
Contributor

Hey @MikeBeas 👋 thank you for reporting this. I believe you're essentially being affected by the way we're currently caching the node_moduleswhen using npm:

if install_deps package.json $NODE_VERSION $NETLIFY_CACHE_DIR/package-sha

So this is essentially related to #510.

We're in the process of revising the way we handle npm install and node_modules caching. We don't have a specific timeline for this yet, as we're still actively discussing the best approach that would not cause a huge disruption to our build system, but I just wanted to let you know that we're actively looking into it 👍 we'll do our best to keep you posted.

@JGAntunes JGAntunes added init/monorepo-support type: bug code to address defects in shipped code labels Feb 24, 2021
@MikeBeas
Copy link
Author

@JGAntunes Sounds good,thanks for the info! Looking forward to seeing what you guys are working on.

@denis-sokolov
Copy link

The proposal may conflict with other expected meanings of the same syntax. A dependency on * means “any version would do, and if there’s no specific version detailed in the lockfile, it can make sense to install an older available from the cache.

@MikeBeas, perhaps your use case can be alleviated by adding an explicit install command to your build commands. Where you currently have something like npm run build, try npm install your-package@latest && npm run build. The “latest” keyword in npm will always deliver the latest version no matter what is currently in node_modules.

@MikeBeas
Copy link
Author

MikeBeas commented Mar 3, 2021

@denis-sokolov I don't disagree with you that adding this to my build script might solve the problem (and thanks for that! I can't believe I hadn't thought of it). But if I'm not mistaken, many (most?) build systems don't use a build cache like Netlify does, so in a lot of cases, the expected (or at least assumed) behavior for "install any version" might actually be to install the latest, especially for users who are used to doing things outside Netlify and are switching over.

I would also think that if a project had a particular version specified in the lockfile, it would be considered contradictory to using *, since the lockfile is indicating that not any version will do after all, and the lockfile version would be used.

If a project has no version in the lockfile and no specified version, the assumption by most users, in my opinion, would be that the package will reinstall at the latest version in many or most build systems outside Netlify. I think Netlify should probably match this.

@denis-sokolov
Copy link

I am less convinced than you are about what most users would expect. For instance, off the top search result in Google, Travis caches dependencies by default. It seems like a natural optimization to make. It also matches the behavior for developers on their machines which re-use existing node_modules until package.json or lockfile are updated.

In light of the lack of clear consensus on what we expect, I, personally, would fall back on the meaning of package.json dependencies field, where * does mean any version is accepted. And you’re right there’s some duplication and possibility of contradiction between the package.json file and the lockfile, but that stems from packages needing to live in the context of the consumers having broader versions of packages for deduplication.

# for free to subscribe to this conversation on GitHub. Already have an account? #.
Labels
init/monorepo-support type: bug code to address defects in shipped code
Projects
None yet
Development

No branches or pull requests

3 participants