-
-
Notifications
You must be signed in to change notification settings - Fork 8.8k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Document solution to "docs last update" date being rendered incorrectly when published through Vercel #10031
Comments
Disagree to document it in the options docs or error message, agree to document it in deployment docs. Each CI has its own configuration for clone depth and we can't exhaustively document all, and the deployment docs is one of the few places where we give a nod to the exact options of each platform. |
This doesn't seem to work for us. I tried adding the Would be ideal if there was a fix for this that didn't require changing any settings in Vercel. |
The logic behind "docs last update" is that a search through the git commit history is performed to identify the latest commit that changed the file. If the git repo is cloned with depth 1, the entire history has only one level, and all files appear to be updated at the same date. There is no fix for this in Docusaurus, you need to ensure that the environment where you run the build is able to do a deep clone of the repo. I think that this should be explicitly explained in the Deployment page. |
Documentation is great but people will still skip it and report issues over time, or have to look at the documentation to understand how to fix their problem. I think we should focus on improving the DX instead. We could fail fast in the CI with a good error message so that users are immediately aware of the problem. Apparently we can know if a repository is a shallow clone:
https://git-scm.com/docs/git-rev-parse#Documentation/git-rev-parse.txt---is-shallow-repository So I get if we are trying to read the git commit date anywhere (docs, blog, sitemap...), and the repository is shallow (checked at most once / memoized), then we can throw with a link to some github discussion or documentation page, eventually providing an escape hatch (env variable) for those what want to failsafe. |
Hey, I tried to analyze the problem better, here's what I found out. Different clone methodsThere are various ways to clone a repository according to this GitHub blog post:
What we needDeep clone works but is slow Shallow clone works but doesn't show the expected lastUpdate date Blobless clone works, but is a bit slower than a shallow clone Treeless clone leads to errors at build time: [cause]: Error: Failed to retrieve the git history for file "/Users/sebastienlorber/Desktop/git/treeless/website/_dogfooding/_docs tests/tests/category-links/readme.mdx" with exit code 128: error: unable to open .git/objects/pack/pack-864c8b60dc58207d6541ece50d3fc6e975aeea55.idx: No such file or directory
fatal: unable to rename temporary '*.idx' file to '.git/objects/pack/pack-864c8b60dc58207d6541ece50d3fc6e975aeea55.idx'
fatal: fetch-pack: invalid index-pack output
fatal: could not fetch 70ea7acbec861fc1820abe4ca6910a2456533f2b from promisor remote
at getFileCommitDate (/Users/sebastienlorber/Desktop/git/treeless/packages/docusaurus-utils/lib/gitUtils.js:56:15) So, my conclusion is that the blobless clone method offers the best tradeoff: it can show lastUpdate and is relatively fast. BenchmarkI ran a benchmark in the cloud to see the clone performance impact of each alternative on our own repo: hyperfine --runs 5 \
"rm -rf default && git clone https://github.com/facebook/docusaurus.git default" \
"rm -rf shallow && git clone --depth=1 https://github.com/facebook/docusaurus.git shallow" \
"rm -rf treeless && git clone --filter=tree:0 https://github.com/facebook/docusaurus.git treeless" \
"rm -rf blobless && git clone --filter=blob:none https://github.com/facebook/docusaurus.git blobless"
Benchmark 1: rm -rf default && git clone https://github.com/facebook/docusaurus.git default
Time (mean ± σ): 36.956 s ± 3.152 s [User: 41.199 s, System: 4.402 s]
Range (min … max): 34.624 s … 41.291 s 5 runs
Benchmark 2: rm -rf shallow && git clone --depth=1 https://github.com/facebook/docusaurus.git shallow
Time (mean ± σ): 2.424 s ± 0.241 s [User: 0.988 s, System: 0.548 s]
Range (min … max): 2.219 s … 2.809 s 5 runs
Benchmark 3: rm -rf treeless && git clone --filter=tree:0 https://github.com/facebook/docusaurus.git treeless
Time (mean ± σ): 3.359 s ± 0.214 s [User: 1.198 s, System: 0.596 s]
Range (min … max): 3.120 s … 3.672 s 5 runs
Benchmark 4: rm -rf blobless && git clone --filter=blob:none https://github.com/facebook/docusaurus.git blobless
Time (mean ± σ): 4.785 s ± 0.710 s [User: 2.188 s, System: 0.800 s]
Range (min … max): 4.294 s … 5.934 s 5 runs Summary
rm -rf shallow && git clone --depth=1 https://github.com/facebook/docusaurus.git shallow ran
1.39 ± 0.16 times faster than rm -rf treeless && git clone --filter=tree:0 https://github.com/facebook/docusaurus.git treeless
1.97 ± 0.35 times faster than rm -rf blobless && git clone --filter=blob:none https://github.com/facebook/docusaurus.git blobless
15.25 ± 2.00 times faster than rm -rf default && git clone https://github.com/facebook/docusaurus.git default As we can see:
Hosting platformsNetlifyApparently, Netlify does a blobless clone by default: https://answers.netlify.com/t/please-confirm-repo-clones-are-not-shallow/86587 It's great because it's relatively fast and we still have access to the git history to compute the last update date. VercelVercel does not give us the full git history by default so I think it's safe to assume they use a shallow clone. There's a I'm trying to figure out if we can do blobless clones on Vercel, which would be better for us and other docs framework like Nextra, Fumadocs and others that read the git file history. GitHub ActionsUsers will usually use the checkout action: https://github.com/actions/checkout There are both a I also saw a That's all for now, I plan to edit this comment with new findings. |
Note: I also ran tests on Vercel using our own repo, which is quite large. It works with the default: https://vercel.com/lorbersebastiens-projects/docusaurus-clone-deep/DcQ2Td96mPGgLD61tHtQzwd9gVAU But not when using Similar to @JKarlavige I get the following errors: Any idea @leerob how to clone large repositories on Vercel and still be able to access git history? |
Have you read the Contributing Guidelines on issues?
Description
Setting the
VERCEL_DEEP_CLONE
environment variable totrue
through vercel'sProject Settings
page will fix the "docs last update" date being rendered incorrectly when published through Vercel.It should be documented in the field description of showLastUpdateTime.
It should also be documented under Deploying to Vercel. (I understand the decision to keep the "deployment" section mostly as-is.)
An aside:
A broader approach/solution would involve outputting a warning in the console if the repository is shallow and the
showLastUpdateTime
field is enabled. The warning would include the relevant solution for GitHub Actions or Vercel, depending on which one is being used.Self-service
The text was updated successfully, but these errors were encountered: