Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Added maximumWalkingDepth option for supporting large documents #360

Merged
merged 2 commits into from
Apr 14, 2024

Conversation

BrandonDR
Copy link
Contributor

@BrandonDR BrandonDR commented Apr 4, 2024

This PR implements a new option maximumWalkingDepth.
This option replaces the hard-coded 1000000 (1 million) max loop iterations. This loop counting is intended to avoid an infinite loop. I have retained the default behaviour but updated the InternalError message to 'infinite loop or massive dataset detected'. This should make this issue easier to debug.

I have added TS doc comments and updated the README to document this option.
I have also added a Tip: to suggest the use of the Infinity constant which will disable the infinite loop detection. I intend to use this in combination with a timeout on my node process instead of an arbitrary limit.

Motivation

When generating a document using a FOR loop with 2000 iterations and each loop inserting (INS) many values, I was able to exceed 1 million loops in the walkTemplate function. This caused a generic 'something went wrong with the document. Please review and try again' error

I appreciate your work on this library!

@BrandonDR
Copy link
Contributor Author

BrandonDR commented Apr 4, 2024

Build, test and code coverage seems to pass (expand to see output)
$ yarn install && yarn compile && yarn test
yarn install v1.22.22
[1/5] Validating package.json...
[2/5] Resolving packages...
success Already up-to-date.
$ yarn compile
yarn run v1.22.22
$ rimraf ./lib && tsc && yarn rollup
$ rollup -c

./src/browser.ts → ./lib/browser.js...
(!) Circular dependencies
node_modules/jszip/lib/utils.js -> node_modules/jszip/lib/base64.js -> node_modules/jszip/lib/utils.js
node_modules/jszip/lib/utils.js -> node_modules/jszip/lib/base64.js -> /home/brandon/code/docx-templates/node_modules/jszip/lib/utils.js?commonjs-proxy -> node_modules/jszip/lib/utils.js
(!) Use of eval is strongly discouraged
https://rollupjs.org/guide/en/#avoiding-eval
node_modules/vm-browserify/index.js
108: 
109: Script.prototype.runInThisContext = function () {
110:     return eval(this.code); // maybe...
                ^
111: };
created ./lib/browser.js in 1s

./lib/index.d.ts → ./lib/bundled.d.ts...
created ./lib/bundled.d.ts in 118ms
Done in 5.38s.
Done in 5.74s.
yarn run v1.22.22
$ rimraf ./lib && tsc && yarn rollup
$ rollup -c

./src/browser.ts → ./lib/browser.js...
(!) Circular dependencies
node_modules/jszip/lib/utils.js -> node_modules/jszip/lib/base64.js -> node_modules/jszip/lib/utils.js
node_modules/jszip/lib/utils.js -> node_modules/jszip/lib/base64.js -> /home/brandon/code/docx-templates/node_modules/jszip/lib/utils.js?commonjs-proxy -> node_modules/jszip/lib/utils.js
(!) Use of eval is strongly discouraged
https://rollupjs.org/guide/en/#avoiding-eval
node_modules/vm-browserify/index.js
108: 
109: Script.prototype.runInThisContext = function () {
110:     return eval(this.code); // maybe...
                ^
111: };
created ./lib/browser.js in 1.1s

./lib/index.d.ts → ./lib/bundled.d.ts...
created ./lib/bundled.d.ts in 101ms
Done in 5.67s.
yarn run v1.22.22
$ yarn lint && yarn testCovFull
$ eslint "src/**/*.{js,jsx,ts,tsx}"
$ yarn testCovPrepare && yarn testDev && yarn testCovReport
$ rm -rf ./coverage .nyc_output .nyc_tmp && mkdir .nyc_tmp
$ NODE_ENV=development jest --coverage && mv .nyc_output/coverage-final.json .nyc_tmp/coverage-dev.json && rm -rf .nyc_output
 PASS  src/__tests__/unit.test.ts
 PASS  src/__tests__/list_commands.test.ts
 PASS  src/__tests__/error_handling.test.ts (7.893 s)
 PASS  src/__tests__/images.test.ts (10.594 s)
 PASS  src/__tests__/templating.test.ts (11.422 s)
-----------------------|---------|----------|---------|---------|------------------------------------------------------------
File                   | % Stmts | % Branch | % Funcs | % Lines | Uncovered Line #s                                          
-----------------------|---------|----------|---------|---------|------------------------------------------------------------
All files              |   94.12 |    86.81 |   95.72 |   95.11 |                                                            
 browser.ts            |       0 |      100 |     100 |       0 | 1                                                          
 errors.ts             |   93.54 |    52.38 |   89.47 |      90 | 41-43,71                                                   
 index.ts              |     100 |      100 |       0 |     100 |                                                            
 jsSandbox.ts          |      90 |    88.88 |     100 |   91.17 | 45-47                                                      
 main.ts               |   93.93 |    90.27 |     100 |   94.11 | 57,241,387,430,433,459,508-526                             
 preprocessTemplate.ts |     100 |      100 |     100 |     100 |                                                            
 processTemplate.ts    |   93.06 |    86.66 |     100 |   95.06 | 88-99,255-259,431,488,500,616,642,717,744,961,974,976,1093 
 reportUtils.ts        |   94.44 |    80.95 |     100 |   96.82 | 36,42                                                      
 types.ts              |     100 |      100 |     100 |     100 |                                                            
 xml.ts                |   98.59 |       90 |   91.66 |   98.46 | 45                                                         
 zip.ts                |     100 |      100 |     100 |     100 |                                                            
-----------------------|---------|----------|---------|---------|------------------------------------------------------------

Test Suites: 5 passed, 5 total
Tests:       190 passed, 190 total
Snapshots:   164 passed, 164 total
Time:        13.355 s, estimated 15 s
Ran all test suites.
$ cp -r .nyc_tmp .nyc_output && nyc report --reporter=html --reporter=lcov --reporter=text
-----------------------|---------|----------|---------|---------|------------------------------------------------------------
File                   | % Stmts | % Branch | % Funcs | % Lines | Uncovered Line #s                                          
-----------------------|---------|----------|---------|---------|------------------------------------------------------------
All files              |   94.12 |    86.81 |   95.72 |   95.11 |                                                            
 browser.ts            |       0 |      100 |     100 |       0 | 1                                                          
 errors.ts             |   93.54 |    52.38 |   89.47 |      90 | 41-43,71                                                   
 index.ts              |     100 |      100 |       0 |     100 |                                                            
 jsSandbox.ts          |      90 |    88.88 |     100 |   91.17 | 45-47                                                      
 main.ts               |   93.93 |    90.27 |     100 |   94.11 | 57,241,387,430,433,459,508-526                             
 preprocessTemplate.ts |     100 |      100 |     100 |     100 |                                                            
 processTemplate.ts    |   93.06 |    86.66 |     100 |   95.06 | 88-99,255-259,431,488,500,616,642,717,744,961,974,976,1093 
 reportUtils.ts        |   94.44 |    80.95 |     100 |   96.82 | 36,42                                                      
 types.ts              |     100 |      100 |     100 |     100 |                                                            
 xml.ts                |   98.59 |       90 |   91.66 |   98.46 | 45                                                         
 zip.ts                |     100 |      100 |     100 |     100 |                                                            
-----------------------|---------|----------|---------|---------|------------------------------------------------------------
Done in 18.25s.

@BrandonDR
Copy link
Contributor Author

BrandonDR commented Apr 7, 2024

For context, this seems to related to the fix for #340, specifically this commit dseiler@8eeec1f

@jjhbw
Copy link
Collaborator

jjhbw commented Apr 8, 2024

Hi @BrandonDR , great idea to add this. The hardcoded limit seems a bit low, now I think of it. I can totally imagine legit use cases hitting that limit, so making it configurable makes sense to me. What is the limit value are you using in your code? Maybe we should increase the default value too, in addition to making it configurable.

@BrandonDR
Copy link
Contributor Author

BrandonDR commented Apr 9, 2024

Hi @jjhbw,
I am using the Infinity constant, which disables the exception but this would not be a good default if there could still be an infinite loop.

const buffer = await createReport({
    //...
    maximumWalkingDepth: Infinity,
    //...
});

I am opting for a 10 minute process timeout instead. As I am spawning a node process from my PHP queue worker and handling the exception case there.

Our use case has a bunch of varied templates that we are building all the time for a range of clients, so we need something to accommodate a wide range of document sizes and templates.
For the case we encountered this error there was a FOR loop with a 20 column table with ~2400 rows which managed to hit this limit. This would be an extreme outlier.

Hard to say, maybe 10 million might be reasonable headroom. There is a trade off of how quickly the library should give up.
Maybe a timeoutSeconds option with a setTimeout approach might more sense?

@jjhbw jjhbw merged commit 672e4bc into guigrpa:master Apr 14, 2024
3 checks passed
@jjhbw
Copy link
Collaborator

jjhbw commented Apr 14, 2024

Excellent, thanks a lot for contributing!

@dseiler
Copy link

dseiler commented Apr 20, 2024

@BrandonDR thanks for this update. Yes, it makes sense to have this limit configurable. I set is initially to 1 Million because with 1 Million it takes roughly 10 seconds to hit the limit. In my documents I usually never get close to this limit (even with quite complex docs with many loops etc.). But of course there could be use cases where you need more (however a docx with more than 2000 rows might become unhandy ;-) ). My biggest concern was the memory consumption. If it exceeds the 1 Mio loops (which is usually an indication that something went out of control), the memory consumption on our cloud infrastructure might get critical.

@dseiler
Copy link

dseiler commented Apr 20, 2024

@BrandonDR quick addition: if the users can create the templates themselves they could always crash the server with a statement like this:

{{! counts = [0];}}
{{ FOR count IN counts }}
{{! counts.push(-1);}}
{{ $count }}
{{ END-FOR count }}

Besides that there could be still some very rare situations in a template where the engine goes into an infinite loop. So in production I would always set this option to a decent value to prevent crashes.

@jjhbw
Copy link
Collaborator

jjhbw commented Apr 21, 2024

if the users can create the templates themselves they could always crash the server with a statement like this:

Just a pedantic side note: be aware of the security considerations when letting users upload templates that are executed on your servers. You are effectively executing user-provided javascript in that case. See https://github.com/guigrpa/docx-templates?tab=readme-ov-file#performance--security

@BrandonDR
Copy link
Contributor Author

BrandonDR commented Apr 21, 2024

@jjhbw @dseiler Good notes, I am running this from separate 'queue workers' so memory usage is not a huge concern in regards to crashing the whole application.
It is reasonably well isolated but I should isolate it further regarding security - it should be on a dedicated vm for untrusted user uploads with minimal permissions/config/env vars etc. I have thought of this prior. But of course there are deadlines to meet. Thanks for the reminder

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants