Skip to content

Commit 247cbe6

Browse files
authoredMar 14, 2018
Merge pull request #14 from smithclay/cleanup-0.1.1
Use standard logging library, reuse session for performance
2 parents 0194b8d + 48b0778 commit 247cbe6

9 files changed

+92
-75
lines changed
 

‎README.md

+27-23
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
11
## lambdium
22
### headless chrome + selenium webdriver in AWS Lambda
33

4+
**Lambdium allows you to run a Selenium Webdriver script written in Javascript inside of an AWS Lambda function bundled with [Headless Chromium](https://developers.google.com/web/updates/2017/04/headless-chrome).**
5+
46
*This project is now published on the [AWS Serverless Application Repository](https://serverlessrepo.aws.amazon.com), allowing you to install it in your AWS account with one click. Install in your AWS account [here](https://serverlessrepo.aws.amazon.com/#/applications/arn:aws:serverlessrepo:us-east-1:156280089524:applications~lambdium).* Quickstart instructions are in the [`README-SAR.md` file](https://github.com/smithclay/lambdium/blob/master/README-SAR.md).
57

68
This uses the binaries from the [serverless-chrome](https://github.com/adieuadieu/serverless-chrome) project to prototype running headless chromium with `selenium-webdriver` in AWS Lambda. I've also bundled the chromedriver binary so the browser can be interacted with using the [Webdriver Protocol](https://www.w3.org/TR/webdriver/).
@@ -13,61 +15,63 @@ The function interacts with [headless Chromium](https://chromium.googlesource.co
1315

1416
Since this Lambda function is written using node.js, you can run almost any script written for [selenium-webdriver](https://www.npmjs.com/package/selenium-webdriver). Example scripts can be found in the `examples` directory.
1517

16-
#### Requirements
18+
#### Requirements and Setup
1719

1820
* An AWS Account
1921
* The [AWS SAM Local](https://github.com/awslabs/aws-sam-local) tool for running functions locally with the [Serverless Application Model](https://github.com/awslabs/serverless-application-model) (see: `template.yaml`)
2022
* node.js + npm
2123
* `modclean` npm modules for reducing function size (optional)
2224
* Bash
2325

24-
#### Fetching dependencies
26+
_Note:_ If you don't need to build, customize, or run this locally, you can deploy it directly from a [template on the AWS Serverless Application repository](https://serverlessrepo.aws.amazon.com/#/applications/arn:aws:serverlessrepo:us-east-1:156280089524:applications~lambdium) and skip all of the below steps.
27+
28+
#### 1. Fetching dependencies
2529

2630
The headless chromium binary is too large for Github, you need to fetch it using a script bundled in this repository. [Marco Lüthy](https://github.com/adieuadieu) has an excellent post on Medium about how he built chromium for for AWS Lambda [here](https://medium.com/@marco.luethy/running-headless-chrome-on-aws-lambda-fa82ad33a9eb).
2731

2832
```sh
2933
$ ./scripts/fetch-dependencies.sh
3034
```
3135

32-
#### Running locally with SAM Local
36+
##### 2. Cleaning up the `node_modules` directory to reduce function size
3337

34-
SAM Local can run this function on your computer inside a Docker container that acts like AWS Lambda. To run the function with an example event trigger that uses selenium to use headless chromium to visit `google.com`, run this:
38+
It's a good idea to clean the `node_modules` directory before packaging to make the function size significantly smaller (making the function run faster!). You can do this using the `modclean` package:
39+
40+
To install it:
3541

3642
```sh
37-
$ sam local invoke Lambdium -e event.json
43+
$ npm i -g modclean
3844
```
3945

40-
### Deploying
41-
42-
#### Creating a bucket for the function deployment
43-
44-
This will create a file called `packaged.yaml` you can use with Cloudformation to deploy the function.
45-
46-
You need to have an S3 bucket configured on your AWS account to upload the packed function files. For example:
46+
Then, run:
4747

4848
```sh
49-
$ export LAMBDA_BUCKET_NAME=lambdium-upload-bucket
49+
$ modclean --patterns="default:*"
5050
```
5151

52-
##### Reducing function size for performance (and faster uploads!)
52+
Follow the prompts and choose 'Y' to remove extraneous files from `node_modules`.
5353

54-
It's a good idea to clean the `node_modules` directory before packaging to make the function size significantly smaller (making the function run faster!). You can do this using the `modclean` package:
54+
#### 3. Running locally with SAM Local
5555

56-
To install it:
56+
SAM Local can run this function on your computer inside a Docker container that acts like AWS Lambda. To run the function with an example event trigger that uses selenium to use headless chromium to visit `google.com`, run this:
5757

5858
```sh
59-
$ npm i -g modclean
59+
$ sam local invoke Lambdium -e event.json
6060
```
6161

62-
Then, run:
62+
### Deploying
63+
64+
#### Creating a S3 bucket for the function deployment
65+
66+
This will create a file called `packaged.yaml` you can use with Cloudformation to deploy the function.
67+
68+
You need to have an S3 bucket configured on your AWS account to upload the packed function files. For example:
6369

6470
```sh
65-
$ modclean --patterns="default:*"
71+
$ export LAMBDA_BUCKET_NAME=lambdium-upload-bucket
6672
```
6773

68-
Follow the prompts and choose 'Y' to remove extraneous files from `node_modules`.
69-
70-
##### Packaging the function for Cloudformation using SAM
74+
#### Packaging the function for Cloudformation using SAM
7175

7276
```sh
7377
$ sam package --template-file template.yaml --s3-bucket $LAMBDA_BUCKET_NAME --output-template-file packaged.yaml
@@ -83,7 +87,7 @@ This will create the function using Cloudformation after packaging it is complet
8387
8488
If set, the optional `DEBUG_ENV` environment variable will log additional information to Cloudwatch.
8589
86-
## Invoking the function
90+
### Running the function
8791
8892
Post-deploy, you can have lambda run a Webdriver script. There's an example of a selenium-webdriver simple script in the `examples/` directory that the Lambda function can now run.
8993

‎examples/visitgoogle.js

+1-1
Original file line numberDiff line numberDiff line change
@@ -11,5 +11,5 @@ $browser.findElement($driver.By.name('btnK')).click();
1111
$browser.wait($driver.until.titleIs('Google'), 1000);
1212
$browser.getTitle().then(function(title) {
1313
console.log("title is: " + title);
14+
console.log('Finished running script!');
1415
});
15-
console.log('Finished running script!');

‎index.js

+28-18
Original file line numberDiff line numberDiff line change
@@ -4,26 +4,35 @@ const fs = require('fs');
44

55
const chromium = require('./lib/chromium');
66
const sandbox = require('./lib/sandbox');
7-
const { log } = require('./lib/helpers');
7+
const log = require('lambda-log');
88

9-
console.log('Loading function');
9+
log.info('Loading function');
10+
11+
// Create new reusable session (spawns chromium and webdriver)
12+
if (!process.env.CLEAN_SESSIONS) {
13+
$browser = chromium.createSession();
14+
}
1015

1116
exports.handler = (event, context, callback) => {
1217
context.callbackWaitsForEmptyEventLoop = false;
1318

14-
if (process.env.CLEAR_TMP) {
15-
log('attempting to clear /tmp directory')
16-
log(child.execSync('rm -rf /tmp/core*').toString());
19+
if (process.env.CLEAN_SESSIONS) {
20+
log.info('attempting to clear /tmp directory')
21+
log.info(child.execSync('rm -rf /tmp/core*').toString());
22+
}
23+
24+
if (process.env.DEBUG_ENV || process.env.SAM_LOCAL) {
25+
log.config.debug = true;
26+
log.config.dev = true;
1727
}
1828

19-
if (process.env.DEBUG_ENV) {
20-
//log(child.execSync('set').toString());
21-
log(child.execSync('pwd').toString());
22-
log(child.execSync('ls -lhtra .').toString());
23-
log(child.execSync('ls -lhtra /tmp').toString());
29+
if (process.env.LOG_DEBUG) {
30+
log.debug(child.execSync('pwd').toString());
31+
log.debug(child.execSync('ls -lhtra .').toString());
32+
log.debug(child.execSync('ls -lhtra /tmp').toString());
2433
}
2534

26-
log('Received event:', JSON.stringify(event, null, 2));
35+
log.info(`Received event: ${JSON.stringify(event, null, 2)}`);
2736

2837
// Read input
2938
const inputParam = event.Base64Script || process.env.BASE64_SCRIPT;
@@ -32,17 +41,18 @@ exports.handler = (event, context, callback) => {
3241
}
3342

3443
const inputBuffer = Buffer.from(inputParam, 'base64').toString('utf8');
35-
if (process.env.DEBUG_ENV) {
36-
log(`Executing "${inputBuffer}"`);
44+
log.debug(`Executing script "${inputBuffer}"`);
45+
46+
// Creates a new session on each event (instead of reusing for performance benefits)
47+
if (process.env.CLEAN_SESSIONS) {
48+
$browser = chromium.createSession();
3749
}
3850

39-
// Start selenium webdriver session
40-
$browser = chromium.createSession();
4151
sandbox.executeScript(inputBuffer, $browser, webdriver, function(err) {
42-
if (process.env.DEBUG_ENV) {
43-
log(child.execSync('ps aux').toString());
52+
if (process.env.LOG_DEBUG) {
53+
log.debug(child.execSync('ps aux').toString());
54+
log.debug(child.execSync('cat /tmp/chromedriver.log').toString())
4455
}
45-
4656
if (err) {
4757
callback(err, null);
4858
}

‎lib/chromium.js

+9-17
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,4 @@
11
const child = require('child_process');
2-
const { log } = require('./helpers');
32
const os = require('os');
43
const path = require('path');
54
const chrome = require('selenium-webdriver/chrome');
@@ -44,9 +43,15 @@ const defaultChromeFlags = [
4443
const HEADLESS_CHROME_PATH = 'bin/headless-chromium';
4544
const CHROMEDRIVER_PATH = '/var/task/bin/chromedriver';
4645
exports.createSession = function() {
47-
const service = new chrome.ServiceBuilder(CHROMEDRIVER_PATH)
48-
.enableVerboseLogging()
49-
.build();
46+
var service;
47+
if (process.env.LOG_DEBUG || process.env.SAM_LOCAL) {
48+
service = new chrome.ServiceBuilder(CHROMEDRIVER_PATH)
49+
.loggingTo('/tmp/chromedriver.log')
50+
.build();
51+
} else {
52+
service = new chrome.ServiceBuilder(CHROMEDRIVER_PATH)
53+
.build();
54+
}
5055

5156
const options = new chrome.Options();
5257

@@ -60,16 +65,3 @@ exports.createSession = function() {
6065
return chrome.Driver.createSession(options, service);
6166
}
6267

63-
const spawnProcess = function(localPath, flags) {
64-
const opts = {
65-
cwd: os.tmpdir(),
66-
shell: true,
67-
detached: true,
68-
};
69-
70-
const proc = child.spawn(path.join(process.env.LAMBDA_TASK_ROOT, localPath),
71-
flags,
72-
opts
73-
);
74-
return proc;
75-
};

‎lib/helpers.js

-6
This file was deleted.

‎lib/sandbox.js

+18-6
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
const vm = require('vm');
2+
const log = require('lambda-log');
23

34
exports.executeScript = function(scriptText, browser, driver, cb) {
45
// Create Sandbox VM
@@ -14,7 +15,7 @@ exports.executeScript = function(scriptText, browser, driver, cb) {
1415
try {
1516
script.runInContext(scriptContext);
1617
} catch (e) {
17-
console.log('[script error]', e);
18+
log.error(`[script error] ${e}`);
1819
return cb(e, null);
1920
}
2021

@@ -27,8 +28,19 @@ exports.executeScript = function(scriptText, browser, driver, cb) {
2728
}
2829
});
2930
*/
30-
31-
$browser.quit().then(function() {
32-
cb(null);
33-
});
34-
}
31+
// https://github.com/GoogleChrome/puppeteer/issues/1825#issuecomment-372241101
32+
// Reuse existing session, likely some edge cases around this...
33+
if (process.env.CLEAN_SESSIONS) {
34+
$browser.quit().then(function() {
35+
cb(null);
36+
});
37+
} else {
38+
browser.manage().deleteAllCookies().then(function() {
39+
return $browser.get('about:blank').then(function() {
40+
cb(null);
41+
}).catch(function(err) {
42+
cb(err);
43+
});
44+
});
45+
}
46+
}

‎package-lock.json

+5
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

‎package.json

+2-1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "lambdium",
3-
"version": "0.1.0",
3+
"version": "0.1.1",
44
"description": "headless chromium in lambda prototype",
55
"main": "index.js",
66
"scripts": {
@@ -9,6 +9,7 @@
99
"author": "Clay Smith",
1010
"license": "ISC",
1111
"dependencies": {
12+
"lambda-log": "^1.3.0",
1213
"selenium-webdriver": "^3.6.0"
1314
}
1415
}

‎template.yaml

+2-3
Original file line numberDiff line numberDiff line change
@@ -9,10 +9,9 @@ Resources:
99
Runtime: nodejs6.10
1010
FunctionName: lambdium
1111
Description: headless chromium running selenium
12-
MemorySize: 1024
13-
Timeout: 10
12+
MemorySize: 1156
13+
Timeout: 20
1414
Environment:
1515
Variables:
16-
DEBUG_ENV: "true"
1716
CLEAR_TMP: "true"
1817
CodeUri: .

0 commit comments

Comments
 (0)