Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

feat: Add the MetricsCollector for client side metrics #1566

Merged
merged 194 commits into from
Feb 24, 2025
Merged
Changes from all commits
Commits
Show all changes
194 commits
Select commit Hold shift + click to select a range
abb292b
Add the metrics tracer factory class
danieljbruce Jan 10, 2025
051b488
Add system tests for the client side metrics
danieljbruce Jan 10, 2025
1c49f86
Add open telemetry packages
danieljbruce Jan 10, 2025
7a5be3b
Add a metrics tracer factory
danieljbruce Jan 10, 2025
5390411
metadata experimentation
danieljbruce Jan 13, 2025
a7f2fd4
Pass metadata, status along
danieljbruce Jan 13, 2025
9e3e5f5
Get mapped entries
danieljbruce Jan 13, 2025
9f93172
Start collecting a few metrics in the metrics trac
danieljbruce Jan 13, 2025
b32aea0
on attempt start
danieljbruce Jan 14, 2025
750c1e7
Adding more metrics
danieljbruce Jan 15, 2025
b885126
Add support for application blocking latencies
danieljbruce Jan 15, 2025
5fb300b
Add a TODO for date wrapper
danieljbruce Jan 15, 2025
feb36e7
Add first unit test for the metrics tracer
danieljbruce Jan 16, 2025
8465b3a
Move the code for the TestMeterProvider to
danieljbruce Jan 16, 2025
7a97aab
Move the Date provider to a second file
danieljbruce Jan 16, 2025
51d3dd3
Fix attempt latencies bug
danieljbruce Jan 16, 2025
ee8c272
Add assertion check against text file
danieljbruce Jan 17, 2025
b7413e8
More realistic seconds increment
danieljbruce Jan 17, 2025
7c8877b
Remove imports
danieljbruce Jan 17, 2025
503a2a9
Adjust event timings to be more realistic
danieljbruce Jan 17, 2025
938cb2c
Remove only
danieljbruce Jan 17, 2025
854e1d1
Add comments to the table class
danieljbruce Jan 17, 2025
db7d1b1
More comments in table
danieljbruce Jan 17, 2025
ee67037
Remove TODO
danieljbruce Jan 17, 2025
ea2fbe2
Move observability options into a separate file
danieljbruce Jan 17, 2025
c22eb5b
inline definitions for the tabular api surface
danieljbruce Jan 17, 2025
a658a39
Comment source code out for now
danieljbruce Jan 17, 2025
960b402
Add abstractions for classes that have a logger
danieljbruce Jan 17, 2025
23a7c14
Generate documentation for meter provider
danieljbruce Jan 17, 2025
0ac6d15
Generate documentation for the date provider
danieljbruce Jan 17, 2025
bad23b2
Generate logger documentation
danieljbruce Jan 17, 2025
49bd7ca
Observability options documentation
danieljbruce Jan 17, 2025
129e8fd
Add more documentation for various MTF methods
danieljbruce Jan 17, 2025
052c7bb
Comment out Metrics
danieljbruce Jan 17, 2025
ac27a95
Add a bunch of TODOs in front of the comments
danieljbruce Jan 17, 2025
18c942e
Delete client-side-metrics file
danieljbruce Jan 17, 2025
7a3aabc
Revert "Delete client-side-metrics file"
danieljbruce Jan 17, 2025
5906c29
Revert "Revert "Delete client-side-metrics file""
danieljbruce Jan 17, 2025
be731af
Add headers
danieljbruce Jan 17, 2025
c26640f
Remove TODOs
danieljbruce Jan 17, 2025
3011e50
Add AttemptInfo to distinguish from OperationInfo
danieljbruce Jan 17, 2025
945f237
Adjust dimensions to match documentation
danieljbruce Jan 17, 2025
b04c3c4
Update tests with dimension metrics
danieljbruce Jan 17, 2025
2417e80
Revert "Revert "Revert "Delete client-side-metrics file"""
danieljbruce Jan 17, 2025
df59d88
Do some measurements
danieljbruce Jan 20, 2025
8ad51f5
Revert "Do some measurements"
danieljbruce Jan 20, 2025
6868f5a
Revert "Revert "Revert "Revert "Delete client-side-metrics file""""
danieljbruce Jan 20, 2025
7cc36a2
Add header
danieljbruce Jan 20, 2025
62a4b8b
Remove the TODOs
danieljbruce Jan 20, 2025
7c4f414
Add line back
danieljbruce Jan 20, 2025
83f53ae
Add comment
danieljbruce Jan 20, 2025
610eec0
Add version
danieljbruce Jan 20, 2025
a2b5951
Add version to client side metrics
danieljbruce Jan 20, 2025
5f67cad
linter
danieljbruce Jan 20, 2025
8f20c78
Generate documentation for AttemptInfo interface
danieljbruce Jan 20, 2025
9b1ba9d
Logger documentation
danieljbruce Jan 20, 2025
88e96c3
Generate more documentation
danieljbruce Jan 20, 2025
ed39628
Generate documentation
danieljbruce Jan 20, 2025
76b1249
Make sure test reports correct duration, zone
danieljbruce Jan 20, 2025
8d60cb1
Generate documentation for the dimensions to strin
danieljbruce Jan 20, 2025
19fef92
Add version to the dimensions
danieljbruce Jan 20, 2025
1ecfb1c
Fix the client name. The version is going to chan
danieljbruce Jan 20, 2025
d8a3960
Update the expected output file.
danieljbruce Jan 20, 2025
1d6b645
Fox bug, get cluster
danieljbruce Jan 20, 2025
acb1d3a
Add fake cluster to tests
danieljbruce Jan 20, 2025
c30b057
Remove console log
danieljbruce Jan 20, 2025
9ef079b
Generate more documentation
danieljbruce Jan 20, 2025
d5a0368
Require a call to fetch the project when using MT
danieljbruce Jan 20, 2025
ae532d8
use same date provider for all metrics tracers
danieljbruce Jan 20, 2025
b2bced9
In the metrics traceer, don’t fetch the project
danieljbruce Jan 20, 2025
e1dd61c
Remove only
danieljbruce Jan 20, 2025
cd97f35
Merge branch 'main' into 359913994-first-PR-add-infrastructure
danieljbruce Jan 20, 2025
9ec98df
Add open telemetry api
danieljbruce Jan 20, 2025
2457dbe
Merge branch '359913994-first-PR-add-infrastructure' of https://githu…
danieljbruce Jan 20, 2025
5a1a3aa
Add TestExecuteQuery_EmptyResponse to failures
danieljbruce Jan 20, 2025
1bd2d2b
TestExecuteQuery_SingleSimpleRow known failures
danieljbruce Jan 20, 2025
c2be338
Fix syntax in known failures
danieljbruce Jan 20, 2025
cd0d774
Add two tests to the known failures
danieljbruce Jan 20, 2025
e7caf36
TestSampleRowKeys_Retry_WithRetryInfo to known fai
danieljbruce Jan 20, 2025
7fd86d2
Change word dimensions to attributes
danieljbruce Jan 20, 2025
db05ff3
Change more docs to use Attributes instead of dim
danieljbruce Jan 20, 2025
9cc4b15
attributes
danieljbruce Jan 20, 2025
0142329
Test should use attributes as string
danieljbruce Jan 20, 2025
15d6e4a
For Windows replace carriage return
danieljbruce Jan 21, 2025
865529e
Update documentation with types
danieljbruce Jan 21, 2025
28fbfd8
Add metrics collector
danieljbruce Jan 23, 2025
5995789
Metrics handler, GCPMetricsHandler and tests add
danieljbruce Jan 24, 2025
a62b124
Remove only
danieljbruce Jan 24, 2025
bfc5883
Merge branch 'main' of https://github.com/googleapis/nodejs-bigtable …
danieljbruce Jan 24, 2025
0996d3c
Add metrics handlers parameter to Doc
danieljbruce Jan 24, 2025
ef8e3fe
Don’t require retries to be passed into metrics
danieljbruce Jan 24, 2025
c68a76f
Remove testMeterProvider
danieljbruce Jan 24, 2025
47a24b1
Remove the attributesToString function
danieljbruce Jan 24, 2025
b2600f2
Eliminate unused class
danieljbruce Jan 24, 2025
d4d3f6c
Generate documentation for the IMetricsHandler
danieljbruce Jan 24, 2025
b8dff1c
Generate documentation for GCPMetricsHandler
danieljbruce Jan 24, 2025
d50384f
Restrict attributes interfaces and solve compile
danieljbruce Jan 24, 2025
b5fc1f2
🦉 Updates from OwlBot post-processor
gcf-owl-bot[bot] Jan 24, 2025
1e5dc82
use undefined instead of null
danieljbruce Jan 24, 2025
c2ffbc6
Introduce enums for allowable values
danieljbruce Jan 24, 2025
1200b3c
Merge branch '359913994-first-PR-add-infrastructure' of https://githu…
danieljbruce Jan 24, 2025
9320149
Add more headers
danieljbruce Jan 24, 2025
4023361
Remove only
danieljbruce Jan 24, 2025
ef91733
Use null to pass values around. Not undefined
danieljbruce Jan 24, 2025
52b570c
Modify test step
danieljbruce Jan 24, 2025
6a6774f
Add metrics
danieljbruce Jan 24, 2025
10b6d30
Don’t provide first response latency
danieljbruce Jan 24, 2025
33c17c6
Remove firstResponseLatency from operation metrics
danieljbruce Jan 24, 2025
fbf2314
Expose interface allowing undefined not null
danieljbruce Jan 24, 2025
39fe861
Better explanations for design decision inline
danieljbruce Jan 24, 2025
8f13100
Use attempt start time not operation start time
danieljbruce Jan 27, 2025
48e0e95
Adjust tests for first response latency
danieljbruce Jan 27, 2025
66c4ab1
Remove TODO
danieljbruce Jan 27, 2025
e7c5b5f
Use the MethodName enum instead of string
danieljbruce Jan 28, 2025
98be351
Don’t use enum for streaming operation
danieljbruce Jan 28, 2025
efdfcea
Remove copy/pasted comment
danieljbruce Jan 28, 2025
4a6a476
Rename to OperationMetricsCollector
danieljbruce Jan 28, 2025
edfcf8a
Rename the method to getOperationAttributes
danieljbruce Jan 28, 2025
bc4998f
🦉 Updates from OwlBot post-processor
gcf-owl-bot[bot] Jan 28, 2025
7e6374d
Merge branch '359913994-first-PR-add-infrastructure' of https://githu…
gcf-owl-bot[bot] Jan 28, 2025
10b72ec
🦉 Updates from OwlBot post-processor
gcf-owl-bot[bot] Jan 28, 2025
a72b51a
Merge branch '359913994-first-PR-add-infrastructure' of https://githu…
gcf-owl-bot[bot] Jan 28, 2025
47fd9d0
Add aggregate views to the GCP metrics handler
danieljbruce Jan 28, 2025
2a42162
Merge branch '359913994-first-PR-add-infrastructure' of https://githu…
danieljbruce Jan 28, 2025
9073f07
Adjust test based on enum changes
danieljbruce Jan 28, 2025
32d3983
Update the documentation to be more descriptive
danieljbruce Jan 28, 2025
9716c4a
Add the state machine to the metrics collector
danieljbruce Jan 29, 2025
d2b93ee
Use grpc code to report attempt/operation status
danieljbruce Jan 29, 2025
99f9577
Remove parameters from JS Documentation
danieljbruce Jan 29, 2025
c82e72d
Update interfaces and some metrics
danieljbruce Jan 30, 2025
759e829
Documentation for all the different interfaces
danieljbruce Jan 30, 2025
76b6f5a
use operation start time as the benchmark
danieljbruce Jan 30, 2025
1e840a4
Final operation status shouldn’t be included per a
danieljbruce Jan 30, 2025
7bf62e9
Move OnAttemptCompleteInfo
danieljbruce Jan 31, 2025
fca55b7
Provide AttemptOnlyAttributes in the only file
danieljbruce Jan 31, 2025
51afdce
Move over the OperationOnlyAttributes
danieljbruce Jan 31, 2025
57b1dc1
Adjust the guard so that it is earlier
danieljbruce Jan 31, 2025
0f850b7
Adjust the test output file
danieljbruce Jan 31, 2025
6c1e01b
Change streaming back to STREAMING/UNARY
danieljbruce Jan 31, 2025
2910408
🦉 Updates from OwlBot post-processor
gcf-owl-bot[bot] Jan 31, 2025
2781561
Change metrics handler interface to support each metric
danieljbruce Jan 31, 2025
0b4d93e
Revert "Change metrics handler interface to support each metric"
danieljbruce Jan 31, 2025
1b6681b
Supply the projectId later in the client side
danieljbruce Feb 5, 2025
b6f1302
Remove the GCPMetricsHandler file
danieljbruce Feb 5, 2025
1ae82ff
Change location of the client-side-metrics-attribu
danieljbruce Feb 6, 2025
3ee5604
Change common test utilities folder name
danieljbruce Feb 6, 2025
124ed30
Remove aliases for grpc status
danieljbruce Feb 6, 2025
ef36a6f
Should be MethodName type
danieljbruce Feb 6, 2025
6829224
Rename variable as it expands beyond latency
danieljbruce Feb 6, 2025
dd603f1
Remove private methods for building attributes
danieljbruce Feb 6, 2025
b493c0d
Replace the logger class with a simple object
danieljbruce Feb 6, 2025
2f19f31
Remove only
danieljbruce Feb 6, 2025
dfe7d57
Remove the logger classes
danieljbruce Feb 6, 2025
2d1684f
Revert "Remove the GCPMetricsHandler file"
danieljbruce Feb 6, 2025
38b6f84
Revert "Revert "Remove the GCPMetricsHandler file""
danieljbruce Feb 6, 2025
96d0da5
Add a fixture into the PR for exporter testing
danieljbruce Feb 7, 2025
0b28837
Add header
danieljbruce Feb 7, 2025
67bec83
Take out two unused interfaces
danieljbruce Feb 7, 2025
451c39f
Expected zones shouldn’t have a space
danieljbruce Feb 10, 2025
1a12e82
Use true/false because that is what the backend
danieljbruce Feb 10, 2025
064712f
Eliminate interfaces that aren’t needed anymore
danieljbruce Feb 11, 2025
197ea59
Add client_uuid
danieljbruce Feb 11, 2025
ffbb1d0
Read server time with regex
danieljbruce Feb 12, 2025
b4db210
Fix the duration parsing logic
danieljbruce Feb 12, 2025
21992eb
Mock fake date
danieljbruce Feb 12, 2025
598a57d
Mock dates differently. No date provider in src
danieljbruce Feb 12, 2025
ee27619
Remove date provider from metrics collector
danieljbruce Feb 12, 2025
fa0ff6f
Get rid of the test date provider
danieljbruce Feb 12, 2025
6d744b8
Get rid of the fake date code
danieljbruce Feb 12, 2025
707a9e9
Don’t pass project in when collecting metadata.
danieljbruce Feb 12, 2025
0cbdbc1
Roll receivedFirstResponse state into state variab
danieljbruce Feb 12, 2025
4ba6ba0
Create an object to contain metrics collector data
danieljbruce Feb 12, 2025
59f0358
Remove only
danieljbruce Feb 12, 2025
d15aee6
Replace the strings with constants
danieljbruce Feb 12, 2025
5da73e1
Check isNaN for the server duration
danieljbruce Feb 12, 2025
1aa6294
Use one argument for metrics handler not two
danieljbruce Feb 18, 2025
4af9d78
Pass streaming into metrics collector at creation
danieljbruce Feb 18, 2025
d352b39
Remove Datelike interface
danieljbruce Feb 18, 2025
038cf5c
Remove unused import
danieljbruce Feb 18, 2025
f2f9ecf
Eliminate two interfaces that aren’t used
danieljbruce Feb 18, 2025
e9c7564
serializing code
danieljbruce Feb 18, 2025
8c0406a
try other things out
danieljbruce Feb 18, 2025
801d935
More serializer experiments
danieljbruce Feb 19, 2025
6565298
Don’t use custom parsing code
danieljbruce Feb 19, 2025
8a62bd0
Update serializing code
danieljbruce Feb 19, 2025
dbb74f1
Correct the test stub for response params
danieljbruce Feb 19, 2025
6ea3b0c
eliminate the serializing file
danieljbruce Feb 19, 2025
1824f87
Remove the toString call
danieljbruce Feb 19, 2025
6add7fd
Remove only
danieljbruce Feb 19, 2025
cab137c
Change method name
danieljbruce Feb 19, 2025
3146854
map is array of strings
danieljbruce Feb 19, 2025
667a705
Change metadata type in stub
danieljbruce Feb 19, 2025
c2d27e6
Improve method name
danieljbruce Feb 20, 2025
12ffbe0
de-duplicate data
danieljbruce Feb 24, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions package.json
Original file line number Diff line number Diff line change
@@ -47,9 +47,14 @@
"precompile": "gts clean"
},
"dependencies": {
"@google-cloud/opentelemetry-cloud-monitoring-exporter": "^0.20.0",
"@google-cloud/opentelemetry-resource-util": "^2.4.0",
"@google-cloud/precise-date": "^4.0.0",
"@google-cloud/projectify": "^4.0.0",
"@google-cloud/promisify": "^4.0.0",
"@opentelemetry/api": "^1.9.0",
"@opentelemetry/resources": "^1.30.0",
"@opentelemetry/sdk-metrics": "^1.30.0",
"arrify": "^2.0.0",
"concat-stream": "^2.0.0",
"dot-prop": "^6.0.0",
34 changes: 34 additions & 0 deletions src/client-side-metrics/client-side-metrics-attributes.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
// Copyright 2025 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// https://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

// The backend is expecting true/false and will fail if other values are provided.
// export in open telemetry is expecting string value attributes so we don't use boolean
// true/false.
export enum StreamingState {
STREAMING = 'true',
UNARY = 'false',
}

/**
* Represents the names of Bigtable methods. These are used as attributes for
* metrics, allowing for differentiation of performance by method.
*/
export enum MethodName {
READ_ROWS = 'Bigtable.ReadRows',
MUTATE_ROW = 'Bigtable.MutateRow',
CHECK_AND_MUTATE_ROW = 'Bigtable.CheckAndMutateRow',
READ_MODIFY_WRITE_ROW = 'Bigtable.ReadModifyWriteRow',
SAMPLE_ROW_KEYS = 'Bigtable.SampleRowKeys',
MUTATE_ROWS = 'Bigtable.MutateRows',
}
71 changes: 71 additions & 0 deletions src/client-side-metrics/metrics-handler.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
// Copyright 2025 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// https://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

import {MethodName, StreamingState} from './client-side-metrics-attributes';
import {grpc} from 'google-gax';

/**
* The interfaces below use undefined instead of null to indicate a metric is
* not available yet. The benefit of this is that new metrics can be added
* without requiring users to change the methods in their metrics handler.
*/

type IMetricsCollectorData = {
instanceId: string;
table: string;
cluster?: string;
zone?: string;
appProfileId?: string;
methodName: MethodName;
clientUid: string;
};

interface StandardData {
projectId: string;
metricsCollectorData: IMetricsCollectorData;
clientName: string;
streamingOperation: StreamingState;
}

export interface OnOperationCompleteData extends StandardData {
firstResponseLatency?: number;
operationLatency: number;
retryCount?: number;
finalOperationStatus: grpc.status;
}

export interface OnAttemptCompleteData extends StandardData {
attemptLatency: number;
serverLatency?: number;
connectivityErrorCount: number;
attemptStatus: grpc.status;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking a lot better!

One more comment though: there's a lot of duplicated fields between OnOperationCompleteData and OnAttemptCompleteData. You should consider deduplicating the data so that each field exists in just one place (some fields belong at the operation level, and some are related to the specific attempt, and it could make sense to keep them separate)

In Python, both the Attempt and its associated Operation are passed in for on_attempt_complete. And the Operation holds a list of associated attempts, which is used to calculate retry_count

But that may be less applicable here since you're building these objects on the fly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had something similar before with StandardAttributes when the metrics and attributes were separate so I decided to de-duplicate the data with a new StandardData interface and then extend it.


/**
* An interface for handling client-side metrics related to Bigtable operations.
* Implementations of this interface can define how metrics are recorded and processed.
*/
export interface IMetricsHandler {
/**
* Called when an operation completes (successfully or unsuccessfully).
* @param {OnOperationCompleteData} data Metrics and attributes related to the completed operation.
*/
onOperationComplete?(data: OnOperationCompleteData): void;

/**
* Called when an attempt (e.g., an RPC attempt) completes.
* @param {OnAttemptCompleteData} data Metrics and attributes related to the completed attempt.
*/
onAttemptComplete?(data: OnAttemptCompleteData): void;
}
300 changes: 300 additions & 0 deletions src/client-side-metrics/operation-metrics-collector.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,300 @@
// Copyright 2025 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// https://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

import * as fs from 'fs';
import {IMetricsHandler} from './metrics-handler';
import {MethodName, StreamingState} from './client-side-metrics-attributes';
import {grpc} from 'google-gax';
import * as gax from 'google-gax';
const root = gax.protobuf.loadSync(
'./protos/google/bigtable/v2/response_params.proto'
);
const ResponseParams = root.lookupType('ResponseParams');

/**
* An interface representing a tabular API surface, such as a Bigtable table.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does this exist here? Isn't there already a class for the TabularAPISurface?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An ITabularApiSurface parameter is provided when the metrics collector is created. We don't want to require this parameter to be a whole TabularApiSurface object because only a small subset of the properties are actually used and we don't want to restrict the OperationMetricsCollector further than we have to. In particular, this simplifies the test for the metrics collector where we create a simple FakeTable object which would otherwise have to be a whole TabularApiSurface object.

*/
export interface ITabularApiSurface {
instance: {
id: string;
};
id: string;
bigtable: {
appProfileId?: string;
clientUid: string;
};
}

const packageJSON = fs.readFileSync('package.json');
const version = JSON.parse(packageJSON.toString()).version;

// MetricsCollectorState is a list of states that the metrics collector can be in.
// Tracking the OperationMetricsCollector state is done so that the
// OperationMetricsCollector methods are not called in the wrong order. If the
// methods are called in the wrong order they will not execute and they will
// throw warnings.
//
// The following state transitions are allowed:
// OPERATION_NOT_STARTED -> OPERATION_STARTED_ATTEMPT_NOT_IN_PROGRESS
// OPERATION_STARTED_ATTEMPT_NOT_IN_PROGRESS -> OPERATION_STARTED_ATTEMPT_IN_PROGRESS
// OPERATION_STARTED_ATTEMPT_IN_PROGRESS -> OPERATION_STARTED_ATTEMPT_NOT_IN_PROGRESS
// OPERATION_STARTED_ATTEMPT_IN_PROGRESS -> OPERATION_COMPLETE
enum MetricsCollectorState {
OPERATION_NOT_STARTED,
OPERATION_STARTED_ATTEMPT_NOT_IN_PROGRESS,
OPERATION_STARTED_ATTEMPT_IN_PROGRESS_NO_ROWS_YET,
OPERATION_STARTED_ATTEMPT_IN_PROGRESS_SOME_ROWS_RECEIVED,
OPERATION_COMPLETE,
}

/**
* A class for tracing and recording client-side metrics related to Bigtable operations.
*/
export class OperationMetricsCollector {
private state: MetricsCollectorState;
private operationStartTime: Date | null;
private attemptStartTime: Date | null;
private zone: string | undefined;
private cluster: string | undefined;
private tabularApiSurface: ITabularApiSurface;
private methodName: MethodName;
private attemptCount = 0;
private metricsHandlers: IMetricsHandler[];
private firstResponseLatency: number | null;
private serverTimeRead: boolean;
private serverTime: number | null;
private connectivityErrorCount: number;
private streamingOperation: StreamingState;

/**
* @param {ITabularApiSurface} tabularApiSurface Information about the Bigtable table being accessed.
* @param {IMetricsHandler[]} metricsHandlers The metrics handlers used for recording metrics.
* @param {MethodName} methodName The name of the method being traced.
* @param {StreamingState} streamingOperation Whether or not the call is a streaming operation.
*/
constructor(
tabularApiSurface: ITabularApiSurface,
metricsHandlers: IMetricsHandler[],
methodName: MethodName,
streamingOperation: StreamingState
) {
this.state = MetricsCollectorState.OPERATION_NOT_STARTED;
this.zone = undefined;
this.cluster = undefined;
this.tabularApiSurface = tabularApiSurface;
this.methodName = methodName;
this.operationStartTime = null;
this.attemptStartTime = null;
this.metricsHandlers = metricsHandlers;
this.firstResponseLatency = null;
this.serverTimeRead = false;
this.serverTime = null;
this.connectivityErrorCount = 0;
this.streamingOperation = streamingOperation;
}

private getMetricsCollectorData() {
return {
instanceId: this.tabularApiSurface.instance.id,
table: this.tabularApiSurface.id,
cluster: this.cluster,
zone: this.zone,
appProfileId: this.tabularApiSurface.bigtable.appProfileId,
methodName: this.methodName,
clientUid: this.tabularApiSurface.bigtable.clientUid,
};
}

/**
* Called when the operation starts. Records the start time.
*/
onOperationStart() {
if (this.state === MetricsCollectorState.OPERATION_NOT_STARTED) {
this.operationStartTime = new Date();
this.firstResponseLatency = null;
this.state =
MetricsCollectorState.OPERATION_STARTED_ATTEMPT_NOT_IN_PROGRESS;
} else {
console.warn('Invalid state transition');
}
}

/**
* Called when an attempt (e.g., an RPC attempt) completes. Records attempt latencies.
* @param {string} projectId The id of the project.
* @param {grpc.status} attemptStatus The grpc status for the attempt.
*/
onAttemptComplete(projectId: string, attemptStatus: grpc.status) {
if (
this.state ===
MetricsCollectorState.OPERATION_STARTED_ATTEMPT_IN_PROGRESS_NO_ROWS_YET ||
this.state ===
MetricsCollectorState.OPERATION_STARTED_ATTEMPT_IN_PROGRESS_SOME_ROWS_RECEIVED
) {
this.state =
MetricsCollectorState.OPERATION_STARTED_ATTEMPT_NOT_IN_PROGRESS;
this.attemptCount++;
const endTime = new Date();
if (projectId && this.attemptStartTime) {
const totalTime = endTime.getTime() - this.attemptStartTime.getTime();
this.metricsHandlers.forEach(metricsHandler => {
if (metricsHandler.onAttemptComplete) {
metricsHandler.onAttemptComplete({
attemptLatency: totalTime,
serverLatency: this.serverTime ?? undefined,
connectivityErrorCount: this.connectivityErrorCount,
streamingOperation: this.streamingOperation,
attemptStatus,
clientName: `nodejs-bigtable/${version}`,
metricsCollectorData: this.getMetricsCollectorData(),
projectId,
});
}
});
}
} else {
console.warn('Invalid state transition attempted');
}
}

/**
* Called when a new attempt starts. Records the start time of the attempt.
*/
onAttemptStart() {
if (
this.state ===
MetricsCollectorState.OPERATION_STARTED_ATTEMPT_NOT_IN_PROGRESS
) {
this.state =
MetricsCollectorState.OPERATION_STARTED_ATTEMPT_IN_PROGRESS_NO_ROWS_YET;
this.attemptStartTime = new Date();
this.serverTime = null;
this.serverTimeRead = false;
this.connectivityErrorCount = 0;
} else {
console.warn('Invalid state transition attempted');
}
}

/**
* Called when the first response is received. Records first response latencies.
*/
onResponse(projectId: string) {
if (
this.state ===
MetricsCollectorState.OPERATION_STARTED_ATTEMPT_IN_PROGRESS_NO_ROWS_YET
) {
this.state =
MetricsCollectorState.OPERATION_STARTED_ATTEMPT_IN_PROGRESS_SOME_ROWS_RECEIVED;
const endTime = new Date();
if (projectId && this.operationStartTime) {
this.firstResponseLatency =
endTime.getTime() - this.operationStartTime.getTime();
}
}
}

/**
* Called when an operation completes (successfully or unsuccessfully).
* Records operation latencies, retry counts, and connectivity error counts.
* @param {string} projectId The id of the project.
* @param {grpc.status} finalOperationStatus Information about the completed operation.
*/
onOperationComplete(projectId: string, finalOperationStatus: grpc.status) {
if (
this.state ===
MetricsCollectorState.OPERATION_STARTED_ATTEMPT_NOT_IN_PROGRESS
) {
this.state = MetricsCollectorState.OPERATION_COMPLETE;
const endTime = new Date();
if (projectId && this.operationStartTime) {
const totalTime = endTime.getTime() - this.operationStartTime.getTime();
{
this.metricsHandlers.forEach(metricsHandler => {
if (metricsHandler.onOperationComplete) {
metricsHandler.onOperationComplete({
finalOperationStatus: finalOperationStatus,
streamingOperation: this.streamingOperation,
metricsCollectorData: this.getMetricsCollectorData(),
clientName: `nodejs-bigtable/${version}`,
projectId,
operationLatency: totalTime,
retryCount: this.attemptCount - 1,
firstResponseLatency: this.firstResponseLatency ?? undefined,
});
}
});
}
}
} else {
console.warn('Invalid state transition attempted');
}
}

/**
* Called when metadata is received. Extracts server timing information if available.
* @param {object} metadata The received metadata.
*/
onMetadataReceived(metadata: {
internalRepr: Map<string, string[]>;
options: {};
}) {
const mappedEntries = new Map(
Array.from(metadata.internalRepr.entries(), ([key, value]) => [
key,
value.toString(),
])
);
const SERVER_TIMING_REGEX = /.*gfet4t7;\s*dur=(\d+\.?\d*).*/;
const SERVER_TIMING_KEY = 'server-timing';
const durationValues = mappedEntries.get(SERVER_TIMING_KEY);
const matchedDuration = durationValues?.match(SERVER_TIMING_REGEX);
if (matchedDuration && matchedDuration[1]) {
if (!this.serverTimeRead) {
this.serverTimeRead = true;
this.serverTime = isNaN(parseInt(matchedDuration[1]))
? null
: parseInt(matchedDuration[1]);
}
} else {
this.connectivityErrorCount++;
}
}

/**
* Called when status information is received. Extracts zone and cluster information.
* @param {object} status The received status information.
*/
onStatusMetadataReceived(status: {
metadata: {internalRepr: Map<string, Uint8Array[]>; options: {}};
}) {
const INSTANCE_INFORMATION_KEY = 'x-goog-ext-425905942-bin';
const mappedValue = status.metadata.internalRepr.get(
INSTANCE_INFORMATION_KEY
) as Buffer[];
const decodedValue = ResponseParams.decode(
mappedValue[0],
mappedValue[0].length
);
if (decodedValue && (decodedValue as unknown as {zoneId: string}).zoneId) {
this.zone = (decodedValue as unknown as {zoneId: string}).zoneId;
}
if (
decodedValue &&
(decodedValue as unknown as {clusterId: string}).clusterId
) {
this.cluster = (decodedValue as unknown as {clusterId: string}).clusterId;
}
}
}
53 changes: 53 additions & 0 deletions test-common/test-metrics-handler.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
// Copyright 2025 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// https://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

import {
IMetricsHandler,
OnAttemptCompleteData,
OnOperationCompleteData,
} from '../src/client-side-metrics/metrics-handler';

/**
* A test implementation of the IMetricsHandler interface. Used for testing purposes.
* It logs the metrics and attributes received by the onOperationComplete and onAttemptComplete methods.
*/
export class TestMetricsHandler implements IMetricsHandler {
private messages: {value: string};
requestsHandled: (OnOperationCompleteData | OnAttemptCompleteData)[] = [];

constructor(messages: {value: string}) {
this.messages = messages;
}
/**
* Logs the metrics and attributes received for an operation completion.
* @param {OnOperationCompleteData} data Metrics related to the completed operation.
*/
onOperationComplete(data: OnOperationCompleteData) {
this.requestsHandled.push(data);
data.clientName = 'nodejs-bigtable';
this.messages.value += 'Recording parameters for onOperationComplete:\n';
this.messages.value += `${JSON.stringify(data)}\n`;
}

/**
* Logs the metrics and attributes received for an attempt completion.
* @param {OnOperationCompleteData} data Metrics related to the completed attempt.
*/
onAttemptComplete(data: OnAttemptCompleteData) {
this.requestsHandled.push(data);
data.clientName = 'nodejs-bigtable';
this.messages.value += 'Recording parameters for onAttemptComplete:\n';
this.messages.value += `${JSON.stringify(data)}\n`;
}
}
195 changes: 195 additions & 0 deletions test/metrics-collector/metrics-collector.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,195 @@
// Copyright 2025 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// https://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

import {describe} from 'mocha';
import * as assert from 'assert';
import * as fs from 'fs';
import {TestMetricsHandler} from '../../test-common/test-metrics-handler';
import {OperationMetricsCollector} from '../../src/client-side-metrics/operation-metrics-collector';
import {
MethodName,
StreamingState,
} from '../../src/client-side-metrics/client-side-metrics-attributes';
import {grpc} from 'google-gax';
import {expectedRequestsHandled} from './metrics-handler-fixture';
import * as gax from 'google-gax';
const root = gax.protobuf.loadSync(
'./protos/google/bigtable/v2/response_params.proto'
);
const ResponseParams = root.lookupType('ResponseParams');

/**
* A fake implementation of the Bigtable client for testing purposes. Provides a
* metricsTracerFactory and a stubbed projectId method.
*/
class FakeBigtable {
clientUid = 'fake-uuid';
appProfileId?: string;
projectId = 'my-project';
}

/**
* A fake implementation of a Bigtable instance for testing purposes. Provides only an ID.
*/
class FakeInstance {
/**
* The ID of the fake instance.
*/
id = 'fakeInstanceId';
}

describe('Bigtable/MetricsCollector', () => {
const logger = {value: ''};
const originalDate = global.Date;

before(() => {
let mockTime = new Date('1970-01-01T00:00:01.000Z').getTime();

(global as any).Date = class extends originalDate {
constructor(...args: any[]) {
// Using a rest parameter
if (args.length === 0) {
super(mockTime);
logger.value += `getDate call returns ${mockTime.toString()} ms\n`;
mockTime += 1000;
}
}

static now(): number {
return mockTime;
}

static parse(dateString: string): number {
return originalDate.parse(dateString);
}

static UTC(
year: number,
month: number,
date?: number,
hours?: number,
minutes?: number,
seconds?: number,
ms?: number
): number {
return originalDate.UTC(year, month, date, hours, minutes, seconds, ms);
}
};
});

after(() => {
(global as any).Date = originalDate;
});

it('should record the right metrics with a typical method call', async () => {
const testHandler = new TestMetricsHandler(logger);
const metricsHandlers = [testHandler];
class FakeTable {
id = 'fakeTableId';
instance = new FakeInstance();
bigtable = new FakeBigtable();

async fakeMethod(): Promise<void> {
function createMetadata(duration: string) {
return {
internalRepr: new Map([
['server-timing', [`gfet4t7; dur=${duration}`]],
]),
options: {},
};
}
if (this.bigtable.projectId) {
const status = {
metadata: {
internalRepr: new Map([
[
'x-goog-ext-425905942-bin',
[
ResponseParams.encode({
zoneId: 'us-west1-c',
clusterId: 'fake-cluster3',
}).finish(),
],
],
]),
options: {},
},
};
const metricsCollector = new OperationMetricsCollector(
this,
metricsHandlers,
MethodName.READ_ROWS,
StreamingState.STREAMING
);
// In this method we simulate a series of events that might happen
// when a user calls one of the Table methods.
// Here is an example of what might happen in a method call:
logger.value += '1. The operation starts\n';
metricsCollector.onOperationStart();
logger.value += '2. The attempt starts.\n';
metricsCollector.onAttemptStart();
logger.value += '3. Client receives status information.\n';
metricsCollector.onStatusMetadataReceived(status);
logger.value += '4. Client receives metadata.\n';
metricsCollector.onMetadataReceived(createMetadata('101'));
logger.value += '5. Client receives first row.\n';
metricsCollector.onResponse(this.bigtable.projectId);
logger.value += '6. Client receives metadata.\n';
metricsCollector.onMetadataReceived(createMetadata('102'));
logger.value += '7. Client receives second row.\n';
metricsCollector.onResponse(this.bigtable.projectId);
logger.value += '8. A transient error occurs.\n';
metricsCollector.onAttemptComplete(
this.bigtable.projectId,
grpc.status.DEADLINE_EXCEEDED
);
logger.value += '9. After a timeout, the second attempt is made.\n';
metricsCollector.onAttemptStart();
logger.value += '10. Client receives status information.\n';
metricsCollector.onStatusMetadataReceived(status);
logger.value += '11. Client receives metadata.\n';
metricsCollector.onMetadataReceived(createMetadata('103'));
logger.value += '12. Client receives third row.\n';
metricsCollector.onResponse(this.bigtable.projectId);
logger.value += '13. Client receives metadata.\n';
metricsCollector.onMetadataReceived(createMetadata('104'));
logger.value += '14. Client receives fourth row.\n';
metricsCollector.onResponse(this.bigtable.projectId);
logger.value += '15. User reads row 1\n';
logger.value += '16. Stream ends, operation completes\n';
metricsCollector.onAttemptComplete(
this.bigtable.projectId,
grpc.status.OK
);
metricsCollector.onOperationComplete(
this.bigtable.projectId,
grpc.status.OK
);
}
}
}
const table = new FakeTable();
await table.fakeMethod();
const expectedOutput = fs.readFileSync(
'./test/metrics-collector/typical-method-call.txt',
'utf8'
);
// Ensure events occurred in the right order here:
assert.strictEqual(logger.value, expectedOutput.replace(/\r/g, ''));
assert.deepStrictEqual(
testHandler.requestsHandled,
expectedRequestsHandled
);
});
});
70 changes: 70 additions & 0 deletions test/metrics-collector/metrics-handler-fixture.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
// Copyright 2025 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// https://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

export const expectedRequestsHandled = [
{
attemptLatency: 2000,
serverLatency: 101,
connectivityErrorCount: 0,
streamingOperation: 'true',
attemptStatus: 4,
clientName: 'nodejs-bigtable',
metricsCollectorData: {
appProfileId: undefined,
instanceId: 'fakeInstanceId',
table: 'fakeTableId',
cluster: 'fake-cluster3',
zone: 'us-west1-c',
methodName: 'Bigtable.ReadRows',
clientUid: 'fake-uuid',
},
projectId: 'my-project',
},
{
attemptLatency: 2000,
serverLatency: 103,
connectivityErrorCount: 0,
streamingOperation: 'true',
attemptStatus: 0,
clientName: 'nodejs-bigtable',
metricsCollectorData: {
appProfileId: undefined,
instanceId: 'fakeInstanceId',
table: 'fakeTableId',
cluster: 'fake-cluster3',
zone: 'us-west1-c',
methodName: 'Bigtable.ReadRows',
clientUid: 'fake-uuid',
},
projectId: 'my-project',
},
{
finalOperationStatus: 0,
streamingOperation: 'true',
metricsCollectorData: {
appProfileId: undefined,
instanceId: 'fakeInstanceId',
table: 'fakeTableId',
cluster: 'fake-cluster3',
zone: 'us-west1-c',
methodName: 'Bigtable.ReadRows',
clientUid: 'fake-uuid',
},
clientName: 'nodejs-bigtable',
projectId: 'my-project',
operationLatency: 7000,
retryCount: 1,
firstResponseLatency: 5000,
},
];
30 changes: 30 additions & 0 deletions test/metrics-collector/typical-method-call.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
1. The operation starts
getDate call returns 1000 ms
2. The attempt starts.
getDate call returns 2000 ms
3. Client receives status information.
4. Client receives metadata.
5. Client receives first row.
getDate call returns 3000 ms
6. Client receives metadata.
7. Client receives second row.
8. A transient error occurs.
getDate call returns 4000 ms
Recording parameters for onAttemptComplete:
{"attemptLatency":2000,"serverLatency":101,"connectivityErrorCount":0,"streamingOperation":"true","attemptStatus":4,"clientName":"nodejs-bigtable","metricsCollectorData":{"instanceId":"fakeInstanceId","table":"fakeTableId","cluster":"fake-cluster3","zone":"us-west1-c","methodName":"Bigtable.ReadRows","clientUid":"fake-uuid"},"projectId":"my-project"}
9. After a timeout, the second attempt is made.
getDate call returns 5000 ms
10. Client receives status information.
11. Client receives metadata.
12. Client receives third row.
getDate call returns 6000 ms
13. Client receives metadata.
14. Client receives fourth row.
15. User reads row 1
16. Stream ends, operation completes
getDate call returns 7000 ms
Recording parameters for onAttemptComplete:
{"attemptLatency":2000,"serverLatency":103,"connectivityErrorCount":0,"streamingOperation":"true","attemptStatus":0,"clientName":"nodejs-bigtable","metricsCollectorData":{"instanceId":"fakeInstanceId","table":"fakeTableId","cluster":"fake-cluster3","zone":"us-west1-c","methodName":"Bigtable.ReadRows","clientUid":"fake-uuid"},"projectId":"my-project"}
getDate call returns 8000 ms
Recording parameters for onOperationComplete:
{"finalOperationStatus":0,"streamingOperation":"true","metricsCollectorData":{"instanceId":"fakeInstanceId","table":"fakeTableId","cluster":"fake-cluster3","zone":"us-west1-c","methodName":"Bigtable.ReadRows","clientUid":"fake-uuid"},"clientName":"nodejs-bigtable","projectId":"my-project","operationLatency":7000,"retryCount":1,"firstResponseLatency":5000}

Unchanged files with check annotations Beta

}
}
function isStringArray(array: any): array is string[] {

Check warning on line 516 in src/app-profile.ts

GitHub Actions / lint

Unexpected any. Specify a different type
return array.every((cluster: any) => {

Check warning on line 517 in src/app-profile.ts

GitHub Actions / lint

Unexpected any. Specify a different type
return typeof cluster === 'string';
});
}
function isClusterArray(array: any): array is Cluster[] {

Check warning on line 522 in src/app-profile.ts

GitHub Actions / lint

Unexpected any. Specify a different type
return array.every((cluster: any) => {

Check warning on line 523 in src/app-profile.ts

GitHub Actions / lint

Unexpected any. Specify a different type
return isCluster(cluster);
});
}
function isCluster(cluster: any): cluster is Cluster {

Check warning on line 528 in src/app-profile.ts

GitHub Actions / lint

Unexpected any. Specify a different type
return (
(cluster as Cluster).bigtable !== undefined &&
(cluster as Cluster).instance !== undefined &&
import {Mutation, ConvertFromBytesUserOptions, Bytes, Data} from './mutation';
import {Bigtable} from '.';
import {
Table,

Check warning on line 21 in src/row.ts

GitHub Actions / lint

'Table' is defined but never used
Entry,
MutateCallback,
MutateResponse,
// Handling retries in this client. Specify the retry options to
// make sure nothing is retried in retry-request.
noResponseRetries: 0,
shouldRetryFn: (_: any) => {

Check warning on line 363 in src/tabular-api-surface.ts

GitHub Actions / lint

'_' is defined but never used

Check warning on line 363 in src/tabular-api-surface.ts

GitHub Actions / lint

Unexpected any. Specify a different type
return false;
},
};
userStream.emit('error', error);
}
})
.on('data', _ => {

Check warning on line 553 in src/tabular-api-surface.ts

GitHub Actions / lint

'_' is defined but never used
// Reset error count after a successful read so the backoff
// time won't keep increasing when as stream had multiple errors
numConsecutiveErrors = 0;
// Handling retries in this client. Specify the retry options to
// make sure nothing is retried in retry-request.
noResponseRetries: 0,
shouldRetryFn: (_: any) => {

Check warning on line 812 in src/tabular-api-surface.ts

GitHub Actions / lint

'_' is defined but never used
return false;
},
};