Reading an append blob to a string takes long and then returns HTTP 412 #51

petmat · 2019-01-04T10:35:00Z

Which service(blob, file, queue, table) does this issue concern?

blob

Which version of the SDK was used?

@azure/storage-blob@10.3.0

What's the Node.js/Browser version?

Node.js v10.14.1

What problem was encountered?

Following the example here https://github.com/Azure/azure-storage-js/blob/master/blob/samples/basic.sample.js I was trying to read an append blob from my storage account to a string. It resulted in the streamToString function taking really long and then finally giving an HTTP 412 error with errorCode of undefined. I suspect the error might have something to do with me trying to read an append blob that is constantly getting more lines in it. It is a log file and I would just like to be able to read the current snapshot of it. I could not find any examples dealing with a scenario like mine. Any help would be appreciated!

The detailed error is down here:

{ Error: Unexpected status code: 412
    at new RestError (C:\projects\xxx\RequestLogViewer\node_modules\@azure\ms-rest-js\dist\msRest.node.js:1397:28)
    at C:\projects\xxx\RequestLogViewer\node_modules\@azure\ms-rest-js\dist\msRest.node.js:1849:37
    at process._tickCallback (internal/process/next_tick.js:68:7)
  code: undefined,
  statusCode: 412,
  request:
   WebResource {
     streamResponseBody: true,
     url:
      'https://xxxstor.blob.core.windows.net/request-logs/2019-01-04.txt',
     method: 'GET',
     headers: HttpHeaders { _headersMap: [Object] },
     body: undefined,
     query: undefined,
     formData: undefined,
     withCredentials: false,
     abortSignal:
      a {
        _aborted: false,
        children: [],
        abortEventListeners: [Array],
        parent: undefined,
        key: undefined,
        value: undefined },
     timeout: 0,
     onUploadProgress: undefined,
     onDownloadProgress: undefined,
     operationSpec:
      { httpMethod: 'GET',
        path: '{containerName}/{blob}',
        urlParameters: [Array],
        queryParameters: [Array],
        headerParameters: [Array],
        responses: [Object],
        isXML: true,
        serializer: [Serializer] } },
  response:
   { body: undefined,
     headers: HttpHeaders { _headersMap: [Object] },
     status: 412 },
  body: undefined }

Steps to reproduce the issue?

Here is my code:

const {
  Aborter,
  BlobURL,
  ContainerURL,
  SharedKeyCredential,
  ServiceURL,
  StorageURL,
} = require('@azure/storage-blob');
const format = require('date-fns/format');

async function streamToString(readableStream) {
  return new Promise((resolve, reject) => {
    const chunks = [];
    readableStream.on('data', (data) => {
      chunks.push(data.toString());
    });
    readableStream.on('end', () => {
      resolve(chunks.join(''));
    });
    readableStream.on('error', reject);
  });
}

async function run() {
  const accountName = 'xxxstor';
  const accountKey = 'omitted';
  const credential = new SharedKeyCredential(accountName, accountKey);
  const pipeline = StorageURL.newPipeline(credential);
  const serviceURL = new ServiceURL(
    `https://${accountName}.blob.core.windows.net`,
    pipeline
  );
  const containerName = 'request-logs';
  const containerURL = ContainerURL.fromServiceURL(serviceURL, containerName);
  const blobName = `${format(new Date(), 'YYYY-MM-DD[.txt]')}`;
  const blobURL = BlobURL.fromContainerURL(containerURL, blobName);
  console.log('Downloading blob...');
  const response = await blobURL.download(Aborter.none, 0);
  console.log('Reading response to string...');
  const body = await streamToString(response.);
  console.log(body.length);
}

run().catch((err) => {
  console.error(err);
});

Have you found a mitigation/solution?

no

The text was updated successfully, but these errors were encountered:

XiaoningLiu · 2019-01-07T02:03:59Z

@petmat You are right about "trying to read an append blob that is constantly getting more lines in it."

blobURL.download() will try to download a blob with a HTTP Get request into a stream. When stream unexpected ends due to such as network broken, a retry will resume the stream read from the broken point with a new HTTP Get request.

The second HTTP request will use conditional header IfMatch with the blob's ETag returned in first request to make sure the blob doesn't change when the 2nd retry happens. Otherwise, a 412 conditional header doesn't match error will be returned. This strict strategy is used to avoid data integrity issues, such as the blob maybe totally over written by someone others. However, this strategy seems avoiding you from reading a constantly updated log file when a retry happens.

While I don't think it's bug, but we need to make this scenario work for you. There are 2 solutions, please have a try:

1> snapshot the append blob first, and read from the snapshot blob
2> set maxRetryRequests to 0 in blobURL.download() options. And download a small range (for example 4MB) every time. This means no conditional retry will happen, so no 412 error either. But the returned stream may not complete when there are network issues. Set a small range will help avoid it. Check the stream length.

petmat · 2019-01-08T05:28:43Z

I went with solution number one and I got the snapshot of the blob downloaded with the following code:

// ...
const blobURL = BlobURL.fromContainerURL(containerURL, blobName);
console.log('Downloading blob...');
const snapshotResponse = await blobURL.createSnapshot(Aborter.none);
const snapshotURL = blobURL.withSnapshot(snapshotResponse.snapshot);
const response = await snapshotURL.download(Aborter.none, 0);
console.log('Reading response to string...', snapshotURL.blobContext.length);
const body = await streamToString(response.readableStreamBody);
// ...

Downloading the blob did take some time though. (couple of minutes) But then again the size of the blob is ~140 MB. I didn't explore the solution number two because it seemed a bit contrived. But I'm happy with this 👍 Next I'm just going to implement a strategy to decide how often a new snapshot will be created and possibly removing old ones.

Thank you!

XiaoningLiu self-assigned this Jan 7, 2019

XiaoningLiu added the question Further information is requested label Jan 7, 2019

vinjiang closed this as completed Jan 15, 2019

sera4000 mentioned this issue Dec 21, 2020

Plugin error and reload janmg/logstash-input-azure_blob_storage#11

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reading an append blob to a string takes long and then returns HTTP 412 #51

Reading an append blob to a string takes long and then returns HTTP 412 #51

petmat commented Jan 4, 2019 •

edited

Loading

XiaoningLiu commented Jan 7, 2019

petmat commented Jan 8, 2019

Reading an append blob to a string takes long and then returns HTTP 412 #51

Reading an append blob to a string takes long and then returns HTTP 412 #51

Comments

petmat commented Jan 4, 2019 • edited Loading

Which service(blob, file, queue, table) does this issue concern?

Which version of the SDK was used?

What's the Node.js/Browser version?

What problem was encountered?

Steps to reproduce the issue?

Have you found a mitigation/solution?

XiaoningLiu commented Jan 7, 2019

petmat commented Jan 8, 2019

petmat commented Jan 4, 2019 •

edited

Loading