Node.js切片一个非常大的缓冲区耗尽内存

Node.js slicing a very large buffer running out of memory

本文关键字：缓冲区内存非常一个切片 js Node 更新时间：2023-09-26

我有一个非常大的base64编码字符串，需要读入字节(Uint8)数组，然后将字节数组分割成指定大小的块，base64分别编码这些块。使用下面的函数可以工作，但是调用.slice或. tostring每次调用都会增加堆上的内存，因为(我相信)它会复制缓冲区。对于特别大的base64Encoded字符串，应用程序将耗尽堆上的空间。如何将其分成指定的大小，并在不耗尽内存的情况下对其进行base64编码?

const process = function (reallyLargeBase64EncodedString, splitSize){
var theBuffer = Buffer.from(reallyLargeBase64EncodedString, 'base64');
//var tempBuffer = new Buffer(splitSize);
for (var start = 0; start < theBuffer.length; start += splitSize) {
    //for(var z = 0; z < splitSize; z++){
        //tempBuffer.writeUInt8( theBuffer[start+z],z);
    //}
    //var base64EncodedVal = tempBuffer.toString('base64');
    //var base64EncodedVal = theBuffer.buffer.toString('base64', start, start+splitSize);
    var base64EncodedVal = theBuffer.slice(start,start+splitSize).toString('base64'); 
    //do stuff with the base64 encoded value
}

};

我建议使用node的流接口来处理这么大的东西。如果您的base64编码字符串来自文件或网络请求，您可以直接从输入管道到base64解码流，如base64-stream。

为了将数据分块并重新编码每个分块，您必须编写自己的转换流(在输入和输出之间的流)。它看起来像

// NOTE: the following code has been tested in node 6.
// since it relies on the new Buffer api, it must be run in 5.10+
var Transform = require('stream').Transform;
class ChunkEncode extends Transform {
    constructor(options){
        super(options);
        this.splitSize = options.splitSize;
        this.buffer = Buffer.alloc(0);
    }
    _transform(chunk, encoding, cb){
        // chunk is a Buffer;
        this.buffer = Buffer.concat([this.buffer, chunk]);
        while (this.buffer.length > this.splitSize){
            let chunk = this.buffer.slice(0, this.splitSize);
            // Encode and write back to the stream.
            this.push(chunk.toString('base64')) 
            // throw in a newline for visibility.
            this.push(''n');
            // chop off `splitSize` from the start of our buffer.
            this.buffer = this.buffer.slice(this.splitSize);
        }
    }
}

那么你应该可以这样做

 var fs     = require('fs');
 var base64 = require('base64-stream');
 fs.createReadStream('./long-base64-string')
 .pipe(base64.decode())
 .pipe(new ChunkEncode({splitSize : 128}))
 .pipe(process.stdout)

将记录到标准输出，但您也可以轻松地写入文件或网络流。如果您需要进一步操作数据，您可以创建一个写流，它将允许您在每个数据块进入时对其进行处理。