按不在特定边界内的字符分割字符串

Split string by chars which not inside specific boundary

本文关键字:字符 分割 字符串 边界      更新时间:2023-09-26

我想写一个正则表达式,它通过逗号分隔字符串,而不是在()中。

例子:

"test,test,test".split(/.../) => var a = ["test", "test", "test"];
"test(123,345),test".split(/.../) => var a = ["test(123,345)", "test"];
"test(123,345),a(b,c)".split(/.../) => var a = ["test(123,345)", "a(b,c)"];
"test(cb(a,b),345),a(b(d,e,f),c),abc".split(/.../) => var a = ["test(cb(a,b),345)", "a(b(d,e,f),c)", "abc"];

我有以下正则表达式,但它只有在第一个匹配逗号后没有()时才有效:

"test,test,test".split(/,(?!.*'))/) => OK
"test(cb(a,b),345),test,test".split(/,(?!.*'))/) => OK
"test,test(cb(a,b),345),test".split(/,(?!.*'))/) => FAIL

正则表达式不适合这种任务。我认为使用自己的解析器更容易,您可以按照括号嵌套的级别来确定是否应该分割:

function splitTokens(var input) {
    var tokens = [];
    var currentToken = "";
    var nestingLevel = 0;
    for (var i = 0; i < input.length; i++) {
        var currentChar = input[i];
        if (currentChar === "," && nestingLevel === 0) {
            tokens.push(currentToken);
            currentToken="";
        } else {
            currentToken+=currentChar;
            if (currentChar === "(") { nestingLevel++; }
            else if (currentChar === ")") { nestingLevel--; }
        }
    }
    if (currentToken.length) {
        tokens.push(currentToken);
    }
    return tokens;
}

请注意,我没有处理不匹配的括号,您可能需要为这些情况添加逻辑。