JS正则表达式不返回所有匹配的组

JS regex not returning all matched group

本文关键字:正则表达式 返回 JS      更新时间:2023-09-26

我的字符串如下:

var data = "Validation failed: Attachments document 01april2015_-_Copy.csv has contents that are not what they are reported to be, Attachments document 01april2015.csv has contents that are not what they are reported to be"

我的正则表达式:

var regex = /Validation failed:(?:(?:,)* Attachments document ([^,]*) has contents that are not what they are reported to be)+/;
结果:

data.match(regex)

["验证失败:附件文档01april2015_-_Copy.csv的内容与报告的内容不一致,附件文档01april2015.csv的内容与报告的内容不一致","01april2015.csv"]

data.match(regex).length == 2
真正

预期结果:

data.match(regex)

["验证失败:附件文档01april2015_-Copy.csv的内容与报告的内容不一致,附件文档01april2015.csv的内容与报告的内容不一致","01april2015- _copy .csv", "01april2015.csv"]

data.match(regex).length == 3
真正

我无法理解为什么匹配后不返回第一个文件名(01april2015_-_Copy.csv)。

在JS中,没有Captures集合在c#中,因此,我建议使用g选项缩短正则表达式,并使用exec,以便不丢失捕获的文本:

var re = /Attachments document ([^,]*) has contents that are not what they are reported to be/g; 
var str = 'Validation failed: Attachments document 01april2015_-_Copy.csv has contents that are not what they are reported to be, Attachments document 01april2015.csv has contents that are not what they are reported to be';
var m;
var arr = [str];
while ((m = re.exec(str)) !== null) {
    if (m.index === re.lastIndex) {
        re.lastIndex++;
    }
    arr.push(m[1]);
}
console.log(arr);

请注意,可以使用匹配所需子字符串的最短可能模式来查找多个匹配。我们不能使用String#match,因为:

如果正则表达式包含g标志,该方法返回一个包含所有匹配子字符串而不是匹配对象的Array。不返回捕获的组

如果您想获得捕获组,并且设置了全局标志,则需要使用RegExp.exec()

/g:

查看RegExp#exec的行为

如果您的正则表达式使用"g"标志,您可以多次使用exec()方法来查找同一字符串中的连续匹配。

如果匹配成功,exec()方法返回一个数组并更新正则表达式对象的属性。返回的数组将匹配的文本作为第一项,然后为每个匹配的包含被捕获的文本的捕获括号提供一项。