Regex查找正则表达式标志和搜索模式

Regex to find a regular expression flags and search pattern

本文关键字:搜索 模式 标志 查找 正则表达式 Regex      更新时间:2023-09-26

具有如下字符串:

"/some regex/gi"

如何获得具有搜索模式("some regex")和标志("gi")的数组?

我尝试使用match函数:

> "/some regex/gi".match("/(.*)/([a-z]+)")
[ '/some regex/gi',
  'some regex',
  'gi',
  index: 0,
  input: '/some regex/gi' ]

但是,这对于没有标志的正则表达式(返回null)和其他更复杂的正则表达式都会失败。

示例:

"without flags" // => ["without flags", "without flags", ...]
"/with flags/gi" // => ["/with flags/gi", "with flags", "gi", ...]
"/with'/slashes'//gi" // => ["/with'/slashes'//gi", "with'/slashes'/", "gi", ...]
"/with '/some'/(.*)regex/gi" // => ["/with '/some'/(.*)regex/gi", "with '/some'/(.*)regex", "gi", ...]

数组中的顺序并不重要,但是,搜索模式和标志的位置应该相同。


实际上,我想获取一个字符串并对其进行解析。在获得搜索模式和标志后,我想将它们传递给new RegExp(searchPattern, flags)——然后我就有了一个真正的正则表达式。

对于我的输入,我想接受带斜杠和不带斜杠的字符串。没有斜杠(实际上在第一个字符的开头没有斜杠)表示没有标志。

因此,对于"/hi/gi",我们将有re = new RegExp("hi", "gi")。另请参阅以下示例:

"/singlehash" => new RegExp("/singlehash", undefined)
"single/hash" => new RegExp("single/hash", undefined)
"/hi/" => new RegExp("hi", undefined)
"hi" => new RegExp("hi", undefined)
"/hi'/slash/" => new RegExp("hi'/slash", undefined)
"/hi'/slash/gi" => new RegExp("hi'/slash", "gi")
"/^dummy.*[a-z]$/flags" => new RegExp("^dummy.*[a-z]$", "flags")

我创建了以下不应该输出任何错误的小脚本:

var obj = {
    "without flags": new RegExp("without flags"),
    "/something/gi": new RegExp("something", "gi"),
    "/with'/slashes'//gi": new RegExp("with'/slashes'/", "gi"),
    "/with '/some'/(.*)regex/gi": new RegExp("with '/some'/(.*)regex", "gi"),
    "/^dummy.*[a-z]$/gmi": new RegExp("^dummy.*[a-z]$", "gmi"),
    "raw input": new RegExp("raw input"),
    "/singlehash": new RegExp("/singlehash"),
    "single/hash": new RegExp("single/hash")
};
var re = /^((?:'/(.*)'/(.*)|.*))$/;
try {
    for (var s in obj) {
        var c = obj[s];
        var m = s.match(re);
        if (m === null) {
            return console.error("null result for" + s);
        }
        console.log("> Input: " + s);
        console.log("  Pattern: " + m[1]);
        console.log("  Flags: " + m[2]);
        console.log("  Match array: ", m);
        var r = new RegExp(m[1], m[2]);
        if (r.toString() !== c.toString()) {
            console.error("Incorrect parsing for: " +  s + ". Expected " + c.toString() + " but got " + r.toString());
        } else {
            console.info("Correct parsing for: " + s);
        }
    }
} catch (e) {
    console.error("!!!! Failed to parse: " + s + "'n" + e.stack);
}

JSFIDDLE

即使正则表达式中有'/,这也会起作用:/('/?)(.+)'1([a-z]*)/i

带分隔符和标志:

var matches = "/some regex/gi".match(/('/?)(.+)'1([a-z]*)/i);

输出:

["/some regex/gi", "/", "some regex", "gi"]

无分隔符:

var matches = "without flags".match(/('/?)(.+)'1([a-z]*)/i);

输出:

["without flags", "", "without flags", ""]

使用我们所有的测试用例:

var matches = "/some regex/gi".match(/('/?)(.+)'1([a-z]*)/i);
==> ["/some regex/gi", "/", "some regex", "gi"]
var matches = "some regex".match(/('/?)(.+)'1([a-z]*)/i);
==> ["some regex", "", "some regex", ""]
var matches = "/with'/slashes'//gi".match(/('/?)(.+)'1([a-z]*)/i);
==> ["/with/slashes//gi", "/", "with/slashes/", "gi"]
var matches = "/with '/some'/(.*)regex/gi".match(/('/?)(.+)'1([a-z]*)/i);
==> ["/with /some/(.*)regex/gi", "/", "with /some/(.*)regex", "gi"]
var matches = "/^dummy.*[a-z]$/gmi".match(/('/?)(.+)'1([a-z]*)/i);
==> ["/^dummy.*[a-z]$/gmi", "/", "^dummy.*[a-z]$", "gmi"]
var matches = "/singlehash".match(/('/?)(.+)'1([a-z]*)/i);
==> ["/singlehash", "", "/singlehash", ""]
var matches = "single/hash".match(/('/?)(.+)'1([a-z]*)/i);
==> ["single/hash", "", "single/hash", ""]
var matches = "raw input".match(/('/?)(.+)'1([a-z]*)/i);
==> ["raw input", "", "raw input", ""]

正则表达式在matches[2]中,标志在matches[3]

使用回溯。请参阅此正则表达式:

/^(?:'/(.*)'/([a-z]*)|(.*))$/

这是一个在线代码演示。现在工作。

从正则表达式中删除引号并使用正则表达式分隔符/.../:

var obj = {
    "/without flags/": new RegExp("without flags"),
    "/something/gi": new RegExp("something", "gi"),
    "/with'/slashes'//gi": new RegExp("with'/slashes'/", "gi"),
    "/with '/some'/(.*)regex/gi": new RegExp("with '/some'/(.*)regex", "gi"),
    "/^dummy.*[a-z]$/gmi": new RegExp("^dummy.*[a-z]$", "gmi"),
    "/singlehash": new RegExp("/singlehash"),
    "single/hash": new RegExp("single/hash"),
    "raw input": new RegExp("raw input")
};
var re = /^(?:'/(.*?)'/([a-z]*)|(.+))$/i;
try {
    for (var s in obj) {
        var c = obj[s];
        var m = s.match(re);    
        if (m === null || m[1]+m[3] === undefined) {
            console.error("null result for: " + s, m);
            continue;
        }
        var regex = (m[1]==undefined)?m[3]:m[1];
        var r = (m[2]==undefined) ? new RegExp(regex) : new RegExp(regex, m[2]);
        if (r.toString() !== c.toString()) {
            console.error("Incorrect parsing for: " +  s + ". Expected " + c.toString() + " but got " + r.toString());
        } else {
            console.info("Correct parsing for: " + s);
        }
    }
} catch (e) {
    console.error("!!!! Failed to parse: " + s + "'n" + e.stack);
}

-JSFiddle演示

-RegEx演示

为什么不只是

/'/(.*)'/(.*)|(.*)/

英语

Look for either
    a slash, followed by
    a (greedy) sequence of characters,
    a closing slash,
    and an optional sequence of flags
or
    any sequence of characters

请参阅http://regex101.com/r/aY1oS8/2.