在javascript代码字符串中查找正则表达式字面量
finding regular expression literals in a string of javascript code
我正在用javascript对javascript代码进行一种粗略的解析。我将省去为什么我需要这样做的细节,但足以说明我不希望集成大量的库代码,因为这对于我的目的来说是不必要的,而且保持轻量级和相对简单是很重要的。所以请不要建议我使用JsLint或类似的东西。如果答案的代码比你能粘贴到答案中的代码多,那么它可能比我想要的多。
我的代码目前能够很好地检测引用的部分和注释,然后匹配大括号,括号和父括号(确保不要被引号和注释混淆,或者引号中的转义,当然)。这就是我需要它做的,而且它做得很好……只有一个例外:
它可能被正则表达式字面量混淆。所以我希望在javascript字符串中检测正则表达式字面量的一些帮助,所以我可以适当地处理它们。
像这样:
function getRegExpLiterals (stringOfJavascriptCode) {
var output = [];
// todo!
return output;
}
var jsString = "var regexp1 = /abcd/g, regexp1 = /efg/;"
console.log (getRegExpLiterals (jsString));
// should print:
// [{startIndex: 13, length: 7}, {startIndex: 32, length: 5}]
es5-lexer是一个JS词法分析器,它使用非常精确的启发式来区分JS代码中的正则表达式和除法表达式,并且还提供了一个令牌级别的转换,您可以使用它来确保结果程序将被完整的JS解析器以与词法分析器相同的方式解释。
决定/
是否开始正则表达式的位在guess_is_regexp.js
中,测试从scanner_test.js
行401开始
var REGEXP_PRECEDER_TOKEN_RE = new RegExp(
"^(?:" // Match the whole tokens below
+ "break"
+ "|case"
+ "|continue"
+ "|delete"
+ "|do"
+ "|else"
+ "|finally"
+ "|in"
+ "|instanceof"
+ "|return"
+ "|throw"
+ "|try"
+ "|typeof"
+ "|void"
// Binary operators which cannot be followed by a division operator.
+ "|[+]" // Match + but not ++. += is handled below.
+ "|-" // Match - but not --. -= is handled below.
+ "|[.]" // Match . but not a number with a trailing decimal.
+ "|[/]" // Match /, but not a regexp. /= is handled below.
+ "|," // Second binary operand cannot start a division.
+ "|[*]" // Ditto binary operand.
+ ")$"
// Or match a token that ends with one of the characters below to match
// a variety of punctuation tokens.
// Some of the single char tokens could go above, but putting them below
// allows closure-compiler's regex optimizer to do a better job.
// The right column explains why the terminal character to the left can only
// precede a regexp.
+ "|["
+ "!" // ! prefix operator operand cannot start with a division
+ "%" // % second binary operand cannot start with a division
+ "&" // &, && ditto binary operand
+ "(" // ( expression cannot start with a division
+ ":" // : property value, labelled statement, and operand of ?:
// cannot start with a division
+ ";" // ; statement & for condition cannot start with division
+ "<" // <, <<, << ditto binary operand
// !=, !==, %=, &&=, &=, *=, +=, -=, /=, <<=, <=, =, ==, ===, >=, >>=, >>>=,
// ^=, |=, ||=
// All are binary operands (assignment ops or comparisons) whose right
// operand cannot start with a division operator
+ "="
+ ">" // >, >>, >>> ditto binary operand
+ "?" // ? expression in ?: cannot start with a division operator
+ "[" // [ first array value & key expression cannot start with
// a division
+ "^" // ^ ditto binary operand
+ "{" // { statement in block and object property key cannot start
// with a division
+ "|" // |, || ditto binary operand
+ "}" // } PROBLEMATIC: could be an object literal divided or
// a block. More likely to be start of a statement after
// a block which cannot start with a /.
+ "~" // ~ ditto binary operand
+ "]$"
// The exclusion of ++ and -- from the above is also problematic.
// Both are prefix and postfix operators.
// Given that there is rarely a good reason to increment a regular expression
// and good reason to have a post-increment operator as the left operand of
// a division (x++ / y) this pattern treats ++ and -- as division preceders.
);
相关文章:
- 为什么正则表达式查找“;t〃;在“;乔治亚州”;使用Angular$sce
- 使用正则表达式查找字符串中的不匹配字符
- 使用正则表达式查找字符模式
- 什么'用这个正则表达式查找URL是错误的
- 正则表达式查找字符串并返回大括号之间的所有内容
- 如何使用正则表达式查找对象成员名称
- Javascript 正则表达式:查找所有 URL 优化
- 在Javascript中如何使用正则表达式查找单词
- JavaScript 正则表达式查找十进制值
- JavaScript 正则表达式查找与开始和结束模式匹配的所有子字符串
- 正则表达式:查找特定字符串,只要它出现在另一个特定字符串之前
- 如何创建正则表达式 查找 JS 文件(或 JSON)中的所有字符串
- 正则表达式:查找大写单词
- 正则表达式查找以文本而不是链接形式编写的 URL
- 使用正则表达式查找数字中的特定数字
- JavaScript正则表达式查找特定句子
- 正则表达式查找模式,返回子模式
- 我该如何在JavaScript数组中执行正则表达式查找uof值
- 用Javascript正则表达式查找>以及<
- Javascript正则表达式查找不以“”开头的单词;我的:"