如何找到" "脚本"标记从字符串与JAVASCRIPT/正则表达式

How to find "<script>" tag from the string with JAVASCRIPT/ regular expression

本文关键字:quot 字符串 JAVASCRIPT 正则表达式 何找 脚本      更新时间:2023-09-26

我需要为文本<script验证传入字符串。

例子:
string a = "This is a simple <script> string";

现在,我需要编写一个正则表达式,它将告诉我该字符串是否包含<script>标记。

我最后写了这样的东西:<* ?script.* ?>

但挑战是,输入字符串可能以以下方式包含脚本:

string a = "This is a simple <script> string";
string a = "This is a simple < script> string";
string a = "This is a simple <javascript></javascript> string";
string a = "This is a simple <script type=text/javascript> string";

因此正则表达式应该检查开始<标记,然后应该检查script

使用场景:
/<script['s'S]*?>['s'S]*?<'/script>/gi

@bodhizero接受的<[^>]*script的答案在以下条件下错误地返回true:

// Not a proper script tag.
const a = "This is a simple < script> string"; 
// Space added before "img", otherwise the entire tag fails to render here.
const a = "This is a simple < img src='//example.com/script.jpg'> string";
// Picks up "nonsense code" just because a '<' character happens to precede a 'script' string somewhere along the way.
const a = "This is a simple for(i=0;i<5;i++){alert('script')} string";

这是一个构建和测试正则表达式的优秀资源。

试试这个:

/(<|%3C)script['s'S]*?(>|%3E)['s'S]*?(<|%3C)('/|%2F)script['s'S]*?(>|%3E)/gi

我推荐的基于正则表达式的解决方案如下:

Regex rMatch = new Regex(@"<script[^>]*>(.*?)</script[^>]*>", RegexOptions.IgnoreCase & RegexOptions.Singleline);
myString = rMatch.Replace(myString, "");

这个正则表达式将正确识别和删除以下字符串中的脚本标签:

<script></script>
<script>something...</script>
something...<ScRiPt>something...</scripT>something...
something...<ScRiPt something...="something...">something...</scripT something...>something...

奖励,它将不匹配任何以下无效的脚本字符串:

< script></script>
<javascript>something...</javascript>

使用

const re = /<script'b[^>]*>['s'S]*?<'/script'b[^>]*>/g

像这样使用:

const html = `
  ...
  
    <script type="text/javascript">
        alert('1');
    </script>
    <div>Test</div>
    <script type="text/javascript">
        alert('2');
    </script>
  ...
`
const re = /<script'b[^>]*>['s'S]*?<'/script'b[^>]*>/g
const results = html.match(re)
console.log(results) // an array containing each script tag.

查看具体的regex的操作并在这里了解它:

https://regexr.com/5od96

Regexr网站是最有用的正则表达式网站!将鼠标悬停在正则表达式的任何部分上,它会告诉你有关它的信息,以及更多信息。

一个否定的字符类在这里会派上用场。

<[^>]*script

我想这个绝对适合我。

var regexp = /<script+.*>+.*<'/script>/g;