允许相对和绝对url的Javascript url验证

Javascript url validation allowing relative and absolute urls

本文关键字:url Javascript 验证 许相对 相对      更新时间:2023-09-26

我试图验证一个字段,以允许相对和绝对url。我使用的是这篇文章中的正则表达式,但它允许在url中使用空格。

var urlRegex = new RegExp(/('/?['w-]+)('/['w-]+)*'/?|(((http|ftp|https):'/'/)?['w-]+('.['w-]+)+(['w.,@?^=%&:'/~+#-]*['w@?^=%&'/~+#-])?)/gi);

的例子:

// this should work
this/will/work.aspx?say=hello 
http://www.example.com/this/will/work.aspx?say=hello
// this shouldn't work but does
and/this will also work/even though it shouldn't
and/this-shouldn't/but it does/also

下面的代码是我最初用来验证绝对url的,它工作得很好。如果我没记错的话,我是从jquery源代码中提取的。如果可以将其修改为也接受相对url,那将是完美的,但这超出了我的能力范围。

var urlRegex = new RegExp(/^(https?|ftp):'/'/(((([a-z]|'d|-|'.|_|~|['u00A0-'uD7FF'uF900-'uFDCF'uFDF0-'uFFEF])|(%['da-f]{2})|[!'$&''(')'*'+,;=]|:)*@)?((('d|[1-9]'d|1'd'd|2[0-4]'d|25[0-5])'.('d|[1-9]'d|1'd'd|2[0-4]'d|25[0-5])'.('d|[1-9]'d|1'd'd|2[0-4]'d|25[0-5])'.('d|[1-9]'d|1'd'd|2[0-4]'d|25[0-5]))|((([a-z]|'d|['u00A0-'uD7FF'uF900-'uFDCF'uFDF0-'uFFEF])|(([a-z]|'d|['u00A0-'uD7FF'uF900-'uFDCF'uFDF0-'uFFEF])([a-z]|'d|-|'.|_|~|['u00A0-'uD7FF'uF900-'uFDCF'uFDF0-'uFFEF])*([a-z]|'d|['u00A0-'uD7FF'uF900-'uFDCF'uFDF0-'uFFEF])))'.)+(([a-z]|['u00A0-'uD7FF'uF900-'uFDCF'uFDF0-'uFFEF])|(([a-z]|['u00A0-'uD7FF'uF900-'uFDCF'uFDF0-'uFFEF])([a-z]|'d|-|'.|_|~|['u00A0-'uD7FF'uF900-'uFDCF'uFDF0-'uFFEF])*([a-z]|['u00A0-'uD7FF'uF900-'uFDCF'uFDF0-'uFFEF])))'.?)(:'d*)?)('/((([a-z]|'d|-|'.|_|~|['u00A0-'uD7FF'uF900-'uFDCF'uFDF0-'uFFEF])|(%['da-f]{2})|[!'$&''(')'*'+,;=]|:|@)+('/(([a-z]|'d|-|'.|_|~|['u00A0-'uD7FF'uF900-'uFDCF'uFDF0-'uFFEF])|(%['da-f]{2})|[!'$&''(')'*'+,;=]|:|@)*)*)?)?('?((([a-z]|'d|-|'.|_|~|['u00A0-'uD7FF'uF900-'uFDCF'uFDF0-'uFFEF])|(%['da-f]{2})|[!'$&''(')'*'+,;=]|:|@)|['uE000-'uF8FF]|'/|'?)*)?('#((([a-z]|'d|-|'.|_|~|['u00A0-'uD7FF'uF900-'uFDCF'uFDF0-'uFFEF])|(%['da-f]{2})|[!'$&''(')'*'+,;=]|:|@)|'/|'?)*)?$/i);

我认为你只需要锚定模式,使它必须匹配整个字符串:

var urlRegex = /^('/?['w-]+)('/['w-]+)*'/?|(((http|ftp|https):'/'/)?['w-]+('.['w-]+)+(['w.,@?^=%&:'/~+#-]*['w@?^=%&'/~+#-])?)$/gi;

前面的^和后面的$意味着模式必须匹配整个字符串,而不仅仅是其中的一部分。

edit也就是说,模式还有其他问题。首先,& (&)的HTML实体需要只是"&"。在[]组中,斜杠不需要转义,我们也不需要"g"后缀。剩下的是:

var urlRegex = /^(?:('/?['w-]+)('/['w-]+)*'/?|(((http|ftp|https):'/'/)?['w-]+('.['w-]+)*(['w.,@?^=%&:/~+#-]*['w@?^=%&/~+#-])?))$/i;

再次编辑 -哎呀,还需要将整个内容包起来

我写了一篇关于URI验证的文章,包括RFC3986定义的所有不同URI组件的代码片段:

正则表达式URI验证

你可以在那里找到你要找的东西。但是请注意,几乎任何字符串都表示有效的URI——甚至是空字符串!