使用正则表达式提取 url 路径

Extracting url path using regexp

本文关键字：url 路径提取正则表达式更新时间：2023-09-26

场景

我想从 document.location 中提取路径字符串，不包括前导斜杠。例如，如果网址是：

http://stackoverflow.com/questions/ask

我会得到：

questions/ask

这应该很简单：

/* group everything after the leading slash */
var re = /'/(.+)/gi;
var matches = document.location.pathname.match(re);
console.log(matches[0]);

但是如果我在火虫控制台中运行此代码段，我仍然会得到前导斜杠。我已经测试了正则表达式，并且正则表达式引擎正确提取了该组。

问题

如何正确获取组 1 字符串？

如果您只想获取路径名而不使用前导斜杠，则实际上并不需要正则表达式。由于location.pathname总是以/开头，因此您可以简单地从第一个索引中获取子字符串：

document.location.pathname.substr(1) // or .slice(1)

你说的是尾部斜杠还是前导斜杠？从您的帖子来看，它看起来像前导斜杠。

document.location.pathname.replace(/^'//,"")

顺便说一句，您的正则表达式是正确的，但您只需要删除gi并读取matches[1]而不是matches[0]，因为matches[0]整个字符串与正则表达式匹配，而matches[1]是匹配字符串中捕获的部分（正则表达式中带括号的引号）。

var matches = document.location.pathname.match(/'/(.+)/);
console.log(matches); // ["/questions/ask", "questions/ask"]

使用正则表达式，您可以执行以下操作：

var m = 'http://stackoverflow.com/questions/ask'.match(/'/{2}[^'/]+('/.+)/);
console.log(m[1]); /questions/ask