突出显示文本正文中的短语部分

Highlight portions of a phrase in body of text

本文关键字:短语部 正文 显示 文本      更新时间:2023-09-26

我试图在文本正文中查找短语的一部分(使用 jQuery/JS(,如下例所示:

短语:起初上帝创造了天地。

文本: Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.在开始Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

它可能不是整个短语,而只是短语中的几个单词。 从本质上讲,我想找到与原始短语的一部分匹配的单词序列。

我已经做了很多搜索,但还没有想出任何想法。

进一步澄清:用户可以输入短语"起初上帝创造",文本只能说"上帝创造"。尽管如此,"上帝创造的"应该突出显示,因为它与用户输入的部分短语相匹配。

试试这个: 使用正则表达式的短代码

function hilight(search) {
  if(search=="")return false;
  var sbody = document.getElementById('sbody').innerHTML;
  sbody = sbody.replace(/<b class="hilight">([^<]+)<'/b>/gmi, '$1');  // remove previous hilighted text
  var re = new RegExp('''b(' + search + ')''b', 'gmi');
  // var re = '/'b(' + search + ')'b/gmi';
  subst = '<b class="hilight">$1</b>';
  var result = sbody.replace(re, subst);
  document.getElementById('sbody').innerHTML = result
}
<input type="text" name="search" id="search" onkeyup="return hilight(this.value);" />
<div id="sbody">
  I'm attempting to find portions of a phrase in a body of text Nonetheless
  that "God created" should be highlighted because it matched part of the phrase the user entered.
</div>

我会这样:

var minNumberOfWordsInSequence = 3;
var phrase = "In the beginning God created the heaven and the earth.";
var text = "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. In the beginning Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.";
var phrasesToCheck = [];
phrase.split(' ').forEach(function(word, idx, words) {
    if (words.length > (idx + minNumberOfWordsInSequence - 1)) {
        var segment = "";
        for (var c = 0; c < minNumberOfWordsInSequence; c++) {
            segment += words[idx + c] + " ";
        }
        phrasesToCheck.push(segment.trim());
    }
});
phrasesToCheck.forEach(function(phrase) {
    if (text.toLowerCase().indexOf(phrase.toLowerCase()) > -1) {
        console.log("Found '" + phrase.toLowerCase() + "' in the text.");
    }
});

这里有一个JSFiddle可以玩:http://jsfiddle.net/remus/mgv6mvwn/

你可以把它压缩一点,但为了清楚起见,我会这样保留它。

首先,您需要将搜索文本分解为有序列表中的组成词。忽略多个空格或标点符号的问题这可以使用拆分简单地完成:

var search_str ="起初上帝创造">

var list = search_str.split((

其次,您需要使用列表生成要匹配的单词组合。

列表[1], 列表[2], 列表[3] ...

列表[1]+" "+列表[2], 列表[2]+" "+列表[3], 列表[3]+

" "+列表[4] ...

列表[1]+" "+列表[2]+" "+列表[3], 列表[2]+" "+列表[3]+" "+列表[4], 列表[3]+" "+

列表[4]+" "+列表[5] ...

这是很有可能的 - 但编程这些东西很混乱。我不会在这里尝试。

稍后,您需要查看RegExp以允许长空格和标点符号。

在这里,您可以决定是要搜索整个短语还是搜索短语中的每个单词,就像您要求的那样。然后,它将在原始文本下方的div 中输出搜索和突出显示的文本,或者您可以通过更改以下行将其更改为输出而不是原始文本:请参阅代码下方的选项以了解用法。

document.getElementById('output').innerHTML = Text;

document.getElementById('searchtext').innerHTML = Text;

演示

.HTML:

    <div id="searchtext">

I'm attempting to find portions of a phrase in a body of text (using jQuery/JS), like the example below:
Phrase: In the beginning God created the heaven and the earth.
Text: Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. In the beginning Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
It may not be the entire phrase, but simply a few words from the phrase. Essentially I want to find sequences of words that match a portion of the original phrase.
I've done a lot of searching but have not come up with any ideas for this yet.
To further clarify: The user may input the phrase "In the beginning God created" and the text may ONLY say "God created". Nonetheless that "God created" should be highlighted because it matched part of the phrase the user entered.
</div>
<div id="output" style="margin-top:40px;"></div>

.JS:

function HighlightText(bodyText, searchTerm) 
{
    highlightStartTag = "<font style='font-weight:bold;'>";
    highlightEndTag = "</font>";

  var newText = "";
  var i = -1;
  var lcSearchTerm = searchTerm.toLowerCase();
  var lcBodyText = bodyText.toLowerCase();
  while (bodyText.length > 0) {
    i = lcBodyText.indexOf(lcSearchTerm, i+1);
    if (i < 0) {
      newText += bodyText;
      bodyText = "";
    } else {
      if (bodyText.lastIndexOf(">", i) >= bodyText.lastIndexOf("<", i)) {
        if (lcBodyText.lastIndexOf("/script>", i) >= lcBodyText.lastIndexOf("<script", i)) {
          newText += bodyText.substring(0, i) + highlightStartTag + bodyText.substr(i, searchTerm.length) + highlightEndTag;
          bodyText = bodyText.substr(i + searchTerm.length);
          lcBodyText = bodyText.toLowerCase();
          i = -1;
        }
      }
    }
  }
  return newText;
}

function highlight(searchPhrase, treatAsPhrase,element)
{
  if (treatAsPhrase) {
    searchArray = [searchPhrase];
  } else {
    searchArray = searchPhrase.split(" ");
  }
  var Text = document.getElementById(element).innerHTML;
  for (var i = 0; i < searchArray.length; i++) {
    Text = HighlightText(Text, searchArray[i]);
  }
  document.getElementById('output').innerHTML = Text;
  return true;
}
highlight('Afterwards God created',0,'searchtext')

选项:

                                              '/ element that should be searched, see the DEMO
highlight('Afterwards God created',0,'searchtext')
                    ^searchPhrase  ^handle as phrase (1) or search for every single word (0)

通过在空格上拆分搜索文本来创建正则表达式的方法可能有效。

搜索短语"In the beginning"可以转换为正则表达式:

/('b(In|the|beginning)'b([^'w<>]|(?=<))+)+/gi

[^'w<>]是使用非单词字符,但不使用 HTML 标记分隔符。|(?=<)是这样,它还将匹配(但不连接(开始或结束 HTML 标记的开头。

.HTML:

<input id="search" type="text">
<span id="match" style="color: red;"></span>
<p id="text">Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea <b>commodo</b> consequat. In the <i>beginning</i> Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.</p>

JavaScript:

var _search = document.getElementById('search'),
    _text = document.getElementById('text'),
    _match = document.getElementById('match'),
    body = _text.innerHTML;
_match.parentElement.removeChild(_match);
_match.removeAttribute('id');
function onSearchInput(event) {
    var query = _search.value.trim(),
        rxStr = query.replace(/'s+/g, '|'),
        newBody = '',
        lastIndex = 0,
        result,
        rx;
    if (!rxStr) {
        _text.innerHTML = body;
        return;
    }
    rx = new RegExp('(''b(' + rxStr + ')''b([^''w<>]|(?=<))+)+', 'ig');
    result = rx.exec(body);
    if (!result) {
        return;
    }
    console.log('rx:', rx.source);
    while (result) {
        console.log('match:', result[0]);
        newBody += body.slice(lastIndex, result.index);
        _match.textContent = result[0];
        newBody += _match.outerHTML;
        lastIndex = result.index + result[0].length;
        result = rx.exec(body);
    }
    newBody += body.slice(lastIndex);
    _text.innerHTML = newBody;
}
_search.addEventListener('input', onSearchInput);

JSFIDDLE: http://jsfiddle.net/ta80a9h2/

尝试

var div = $("div") // element containing `text`
        , input = $("#input")
        , highlighted = $("<span class=word>").css("fontWeight", "bold");
    input.on({
      change: function (e) {
        var m = div.text().match(new RegExp(e.target.value, "i"));
        if (m !== null) {
            div.html(function (_, text) {
                return text.replace(highlighted[0].outerHTML, highlighted.text())
                       .replace(m[0], highlighted.text(m[0])[0].outerHTML)
            });
        } 
      }
    });

var div = $("div") // element containing `text`
        , input = $("#input")
        , highlighted = $("<span class=word>").css("fontWeight", "bold");
    
    input.on({
      change: function (e) {
        var m = div.text().match(new RegExp(e.target.value, "i"));
        if (m !== null) {
            div.html(function (_, text) {
                return text.replace(highlighted[0].outerHTML, highlighted.text())
                       .replace(m[0], highlighted.text(m[0])[0].outerHTML)
            });
        } 
      }
      // clear `input`
      , focus:function(e) {
          e.target.value = ""
      }
    })
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js"></script>
<input id="input" type="text" />
<br />
<div>Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. In the beginning Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
    <div>

我不确定你从原始帖子中问了什么,但假设你知道你要找的序列,你可以使用 indexOf() 方法返回给定值在字符串中的位置。