将超集数组叠加到子集数组上,其中顺序/序列很重要(在Javascript中)

Overlaying a superset array onto a subset array where order/sequence matters (in Javascript)

本文关键字:数组 Javascript 顺序 叠加 子集      更新时间:2023-09-26

我有一组形式为

的数组
A= [1,2,3,4,5,6]
B= [7,8,9]
C= [5,6]
D= [9]

我想在子集(子序列)上"覆盖"右侧(后缀)超集(严格地说,超序列),以便结果集看起来像这样:

A= [1,2,3,4,5,6] (unchanged, because not a subset of anything)
B= [7,8,9] (unchanged, because not a subset of anything)
C= [1,2,3,4,5,6] (C overlayed with A, because C is a subset of A)
D= [7,8,9] (D overlayed with B, because D is a subset of B)

我在node.js中做这个。我认为这部分是一个我没能理解的逻辑问题。我

真实世界的用例是合并路径名,以便规范化一个分类层次结构,该层次结构有许多项目,其中混合了完整和截断的路径,例如/Science/Biology和/Biology被规范化为/Science/Biology

非常感谢任何指示如何做到这一点。

首先用Haskell写这个,只是为了把算法记下来。

import Data.List (maximumBy, tails)
import Data.Map (Map, findWithDefault)
import qualified Data.Map.Strict as Map
import Data.Ord (comparing)
main :: IO()
main = putStrLn $ show $ normalize [[1..6], [7..9], [5..6], [9]]
normalize :: Ord a => [[a]] -> [[a]]
normalize xxs = map ('xs -> findWithDefault xs xs index) xxs
  where index = suffixIndex xxs
suffixIndex :: Ord a => [[a]] -> Map [a] [a]
suffixIndex xxs = Map.fromListWith (maxBy length) entries
  where entries = [ (suf, xs) | xs <- xxs, suf <- suffixes xs ]
        suffixes xs = drop 1 $ filter (not . null) $ tails xs
maxBy :: Ord b => (a -> b) -> a -> a -> a
maxBy f x y = maximumBy (comparing f) [x, y]

suffixIndex将每个后缀映射到包含该后缀的最长列表。因此,例如,[[1,2,3], [2,3]]的结果索引看起来像[2,3] -> [1,2,3], [3] -> [1,2,3]

一旦建立索引,每个列表通过应用映射(如果存在映射)被"规范化"。

现在在Javascript中

console.log(JSON.stringify(normalize([[1,2,3,4,5,6], [7,8,9], [5,6], [9]])));
function normalize(xxs) {
    var index = suffixIndex(xxs);
    return xxs.map(function (xs) {
        str = JSON.stringify(xs);
        return index.hasOwnProperty(str) ? index[str] : xs;
    });
}
function suffixIndex(xxs) {
    var index = {};
    xxs.forEach(function (xs) {
        suffixes(xs).forEach(function (suffix) {
            var str = JSON.stringify(suffix);
            index[str] = index.hasOwnProperty(str)
                ? maxBy(lengthOf, index[str], xs)
                : xs;
        });
    });
    return index;
}
function suffixes(xs) {
    var i, result = [];
    for (i = 1; i < xs.length; i++) result.push(xs.slice(i));
    return result;
}
function lengthOf(arr) { return arr.length; }
function maxBy(f, x, y) { return f(x) > f(y) ? x : y; }

可能不是最优雅的方法,但是比较字符串化的版本是可行的。假设您在数组中有A, B, CD arr:

function overlay (arr) {
  arr = arr.map(function(item) {
    // Stringify the item
    var itemStr = item.join(",");
    // Loop through each item in the array
    arr.forEach(function(compare) {
      // Stringify the item to compare
      var compareStr = compare.join(",");
      // If we're not comparing it to itself, and the rightmost part
      // of the comparison string == the item in question, set the
      // item to the value of "compare"
      if (compareStr != itemStr && 
          compare.join(",").substr(0 - itemStr.length) == itemStr) {
        item = compare;
      }
    });
    return item;
  });
}

您可以通过对数组中的所有项进行预字符串化的版本来进行优化。