递归地修剪Object键和值中的空白

Trim white spaces in both Object key and value recursively

本文关键字:空白 键和值 Object 修剪 递归      更新时间:2023-09-26

如何递归地修剪JavaScript对象中键和值中的空白?

我遇到了一个问题,我试图"清理"用户提供的JSON字符串,并将其发送到其他代码中进行进一步处理。

假设我们有一个用户提供的JSON字符串,其属性键和值的类型为"string"。然而,在这种情况下,问题在于键和值并不像期望的那样干净。说一个{"key_with_leading_n_trailing_spaces":"my_value_with_lead _spaces"}。

在这种情况下,它很容易导致您出色编写的JavaScript程序试图使用此类数据(或者我们应该称之为脏数据?)的问题,因为当您的代码试图从该JSON对象中获取值时,不仅键不匹配,而且值也不匹配。我在谷歌上搜索了一下,发现了一些技巧,但没有一种方法可以治愈所有的问题。

给定这个在键和值中有很多空白的JSON。

var badJson = {
  "  some-key   ": "    let it go    ",
  "  mypuppy     ": "    donrio   ",
  "   age  ": "   12.3",
  "  children      ": [
    { 
      "   color": " yellow",
      "name    ": "    alice"
    },    { 
      "   color": " silver        ",
      "name    ": "    bruce"
    },    { 
      "   color": " brown       ",
      "     name    ": "    francis"
    },    { 
      "   color": " red",
      "      name    ": "    york"
    },
  ],
  "     house": [
    {
      "   name": "    mylovelyhouse     ",
      " address      " : { "number" : 2343, "road    "  : "   boardway", "city      " : "   Lexiton   "}
    }
  ]
};

这就是我(在使用lodash.js的帮助下)想到的:

//I made this function to "recursively" hunt down keys that may 
//contain leading and trailing white spaces
function trimKeys(targetObj) {
  _.forEach(targetObj, function(value, key) {
      if(_.isString(key)){
        var newKey = key.trim();
        if (newKey !== key) {
            targetObj[newKey] = value;
            delete targetObj[key];
        }
        if(_.isArray(targetObj[newKey]) || _.isObject(targetObj[newKey])){
            trimKeys(targetObj[newKey]);
        }
      }else{
        if(_.isArray(targetObj[key]) || _.isObject(targetObj[key])){
            trimKeys(targetObj[key]);
        }
      }
   });
}
//I stringify this is just to show it in a bad state
var badJson = JSON.stringify(badJson);
console.log(badJson);
//now it is partially fixed with value of string type trimed
badJson = JSON.parse(badJson,function(key,value){
    if(typeof value === 'string'){
        return value.trim();
    }
    return value;
});
trimKeys(badJson);
console.log(JSON.stringify(badJson));

这里需要注意的是:我分了一两个步骤来完成这项工作,因为我找不到更好的一次性解决方案。如果我的代码中有问题或其他更好的东西,请与我们分享。

谢谢!

您可以将其字符串化、字符串替换和重分析

JSON.parse(JSON.stringify(badJson).replace(/"'s+|'s+"/g,'"'))

您可以使用Object.keys清理属性名称和属性以获得键的数组,然后使用array.prototype.reduce迭代键并创建一个具有修剪键和值的新对象。该函数需要是递归的,这样它还可以修剪嵌套的对象和数组。

注意,它只处理普通的数组和对象,如果你想处理其他类型的对象,对reduce的调用需要更复杂才能确定对象的类型(例如,new obj.constructor()适当聪明的版本)

function trimObj(obj) {
  if (!Array.isArray(obj) && typeof obj != 'object') return obj;
  return Object.keys(obj).reduce(function(acc, key) {
    acc[key.trim()] = typeof obj[key] == 'string'? obj[key].trim() : trimObj(obj[key]);
    return acc;
  }, Array.isArray(obj)? []:{});
}

我使用的最佳解决方案是这样的。查看有关replacer函数的文档。

function trimObject(obj){
  var trimmed = JSON.stringify(obj, (key, value) => {
    if (typeof value === 'string') {
      return value.trim();
    }
    return value;
  });
  return JSON.parse(trimmed);
}
var obj = {"data": {"address": {"city": "'n 'r     New York", "country": "      USA     'n'n'r"}}};
console.log(trimObject(obj));

epascarello上面的答案加上一些单元测试(只是为了让我确定):

function trimAllFieldsInObjectAndChildren(o: any) {
  return JSON.parse(JSON.stringify(o).replace(/"'s+|'s+"/g, '"'));
}
import * as _ from 'lodash';
assert.true(_.isEqual(trimAllFieldsInObjectAndChildren(' bob '), 'bob'));
assert.true(_.isEqual(trimAllFieldsInObjectAndChildren('2 '), '2'));
assert.true(_.isEqual(trimAllFieldsInObjectAndChildren(['2 ', ' bob ']), ['2', 'bob']));
assert.true(_.isEqual(trimAllFieldsInObjectAndChildren({'b ': ' bob '}), {'b': 'bob'}));
assert.true(_.isEqual(trimAllFieldsInObjectAndChildren({'b ': ' bob ', 'c': 5, d: true }), {'b': 'bob', 'c': 5, d: true}));
assert.true(_.isEqual(trimAllFieldsInObjectAndChildren({'b ': ' bob ', 'c': {' d': 'alica c c '}}), {'b': 'bob', 'c': {'d': 'alica c c'}}));
assert.true(_.isEqual(trimAllFieldsInObjectAndChildren({'a ': ' bob ', 'b': {'c ': {'d': 'e '}}}), {'a': 'bob', 'b': {'c': {'d': 'e'}}}));
assert.true(_.isEqual(trimAllFieldsInObjectAndChildren({'a ': ' bob ', 'b': [{'c ': {'d': 'e '}}, {' f ': ' g ' }]}), {'a': 'bob', 'b': [{'c': {'d': 'e'}}, {'f': 'g' }]}));

我认为一个通用的map函数可以很好地处理这个问题。它将深层对象遍历和转换与我们希望执行的特定操作()分离开来

const identity = x =>
  x
const map = (f = identity, x = null) =>
  Array.isArray(x)
    ? x.map(v => map(f, v))
: Object(x) === x
    ? Object.fromEntries(Object.entries(x).map(([ k, v ]) => [ map(f, k), map(f, v) ]))
    : f(x)
const dirty = 
` { "  a  ": "  one "
  , " b": [ null,  { "c ": 2, " d ": { "e": "  three" }}, 4 ]
  , "  f": { "  g" : [ "  five", 6] }
  , "h " : [[ [" seven  ", 8 ], null, { " i": " nine " } ]]
  , " keep  space  ": [ " betweeen   words.  only  trim  ends   " ]
  }
`
  
const result =
  map
   ( x => String(x) === x ? x.trim() : x // x.trim() only if x is a String
   , JSON.parse(dirty)
   )
   
console.log(JSON.stringify(result))
// {"a":"one","b":[null,{"c":2,"d":{"e":"three"}},4],"f":{"g":["five",6]},"h":[[["seven",8],null,{"i":"nine"}]],"keep  space":["betweeen   words.  only  trim  ends"]}

map可以重复使用,以方便地应用不同的转换-

const result =
  map
   ( x => String(x) === x ? x.trim().toUpperCase() : x
   , JSON.parse(dirty)
   )
console.log(JSON.stringify(result))
// {"A":"ONE","B":[null,{"C":2,"D":{"E":"THREE"}},4],"F":{"G":["FIVE",6]},"H":[[["SEVEN",8],null,{"I":"NINE"}]],"KEEP  SPACE":["BETWEEEN   WORDS.  ONLY  TRIM  ENDS"]}

使map实用

感谢Scott的评论,我们为map添加了一些人体工程学。在这个例子中,我们把trim写成一个函数-

const trim = (dirty = "") =>
  map
   ( k => k.trim().toUpperCase()          // transform keys
   , v => String(v) === v ? v.trim() : v  // transform values
   , JSON.parse(dirty)                    // init
   )

这意味着map现在必须接受两个函数参数-

const map = (fk = identity, fv = identity, x = null) =>
  Array.isArray(x)
    ? x.map(v => map(fk, fv, v)) // recur into arrays
: Object(x) === x
    ? Object.fromEntries(
        Object.entries(x).map(([ k, v ]) =>
          [ fk(k)           // call fk on keys
          , map(fk, fv, v)  // recur into objects
          ] 
        )
      )
: fv(x) // call fv on values

现在,我们可以看到密钥转换与值转换是分离的。字符串值得到一个简单的.trim,而键得到.trim().toUpperCase()-

console.log(JSON.stringify(trim(dirty)))
// {"A":"one","B":[null,{"C":2,"D":{"E":"three"}},4],"F":{"G":["five",6]},"H":[[["seven",8],null,{"I":"nine"}]],"KEEP  SPACES":["betweeen   words.  only  trim  ends"]}

展开下面的代码段,在您自己的浏览器-中验证结果

const identity = x =>
  x
const map = (fk = identity, fv = identity, x = null) =>
  Array.isArray(x)
    ? x.map(v => map(fk, fv, v))
: Object(x) === x
    ? Object.fromEntries(
        Object.entries(x).map(([ k, v ]) =>
          [ fk(k), map(fk, fv, v) ]
        )
      )
: fv(x)
const dirty = 
` { "  a  ": "  one "
  , " b": [ null,  { "c ": 2, " d ": { "e": "  three" }}, 4 ]
  , "  f": { "  g" : [ "  five", 6] }
  , "h " : [[ [" seven  ", 8 ], null, { " i": " nine " } ]]
  , " keep  spaces  ": [ " betweeen   words.  only  trim  ends   " ]
  }
`
const trim = (dirty = "") =>
  map
   ( k => k.trim().toUpperCase()
   , v => String(v) === v ? v.trim() : v
   , JSON.parse(dirty)
   )
   
console.log(JSON.stringify(trim(dirty)))
// {"A":"one","B":[null,{"C":2,"D":{"E":"three"}},4],"F":{"G":["five",6]},"H":[[["seven",8],null,{"I":"nine"}]],"KEEP  SPACES":["betweeen   words.  only  trim  ends"]}

类似于epascarello的答案。这就是我所做的:

import java.util.regex.Matcher;
import java.util.regex.Pattern;
........
public String trimWhiteSpaceAroundBoundary(String inputJson) {
    String result;
    final String regex = "'"''s+|''s+'"";
    final Pattern pattern = Pattern.compile(regex);
    final Matcher matcher = pattern.matcher(inputJson.trim());
    // replacing the pattern twice to cover the edge case of extra white space around ','
    result = pattern.matcher(matcher.replaceAll("'"")).replaceAll("'"");
    return result;
}

测试用例

assertEquals("'"2'"", trimWhiteSpace("'" 2 '""));
assertEquals("2", trimWhiteSpace(" 2 "));
assertEquals("{   }", trimWhiteSpace("   {   }   "));
assertEquals("'"bob'"", trimWhiteSpace("'" bob '""));
assertEquals("['"2'",'"bob'"]", trimWhiteSpace("['"  2  '",  '"  bob  '"]"));
assertEquals("{'"b'":'"bob'",'"c c'": 5,'"d'": true }",
              trimWhiteSpace("{'"b '": '" bob '", '"c c'": 5, '"d'": true }"));

我尝试了上面的解决方案JSON.stringify解决方案,但它无法使用类似"this is '''my'''test"的字符串。你可以使用stringify的replacer函数绕过它,只需修剪输入的值。

JSON.parse(JSON.stringfy(obj,(键,值)=>(值的类型==='string'?value.trim():值))

@RobG感谢您的解决方案。再添加一个条件不会创建更多嵌套对象

function trimObj(obj) {
      if (obj === null && !Array.isArray(obj) && typeof obj != 'object') return obj;
      return Object.keys(obj).reduce(function(acc, key) { 
        acc[key.trim()] = typeof obj[key] === 'string' ? 
          obj[key].trim() : typeof obj[key] === 'object' ?  trimObj(obj[key]) : obj[key];
        return acc;
      }, Array.isArray(obj)? []:{});
    }