设计集合，如何去规范化

Design collection, how to de normalize

本文关键字：何去规范化集合更新时间：2023-12-25

我在许多地方都有不同价格的服务。在过渡SQL中，我会让price_location表包含service_id和location_id，当我想在某些地区找到显示最高和最低价格的服务时（地区将选择多个位置），进行连接和分组。

由于服务和位置是多对多的，我想到了以下几点：

service_location_price = [
  {
    serviceName:'s1';
    ,price:10
    ,location:'location1'
  },{//to keep it simple only serviceName is here but
     // there will be multiple providers for the same
     // serviceName at same location but different price
    serviceName:'s1';
    ,price:12
    ,location:'location1'
  },{
    serviceName:'s1';
    ,price:15
    ,location:'location2'
  }
];

基本上是破坏第二种正常形式（具有重复行）的平面文件数据。

现在，聚合和/或地图减少应该可以很好地在某个地区获得显示最低和最高价格的服务。或显示可用于某些服务的位置。

service和location都有自己的集合，service_location_price集合复制了此查询的service和location的一些值。

有些人担心重复的数据，希望以不同的方式实现（猫鼬填充匹配？？）。

不确定我在这里有什么选择，所以希望有更多经验的人能提供一些意见。有没有更好的方法让搜索

服务和位置不会更新太多，但两者之间的关系可能会改变、添加或删除。但是，在各地区搜索服务将非常频繁。

populate是一个用于解析引用的大型$in查询，然后它将数组中的引用交换为相应的文档。如果引用字段是索引的，那也没那么糟糕，但它是一个额外的查询，是糟糕模式设计的支柱，因为当你不使用关系数据库时，它更容易模拟关系数据库，并且应该以不同的方式处理问题。我认为它应该从Mongoose中删除，但遗憾的是，现在有点晚了：（

我不确定你是如何建模区域的——你说过一个区域可以是多个位置，所以我将把一个区域建模为location值的数组。

给定区域中的服务总数：

db.service_location_price.distinct("serviceName", { "location" : { "$in" : region_array } })

这将为您提供一个服务名称数组，因此.length将提供服务的数量。

一个地区服务的最低/最高价格：

db.service_location_price.find({ "location" : { "$in" : region_array }, "serviceName" : "service1" }).sort({ "price" : 1 }).limit(1)
db.service_location_price.find({ "location" : { "$in" : region_array }, "serviceName" : "service1" }).sort({ "price" : -1 }).limit(1)

样本文档中没有关于服务供应商的信息，所以我不知道如何找到一个地区的服务供应商数量。也许您想在文档中包含supplier字段？