将JSON格式转换为CSV,在R中上传数据表,生成D3气泡图

Converting JSON format to CSV to upload data table in R to produce D3 bubble chart

本文关键字:数据表 生成 气泡 D3 格式 JSON 转换 CSV      更新时间:2023-09-26

我试图使用R中的D3气泡图来制作我自己的气泡图与分组气泡颜色。

我上传了index.html火炬。json文件从D3到R,它产生的气泡图运行时。但是我不想手动更改JSON代码来创建我自己的气泡和组(下面的标题显示了一组3个不同组的名称的气泡组)。

    {
     "name": "flare",
     "children": [
      {
       "name": "analytics",
       "children": [
        {
         "name": "cluster",
         "children": [
          {"name": "AgglomerativeCluster", "size": 3938},
          {"name": "CommunityStructure", "size": 3812},
          {"name": "HierarchicalCluster", "size": 6714},
          {"name": "MergeEdge", "size": 743}
         ]
        },
        {
         "name": "graph",
         "children": [
          {"name": "BetweennessCentrality", "size": 3534},
          {"name": "LinkDistance", "size": 5731},
          {"name": "MaxFlowMinCut", "size": 7840},
          {"name": "ShortestPaths", "size": 5914},
          {"name": "SpanningTree", "size": 3416}
         ]
        },
        {
         "name": "optimization",
         "children": [
          {"name": "AspectRatioBanker", "size": 7074}
         ]
        }
       ]

使用jsonlite包(从在线阅读可以处理更复杂的json结构)我已经转换为数据帧。

 library(jsonlite)
 fromJSON("flare.json",simplifyDateframe = FALSE)

这是,没有请求的数据帧结构(示例)。

$children[[10]]$children[[6]]$children[[10]]
$children[[10]]$children[[6]]$children[[10]]$name
[1] "OperatorSwitch"
$children[[10]]$children[[6]]$children[[10]]$size
[1] 2581

这是请求的数据帧结构(示例)。

 fromJSON("flare.json",simplifyDataFrame = TRUE)

然而,它产生了一个长串接的数据列表,我一直试图解开我的数据自动化。

Arrays, Colors, Dates, Displays, Filter, Geometry, heap, IEvaluable,  IPredicate, IValueProxy, math, Maths, Orientation, palette, Property, Shapes, Sort, Stats, Strings, 8258, 10001, 8217, 12555, 2324, 10993, NA, 335, 383, 874, NA, 17705, 1486, NA, 5559, 19118, 6887, 6557, 22026, FibonacciHeap, HeapNode, 9354, 1233, DenseMatrix, IMatrix, SparseMatrix, 3165, 2815, 3366, ColorPalette, Palette, ShapePalette, SizePalette, 6367, 1229, 2059, 2291

建议解决方案…

FOR LOOPS(时间限制)

我想过写多个for循环来重建JSON嵌套结构(我更强,但我有一个截止日期,这可能需要一段时间)。但是我认为对JSON更了解的人也许能帮上忙。

CSV转换格式(不工作)

我也试图转换耀斑。json文件使用json到CSV转换器来产生CSV格式需要测试我是否可以将内容从CSV直接更新到R,但这不起作用(即使添加了火炬)。

我真正需要的

转换耀斑的解决方案。从 json到数据框或表,所以我可以上传我的数据与名称,大小和组转换回json ,以产生我自己的气泡图?

如果可能的话,在R中实现这一切将是伟大的,我不认为这是不可能的,但我很高兴听到其他建议。

我不知道下一步该做什么。我通常在R中处理矩阵,所以处理JSON列表和数组不是我的强项。

这可能会给我们提供一些其他的思考。我将在代码中嵌入注释。您可以看到一个实例。

library(jsonlite)
library(dplyr)

flare_json <- rjson::fromJSON(  ## rjson just works better on these for me
    file = "http://bl.ocks.org/mbostock/raw/4063269/flare.json"
)
# let's have a look at the structure of flare.json
# listviewer htmlwidget might help us see what is happening
#   devtools::install_github("timelyportfolio/listviewer")
#   library(listviewer)
jsonedit(
  paste0(
    readLines("http://bl.ocks.org/mbostock/raw/4063269/flare.json")
    ,collapse=""
  )
)
# the interesting thing about Mike Bostock's Bubble Chart example
#   though is that the example removes the nested hierarchy
#    with a JavaScript function called classes
#// Returns a flattened hierarchy containing all leaf nodes under the root.
#function classes(root) {
#  var classes = [];
#  
#  function recurse(name, node) {
#    if (node.children) node.children.forEach(function(child) { recurse(node.name, child); });
#    else classes.push({packageName: name, className: node.name, value: node.size});
#  }
#  
#  recurse(null, root);
#  return {children: classes};
#}
# let's try to recreate this in R
classes <- function(root){
  classes <- data.frame()
  haschild <- function(node){
    (!is.null(node) && "children" %in% names(node))
  }
  recurse <- function(name,node){
    if(haschild(node)){
      lapply(
        1:length(node$children)
        ,function(n){
          recurse(node$name,node$children[[n]])
        }
      )
    } else {
      classes <<- bind_rows(
        classes,
        data.frame(
          "packageName"= name
          ,"className" = node[["name"]]
          ,"size" = node[["size"]]
          ,stringsAsFactors = F
        )
      )
    }
  }
  recurse(root$name,root)
  return(classes)
}
# now with a R flavor our class replica should work
flare_df <- classes(flare_json)

# so the example uses a data.frame with columns
#   packageName, className, size
# and feeds that to bubble.nodes where bubble = d3.layout.pack
# fortunately Joe Cheng has already made a htmlwidget called bubbles
#   https://github.com/jcheng5/bubbles
# that will produce a d3.layout.pack bubble chart
library(scales)
bubbles(
  flare_df$size
  ,flare_df$className
  ,color = col_factor(
    RColorBrewer::brewer.pal(9,"Set1")
    ,factor(flare_df$packageName)
  )(flare_df$packageName)
  ,height = 600
  ,width = 960
)
# it's not perfect with things such as text sizing
#    but it's a start

如果你仍然认为你想要一个嵌套的d3 JSON层次结构,这里是一些代码。

#  convert this to nested d3 json format
#    this is example data provided in a comment to this post
df <- data.frame(
  "overallgroup" = "Online"
  ,"primarygroup" = c(rep("Social Media",3),rep("Web",2))
  ,"datasource" = c("Facebook","Twitter","Youtube","Website","Secondary Website")
  ,"size" = c(10000,5000,200,10000,2500)
  ,stringsAsFactors = FALSE
)

# recommend using data.tree to ease our pain here
#devtools::install_github("gluc/data.tree")
library(data.tree)
# the much easier way
df$pathString <- apply(df[,1:3],MARGIN=1, function(x){paste0(x,collapse="/")})
root <- as.Node(df[,4:5])    
# the harder manual way
root <- Node$new("root")
sapply(unique(df[,1]),root$AddChild)
apply(
  df[,1:ncol(df)]
  ,MARGIN = 1
  ,function(row){
    lapply(2:length(row),function(cellnum){
      cell <- row[cellnum]
      if( cellnum < ncol(df) ){ # assume last column is attribute
        parent <- Reduce(function(x,y){x$Climb(y)},as.character(row[1:(cellnum-1)]),root)
        if(is.null(parent$Climb(cell))){
          cellnode <- parent$AddChild( cell )
        }  
      } else{
        cellnode <- Reduce(function(x,y){x$Climb(y)},as.character(row[1:(cellnum-1)]),root)
        cellnode$Set( size = as.numeric(cell) )
      }
    })
  }
)

# now we should be able to supply root to networkD3
#   that expects a typical d3 nested JSON
#devtools::install_github("christophergandrud/networkD3")
library(networkD3)
treeNetwork( root$ToList(unname=TRUE) )
# or to get it in JSON
jsonlite::toJSON( root$ToList(unname=TRUE), auto_unbox=TRUE)

发布此内容仅供进一步讨论。正如@timelyportfolio所说,有很多事情需要考虑。这里有一个路径(现在只从"flare"JSON到一个长数据帧,直到我们得到更多你想要的东西)。

library(jsonlite)
library(dplyr)
library(tidyr)
flare <- fromJSON("http://bl.ocks.org/mbostock/raw/4063269/flare.json",
                          simplifyVector=FALSE)
flare_df <- bind_rows(lapply(flare$children,
    function(x) {
      kids <- as.list(x)
      kids$stringsAsFactors=FALSE # prevents bind_rows warnings
      do.call("data.frame", kids)
    }
)) %>% gather(child_path, value, -name)
set.seed(1492) # results reproducibility
print(flare_df[sample(nrow(flare_df), 50),])
## Source: local data frame [50 x 3]
## 
##       name                         child_path value
## 1  display                   children.name.18    NA
## 2     util                   children.size.11  5559
## 3  display                    children.name.9    NA
## 4  display           children.children.size.9    NA
## 5  physics           children.children.name.4    NA
## 6    query             children.children.name   add
## 7  physics children.children.children.size.22    NA
## 8     data                   children.name.20    NA
## 9      vis          children.children.size.20 19382
## 10    flex          children.children.name.36    NA
## ..     ...                                ...   ...
# just showing the top-level nodes are present for an example
select(flare_df, name) %>% arrange(name) %>% distinct %>% print(n=1000)
## Source: local data frame [10 x 1]
## 
##         name
## 1  analytics
## 2    animate
## 3       data
## 4    display
## 5       flex
## 6    physics
## 7      query
## 8      scale
## 9       util
## 10       vis

将数据帧展开为"flare"是非常简单的,但对于您的操作来说,这可能不是可用的数据帧格式。

感谢@timelyportfolio为我指出这一点。您可以很简单地使用数据实现与data.frame/json之间的转换。树包(需要最新的github)。诀窍是粘贴一个路径:

#devtools::install_github("gluc/data.tree")
libraray(data.tree)
df <- data.frame(
  "overallgroup" = "Online"
  ,"primarygroup" = c(rep("Social Media",3),rep("Web",2))
  ,"datasource" = c("Facebook","Twitter","Youtube","Website","Secondary Website")
  ,"size" = c(10000,5000,200,10000,2500)
  ,stringsAsFactors = FALSE
)

df$pathString <- paste("root", df$overallgroup, df$primarygroup, df$datasource, sep="/")
root <- as.Node(df[,-c(1, 2, 3)])
# now we should be able to supply root to networkD3
#   that expects a typical d3 nested JSON
#devtools::install_github("christophergandrud/networkD3")
library(networkD3)
treeNetwork( root$ToList(unname=TRUE) )
# or to get it in JSON
jsonlite::toJSON( root$ToList(unname=TRUE), auto_unbox=TRUE)