如何解析保留顺序的大纲

How would one parse an outline where order is retained

本文关键字：顺序何解析保留更新时间：2023-09-26

我正在尝试解析一个看起来像这样的文档。。。

<line>(a) main category</line>
<line>(1) sublime</line>
<line>(i) sub sub line</line>
<line>(b) other category </line>

我不知道如何确保我是罗马数字还是字母。这似乎应该有一个库或模式，但我似乎找不到。

有人能想出一个模式吗？我想使用js，但我相当不懂语言。

循环遍历每一行，并将当前的"标头"与前一行进行检查。

创建一个看起来像这样的方法（不是有效的JS，只是伪代码）；

function isSameType(last, current) {
    if (typeof last == 'numeric' && typeof current == 'numeric') { 
        return true; // 1, 2, 3, 4 etc.
    }
    if (last == 'a' && current == 'b') { //Improve here ;p
        return true; 
    }
    if (last == 'i' && current == 'ii') {
        return true;
    }
    if (last == 'h' && current == 'i') {
        return true; // This is an edgecase... Most likely I after H is the same type - but it might not be, you'll never know for sure
    }
    return false; //Its not caught - go deepar!
}

所以有了这个标记，你就会得到一些有用的东西，但不是完全防水的。。。

编辑：如果这是所有的信息，你可以停止搜索，因为不可能知道H后面的I是否真的更深一层。这是不可能的。

第2版：只要是A->1->I格式，它就应该可以工作。

a.  LEVEL 0
b.  LEVEL 0
c.  LEVEL 0
1.  LEVEL 1
2.  LEVEL 1
i.  LEVEL 2
ii. LEVEL 2
3.  LEVEL 1
i.  LEVEL 2
e.  LEVEL 0 <- this might be an issue - say the letter is V, you wouldn't know if it was level 1 alphabetical or roman (level 2) - Or maybe they went to "a" - in that case its probably level 3, and not level 1, because the A was already there in level 1. A lot of rules!

只要有一套正确的规则，你就会走得更远。但如果他们从3级（iv）跳到1级（v），你可能会遇到麻烦。但到目前为止，如果你看到"v"，并且之前的级别是数字（3），那么它一定是罗马的。

捕获行的内容

var letter = (line.match(/^'s*'(.{1,2})')/) || [''])[0]