What is a good general purpose plain text data format like that used for Bibtex? [closed]

Context

I’m writing a few multiple choice practice questions and I’d like to store them in a simple plain text data format. I’ve previously used tab delimited, but that makes editing in a text editor a bit awkward. I’d like to use a format a bit like bibtex.

E.g.,

@Article{journals/aim/Sloman99,
  title =   "Review of Affective Computing",
  author =  "Aaron Sloman",
  journal = "AI Magazine",
  year =    "1999",
  number =  "1",
  volume =  "20",
  url = "http://dblp.uni-trier.de/db/journals/aim/aim20.html#Sloman99",
  pages =   "127--133",
}

Important properties seem to be:

  • Data is made up of records
  • Each record has multiple attribute-value pairs
  • Each attribute-value pair can be recorded on a new line, but can span multiple lines
  • Easy to manually enter textual data in a text editor
  • Readily available tools to convert into tabular data

For example, here is something a bit like what might work

@
id: 1
question: 1 + 1
a: 1
b: 2
c: 3
d: 4
correct: b

@
id: 2
question: What is the capital city of the country renowned for koalas, 
          emus, and kangaroos?
a: Canberra
b: Melbourne
c: Sydney
d: Australia
correct: a

While I’m interested in the specific context of writing multiple choice questions, I’m also interested in the broader issue of representing data in this or a similar type of format.

Initial Thoughts

My initial thoughts included the following:

  • YAML
  • JSON
  • Delimited data with custom field and record delimiters that permit multi-line records
  • A custom file format with some form of custom parser

I’ve only had a quick look at YAML and JSON; My first impressions are that they might be over-kill. Custom delimiting might be good, but it would probably require all fields to be present in a consistent order for all records. Writing my own parser sounds a bit fiddly.

Answer

Why not use XML? There are many good parsers that directly translate XML files to data structures, even one for R ( http://cran.r-project.org/web/packages/XML/index.html ).

The format looks like this (example taken from http://www.w3schools.com/xml/default.asp ).

<?xml version="1.0"?>
<notes>
    <note>
        <to>Tove</to>
        <from>Jani</from>
        <heading>Reminder</heading>
        <body>Don't forget me this weekend!</body>
    </note>
    <note>
        <to>Janis</to>
        <from>Cardinal</from>
        <heading>Reminder</heading>
        <body>Don't forget me the next weekend!</body>
    </note>
</notes>

E.g, using the XML package:

z=xmlTreeParse("test.xml")
z$doc$children$notes

gives access to the complete notes body,

z$doc$children$notes[1]

is just the first node and so on…

Attribution
Source : Link , Question Author : Jeromy Anglim , Answer Author : thias

Leave a Comment