2008-10-14

Converting Restricted XML to Good-Quality JSON

Here's some ideas for converting restricted forms of XML to good-quality JSON. The restrictions are as follows:
  • The XML can't contain mixed content (elements with both children/attributes and text).
  • The XML cannot depend on the order of child elements with distinct names (order dependence in children with the same name is okay).
  • There can't be any attributes with the same name as child elements.
  • There can't be any elements or attributes that differ only in their namespace names.
You also need to know the following things for each child element:
  • Whether it MUST appear at most once (a singleton element) or MAY appear more than once (a multiplex element).
  • Whether it only contains text (an element with simple type) or child elements and/or attributes (an element with complex-type).
Now, to convert the XML to JSON, apply these rules recursively:
  • A singleton element of simple type, and likewise an attribute, is converted to a JSON simple value: a number or boolean if syntactically possible, otherwise a string.
  • A multiplex object of simple type is converted to a JSON array of simple values.
  • A singleton element of complex type is converted to a JSON object that maps the local names of child elements and attributes to their content. Namespace names are discarded.
  • A multiplex element of complex type is mapped to a JSON array of JSON objects that map the local names of child elements and attributes to their content. Namespace names are discarded.
Comments are very welcome.