JSON vs YAML vs XML: Which Format Should You Use?
JSON, YAML, and XML are the three dominant text-based data serialisation formats in software development today. Each was designed with different priorities and dominates different domains. Choosing the right format for your use case affects readability, tooling compatibility, performance, and maintenance burden. This guide compares all three clearly.
A Quick History of Each Format
XML (eXtensible Markup Language) was standardised in 1998 as a simplified subset of SGML. It was the dominant data exchange format of the early 2000s, used in SOAP web services, RSS feeds, Office document formats, and Android layouts. It was designed to be self-describing and extensible.
JSON (JavaScript Object Notation) was formalised by Douglas Crockford in the early 2000s as a lightweight alternative to XML. It maps directly to JavaScript data structures, making it trivial to use in web applications. The rise of AJAX and REST APIs made JSON the default for web APIs by the early 2010s.
YAML (YAML Ain't Markup Language) emerged around the same time as JSON and prioritises human readability. It became popular for configuration files in the DevOps era — Kubernetes manifests, GitHub Actions workflows, Docker Compose files, and Ansible playbooks are all written in YAML.
Syntax Comparison
Consider a simple data structure representing a person with a list of hobbies:
JSON:
{
"name": "Alice",
"age": 30,
"hobbies": ["cycling", "reading"],
"address": {
"city": "London",
"country": "UK"
}
}
YAML:
name: Alice
age: 30
hobbies:
- cycling
- reading
address:
city: London
country: UK
XML:
<person>
<name>Alice</name>
<age>30</age>
<hobbies>
<hobby>cycling</hobby>
<hobby>reading</hobby>
</hobbies>
<address>
<city>London</city>
<country>UK</country>
</address>
</person>
JSON: Lean and Machine-Readable
JSON's strengths are its simplicity and universal support. The spec is tiny, parsers are fast, and every programming language has built-in or standard library JSON support. Its strict syntax (no comments, no trailing commas, always double-quoted keys) makes it predictable to parse but occasionally frustrating to write by hand.
JSON's limitations: no comments, no support for dates as a native type, no multi-line strings, no anchors/aliases (no way to reuse values). It's ideal for machine-to-machine communication (APIs) but less ideal for human-authored configuration.
YAML: Human-Friendly Configuration
YAML's indentation-based syntax is arguably the most readable of the three for humans — no brackets, no quotes required for simple strings. It supports comments (#), multi-line strings, and anchors/aliases for reusing values.
YAML's Achilles heel is its complexity. The full YAML 1.1 spec has numerous footguns: the Norway Problem (the string NO parsed as boolean false in YAML 1.1), octal parsing (0777 interpreted as the octal value 511), ambiguous boolean parsing, and tab vs space sensitivity. YAML 1.2 fixed many of these, but library support for 1.2 is inconsistent.
# YAML 1.1 footguns (all parse as non-strings without quotes):
country: NO # becomes false
port: 0777 # becomes 511 (octal)
api_key: 123456e7 # becomes 1234560000.0 (scientific notation)
XML: Enterprise and Document-Centric
XML excels at document-centric data where content and structure are mixed (like HTML), where attribute metadata is needed on elements, and where namespace separation between vocabularies matters. The XML ecosystem includes powerful standards: XSLT for transformation, XPath/XQuery for querying, XML Schema (XSD) for validation, and SOAP for web services.
XML's verbosity is its main weakness for data serialisation — the same data takes 2–4x more bytes than equivalent JSON. XML also requires a skilled hand for namespace management and has a steeper learning curve than JSON or YAML.
Side-by-Side Example
For the same data structure, approximate file sizes comparing the three formats:
- JSON: most compact, ~150 bytes for the example above
- YAML: slightly smaller than JSON in practice due to no quotes on simple values, ~120 bytes
- XML: most verbose, ~300 bytes for the same data
Performance and File Size
For high-throughput APIs, JSON parsing is generally faster than XML and slightly faster than YAML in most implementations, due to JSON's simpler grammar. YAML parsers are notoriously slow for large files because the full YAML spec requires handling many special cases. XML carries significant serialisation overhead from closing tags and attribute syntax.
For extremely performance-sensitive data exchange, consider binary formats like Protocol Buffers, MessagePack, or Apache Avro — all are more compact and faster to parse than any text format.
When to Choose Each Format
- JSON — REST APIs, configuration when tooling expects it (package.json, tsconfig.json), structured data in databases, localStorage, inter-service messaging.
- YAML — human-authored configuration files (CI/CD pipelines, Kubernetes manifests, Docker Compose, Ansible), anything where developers will edit the file regularly and readability matters more than strict parsing.
- XML — SOAP web services, RSS/Atom feeds, document formats (DOCX, SVG), Android layouts, any system with an existing XML schema you must comply with.
Convert Between Formats Online
Tools.Fun offers converters for all three: use the JSON Formatter to validate and pretty-print JSON, JSON to YAML Converter to switch between the two most popular config formats, and JSON to XML Converter when you need to integrate with XML-based systems.
← Back