Skip to main content
CalcHive
Developer GuidesMar 5, 20266 min read

JSON vs YAML vs CSV vs XML: Pick the Right Format

Each data format exists for a reason. Here's when to reach for each one, how to convert between them, and the gotchas that trip people up.

Every data format is a set of tradeoffs. JSON won the web API wars but is a pain for config files. YAML is great for config but will silently turn your version number into a float. CSV is dead simple until you hit a comma inside a field. XML is verbose but still runs half the enterprise world.

Here's when to reach for each one.

JSON: The Default

JSON (JavaScript Object Notation) is the lingua franca of web APIs. It supports objects, arrays, strings, numbers, booleans, and null. It's native to JavaScript and has first-class parsers in every language.

{
  "name": "deployment-config",
  "replicas": 3,
  "env": {
    "NODE_ENV": "production",
    "LOG_LEVEL": "warn"
  },
  "ports": [8080, 8443]
}

Strengths

  • Universal support. Every language, every platform, every API.
  • Clear, unambiguous syntax. No indentation-sensitivity surprises.
  • Great tooling -- formatters, validators, schema generators everywhere.
  • Native to browsers. JSON.parse() is fast.

Weaknesses

  • No comments. You literally cannot annotate your config files. This is the biggest complaint about JSON for configuration.
  • Verbose for deeply nested structures. Lots of braces and quotes.
  • No trailing commas. Adding a line to an array means modifying two lines in your diff.
  • No native date type. Dates are strings, and everyone argues about the format.

Keep your JSON clean with the JSON formatter and catch syntax errors early with the JSON validator.

YAML: Config Files Done Right (Usually)

YAML is what happens when you take JSON and optimize it for humans writing config files. It supports comments, is less verbose, and reads more naturally. Kubernetes manifests, Docker Compose files, GitHub Actions workflows, CI/CD pipelines -- YAML is everywhere in DevOps.

# Deployment configuration
name: deployment-config
replicas: 3
env:
  NODE_ENV: production
  LOG_LEVEL: warn
ports:
  - 8080
  - 8443

The gotchas that will bite you

YAML has some infamous quirks that catch people off guard:

# The Norway problem
country: NO        # parsed as boolean false, not the string "NO"
country: "NO"      # this is the string "NO"

# Version numbers
version: 3.10      # parsed as float 3.1, not string "3.10"
version: "3.10"    # this keeps the trailing zero

# Yes/no are booleans
feature_flag: yes  # parsed as boolean true
feature_flag: "yes" # this is the string "yes"

# Indentation matters
parent:
  child: value     # 2-space indent = nested under parent
 child: value      # 1-space indent = YAML parse error

These aren't edge cases -- they're the kind of bugs that show up in production Kubernetes deployments. The YAML 1.2 spec fixed the boolean issue (only true/false), but many parsers still follow YAML 1.1.

Validate your YAML before deploying with the YAML validator. Need to convert between formats? The JSON to YAML converter handles the translation.

CSV: Flat Data, Maximum Compatibility

CSV is the simplest format here. Rows of data, comma-separated values, one record per line. It's the universal import/export format for spreadsheets, databases, and data pipelines.

name,email,role,start_date
Jane Doe,jane@example.com,admin,2024-01-15
Bob Smith,"Smith, Bob",user,2024-03-01
Alice Johnson,alice@example.com,editor,2024-02-20

Where CSV wins

  • Tabular data exports from databases and analytics tools.
  • Data interchange with non-technical users (everyone can open it in Excel).
  • Streaming large datasets -- you can process line by line without loading everything into memory.
  • Log files and time-series data where each row is an event.

Where CSV falls apart

  • No nested data. You can't represent an object with child objects. You either flatten the structure or use a different format.
  • No types. Everything is a string. The consumer has to guess whether "42" is a number or a zip code.
  • Quoting is a mess. What happens when a field contains a comma? Or a newline? Or a quote? RFC 4180 defines the rules, but not every implementation follows them.
  • No standard encoding. Excel opens CSVs as Latin-1 by default on some systems. Your data pipeline reads them as UTF-8. Characters break.

If you need to transform CSV data into something more structured, the CSV to JSON converter handles the common case of turning rows into an array of objects.

XML: Verbose, Powerful, Still Everywhere

XML gets a bad reputation in the JSON era, but it solves problems JSON doesn't. Namespaces, schemas (XSD), mixed content (text with inline markup), and processing instructions are all built in.

<?xml version="1.0" encoding="UTF-8"?>
<deployment xmlns="http://example.com/deploy">
  <name>deployment-config</name>
  <replicas>3</replicas>
  <env>
    <var name="NODE_ENV">production</var>
    <var name="LOG_LEVEL">warn</var>
  </env>
  <ports>
    <port>8080</port>
    <port>8443</port>
  </ports>
</deployment>

Where XML still dominates

  • Enterprise integrations: SOAP APIs, EDI, healthcare (HL7/FHIR), financial services (FIX, XBRL). These aren't going away.
  • Document formats: XHTML, SVG, RSS/Atom feeds, EPUB, Office Open XML (.docx, .xlsx).
  • Configuration with schemas: Maven's pom.xml, Android manifests, Spring config. XSD validation catches errors before deployment.
  • Mixed content: When you need text with inline markup (like HTML), XML is the natural fit. JSON can't do this cleanly.

Format messy XML with the XML formatter.

When to Convert Between Formats

Format conversion comes up in predictable situations:

  • API response to config file: You fetch JSON from an API and need it in YAML for a Kubernetes manifest. Convert with JSON to YAML.
  • Spreadsheet to API: Someone gives you a CSV export and your system ingests JSON. Convert with CSV to JSON.
  • Legacy integration: A partner sends XML but your microservice speaks JSON. Parse the XML, map the fields, serialize to JSON.
  • Human review: You have a huge JSON blob and need to scan it quickly. YAML's cleaner indentation can make it more readable for review.

Decision Cheat Sheet

Use CaseFormatWhy
Web APIJSONUniversal support, native to JS
Config file (with comments)YAMLReadable, supports comments
Tabular data / spreadsheetCSVSimple, universally importable
Enterprise / document markupXMLSchemas, namespaces, mixed content
Config file (no comments needed)JSONSimpler parsing, no indent issues
Data pipeline / ETLCSV or JSON LinesStreamable, line-by-line processing

There's no universally "best" format. JSON is the safe default for APIs. YAML is the safe default for config. CSV is the safe default for tabular data. XML is the safe default when your enterprise partner says "we use XML." Pick the one that fits your use case and move on.

More Articles

UK Take-Home Pay 2025/26: What Changed and What It Means for You

A breakdown of the 2025/26 tax year changes, frozen thresholds, fiscal drag, and worked examples showing your actual take-home pay at every salary level.

Base64, URL Encoding, and JWT: A Developer's Quick Reference

When to use Base64 vs URL encoding, how JWTs actually work, and the encoding mistakes that break production APIs.

MD5, SHA-256, bcrypt: Which Hash Do You Actually Need?

A no-nonsense breakdown of hash functions. What each one does, when to use it, and the one you should never use for passwords.

Regex Cheat Sheet: Common Patterns Every Developer Needs

A practical reference of regex patterns for email validation, password strength, phone numbers, dates, and more. Copy-paste examples included.

Understanding Cron Expressions: A Practical Guide

Learn cron syntax from the ground up. Covers the five-field format, common schedules, platform differences, and the mistakes that trip people up.

BMI Chart: What Your Number Actually Means

A clear explanation of BMI categories, the formula behind the number, the limitations of BMI, and when other health metrics are more useful.

Compound Interest Explained: How Your Money Grows Over Time

The compound interest formula broken down with real examples, the Rule of 72, and practical strategies for making compounding work in your favor.

UUID vs Nano ID vs CUID: Choosing the Right ID Format

Compare UUID v4, UUID v7, Nano ID, and CUID2. Learn the tradeoffs around length, sortability, collision resistance, and database performance.

Will You Ever Repay Your Student Loan? Here's How to Find Out

Most Plan 2 borrowers will never repay in full. Learn how UK student loan repayment works, see worked examples at different salaries, and find out whether overpaying makes sense.

Complete Guide to Metric and Imperial Conversions

Reference tables and mental math tricks for converting between metric and imperial units. Covers length, weight, temperature, volume, and area.