TODO: merge
Dropbox/caravan/caravan/interchange-format
Comparison of data-serialization formats - Wikiwand
RFC 8949 - Comparison of Other Binary Formats to CBOR's Design Objectives
deserialization - Performant Entity Serialization: BSON vs MessagePack (vs JSON) - Stack Overflow
Comparing speed and size of to_csv(), np.save(), to_hdf(), to_pickle() | Towards Data Science
The Best Format to Save Pandas Data | by Ilia Zaitsev | Towards Data Science
devforfu/pandas-formats-benchmark: A little benchmark comparing Pandas data frames serialization formats
Graphtage Documentation
trailofbits/graphtage: A semantic diff utility and library for tree-like files such as JSON, JSON5, XML, HTML, YAML, and CSV.
IDL
Interface description language - Wikiwand
JSON schema
JSON Schema | The home of JSON Schema
Understanding JSON Schema — Understanding JSON Schema 7.0 documentation
Structuring a complex schema — Understanding JSON Schema 7.0 documentation defining types
Combining schemas — Understanding JSON Schema 7.0 documentation
Applying subschemas conditionally — Understanding JSON Schema 7.0 documentation
jsonschema - Json Schema file extension - Stack Overflow .json
with mime type application/schema+json
fastify/fluent-json-schema: A fluent API to generate JSON schemas ❗!important
sinclairzx81/typebox: JSON Schema Type Builder with Static Type Resolution for TypeScript ❗!important, sharing definition between JSON Schema and TypeScript
JSON Schema Tool playground, infer schema from JSON
JSON Schema Lint :: JSON Schema Validator
Fake your JSON-Schemas!
How to do inheritance? · Issue #348 · json-schema-org/json-schema-spec
Explain why inheritance isn't the right model · Issue #148 · json-schema-org/json-schema-org.github.io
jsonSchema attribute conditionally required - Stack Overflow
jsonschema - JSON schema: conditional dependency - Stack Overflow
Validators
JSON Schema Validation: A Vocabulary for Structural Validation of JSON
draft-bhutton-json-schema-validation-00 - JSON Schema Validation: A Vocabulary for Structural Validation of JSON
Python
jsonschema — jsonschema 3.2.0 documentation
keleshev/schema: Schema validation just got Pythonic
TypeScript
Ajv JSON schema validator
ajv-validator/ajv: The fastest JSON schema Validator. Supports JSON Schema draft-04/06/07/2019-09/2020-12 and JSON Type Definition (RFC8927)
cypress-io/schema-tools: Validate, sanitize and document JSON schemas
JavaScript Validators
samchon/typescript-json: Super-fast Runtime type checkers (validators) and JSON.stringify() function TSON
, zod is slow
Comparing schema validation libraries: Zod vs. Yup - LogRocket Blog Yup is pre-TypeScript
jquense/yup: Dead simple Object schema validation
colinhacks/zod: TypeScript-first schema validation with static type inference
Zod Tutorial | Total TypeScript
Learn "Zod" In 5 Minutes - DEV Community
mattkingshott/iodine: A micro JavaScript validation library.
Introduction - Superstruct
ianstormtaylor/superstruct: A simple and composable way to validate data in JavaScript (and TypeScript).
Vest - Declarative Validations validate like writing test
flowstudio/datalize: Parameter, query, form data validation and filtering for NodeJS.
Node.js Form Validation Using Datalize | Toptal
philipnilsson/bueno: Composable validators for forms, API:s in TypeScript
Joi
joi.dev
sideway/joi: The most powerful data validation library for JS
v16.0.0 Release Notes · Issue #2037 · sideway/joi joi 16 is a rewrite
joi.dev - API Reference
RunKit + npm: joi
joi.dev - Schema Tester
tlivings/enjoi: Converts a JSON schema to a Joi schema.
joi/test at master · sideway/joi
What I’ve Learned Validating with Joi – ITNEXT
What I've Learned Validating with Joi — Futurice
Node API Schema Validation with Joi ― Scotch
Joi for Node: Exploring Javascript Object Schema Validation
Joi — awesome code validation for Node.js and Express - DEV Community 👩💻👨💻
Handling Joi validation errors in Hapi 17 – Piotr Karpala – Medium ❗!important, return validation error to client
Expressing complex logic in when()
· Issue #1663 · sideway/joi use when()
on schema
Customize error message
joi/API.md#list-of-errors at master · sideway/joi
Node.js + Joi how to display a custom error messages? - Stack Overflow
Joi validation error does not provide detailed information in response · Issue #3706 · hapijs/hapi in Hapi, failAction()
is the best
Go Validators
JSON
RFC 8259 - The JavaScript Object Notation (JSON) Data Interchange Format
A beginner's guide to JSON, the data format for the internet - Stack Overflow Blog
8259 JSON
6901 JSON Pointer
6902 JSON Patch
JSON ABC - Sort JSON Alphabetically
JSON Sorter - Sort JSON keys online allows comments
Convert JSON to Swift, C#, TypeScript, Objective-C, Go, Java, C++ and more • quicktype
A first look at quicktype
JSON-LD - JSON for Linking Data
digitalbazaar/jsonld.js: A JSON-LD Processor and API implementation in JavaScript JS
Creating semantic sites with Web Components and JSON-LD - Chrome for Developers
msgspec
Faster, more memory-efficient Python JSON parsing with msgspec
ICRAR/ijson: Iterative JSON parser with Pythonic interfaces
Processing large JSON files in Python without running out of memory
JSON for Modern C++ - JSON for Modern C++
nlohmann/json: JSON for Modern C++
JSON serializers
lxsmnsyc/seroval: Stringify JS values JS
fastify/fast-json-stringify: 2x faster than JSON.stringify() JS
ijl/orjson: Fast, correct Python JSON library supporting dataclasses, datetimes, and numpy Python
goccy/go-json: Fast JSON encoder/decoder compatible with encoding/json for Go Go
JSON streaming
JSON streaming - Wikiwand
JSON Lines
ndjson/ndjson.js: Streaming line delimited json parser + serializer
# NDJSON to JSON
jq --slurp . a.ndjson > a.json
# JSON to NDJSON
jq -c .[] a.json > a.ndjson
Binary Serialization
Binary Formats - JSON for Modern C++
BSON (Binary JSON) Serialization MongoDB, in-place update, designed for storage and lookup
JSON and BSON | MongoDB
BSON Types — MongoDB Manual
mongodb/js-bson: BSON Parser for node and browser
bson package - go.mongodb.org/mongo-driver/bson - Go Packages
CBOR — Concise Binary Object Representation | Overview Web Assembly, based-on MsgPack, supports partial decode, designed for network communication
RFC 8949 - Concise Binary Object Representation (CBOR)
RFC 8610 - Concise Data Definition Language (CDDL): A Notational Convention to Express Concise Binary Object Representation (CBOR) and JSON Data Structures
Base58 Encoder / Decoder Online - AppDevTools
toravir/csd: CBOR Stream Decoder
TOML
tomllib — Parse TOML files — Python 3 documentation Python 3.11+, essentially tomli
hukkin/tomli: A lil' TOML parser
sdispater/tomlkit: Style-preserving TOML library for Python
Python and TOML: New Best Friends – Real Python
Taplo | A versatile TOML toolkit.
YAML
PyYAML Documentation
YAML: The Missing Battery in Python – Real Python
CUE
CUE
Introduction | CUE
Documentation | CUE
CUE Playground
Cuetorials
CUE is an exciting configuration language — Bitfield Consulting
Configuring Kubernetes with CUE · garethr.dev
CUE: a data constraint language and shoo-in for Go. Marcel van Lohuizen, Google. - YouTube
GopherCon Europe 2020: Marcel van Lohuizen - Better APIs with Shareable Validation Logic - YouTube
Pkl
Pkl :: Pkl Docs
apple/pkl: A configuration as code language with rich validation and tooling.
Introducing Pkl, a programming language for configuration :: Pkl Docs
Pkl: Apple's New JSON/YAML Killer (I actually want to use this...) - YouTube
Protocol Buffers (protobuf)
Protocol Buffers Documentation
Protocol Buffers Version 3 Language Specification | Protocol Buffers Documentation
Protocol Buffers - Wikiwand
Protocol Buffers Crash Course - YouTube
Protobuf - How Google Changed Data Serialization FOREVER - YouTube
Don't Use REST APIs in your Backend, Use gRPC - YouTube
Protocol Buffers, Part 1 — Serialization Library for Microservices
Protocol Buffers, Part 2 — The Untold Parts Of Using “Any”
Buf | Home The only Protobuf developer platform
MessagePack (msgpack)
MessagePack: It's like JSON. but fast and small.
MessagePack
supports partial decode, designed for network communication
msgpack/msgpack-python: MessagePack serializer implementation for Python msgpack.org[Python]
Go
MessagePack encoding for Go
msgpack package - github.com/vmihailenco/msgpack - Go Packages
vmihailenco/msgpack: msgpack.org[Go] MessagePack encoding for Golang
Node
keywords:messagepack - npm search
mattheworiordan/nodejs-encoding-benchmarks: Simple repo to benchmark performance of Node.js encoding libraries
msgpack/msgpack-javascript: @msgpack/msgpack - MessagePack for JavaScript/TypeScript/ECMA-262 / msgpack.org[JavaScript]
kawanet/msgpack-lite: Fast Pure JavaScript MessagePack Encoder and Decoder / msgpack.org[JavaScript]
Apache Arrow/Feather
PyArrow - Apache Arrow Python bindings — Apache Arrow v6.0.0
arrow package - github.com/apache/arrow/go/arrow - pkg.go.dev
Apache Arrow - v6.0.0
arrow/js at master · apache/arrow
Feather is now part of Apache Arrow
wesm/feather: Feather: fast, interoperable binary data frame storage for Python, R, and more powered by Apache Arrow
Feather: A Fast On-Disk Format for Data Frames for R and Python, powered by Apache Arrow - RStudio
Apache Thrift
Apache Thrift - Home
Reconciling GraphQL and Thrift at Airbnb - Airbnb Engineering & Data Science - Medium
Apache Parquet
Apache Parquet compatible to Pandas DataFrame
apache/parquet-format: Apache Parquet
Reading and Writing the Apache Parquet Format — Apache Arrow v6.0.1
xitongsys/parquet-go: pure golang library for reading/writing parquet file
Processing parquet files in Golang - DEV Community
Inspect Parquet from command line - Stack Overflow
parquet-tools · PyPI
parquet-cli · PyPI
pyarrow
also loads Parquet
Development update: High speed Apache Parquet in Python with Apache Arrow - Wes McKinney
Apache ORC
Apache ORC • High-Performance Columnar Storage for Hadoop
HDF
The HDF5® Library & File Format - The HDF Group
HDFGroup Documentation
Learning HDF5
Using the HDF5 Command-line Tools
Introduction to HDF5 | Quincey Koziol, The HDF Group - YouTube
Parallel HDF5 | Quincey Koziol, The HDF Group - YouTube
A Brief Introduction to HDF5
HDF Group - HDF5 old portal
https://support.hdfgroup.org/HDF5/docNewFeatures/SWMR/Design-HDF5-FileLocking.pdf
Parallel I/O – Why, How, and Where to? - The HDF Group
Cyrille Rossant - Moving away from HDF5
Cyrille Rossant - Should you use HDF5?
On HDF5 and the future of data management
HDF5 for Python — h5py documentation
Save Pandas objects to HDF5 - DEV Community
gonum/hdf5: hdf5 is a wrapper for the HDF5 library
hdf5 package - gonum.org/v1/hdf5 - pkg.go.dev
CDF
Unidata | NetCDF
NetCDF Why and How: Creating Publication Quality NetCDF Datasets - YouTube
Tutorial - Introduction to the NetCDF format - YouTube
Visualising data in NetCDF format - YouTube
ASDF
ASDF Standard — ASDF Standard documentation
FlatBuffers
FlatBuffers: FlatBuffers very similar to Protobuf
What's the difference between Protocol Buffers and Flatbuffers? - Stack Overflow
JSON vs Protocol Buffers vs FlatBuffers | by Kartik Khare | codeburst
Protocol Buffers is indeed relatively similar to FlatBuffers, with the primary difference being that FlatBuffers does not need a parsing/ unpacking step to a secondary representation before you can access data, often coupled with per-object memory allocation. The code is an order of magnitude bigger, too. Protocol Buffers has no optional text import/export.
FlatBuffers: Use in C#
FlatBuffers: Use in JavaScript
FlatBuffers: Use in Python
Cap'n Proto
Cap'n Proto: Introduction zero copy
Cap'n Proto: Cap'n Proto, FlatBuffers, and SBE
Simple Binary Encoding
Mechanical Sympathy: Simple Binary Encoding
real-logic.github.io/simple-binary-encoding
real-logic/simple-binary-encoding: Simple Binary Encoding (SBE) - High Performance Message Codec
BaseN encoding
multiformats/multibase: Self identifying base encodings
RFC 4648 - The Base16, Base32, and Base64 Data Encodings
draft-msporny-base58-03 base58btc
multibase/rfcs at master · multiformats/multibase
C++
What's the most mature JSON library for C++? Support for JSON Schema is a plus. - Quora
miloyip/nativejson-benchmark: C/C++ JSON parser/generator benchmark
cereal Docs - Main
USCiLab/cereal: A C++11 library for serialization
RapidJSON: Main Page
Tencent/rapidjson: A fast JSON parser/generator for C++ with both SAX/DOM style API
Martchus/reflective-rapidjson: Code generator for serializing/deserializing C++ objects to/from JSON using Clang and RapidJSON
JSON for Modern C++: JSON for Modern C++
nlohmann/json: JSON for Modern C++
pboettch/json-schema-validator: JSON schema validator for JSON for Modern C++
Jansson — C library for working with JSON data
Jansson Documentation — Jansson documentation
akheron/jansson: C library for encoding, decoding and manipulating JSON data
C#
Serializing JSON Data into Binary Form | DotNetCurry
Rust
Serde Serialization framework for Rust GitHub
Rust devs push back as Serde project ships precompiled binaries
TimelyDataflow/abomonation: A mortifying serialization library for Rust works even with pointers
Go
fatih/gomodifytags: Go tool to modify struct field tags for JSON serialization
Several ways of serialization and deserialization of golang | Develop Paper
Ellerbach/Golang-Json-serialize-deserialize: Go (Golang) Json serialization and deserialization practices
smallnest/gosercomp: Golang Serializer Benchmark Comparison
gob package - encoding/gob - pkg.go.dev native codec for Go
glycerine/zebrapack: ZebraPack format is like gobs version 2: serialization in Go, but extremely fast and friendly to other languages. Use Go as your schema. Strong typing. Well documented (and msgpack2 compatible) format so other languages can be readily supported. See also https://github.com/glycerine/greenpack for a more recent alternative. Docs:
glycerine/greenpack: Cross-language serialization for Golang: greenpack adds versioning, stronger typing, and optional schema atop msgpack2. greenpack -msgpack2
produces classic msgpack2, and handles nils. Cousin to ZebraPack (https://github.com/glycerine/zebrapack), greenpack's advantage is fully self-describing data. Oh, and faster than protobufs.
microhq/go-bson: A copy of youtube/vitess/go/bson
Java
Java Object Serialization Specification: Contents
java.io A node implementation
jdeserialize
The Java serialization algorithm revealed | JavaWorld
5 things you didn't know about ... Java Object Serialization
Serialization and Deserialization in Java example using Serializable Interface | CodinGeek transient
field will not be serialized
Kotlinx
An Extensive Kotlinx Serializer Library For Serialization | Android | Kotlin
Python
pickle — Python object serialization — Python documentation pickle binds to specific Python version, not cross-compatible
Pickle’s nine flaws | Ned Batchelder
Serialization and Deserialization of Python Objects: Part 1
Serialization and Deserialization of Python Objects: Part 2
Object serialization in Python ~ The Python Corner
marshmallow: simplified object serialization — marshmallow documentation
marshmallow-code/marshmallow: A lightweight library for converting complex objects to and from simple Python datatypes.
serpy: ridiculously fast object serialization — serpy documentation
JSON Serialization in Python using serpy – Twilio Cloud Communications Blog
Working With JSON Data in Python – Real Python
Reading and Writing JSON in Python - The Python Guru
Better Python Object Serialization · Homepage of Hynek Schlawack
Efficiently Store Pandas DataFrames
Python Validators
keleshev/schema: Schema validation just got Pythonic
Introduction to Schema: A Python Libary to Validate your Data | by Khuyen Tran | Towards Data Science
Data-science/schema.ipynb at master · khuyentran1401/Data-science
Welcome to Cerberus — Cerberus is a lightweight and extensible data validation library for Python
Do Not Use If-Else For Validating Data Objects In Python Anymore | by Christopher Tao | May, 2022 | Towards Data Science