Skip to content

File Parsing

January 9, 2025
August 1, 2018

Python network packet dissection frameworks shootout: Scapy vs Construct vs Hachoir vs Kaitai Struct – Adventures in Python

wader/fq: jq for binary formats

Kaitai

Kaitai Struct: declarative binary format parsing language
kaitai-io/awesome-kaitai: A curated list of Kaitai Struct tools and resources

Kaitai Struct: documentation
Kaitai Struct User Guide
Kaitai Struct: KSY reference

Kaitai Struct: declarative binary format parsing language

kaitai-struct-compiler

bin/kaitai-struct-compiler -no-version-check

Root level keys:

Examples

File Format Gallery for Kaitai Struct

.zip file format format spec for Kaitai Struct
.bmp file format format spec for Kaitai Struct
Microsoft Windows icon file format spec for Kaitai Struct
quicktime_mov format spec for Kaitai Struct
TrueType Font File format spec for Kaitai Struct

Digital Imaging and Communications in Medicine (DICOM) file format format spec for Kaitai Struct
kaitai-io/dicom.ksy: DICOM (Digital Imaging and Communications in Medicine) file format spec for Kaitai Struct

MS-XSLX spec (pdf)

Point Cloud

The PCD (Point Cloud Data) file format - Point Cloud Library (PCL)
PLY (file format) - Wikiwand
Wavefront .obj file - Wikiwand
Point Cloud XYZ (POINTCLOUDXYZ) Reader/Writer
STL (file format) - Wikiwand

glTF - Wikiwand
glTF Overview - The Khronos Group Inc

The Alliance for OpenUSD (AOUSD)
USD Home — Universal Scene Description documentation
USD Tutorials — Universal Scene Description documentation
Pixar Universal Scene Description USD | NVIDIA Developer
Working with USD Python Libraries | NVIDIA Developer
USD Python API Notes | NVIDIA Developer
Pixar USD Python API · kiryha/Houdini Wiki · GitHub

Apple、NVIDIA 等公司成立 Alliance for OpenUSD 推動 3D 內容開源標準 - 香港 unwire.hk
Did Pixar Just Change the Future of 3D Forever? - YouTube
The Most Valuable File Format You've Never Heard Of - YouTube

BIM

File formats for BIM - Designing Buildings

RVT: Autodesk Revit, .rda, .rte
NWD: Autodesk Navisworks, .nwc, .nwf, .nwd
DWG: AutoCAD, .dxf, .dwg

OpenIFC: Industry Foundation Classes, .ifc, .ifcxml, .ifczip
BCF: BIM Collaboration Format, .bcfzip
COBie: Construction Operations Building information exchange, .xml

OpenIFC Model Repository

bimfag/intro-python-bim: Basic Course in Python for use with BIM

PDF

PDF - Wikiwand
PostScript - Wikiwand
c++ - PDF specifications for coders: Adobe or ISO? - Stack Overflow
PDF File Format - What is a PDF file?
PDF file format: Basic structure [updated 2020] - Infosec Resources
ISO 32000 (PDF) – PDF Association
PDF Specification Index – PDF Association
Glossary of PDF terms – PDF Association
PDF 32000-1:2008 1.7 2008
PDF Reference, version 1.7 1.7 2006
PDF Reference, Third Edition 1.4 2001

Best tool for inspecting PDF files? - Stack Overflow
Home | veraPDF

PDF Tools | Didier Stevens
Quickpost: About the Physical and Logical Structure of PDF Files | Didier Stevens

QPDF: A Content-Preserving PDF Transformation System

pdfminer/pdfminer.six: Community maintained fork of pdfminer - we fathom PDF

dzzie/pdfstreamdumper: research tool for the analysis of malicious pdf documents. make sure to run the installer first to get all of the 3rd party dlls installed correctly. Windows App


Python

7.1. struct — Interpret bytes as packed binary data — Python documentation

Working with Binary Data in Python | DevDungeon

4. int.from_bytes — Python documentation
4. int.to_bytes — Python documentation

7.1. struct — Interpret bytes as packed binary data — Python documentation

Construct — Construct documentation

Welcome to Hachoir’s documentation! — Hachoir documentation

Features — bitstring documentation

Sepero/SearchBin: Search within binary files for a string, hex, or even another binary file

JavaScript

francisrstokes/construct-js: 🛠️A library for creating byte level data structures.

SharedArrayBuffer vs Int8Array