Table of Contents

Building md-utils: Architecture, Parsing, and Working with AI Coding Agents

md-utils is a CLI and Swift library for structurally manipulating Markdown files — frontmatter CRUD, section extraction and reordering, heading adjustment, wikilink resolution, and more. This post is about how it was built: the architecture decisions, the parsing strategy, the dependency choices, the testing approach, and how AI coding agents fit into the development workflow.

Motivation

I manage a large Obsidian vault. Over time I kept running into the same friction: I needed to batch-update frontmatter fields, check for broken wikilinks, extract sections, reorder content. Obsidian is great as an editor, but it doesn’t expose these operations as scriptable commands. And the existing Markdown CLI tools — pandoc, remark, marked — are renderers. They convert Markdown to other formats. They don’t help you manipulate Markdown as Markdown.

What I wanted was a tool that understands Markdown as a structured document: YAML frontmatter as typed data, headings as a hierarchy that defines sections, wikilinks as resolvable references. And I wanted it as both a CLI for scripting and a library for programmatic use.

Architecture Decisions

Two Products, One Repository

md-utils ships two products from a single Swift package:

MarkdownUtilities — a library with zero CLI dependencies
md-utils — a CLI built on top of that library

This separation is intentional. The library knows nothing about ArgumentParser, terminal I/O, or file system traversal for batch processing. It operates on strings and data structures. The CLI handles argument parsing, file discovery, output formatting, and error reporting.

The practical benefit: anyone can add MarkdownUtilities as an SPM dependency and get frontmatter parsing, section extraction, wikilink resolution, and everything else without pulling in CLI infrastructure.

MarkdownDocument: The Central Type

Everything flows through MarkdownDocument:

public struct MarkdownDocument: @unchecked Sendable {
    public var frontMatter: Yams.Node.Mapping
    public var body: String
}

Two fields. That’s it. The frontmatter is a Yams.Node.Mapping and the body is a String. All operations — frontmatter mutation, section extraction, heading adjustment, wikilink scanning — are extensions on this type.

Why Yams.Node.Mapping Instead of [String: Any]

This was a deliberate choice. When you parse YAML into [String: Any], you lose information:

Key ordering — YAML mappings have meaningful order. [String: Any] (a Dictionary) doesn’t preserve insertion order. If someone carefully ordered their frontmatter keys (title, author, date, tags), you want to preserve that.
YAML tags and structure — Yams.Node preserves the full YAML representation: scalar styles, tags, comments adjacent to nodes. Round-tripping through [String: Any] destroys this.
Type fidelity — Any requires casting everywhere. Node.Mapping gives you Node values that you can pattern-match on (.scalar, .mapping, .sequence).

The tradeoff is that Node.Mapping is less ergonomic than a dictionary for simple lookups. But for a tool that reads, modifies, and writes back frontmatter, preserving structure matters more than convenience.

Feature-Focused Module Organization

The library is organized by feature, not by layer:

Sources/MarkdownUtilities/
├── MarkdownDocument.swift
├── FrontMatter/
├── TOC/
├── FormatConversion/
├── HeadingAdjustment/
├── SectionExtraction/
├── SectionReordering/
├── Wikilink/
├── FileMetadata/
└── Helpers/

Each directory contains the types, parsers, and MarkdownDocument extensions for that feature. FrontMatter/ has FrontMatterParser, YAMLConversion, and the mutation extensions. Wikilink/ has the parser, scanner, resolver, and document extensions. You can understand a feature by reading one directory.

Dependency Choices

Six direct dependencies, each chosen for a specific reason:

swift-parsing (Point-Free)

Used for FrontMatterParser and WikilinkParser. Point-Free’s swift-parsing library provides declarative, composable parser combinators. Compare this to the regex alternative:

// swift-parsing: declarative, composable, testable
private var frontMatterOnlyParser: some Parser<Substring, String> {
    Parse {
        "---\n"
        PrefixUpTo("---").map { String($0) }
        "---"
        Optionally { "\n" }
    }
    .map { (frontMatter, _) in frontMatter }
}

vs.

// Regex: fragile, hard to extend
let pattern = /^---\n([\s\S]*?)---\n?/

The parser combinator approach composes. WikilinkParser handles [[target]], [[target|display]], ![[embed]], [[page#heading]], [[page#^blockID]], and combinations — all built from small, testable pieces. The regex for the same grammar would be a maintenance nightmare.

MarkdownSyntax (wrapping swift-cmark)

MarkdownSyntax provides AST parsing — turning Markdown body text into a tree of headings, paragraphs, code blocks, links, etc. It wraps the C swift-cmark library (the reference CommonMark parser) with Swift-native types. Used for TOC generation, heading adjustment, and format conversion.

Yams

Yams is the standard YAML library in the Swift ecosystem. Mature, widely used (SwiftLint depends on it), and critically, it exposes the Node type that preserves YAML structure. The frontmatter pipeline is: raw string → Yams.compose() → Node.Mapping.

swift-argument-parser

Apple’s swift-argument-parser is the standard choice for Swift CLIs. It supports async commands (AsyncParsableCommand), which md-utils needs for operations that go through the async MarkdownSyntax parser.

PathKit

PathKit provides a clean Path type for file system operations. Used throughout the CLI for batch processing — directory traversal, extension filtering, path resolution.

jmespath.swift

jmespath.swift implements JMESPath, a query language for JSON. Powers the fm search command, which lets you find files whose frontmatter matches a query expression. This is a CLI-only dependency — the library doesn’t depend on it.

How Text is Parsed

md-utils has multiple parsing layers, each handling a different level of the document:

Layer 1: FrontMatter Separation

FrontMatterParser (a swift-parsing parser) splits the raw document into two strings: the YAML frontmatter content (between --- delimiters) and the body (everything after).

---              ← opening delimiter
title: Hello     ← raw frontmatter string
tags: [a, b]
---              ← closing delimiter
# Body           ← body string
Content here

If there’s no opening ---, the entire document is body and frontmatter is empty.

Layer 2: YAML Parsing

YAMLConversion passes the raw frontmatter string to Yams.compose(), which returns a Yams.Node. We then validate it’s a .mapping (not a scalar or sequence — frontmatter should always be key-value pairs) and extract the Node.Mapping.

Layer 3: Wikilink Parsing

WikilinkParser is another swift-parsing combinator that handles the full Obsidian wikilink grammar:

[[target]] — basic link
[[target|display text]] — aliased link
[[page#heading]] — heading anchor
[[page#^blockID]] — block reference
![[embed]] — embedded content
Escaped pipes (\|) in targets

WikilinkScanner uses this parser to find all wikilinks in a string. WikilinkResolver takes a vault root directory, builds a file index, and resolves each wikilink target against it — detecting broken links (no match) and ambiguous links (multiple matches).

Layer 4: Markdown AST

MarkdownSyntax (wrapping swift-cmark) parses the body text into a full AST. This powers TOC generation (extracting heading hierarchy), heading adjustment (modifying heading levels), and format conversion (walking the tree to produce plain text or CSV). AST parsing is async because MarkdownSyntax uses async APIs internally.

Testing with Swift Testing

md-utils has 718 tests, all using Apple’s native Swift Testing framework — not XCTest.

Why Swift Testing

Swift Testing is newer, more expressive, and better aligned with modern Swift. The key advantages for this project:

#expect and #require macros — cleaner assertion syntax with better failure messages than XCTAssertEqual
Backtick naming — test functions named with backticks read as documentation
@Suite grouping — logical test organization without subclassing
try #require() — safe optional unwrapping that fails the test with a clear message, replacing XCTUnwrap

Test Style

Tests use backtick identifiers for readable names:

@Suite("FrontMatterParser Tests")
struct FrontMatterParserTests {

    @Test
    func `Parse document with valid frontmatter`() async throws {
        let content = "---\ntitle: Hello\n---\n# Body"
        let doc = try MarkdownDocument(content: content)
        let title = try #require(doc.getValue(forKey: "title"))
        #expect(title == "Hello")
    }

    @Test
    func `Parse document with no frontmatter`() async throws {
        let content = "# Just a heading"
        let doc = try MarkdownDocument(content: content)
        #expect(doc.frontMatter.isEmpty)
        #expect(doc.body == content)
    }
}

Every test is marked async throws even if it doesn’t use async operations — this keeps the signature uniform and avoids refactoring when a test later needs to call an async API.

CLI Design Patterns

GlobalOptions via @OptionGroup

Every CLI command includes shared options through a single @OptionGroup:

struct MyCommand: AsyncParsableCommand {
    @OptionGroup var options: GlobalOptions
    // command-specific arguments...
}

GlobalOptions provides paths, recursive, includeHidden, extensions, and noSort. The resolvedPaths() method expands directories, applies filters, and returns the final list of files to process. This means every command gets batch processing for free — point it at a directory and it just works.

Consistent Batch Processing

All commands follow the same pattern: resolve paths, iterate, process each file. Commands that modify files have --in-place flags. Commands that output data handle single-file output (direct) and multi-file output (cat-style headers) automatically.

Stdin Support

Commands accept piped input, so you can compose md-utils with other tools:

md-utils extract --name "API" doc.md | md-utils convert to-text

Using AI Coding Agents

AI coding agents were one tool in the development workflow for md-utils, alongside the compiler, the test suite, and documentation. I used Claude Code as the primary agent for this project. Other agents in the same category — OpenAI’s Codex, Google’s Gemini Code Assist, OpenCode — are worth experimenting with. This is a rapidly evolving space and no single tool has a lock on it.

What Worked Well

Rapid feature scaffolding. When adding a new command — say fm array append — the agent could generate the AsyncParsableCommand struct, the argument declarations, the run() method, and a full test suite in one pass. The boilerplate-to-logic ratio in CLI commands is high, and agents handle boilerplate well.

Test generation. Given an implementation, the agent could produce comprehensive test cases covering happy paths, edge cases, and error conditions. The 718 tests in the project were largely agent-generated, then reviewed and adjusted.

Exploring unfamiliar APIs. I hadn’t used Point-Free’s swift-parsing library before. The agent could write parser combinators, explain how PrefixUpTo and Parse compose, and generate working parsers faster than I could have by reading docs alone.

Friction Points and How They Were Addressed

Force unwrap (!) habit. The agent consistently used ! to unwrap optionals — dictionary["key"]!, array.first!. In Swift, force unwrapping is a runtime crash waiting to happen. I initially corrected this case by case, but it kept recurring. The fix: I elevated the prohibition to a “Critical Rule” in the project instructions, with explicit forbidden patterns and required alternatives. This mostly solved it, though occasional violations still needed catching.

XCTest vs Swift Testing. The agent defaulted to XCTest patterns: XCTAssertEqual, XCTUnwrap, func testSomething(). Swift Testing uses completely different macros (#expect, try #require) and conventions (backtick naming, @Test attribute). I had to write explicit documentation listing every forbidden XCTest pattern alongside its Swift Testing replacement. Once this was in the project docs, compliance improved significantly.

Context window limits. As the codebase grew, the agent would lose track of project conventions established earlier in the session. A pattern that was corrected in one file would reappear in the next. This is a fundamental constraint of current LLMs — they work within a finite context window. The solution was better project documentation (see “The CLAUDE.md Progressive Disclosure Pattern” below).

Third-Party Skills and Customization

I installed third-party skills to improve the agent’s Swift knowledge:

Antoine van der Lee’s Swift best practices — a skill that teaches Swift idioms and patterns
Swift Testing expert — specialized knowledge for the Swift Testing framework
Swift language reference — the complete Swift Programming Language book as a skill

I also created custom slash commands (like /update-claude-md) for common workflows, and configured a permission allowlist that whitelisted swift build, swift test, and swift run while keeping destructive operations gated behind confirmation prompts.

The CLAUDE.md Progressive Disclosure Pattern

This is a general-purpose technique for managing AI agent context that I developed while building md-utils. It’s applicable to any AI coding agent, not just Claude Code.

The problem: AI agents have finite context windows. If you front-load all your project documentation into the agent’s context, you waste tokens on information that isn’t relevant to the current task. If you provide too little, the agent makes incorrect assumptions.

The solution: progressive disclosure. Structure your project instructions as a small root file that links to detailed topic documents.

md-utils uses this structure:

CLAUDE.md                    ← 48 lines, loaded every session
docs/
├── architecture.md          ← project structure, core types, dependencies
├── testing-standards.md     ← Swift Testing conventions, patterns
├── swift-coding-standards.md ← language rules, forbidden patterns
├── cli-patterns.md          ← command structure, argument parsing
├── development-workflow.md  ← feature process, commit checklist
├── common-use-cases.md      ← CLI examples and recipes
└── release-procedures.md    ← versioning and release process

The root CLAUDE.md is intentionally kept to 48 lines. It contains:

Build and test commands
The one critical rule (no force unwrapping)
Links to the seven detail documents

The agent loads CLAUDE.md on every session. It only pulls in detail documents when the task requires them — writing tests triggers reading testing-standards.md, adding a CLI command triggers reading cli-patterns.md and architecture.md. This keeps the context window focused on what matters for the current task.

Why this works better than a single large file:

Relevance filtering — the agent only loads context it needs
Maintainability — updating testing conventions means editing one file, not searching through a monolithic document
Scalability — as the project grows, you add new topic files without bloating the root
Human readability — the docs are useful for human contributors too, not just agents

This pattern isn’t specific to Claude Code. Any AI coding agent that reads project files can benefit from this structure. The key insight is that the agent’s context window is a resource to be managed, just like memory or CPU. Progressive disclosure is the technique for managing it.

What’s Next

md-utils v0.1.0 is the first public release. The API is not yet stable, and there’s a clear roadmap ahead:

Link validation — checking URLs and reference links, not just wikilinks
More format conversions — HTML, RTF, and XML (the converter protocol infrastructure is already in place)
Markdown flavor validation — CommonMark, GFM, and Obsidian compliance checking
File metadata writing — the read side is done
LLM agent skill — exposing md-utils as a tool that AI agents can call directly

On the tooling side, I plan to continue experimenting with different AI coding agents as the space evolves. The workflow patterns I’ve described — progressive disclosure, critical rules, explicit documentation of conventions — are transferable across agents. The specific agent matters less than the discipline of clearly specifying what you want.

The project is open source at github.com/DandyLyons/md-utils.

This was written by Daniel Lyons.

If you'd like to support him, please consider buying him a coffee so he can create more content like this.