Skip to content
Architecture Decisions

Architecture Decisions

This page documents every significant architecture decision made during the development of the GoFHIR Validator. We follow the Architecture Decision Record (ADR) format: each decision captures the context (why the decision was needed), the decision itself (what was chosen), and the consequences (trade-offs and implications).

ADR Summary

ADRTitleStatus
ADR-001StructureDefinitions as Single Source of TruthAccepted
ADR-002Pipeline Architecture with PhasesAccepted
ADR-003Small Interfaces (1-2 methods)Accepted
ADR-004Functional Options for ConfigurationAccepted
ADR-005Use gofhir/fhirpath for FHIRPath EvaluationAccepted
ADR-006Use gofhir/fhir for Typed StructsAccepted
ADR-007Automatic Package Loading by VersionAccepted
ADR-008OperationOutcome with Location ExtensionsAccepted
ADR-009Incremental Development by MilestonesAccepted
ADR-010Centralized Message CatalogAccepted
ADR-011Lightweight Registry StructsAccepted
ADR-012BackboneElement Uses Root SDAccepted
ADR-013Choice Types with Case-Insensitive ComparisonAccepted
ADR-014Primitive Types from SD RegexAccepted
ADR-015FHIRPath Type Codes HandlingAccepted
ADR-016Three Independent Validation PhasesAccepted
ADR-017JSON Type Validation Before RegexAccepted
ADR-018Recursive Complex Type ValidationAccepted
ADR-019StructureDefinition Caching StrategyProposed
ADR-020Validation ParallelismProposed
ADR-021Snapshot GenerationProposed

Core Architecture (ADR-001 through ADR-004)

ADR-001: StructureDefinitions as Single Source of Truth

Context: FHIR defines hundreds of resource types, profiles, and extensions. Hardcoding validation rules for each one would be unmaintainable and version-specific.

Decision: All validation rules are derived at runtime from StructureDefinition snapshots. The validator reads ElementDefinition entries to determine cardinality, types, bindings, constraints, and every other rule. No resource-specific logic exists in the codebase.

Consequences: The validator supports any FHIR version or custom profile without code changes. The trade-off is that a valid StructureDefinition snapshot must be available for every type being validated.

ADR-002: Pipeline Architecture with Phases

Context: Validation involves many different categories of checks (structural, cardinality, terminology, FHIRPath constraints, etc.). Mixing them into a single pass would create tangled, hard-to-test code.

Decision: Validation is organized as a sequential pipeline of independent phases. Each phase is a separate package (pkg/structural, pkg/cardinality, etc.) with its own Validator type. The orchestrator in pkg/validator runs them in order and merges the results.

Consequences: Each phase can be developed, tested, and debugged in isolation. New phases can be added without modifying existing ones. The trade-off is that the resource tree is traversed multiple times (once per phase), but the clarity and maintainability benefits outweigh the performance cost.

ADR-003: Small Interfaces (1-2 methods)

Context: The validator needs extension points for terminology services, profile resolution, and logging. Large interfaces create tight coupling and are difficult to implement.

Decision: All interfaces in the project have at most two methods. Key examples: terminology.Provider (2 methods), registry.ProfileResolver (1 method). This follows Go’s standard library convention of small, composable interfaces.

Consequences: Interfaces are trivial to mock for testing and easy to wrap with middleware (caching, logging, metrics). Consumers only implement what they need. The trade-off is that complex operations may require combining multiple small interfaces rather than calling a single large one.

ADR-004: Functional Options for Configuration

Context: The validator needs configurable behavior (terminology provider, logger, disabled phases, etc.) but must remain simple to use with sensible defaults.

Decision: Use the functional options pattern (With* functions returning Option closures) for all construction-time configuration. validator.New() works with zero arguments, and options are added as needed.

Consequences: The API is clean, discoverable, and backward-compatible. Adding new configuration options never changes the New() function signature. The trade-off is a slightly higher initial learning curve for users unfamiliar with the pattern.


Dependencies (ADR-005 through ADR-006)

ADR-005: Use gofhir/fhirpath for FHIRPath Evaluation

Context: FHIR constraints (ElementDefinition.constraint) are expressed as FHIRPath expressions. Evaluating them requires a FHIRPath engine capable of handling FHIR-specific functions like resolve() and memberOf().

Decision: Use the gofhir/fhirpath library as the FHIRPath evaluation engine. It supports the FHIR-specific function set and integrates with Go’s type system.

Consequences: Constraint evaluation is fully functional with support for resolve(), memberOf(), and context-aware evaluation. The dependency is well-scoped and can be replaced if needed since it is accessed through an internal interface.

ADR-006: Use gofhir/fhir for Typed Structs

Context: The validator needs to represent FHIR data types (StructureDefinition, ElementDefinition, OperationOutcome, etc.) in Go code. Hand-writing these structs is error-prone and hard to keep in sync with the specification.

Decision: Use the gofhir/fhir library for typed Go structs that mirror FHIR resource definitions. These structs are used for loading StructureDefinitions and producing OperationOutcome results.

Consequences: Type safety at compile time and automatic JSON serialization/deserialization. The dependency is limited to data types and does not affect the validation logic itself.


Infrastructure (ADR-007 through ADR-011)

ADR-007: Automatic Package Loading by Version

Context: Users need to validate resources against the correct FHIR version (R4, R4B, R5). Requiring manual configuration of base definitions creates friction.

Decision: The validator automatically loads the appropriate FHIR specification package based on the declared version. FHIR R4, R4B, and R5 base definitions are embedded in the binary via pkg/specs, eliminating external file dependencies.

Consequences: Zero-configuration validation works out of the box. The binary is self-contained with no need to download specification files. The trade-off is increased binary size due to embedded specifications.

ADR-008: OperationOutcome with Location Extensions

Context: FHIR’s OperationOutcome provides expression (FHIRPath) and location (XPath) fields, but neither captures the exact JSON position (line and column) of an error.

Decision: Extend OperationOutcome.issue with custom extensions that carry source location information (line number, column number) from the original JSON input.

Consequences: Users get precise error locations for debugging. The output remains a valid FHIR OperationOutcome since extensions are part of the FHIR extensibility model.

ADR-009: Incremental Development by Milestones

Context: Building a complete FHIR validator is a large undertaking. Attempting everything at once would delay delivery and make testing difficult.

Decision: Develop the validator incrementally across 15 milestones, following the FHIR type hierarchy from infrastructure (M0) through structural validation (M1) to advanced features like slicing (M12) and performance optimization (M15).

Consequences: Each milestone delivers testable, working functionality. The approach enabled continuous integration testing and early feedback. See the Milestones page for the complete roadmap.

ADR-010: Centralized Message Catalog

Context: Validation produces diagnostic messages that should be consistent, translatable, and easy to maintain. Scattering message strings across phase implementations makes them hard to keep consistent.

Decision: All diagnostic messages are defined in a centralized message catalog in the pkg/issue package. Each message has an identifier and a template string. Phases reference messages by identifier rather than constructing strings inline.

Consequences: Messages are consistent across phases, easy to review, and could be translated in the future. The catalog also serves as documentation of all possible validation outputs.

ADR-011: Lightweight Registry Structs

Context: The full FHIR StructureDefinition type is large and includes fields not needed during validation (e.g., narrative, publisher metadata). Loading thousands of full SDs into memory is wasteful.

Decision: The pkg/registry package uses lightweight internal structs that extract only the fields needed for validation (snapshot elements, type information, binding details). Full SDs are parsed during loading and then discarded.

Consequences: Significantly reduced memory usage when loading large Implementation Guides with hundreds of profiles. The trade-off is a transformation step during loading, but this happens once at startup.


Validation Logic (ADR-012 through ADR-018)

ADR-012: BackboneElement Uses Root SD

Context: BackboneElement types are defined inline within a resource’s StructureDefinition rather than as standalone types. The validator needs to locate the correct ElementDefinitions for these nested structures.

Decision: When validating BackboneElement children, the validator navigates the root resource’s StructureDefinition rather than looking up a separate SD. Element paths like Patient.contact.name are resolved within the Patient SD.

Consequences: BackboneElement validation works correctly without requiring separate SDs for inline types. The walker maintains the full element path context to enable this navigation.

ADR-013: Choice Types with Case-Insensitive Comparison

Context: FHIR choice types (e.g., value[x]) can appear in JSON as valueString, valueQuantity, etc. The type suffix must be matched against allowed types from the ElementDefinition.

Decision: Use case-insensitive comparison when matching the type suffix in a choice element name against the declared ElementDefinition.type entries. This handles edge cases like valueUri vs valueURI.

Consequences: Robust matching of choice types that handles all capitalization variants defined in the FHIR specification.

ADR-014: Primitive Types from SD Regex

Context: FHIR primitive types (date, dateTime, uri, code, etc.) have format rules expressed as regular expressions in their StructureDefinitions.

Decision: Derive all primitive validation patterns from the regex extension in the primitive type’s StructureDefinition rather than hardcoding patterns. The regex is read from ElementDefinition.type and compiled once.

Consequences: Primitive validation is version-agnostic and automatically correct for any FHIR version. Regex compilation is cached to avoid repeated parsing.

ADR-015: FHIRPath Type Codes Handling

Context: FHIRPath expressions in constraints reference FHIR type codes (e.g., Patient, HumanName). The FHIRPath engine needs to resolve these to actual type information.

Decision: Type codes encountered during FHIRPath evaluation are resolved through the registry, mapping them to their corresponding StructureDefinitions. This enables the FHIRPath engine to navigate type hierarchies correctly.

Consequences: FHIRPath constraints that reference types work correctly. The registry serves as the single source of type information for both validation and FHIRPath evaluation.

ADR-016: Three Independent Validation Phases

Context: Early designs considered a single-pass validation approach. However, some validation categories have natural dependencies (e.g., structural validation should run before cardinality).

Decision: While the full pipeline has 9 phases, they can be logically grouped into three independent stages: structural validation (phases 1-3), semantic validation (phases 4-9), and cross-resource validation. Each stage builds on the confidence established by the previous one.

Consequences: Fatal structural errors prevent unnecessary semantic validation. The grouping provides a clear mental model for understanding the validation flow.

ADR-017: JSON Type Validation Before Regex

Context: Primitive validation involves two checks: the JSON value must be the correct type (string, number, boolean) and the value must match the format regex. Running regex on a wrong-typed value produces confusing errors.

Decision: Always validate the JSON type first. If the JSON type is wrong (e.g., a number where a string is expected), report that error and skip the regex check. This produces clearer, more actionable diagnostics.

Consequences: Users see the most fundamental error first. A birthDate with a numeric value reports “expected string, got number” rather than a regex mismatch.

ADR-018: Recursive Complex Type Validation

Context: FHIR complex types (HumanName, Address, CodeableConcept, etc.) contain nested elements that must also be validated. The validation must handle arbitrary nesting depth.

Decision: Complex type validation is handled recursively. When the walker encounters a complex type element, it resolves the type’s StructureDefinition and validates the children against it. This recursion naturally handles types nested to any depth.

Consequences: All complex types are validated uniformly regardless of nesting depth. The same code path handles both top-level elements and deeply nested structures within extensions or BackboneElements.


Proposed (ADR-019 through ADR-021)

The following ADRs are in Proposed status. They document decisions under consideration that have not yet been implemented.

ADR-019: StructureDefinition Caching Strategy

Context: Loading and parsing StructureDefinitions is expensive. In batch validation scenarios, the same SDs are used repeatedly.

Decision (proposed): Implement a multi-level caching strategy with an in-memory LRU cache for parsed SDs and a file-system cache for downloaded packages. Cache invalidation would be based on package version.

Consequences: Significant performance improvement for batch validation and long-running server integrations. The caching layer would sit between the registry and the loader.

ADR-020: Validation Parallelism

Context: When validating a batch of resources, each resource is independent and could be validated concurrently.

Decision (proposed): Provide an optional parallel validation mode using a worker pool. The number of workers would be configurable via functional options. Each worker would have its own copy of mutable state while sharing read-only StructureDefinitions.

Consequences: Linear throughput scaling for batch workloads on multi-core systems. The shared-nothing design for mutable state would avoid lock contention.

ADR-021: Snapshot Generation

Context: Some StructureDefinitions only provide a differential, not a full snapshot. The validator requires a snapshot for validation.

Decision (proposed): Implement snapshot generation that merges a profile’s differential with its base definition’s snapshot. The generator would walk baseDefinition chains and apply differential overrides.

Consequences: The validator would be able to work with profiles that lack pre-computed snapshots. This is particularly important for user-authored profiles and Implementation Guides that only ship differentials.

Last updated on