Dossier Cloning & Lifecycle Management

DNXT Publisher Suite · Dossier Lifecycle

Dossier Cloning &
Lifecycle Management

Stop rebuilding dossiers from scratch for every post-approval change, label update, or eCTD 4.0 migration. DNXT's clone engine performs full structural surgery in under 5 minutes — rewriting hyperlinks, converting to UUID-based addressing, regenerating TOC structures, and delivering a validated, submission-ready dossier before your first coffee.

Input
Source Dossier
📁 m1/
📁 m2/
📁 m5/
📄 index.xml
📄 util-stf.xml
eCTD 3.2.2
⚙ Clone Engine
Processing…
Indexing 4,812 nodes
Rewriting hyperlinks
UUID generation
SHA-256 checksums
STF → node-ext
TOC transformation
Validation sweep
Output
Target Dossier
✓ Validated
📁 m1/
📁 m2/
📁 m5/
📄 index.xml
📄 util-stf.xml
eCTD 4.0
1-Click
Full Dossier Clone
Initiation
100%
Hyperlink Integrity
Post-Clone
3.x→4.0
eCTD Version
Conversion
<5 min
Full Dossier Processing
vs. 3-Week Manual

Who This Is Built For

Three teams across pharma, biotech, and CROs who spend weeks on dossier duplication work that should take minutes. This feature was designed specifically around their bottlenecks.

📋
Director of Regulatory Operations
Mid-Size Pharma · NDA / MAA Submissions

Every post-approval change (PAC) or label update triggers a three-week sprint: a regulatory associate manually copies the approved eCTD, a second person hunts down every cross-module hyperlink and updates it by hand, and a third person reconciles the TOC before the team can even begin writing the actual change narrative. A broken hyperlink caught at the gateway costs another week. This person owns the SLA to FDA — and right now, their SLA is held hostage by copy-paste work.

  • PAC dossier ready for review in under 5 minutes, not 3 weeks
  • Eliminates the 2-day manual hyperlink audit before every submission
  • Maintains parallel submission variants (US, EU, Canada) from one source without divergence risk
  • Post-clone validation report replaces pre-submission QC checklist
🔬
VP Regulatory Affairs
Biotech · Pipeline Expansion & Line Extensions

A new indication for an approved compound should start from the existing approved dossier — but the existing dossier is in eCTD 3.2.2 and the agency now expects eCTD 4.0. The team has been told that converting is effectively a rebuild, which means the institutional knowledge embedded in five years of submission history gets recreated from scratch, introducing errors that didn't exist before. There's no budget to hire a second regulatory team, and the clinical timeline doesn't flex.

  • Migrates full eCTD 3.x submission history to 4.0 UUID addressing without rebuilding
  • Preserves five years of approved submission lifecycle in the new format
  • Line extension cloning takes the approved parent dossier as a structural starting point
  • Regional adaptation rules apply country-specific TOC requirements automatically
🏢
Head of Regulatory Publishing
CRO · Multi-Client Submission Services

A publishing team servicing six sponsor clients simultaneously cannot afford to dedicate a senior publisher to two weeks of dossier duplication per client PAC request. The current workflow — copy directory tree, find-and-replace hyperlink paths, manually rebuild STF files, regenerate TOC — is not just slow, it's the single largest source of errors and rework. When a client has an urgent Type II variation, the answer is currently "three weeks" when it should be "tomorrow."

  • One publisher handles six simultaneous PAC cloning jobs instead of one
  • STF-to-node-extension conversion happens automatically — no manual XML editing
  • Validated clone is handed to the sponsor for content edits, not for structural repair
  • Audit trail on every clone operation satisfies sponsor GxP documentation requirements

How It Works

Seven deterministic stages, each logged and auditable. Here's what actually happens inside the engine when you initiate a clone.

1
Source Analysis
Dossier Indexing & Graph Construction

The engine traverses the source dossier's directory tree and parses every XML manifest, index file, and leaf document to construct an internal document graph — a complete map of every file, its module location, its document type, and every reference it makes to other documents. This step identifies all hyperlink targets, STF node references, and TOC entries before any transformation begins, ensuring the clone operation has a complete picture of the dossier's internal dependency structure. Typical indexing of a 4,000-node eCTD dossier completes in under 90 seconds.

2
Structure Clone
Directory Tree Replication & Scope Assignment

Using the document graph, the engine replicates the complete folder hierarchy and document set into a new submission context — applying the target application number, submission type, and sequence number to the structural scaffold. File references in the index manifest are remapped to the new root path, and any submission-context metadata (applicant, drug product, application identifier) is updated from the clone configuration supplied by the user. Documents that differ by submission type — for instance, Module 1 regional cover materials — are flagged for replacement rather than direct copy, so the publisher can immediately see what content requires human attention versus what is structurally equivalent.

3
Hyperlink Rewriting
Smart Cross-Reference Cleanup Across All Documents

Every hyperlink within PDF documents, XML files, and HTML bookmarks is resolved against the document graph and rewritten to point to the correct target in the new dossier. The engine distinguishes between three types of references: internal links (document A citing document B within the same dossier), external links (citations to published literature, which are preserved as-is), and anchor links (in-document bookmarks, which are recalculated based on the destination document's content). This is not a find-and-replace operation on file paths — the engine resolves link targets semantically, using document identifiers, so a link to "the primary clinical study report" finds the correct document in the new dossier even if its path has changed. Broken link detection runs simultaneously, flagging any reference that cannot be resolved so a publisher can act on it before the dossier reaches the gateway.

4
eCTD 4.0 Conversion
UUID Address Generation & SHA-256 Checksum Assignment

For dossiers migrating from eCTD 3.x to eCTD 4.0, the engine converts the legacy node-based addressing scheme — where documents are identified by their position in the CTD folder hierarchy — to the UUID-based leaf-document addressing required by eCTD 4.0 and the FHIR-aligned ICH M8 specification. A globally unique identifier is generated for each leaf document, and a SHA-256 cryptographic checksum is computed from the document's binary content and embedded in the manifest. This checksum serves as the agency's integrity verification mechanism: if a document is altered after the manifest is generated, the checksum will fail, and the submission will be rejected. Because the checksum is computed from actual content rather than metadata, it also catches encoding errors or file corruption that might otherwise go undetected.

5
STF Conversion
Study Tagging File Migration to Node Extension Format

Study Tagging Files (STFs) in eCTD 3.x use a flat referencing model tied to the node hierarchy. In eCTD 4.0, the equivalent mechanism is the node extension, which references leaf documents by UUID and supports richer study metadata attributes required by FDA's CDER and CBER review divisions. The clone engine parses each STF, maps its study references to the corresponding UUID-addressed leaf documents in the new dossier, and generates a compliant node extension XML file. Study metadata attributes — study ID, study type, therapeutic area codes — are preserved from the source and validated against the target submission's drug product and indication context. Where a study reference cannot be automatically resolved (for example, a study document that was removed from scope in the new submission), the engine generates a resolver log entry for publisher review.

6
TOC Transformation
Table of Contents Adaptation to Submission Context & Regional Rules

The TOC is not simply copied — it is regenerated from the new dossier's document set, applying the regional TOC template appropriate to the target agency (FDA, EMA, PMDA, Health Canada, or other ICH regions). Regional rules govern which modules are required, which headings are mandatory, what document naming conventions apply at each TOC level, and how placeholder sections are represented when content is not included in the current submission. The engine applies these rules against the document graph to produce a TOC that is structurally correct for the target submission context before any human opens the dossier. For eCTD 4.0 targets, the TOC is expressed as an XML composition resource aligned to the ICH M8 implementation guide rather than the legacy HTML/XML format.

7
Post-Clone Validation
Automated Consistency Sweep & Submission-Ready Certification

Before the cloned dossier is released to the publishing team, the engine runs a full validation sweep using the same rule set applied at submission gateway check — covering FDA's eCTD technical conformance guide, EMA's eCTD specification, PMDA regional requirements, and ICH M8 for eCTD 4.0 targets. The validation checks structural integrity (all referenced files exist at their declared paths), hyperlink resolution (every internal link resolves to a real document and anchor), checksum correctness (SHA-256 values match current file content), STF/node-extension validity, and TOC completeness against the regional rule set. The result is a machine-generated validation report that functions as a pre-submission audit trail — the team knows the dossier is structurally sound before a single content edit begins.

Six Capabilities That Change the Math on Dossier Lifecycle

Each capability is a distinct engineering component of the clone engine, not a workflow checkbox. Here's what each one does and why it matters for regulatory compliance.

🗂️

Full Dossier Clone

The clone operation copies the complete directory structure, all leaf documents, all XML manifests, and all metadata from a source dossier into a new submission context — initiated with a single action. Unlike a file system copy, the clone engine re-contextualizes the dossier: it updates application identifiers, resets submission sequence numbers, and applies the target submission type (original, supplement, variation, line extension) as a structural parameter that drives every subsequent transformation. The result is a new dossier that shares the source's document content but is independently addressed and can be submitted separately without creating cross-contamination between submission sequences.

🔗

Smart Hyperlink Cleanup

Hyperlinks inside regulatory dossiers are notoriously fragile: a renamed folder, a moved document, or a change in submission root path can silently invalidate hundreds of cross-references. The clone engine resolves every link semantically against the document graph rather than rewriting string paths, which means links survive structural changes — such as moving a study report from one CTD section to another — that would break a path-based rewriter. The engine produces a link resolution report categorizing every hyperlink as Resolved, Preserved (external), or Unresolved (requires publisher action), giving the team a precise work order for the content editing phase rather than an open-ended hunt.

🔄

eCTD 4.0 Conversion

Converting an existing eCTD 3.x dossier to eCTD 4.0 has historically required rebuilding the submission from scratch because the two versions use incompatible document addressing schemes: node-based hierarchical identifiers in 3.x versus UUID-based leaf addressing in 4.0. The clone engine performs this conversion as an automated transformation step, generating RFC 4122-compliant UUIDs for each leaf document and computing SHA-256 checksums from actual file content. The conversion preserves the complete submission history — all prior sequences, all approved documents — in the 4.0 format, so teams migrating to eCTD 4.0 do not lose the institutional record embedded in years of approved submissions.

🌍

Regional Adaptation

Regulatory requirements for dossier structure differ by region: FDA's Module 1 requirements, EMA's cover page and application form specifications, PMDA's Japanese regional document requirements, and Health Canada's administrative requirements are all distinct. When cloning a dossier from one regional submission to another — for example, creating a European MAA from an approved US NDA — the engine applies the target region's structural rules, replaces region-specific Module 1 templates with the correct regional versions, and flags documents that require translation or regional reformatting. The regulatory team receives a structurally correct regional dossier with a clear, itemized list of content adaptations required — not a globally structured dossier that must be manually reconfigured.

📑

TOC Transformation Engine

The Table of Contents in a regulatory dossier is not cosmetic — it is a structural declaration to the reviewing agency of what content is included, where it lives, and how it relates to the CTD hierarchy. The TOC transformation engine regenerates the TOC from the cloned document set rather than copying the source TOC, which prevents phantom entries (TOC items pointing to documents that weren't included in the clone) and missing entries (documents in the new dossier not reflected in the TOC). For eCTD 4.0 targets, the TOC is generated as an ICH M8-compliant XML composition resource with the correct namespace declarations, section coding, and leaf-document references required by current agency validators.

Post-Clone Validation

Every clone operation concludes with an automated validation sweep that runs the completed dossier through the same rule sets used by FDA's eSubmitter, EMA's CESP gateway, and PMDA's eCTD validation tools — before any human reviews the output. Validation covers structural conformance, hyperlink integrity, checksum verification, STF/node-extension validity, and regional TOC completeness. The machine-generated validation report is timestamped and signed, creating an auditable record that the dossier was structurally verified at a specific point in the workflow. This replaces the manual pre-submission QC review that typically requires a senior publisher and 1-2 days — and eliminates the category of errors that only surfaces at the agency gateway after submission.

DNXT vs. The Alternative

Most regulatory publishing platforms treat dossier cloning as a file management problem. It isn't — it's a document graph transformation problem. Here's how DNXT compares to the tools teams are actually using today.

Capability DNXT Publisher Suite Veeva Vault RIM LORENZ docuBridge Manual Process
One-click full dossier clone ✓ Yes — single action, all modules ⚠ Document copy via workflow; no structural surgery ⚠ Package duplication; manual path update required ✗ 2-3 week manual effort
Semantic hyperlink rewriting ✓ Graph-resolved, not path string replacement ✗ Links break on copy; manual fix required ⚠ Path-based rewrite only; semantic resolution absent ✗ Manual find-and-replace across thousands of PDFs
eCTD 3.x → 4.0 automated conversion ✓ Full UUID generation, SHA-256 checksums, manifest rebuild ✗ No automated conversion; requires rebuild in Vault ⚠ Partial: structure converts but UUID addressing not automated ✗ Rebuild from scratch; months of effort
STF to node extension conversion ✓ Automated; study metadata preserved and validated ✗ Not supported; STF files must be manually rebuilt ⚠ Supported in newer versions; study metadata mapping is manual