Project Name

Unified Fragmented Government Documentation Into a Single Backstage TechDocs Portal on OCI

Unified Fragmented Governament Documentation Into a Single Backstage TechDocs Portal on OCI
Industry
Government, Public Sector
Technology
Backstage, TechDocs, Dagger (Python SDK), OCI Object Storage, Git, GitLab, OCI Kubernetes Engine (OKE), OCI DevOps

Loading

Unified Fragmented Governament Documentation Into a Single Backstage TechDocs Portal on OCI
Overview

The client is a large government agency based in North America, operating across multiple departments with hundreds of engineers, SREs, and IT staff managing a portfolio of OCI-hosted services including OCI Kubernetes Engine workloads, OCI Functions serverless components, and Oracle Database instances.

 

The agency runs a mixed technology estate built over many years, with documentation scattered across Confluence, SharePoint, individual wikis, email threads, and institutional knowledge held by long-tenured engineers.

 

With increasing staff turnover, a growing OCI footprint, and mounting pressure to improve incident response times and regulatory auditability, agency leadership mandated a unified documentation strategy, but lacked both the platform and the automation to make docs-as-code practical at scale across the full department estate.

Key Challenges

Five documentation systems, no single search surface, and institutional knowledge walking out the door with every engineer who left.

  • Critical Institutional Knowledge Fragmented Across Five Systems: Runbooks, architecture decisions, APIs, and operational procedures were split across Confluence, SharePoint, wikis, email threads, and local drives, with no single source of truth and frequent version conflicts.
  • New Engineer Onboarding Taking 4–6 Weeks to Productivity: Engineers spent weeks in knowledge-transfer sessions and manual discovery across systems, slowing ramp-up and compounding delays as the organisation scaled.
  • Incident Response Delayed by Runbook Discovery Failures: SREs lost 15–30 minutes per incident locating correct runbooks across disconnected systems, increasing MTTR and impacting SLA adherence.
  • Documentation Perpetually Out of Date: Docs were outside source control and maintained inconsistently, causing drift from actual system behavior and reducing trust in documentation over time.
  • No Service Catalog or Ownership Registry: There was no central registry mapping services to owners, dependencies, or environments, forcing reliance on tribal knowledge during incidents.
  • Regulatory Auditability Gap for Documentation Governance: Compliance requirements for documented, owned, and current operational procedures could not be reliably met due to fragmented and unversioned documentation.
Our Solution

Ksolves, an AI-first DevOps consulting services company, deployed a docs-as-code platform anchored on Backstage TechDocs as the unified documentation portal and Dagger Python SDK as the automated build-and-publish pipeline. The principle was straightforward: all documentation lives in Markdown within GitLab, builds automatically via Dagger on every commit, publishes to OCI Object Storage, and is surfaced in Backstage as a searchable, service-linked resource. No documentation exists outside version control, and no runbook is owned by an individual rather than a team.

  • Backstage TechDocs as the Unified Documentation Portal: All runbooks, API docs, and architecture decision records are centralized in Backstage TechDocs, fully indexed and linked to service catalog entries. Engineers can locate any runbook in under 30 seconds through a single search interface.
  • Dagger Python SDK Pipeline - MkDocs to OCI Object Storage: Documentation is built, validated, and published using Dagger on every Git commit. The pipeline generates MkDocs HTML and pushes it to OCI Object Storage, ensuring Backstage always reflects the latest code state and eliminating documentation drift.
  • Backstage Service Catalog With Full OCI Coverage: Every OCI service, including OKE workloads, Functions, and Oracle DB instances, is registered with ownership, dependencies, environment status, runbooks, and live health, replacing tribal knowledge with a single source of truth.
  • Docs-as-Code Migration From Legacy Systems: Confluence, SharePoint, and wiki content was migrated into Markdown within GitLab repositories, aligning documentation ownership with code and enabling version control, reviews, and traceability through merge requests.
  • OCI DevOps Integration for Governed Pipeline Runs: All documentation pipelines run through OCI DevOps, providing full auditability from Git commit to published documentation, ensuring every change is traceable for compliance and governance.

Technology Stack

Category Technology
Platform Backstage + TechDocs
CI/CD Dagger (Python SDK)
Storage OCI Object Storage
Source Control Git / GitLab
Compute OCI Kubernetes Engine (OKE)
Infrastructure OCI DevOps
Impact

From five disconnected documentation systems and a 6-week onboarding window to a single searchable portal where any runbook is 30 seconds away and every document has an audit trail.

  • New Engineer Onboarding Reduced From 4–6 Weeks to Under 3 Days: All runbooks, API docs, and architecture decisions are discoverable in Backstage within 30 seconds, reducing onboarding from weeks of meetings and system hunting to a few days of self-directed exploration.
  • Incident Runbook Discovery Cut From 15–30 Minutes to Under 1 Minute: Backstage global search retrieves any runbook by service, symptom, or keyword in under 60 seconds, eliminating delays that previously increased MTTR during OCI incidents.
  • Documentation Drift Eliminated With Automated Pipeline: Dagger pipelines rebuild and republish documentation on every Git commit, ensuring Backstage always reflects code-state parity and removing stale or untrusted documentation.
  • Complete OCI Service Ownership Registry Established: Backstage service catalog centralizes ownership, dependencies, and runbooks for all OKE workloads, Functions, and Oracle DB instances, replacing tribal knowledge and ad-hoc incident coordination.
  • Regulatory Documentation Audit Trail Established for All Services: Every documentation update is traceable to a Git commit, merge request, and OCI DevOps run, delivering a complete compliance audit trail across all production services.
Solution Architecture
stream-dfd
Conclusion

A government agency where the critical runbook lives in Confluence, or SharePoint, or in the memory of an engineer who left last quarter is not managing a documentation problem; it is carrying an institutional risk that compounds with every hire, every incident, and every compliance audit. Ksolves resolved it with one architectural decision: documentation belongs in Git, alongside the code, owned by the team, and published automatically on every commit. Every runbook, API doc, and architecture decision is now searchable in Backstage in under 30 seconds, current to the last commit, and traceable to a specific pipeline run for audit. Onboarding dropped from 6 weeks to 3 days. Runbook discovery dropped from 30 minutes to under a minute. The compliance audit trail now generates itself.

Is Your Government Agency Losing Critical Institutional Knowledge Every Time an Engineer Leaves?