ProSolvr logo

Resolve problems, permanently

Root Cause Analysis of Boeing Starliner Spacecraft Issue

Boeing Starliner Spacecraft Issue Root Cause Analysis

In June 2024, a Boeing Starliner spacecraft carried NASA astronauts Butch Wilmore and Suni Williams to the International Space Station (ISS) on its Crew Flight Test. However, shortly after docking, several technical anomalies were detected. The engineers detected small helium leaks in the propulsion system and observed multiple reaction-control thrusters failing to fire as expected during approaches and maneuvers. Because of these issues, NASA made the difficult but safety-first decision to bring the Starliner capsule back to Earth uncrewed. The spacecraft performed a controlled autonomous re-entry and landed safely in New Mexico.

As a result, Wilmore and Williams remained aboard the ISS far longer than planned. Their return was deferred, and they were rescheduled to come back aboard a SpaceX Crew Dragon spacecraft instead. Although they were not in immediate danger, the decision underscored how serious the technical glitches were, raising concerns about the reliability of the spacecraft, and its viability for future crewed missions.

This incident shook confidence in the Boeing commercial crew program and highlighted persistent risks in crewed spaceflight development. It raised important questions about propulsion system integrity, quality assurance, and risk management. The issues, including helium leaks, thruster misfires, and a delayed astronaut return, show how small technical discrepancies can escalate into mission-level risks. Modern spacecraft involve tightly coupled systems where software timing, sensor drift, thermal effects, wiring degradation, or control logic inconsistencies can create failures that are difficult to trace.

This is exactly where ProSolvr transforms the investigation process.

Traditional RCA methods like fishbone diagrams and fault trees are powerful, but they become slow and complex when teams face thousands of signals, subsystems, and failure interactions. A Gen-AI powered RCA platform like ProSolvr strengthens the investigation from the start. It guides engineers through structured fishbone analysis and highlights hidden links between technical, process, and human factors that are easy to miss.

ProSolvr visually connects symptoms such as thruster anomalies, telemetry distortions, navigation interference, or thermal stresses to deeper causes such as timing mismatches, integration gaps, material degradation, supplier inconsistencies, or oversight in software development workflows. With clear, data-driven Corrective and Preventive Actions, ProSolvr helps aerospace teams improve reliability, reduce mission risks, and prevent similar failures in future spacecraft missions.

Boeing Starliner Spacecraft Issue

    • People
      • Human Procedural Errors
        • Missed cross-checks during preflight review
        • Incorrect parameter entry during simulation
      • Insufficient Systems Testing Expertise
        • Limited experience with integrated system validation
        • Inadequate training on spacecraft software systems
    • Process
      • Ineffective Risk Assessment Practices
        • Incomplete mitigation planning for known software issues
        • Underestimation of system-level interaction risks
      • Deficient Software Verification Process
        • Failure to detect mismatched timing sequences
        • Inadequate end-to-end mission simulation
    • Equipment
      • Hardware-software Integration Issues
        • Communication bus latency
        • Sensor input inconsistencies
      • Software System Defects
        • Unstable thruster control logic
        • Faulty mission timer module
    • Materials
      • Use of Substitute Materials
        • Non-standard supplier components
        • Alternative alloys with marginal tolerance differences
      • Aging Component Materials
        • Wear-related sensor drift
        • Thermal degradation in wiring insulation
    • Environment
      • Radio-frequency Interference (RFI)
        • Distortion in telemetry data
        • Interference affecting navigation signals
      • Orbital Thermal Environment
        • Heat-induced timing variances
        • Thermal expansion affecting sensor calibration
    • Management
      • Resource Allocation Constraints
        • Staffing shortages in critical engineering teams
        • Budget limitations reducing test coverage
      • Oversight in Software Development
        • Lack of organizational accountability for software defects
        • Insufficient peer review and audits

Suggested Actions Checklist

Here are some corrective actions, preventive actions and investigative actions that organizations may find useful:

    • People
      • Human Procedural Errors
        • Corrective Actions:
          • Reinforce adherence to standard operating procedures by conducting targeted refresher training for flight and simulation teams.
        • Preventive Actions:
          • Introduce mandatory dual-review checkpoints for all preflight and simulation parameter entries.
        • Investigative Actions:
          • Review historical preflight and simulation logs to identify patterns or steps frequently missed by operators.
      • Insufficient Systems Testing Expertise
        • Corrective Actions:
          • Assign experienced system validation specialists to mentor teams handling integrated software and hardware testing.
        • Preventive Actions:
          • Establish a structured certification program for spacecraft systems testing competencies.
        • Investigative Actions:
          • Audit past system validation activities to determine where expertise gaps contributed to missed defects.
    • Process
      • Ineffective Risk Assessment Practices
        • Corrective Actions:
          • Update the risk assessment framework to incorporate system-level interactions and known software vulnerabilities.
        • Preventive Actions:
          • Implement periodic cross-functional risk workshops to reassess evolving mission risks.
        • Investigative Actions:
          • Examine prior risk registers to determine which risks were underestimated or insufficiently mitigated.
      • Deficient Software Verification Process
        • Corrective Actions:
          • Enhance verification workflows by adding high-fidelity, end-to-end simulation stages.
        • Preventive Actions:
          • Standardize verification protocols to include timing-sequence checks across all mission phases.
        • Investigative Actions:
          • Review past verification failures to identify gaps in test cases or simulation fidelity.
    • Equipment
      • Hardware-software Integration Issues
        • Corrective Actions:
          • Recalibrate communication interfaces and update integration firmware to reduce latency and data mismatch.
        • Preventive Actions:
          • Apply integration testing gates after each major hardware or software update.
        • Investigative Actions:
          • Analyze bus-level communication logs to pinpoint recurring inconsistencies.
      • Software System Defects
        • Corrective Actions:
          • Patch defective modules and perform regression testing on thruster logic and mission timing algorithms.
        • Preventive Actions:
          • Introduce automated code-quality checks and continuous integration pipelines for all critical flight software.
        • Investigative Actions:
          • Conduct root cause analysis on defect occurrences to determine whether coding errors, requirements gaps, or logic flaws were responsible.
    • Materials
      • Use of Substitute Materials
        • Corrective Actions:
          • Replace substitute components with specification-compliant materials and re-qualify affected assemblies.
        • Preventive Actions:
          • Implement stricter supplier approval criteria and mandatory material compliance checks.
        • Investigative Actions:
          • Trace procurement records to verify when and why substitute materials were introduced.
      • Aging Component Materials
        • Corrective Actions:
          • Replace aged wiring, sensors, or insulation materials identified as nearing end-of-life.
        • Preventive Actions:
          • Establish a lifecycle-based replacement schedule for components susceptible to wear or thermal degradation.
        • Investigative Actions:
          • Perform failure-mode inspection on removed components to confirm patterns of drift or thermal damage.
    • Environment
      • Radio-frequency Interference (RFI)
        • Corrective Actions:
          • Install additional shielding or filtering units around sensitive telemetry and navigation circuits.
        • Preventive Actions:
          • Conduct routine electromagnetic interference mapping for all mission environments.
        • Investigative Actions:
          • Analyze signal disturbance events from past missions to confirm sources and frequencies of interference.
      • Orbital Thermal Environment
        • Corrective Actions:
          • Reinforce thermal protection or recalibrate sensors to improve stability under extreme temperature swings.
        • Preventive Actions:
          • Perform thermal-vacuum testing earlier and more frequently in the mission design cycle.
        • Investigative Actions:
          • Review temperature trend data across missions to identify triggers for thermal-driven timing or calibration errors.
    • Management
      • Resource Allocation Constraints
        • Corrective Actions:
          • Redirect resources to understaffed engineering teams and increase test coverage through temporary or contract hires.
        • Preventive Actions:
          • Develop annual resource forecasting models to anticipate staffing and testing needs.
        • Investigative Actions:
          • Review project schedules and budget allocations to identify how constraints impacted defect detection.
      • Oversight in Software Development
        • Corrective Actions:
          • Implement mandatory peer reviews and strengthen software governance controls.
        • Preventive Actions:
          • Establish an independent software audit board to periodically review development progress.
        • Investigative Actions:
          • Examine past defect reports to evaluate whether oversight failures contributed to undetected software flaws.
 

Who can learn from the Boeing Starliner Spacecraft Issue template?

  • Aerospace Engineers: They can deepen their understanding of propulsion system behavior, integration challenges, and software hardware interactions that contributed to issues like thruster failures and helium leaks.
  • Quality Assurance and Verification Teams: The RCA highlights gaps in end-to-end testing, software verification, and system-level simulations, helping these teams strengthen validation protocols.
  • Flight Operations and Mission Control Staff: They can learn how procedural errors, missed cross-checks, and unexpected system responses affect mission planning, contingency management, and crew safety.
  • Program and Project Managers: Insights into resource shortages, budget constraints, and oversight gaps help managers improve planning, risk assessment, and decision-making for future missions.
  • Training and Development Teams: Causes such as insufficient systems testing expertise point to the need for enhanced training programs in spacecraft software, integrated systems, and mission-critical procedures.
  • Safety and Risk Management Teams: The RCA provides valuable lessons on identifying systemic vulnerabilities, understanding cascading failure modes, and designing preventive strategies for crewed spaceflight missions.

Why use this template?

Applications like ProSolvr, built on structured fishbone analysis, take investigations to the next level. ProSolvr helps aerospace teams break down complex failures into clear, visual causal pathways, ensuring that no contributing factor is overlooked. By enabling a disciplined and methodical evaluation process, ProSolvr empowers engineers to create targeted CAPA plans that strengthen system reliability, improve verification procedures, and support safer, more dependable crewed missions.

Use ProSolvr by smartQED to systematically identify and eliminate issues in spacecraft systems, reduce costly rework, and prevent resource loss in future missions.

Curated from community experience and public sources:

  • https://www.bbc.com/news/articles/ckgn1125ne3o
  • https://www.nasa.gov/news-release/nasa-decides-to-bring-starliner-spacecraft-back-to-earth-without-crew/