Why NASA Can't Predict Who Survives the Mission... And What That Means for Every High Performer on Earth

Field Note: 005

"The race to the Moon was about national pride. The return to the Moon is about something bigger… whether the human body can actually survive the ambition of the human mind." — Dr. Dave

Dr. David Heitmann DC, MS — Independent Industry Observer

Field intelligence gathered at SHOP 2026 Human Spaceflight Performance Summit Published by Guild of the Wild — May 2026

Log 001

Executive Summary

NASA is solving the hardest human performance problem ever attempted: keeping four people alive, functional, cognitively sharp, and physically capable across a three-to-four year mission to Mars with no ability to return them home if something goes wrong.

To do this, they need data infrastructure that doesn't exist. Monitoring systems that haven't been built. Predictive models that no one has trained. And they are explicitly, openly asking industry to come in and build them.

Most people don't know this conversation is happening.

This report is the translation layer.

What follows is a direct field assessment of four critical gaps identified across three days of sessions, roundtables, and direct conversations with NASA researchers, flight surgeons, and commercial space scientists at SHOP 2026. These are not speculative opportunities. They are documented institutional needs with named buyers, published requirements, and mission timelines that are compressing faster than the research can keep up.

The thesis is simple: the biology NASA is solving for space is the same biology failing on Earth — and the N=1 interpretive framework required to bridge that gap has not been fully operationalized by anyone yet.

Log 002

The Context: Why Now, Why Industry

NASA's human performance research has historically operated on 25-year iteration timelines. A hypothesis is formed, a study is designed, a crew flies, data is collected, findings are published, and the cycle begins again. Slow, rigorous, and built for an era when the mission calendar had margin.

That era is over.

Artemis III is now a 2027 LEO demonstration mission focused on commercial lander rendezvous and docking, with the first crewed lunar surface return pushed to later in the decade. Commercial Low Earth Orbit stations from Vast, Blue Origin, and Axiom are moving toward operational timelines measured in years, not decades. And Mars is no longer an abstract destination. It’s an active engineering problem with a human mission architecture that requires solving problems that have never been solved before.

The people currently working in NASA's human performance departments have made two or three iterations on their solutions across entire careers. They are not equipped, structurally or culturally, to compress a 25-year cycle into four. And they know it.

❝

The message from the floor at SHOP 2026 was direct and unambiguous: bring solid preliminary research, demonstrate that your technology addresses a documented gap in NASA-STD-3001, and there is a clear pathway to get your solution into a real mission environment. The door is open. The requirements are published. Most companies just haven't shown up yet.

What follows is a map of where to show up — and what to bring.

Log 003

Gap 01: The EVA Monitoring Void

The Problem

Operational EVA physiology monitoring is sparse relative to what the mission now demands. The ground team currently has access to heart rate, metabolic rate estimates derived from suit consumable data, suit-environment telemetry, and basic consumables tracking. For ISS EVAs conducted two or three times per six-month mission, this has been a workable baseline.

For Artemis surface operations, multiple high-tempo lunar surface EVAs over roughly a week, potentially up to 8 hours each, in a suit environment with no rapid movement capability and extreme thermal and terrain demands, it is not sufficient.

What is absent is the individualized, full-body performance state model that a serious endurance athlete can now approximate with consumer tools: real-time HRV trend, hydration status, cumulative fatigue trajectory across multiple EVA days, cognitive load estimation, and thermal strain modeling beyond derived core temperature estimates. Go/no-go decisions are not currently being driven by a live, individualized physiological state model that understands the crew member's baseline, their current load, their projected fatigue curve, and their cognitive reserve going into the next suited operation.

That is the gap. It is not a gap in intent, NASA's standards explicitly name suited metabolic rate, physiologic stress, and injury risk as targets for EVA monitoring. It is a gap in the architecture required to close the distance between what is currently measured and what a genuine individual performance state model requires.

Why It's Hard

The spacesuit is not a passive garment. It is a pressurized life support system with a fixed internal environment that makes traditional biometric sensing architectures difficult to implement. Sensor placement is constrained. Data transmission bandwidth is limited. Power budgets are fixed. And the environment inside the suit; humidity, temperature, pressure differentials, creates signal noise that makes standard consumer wearable algorithms unreliable without mission-specific validation.

The easier versions of this problem have been partially addressed. The hard version, a reliable, mission-grade, real-time individualized physiological state model that operates continuously across a full high-tempo surface mission has not.

The Solution Category

Closing this gap requires three converging technology layers, and a fourth element that is only beginning to be named at the institutional level.

The first layer is integrated biosensing — physiological monitoring validated for the suit environment, maintaining signal integrity across a full EVA duration, and transmitting reliably to both the suit's onboard system and the ground team. Not consumer-grade sensors repackaged. Mission-validated hardware with the failure modes understood.
The second layer is AI-powered interpretation — a decision support system that converts raw biosensor streams into actionable operational outputs. Not data dumps. Interpreted signal: fatigue trajectory, cognitive load estimate, hydration status, projected time-to-operational-limit at the current workload level, thermal strain trend.
The third layer is real-time visualization — a display architecture that puts the relevant information in front of the crew member and flight surgeon simultaneously without adding cognitive load to an already saturated operator. Military special operations and high-performance aviation have developed analog frameworks for this. The space context has not yet fully borrowed from either.

The fourth element is where the field is beginning to arrive but has not yet operationalized: individualized pre-mission baseline characterization as the foundational data layer that makes real-time interpretation valid.

NASA's 2025 crew-state risk modeling work explicitly names the goal of building individualized, crew-specific physiology models from EVA analog training and flight biomedical data. That is the right direction. The unresolved opportunity is turning that stated goal into a full longitudinal baseline system… collected months before flight, stress-tested in mission-relevant analog environments, and used operationally during EVA as live decision support context.

An AI system attempting to assess fatigue or cognitive load in real time is only as good as its understanding of what baseline looks like for that specific individual. Population averages tell you nothing useful about when a specific person is approaching their limit. The interpretation model needs the individual's personal response curves, their breaking points, their recovery signatures, and their specific physiological language under load. All established before the mission demands them.

❝

That pre-mission longitudinal baseline architecture is the gap inside the gap. NASA is naming the need. The operational system to fulfill it does not yet exist.

Log 004

Gap 02: The Cognitive Performance Dashboard That Doesn't Exist

The Problem

I had a direct conversation at SHOP 2026 with the researcher leading NASA's cognitive performance program. She laid out exactly what her team needs: a validated, multi-day integration of sleep quality, HRV, continuous glucose monitoring data, and macronutrient intake that correlates reliably to cognitive performance outcomes. This creates a real-time dashboard that tells the flight surgeon and the crew member not just how they feel, but how they are actually performing cognitively and where that trajectory is headed.

Her team knows what they want. They do not have the framework to build it.

The current state: single-day correlations between sleep and cognitive test scores. That is the ceiling of what has been validated. Three-day predictive modeling, the minimum window required to capture the cumulative stress dynamics that actually drive cognitive performance, has not been studied. The dashboard concept exists as an aspiration. The architecture to achieve it does not.

Why Single-Day Data Isn't Enough

This is the insight that most people in the performance monitoring space are still missing, and it is the reason that current wearable analytics, for all their sophistication, cannot close this gap.

Stress is cumulative. Cognitive capacity is not determined by how well you slept last night. It is determined by the trajectory of your sleep, recovery, fuel availability, and physiological load across the preceding days. A crew member who slept eight hours the night before a critical EVA but has been accumulating a sleep debt and glycemic variability across the previous four days is not recovered. They are masking depletion.

The current single-day correlation models cannot see this. They are looking at today when the signal lives in the trajectory.

The Artemis mission architecture makes this more acute, not less. A crew performing four EVAs in five days is not executing four independent events. They are executing one cumulative physiological experience across five days, and the cognitive and physical resources available on day four are a function of everything that happened on days one through three. A monitoring system that doesn't model cumulative load cannot optimize or protect their performance across that window.

The Westworld Horizon

The vision that NASA's cognitive performance team described is best understood not through the language of current wearable technology but through the frame most people already carry from science fiction.

Imagine a display, accessible to the flight surgeon on the ground and to the crew member inside the habitat, that shows not a single number but a flowing, multi-dimensional picture of each crew member's current state and projected trajectory. Sleep architecture quality. HRV trends. Glucose stability. Cognitive performance index. Fatigue accumulation rate. Projected cognitive capacity at the time of the next EVA. All mixed together with the future known operational load.

❝

Not a Whoop score. Not a readiness ring. A living physiological model of each individual that updates continuously, interprets across time, and generates mission-relevant outputs and predictions.

A dashboard showing the crew member is on a trajectory toward significant cognitive degradation by EVA day three; this crew member's glucose variability over the past 48 hours suggests suboptimal fueling that will compound fatigue; this crew member's HRV decline rate exceeds their personal pre-mission baseline by a margin that historically precedes immune events.

The technology components to build this exist. The integration architecture, the longitudinal validation methodology, and the mission-specific algorithmic framework do not. That is the product gap. And it sits inside one of the most credentialed, mission-driven, funded institutions on earth.

Log 005

Gap 03: The Deconditioning Prediction Problem

The Problem

After twenty-five years of continuous human presence aboard the International Space Station, NASA does not yet appear to have a publicly validated operational model that reliably predicts which individual crew member will decondition most under a specific mission profile.

This is not a data collection failure. Crew members are extensively monitored before, during, and after missions. The problem is that the interpretation framework applied to that data is population-level and describes no individual specifically. Mean VO2 max loss, mean strength decline, mean bone remodeling rates: these are statistical descriptions of a composite person who does not exist on any actual mission.

Some crew members return from six-month ISS missions with VO2 max values barely changed. Others lose significantly more than the mean. And critically: the exercise performed on orbit does not reliably predict the outcome on the ground. Twenty-five years of data collection has not closed this gap simply because more population data cannot solve an individual prediction problem. The upstream variable that determines how a specific person responds to a specific extreme environment cannot be found in a population average. It lives in that person's individual longitudinal profile.

For a six-month ISS mission, this variability is manageable. The crew comes home, many recover substantially during post-flight rehabilitation, and the medical team can adapt in real time. The Mars problem is that return and intensive rehabilitation are not available options. A three-to-four year mission with no return option means that a crew member who deconditions significantly in the first six months cannot be sent home. The mission continues. The deconditioning compounds.

The N=1 Imperative

The commercial digital health industry has spent the last decade building increasingly sophisticated longitudinal data collection systems. What it has not built is the interpretive methodology that makes that data predictive at the individual level. The pipes exist. The N=1 framework and treating each person as their own dataset, characterizing their individual response curves, and predicting their trajectory from their own history rather than a population baseline does not.

This is precisely the gap that the deconditioning prediction problem exposes.

The solution is not more population data analyzed with population logic. It is individual baseline characterization begun months before mission launch and includes multi-modal longitudinal data integration across sleep, HRV, CGM, strength markers, cognitive performance, and immune indicators, combined with structured stress-testing in analog environments that establish each crew member's personal deconditioning response signature before the mission demands it.

❝

The pre-mission analog environment is not just a training ground. It is a controlled laboratory for individual physiological profiling.

Run a crew member through a mission-relevant analog, Antarctic station, submarine deployment, HERA isolation protocol, while collecting their full multi-modal data stack, and you generate something that no population study can produce: a personalized model of how this specific person deconditions under this specific category of stress, how fast they recover, and what the early warning signals look like before significant performance degradation occurs.

That individual response model, built before the mission and updated continuously during it, is the interpretive layer that turns data collection into a genuine prediction. NASA is collecting the data. The methodology to interpret it at the individual level is the gap that remains open.

Log 006

Gap 04: Cumulative Cognitive Load — The Mission Planning Blind Spot

The Problem

NASA's behavioral health lab has documented something that mission planners have not yet fully incorporated into Artemis surface mission architecture: cognitive tasks degrade each other across time.

Research presented at SHOP 2026 indicated that high-difficulty geological sampling tasks measurably reduce working memory performance on subsequent cognitive tests conducted during the same EVA. The shared cognitive capacity pool is finite, and demanding tasks draw from it in ways that are not immediately visible but are entirely predictable.

On a single EVA, this is a task-sequencing optimization problem. Put the highest cognitive demand tasks early when capacity is freshest, manage the sequence carefully, build in recovery margins.

Across multiple EVAs over a high-tempo surface week, it becomes something more serious: cumulative cognitive depletion that compounds across the mission timeline. The final EVA of a high-tempo surface operation is not being performed by a fresh crew. It is being performed by a crew whose cognitive resources have been depleted across previous days of six-to-eight hour suited operations, sleep disruption, caloric stress, and high operational demand. The cognitive capacity available on EVA four is not the same as the cognitive capacity available on EVA one, and current mission planning frameworks do not model that degradation curve.

The Missing Framework

What does not currently exist is a cumulative cognitive load management framework for multi-EVA surface missions: a model that predicts, across the full surface mission timeline, the trajectory of cognitive capacity for each crew member based on actual physiological data, and adjusts task sequencing, EVA duration, and recovery windows accordingly.

This is not a research gap. The data to build this model exists in NASA's behavioral health archives. The integration with real-time physiological monitoring would update dynamically based on actual crew state rather than preflight planning assumptions and is the current missing piece.

The industries closest to solving an analog version of this problem are military special operations, where mission planners are beginning to incorporate physiological monitoring data into real-time operational decision-making, and high-stakes aviation, where fatigue risk management systems are now required for commercial flight crews. Neither analog maps perfectly to the EVA context. Both contain architectural lessons that the space human performance community has not yet fully borrowed.

Log 007

The Macro Picture: The Gap Nobody Has Named Yet

There is a narrative in digital health that goes like this:

The industry has spent the last decade building the infrastructure for personalized medicine. Wearables, continuous monitors, and longitudinal data platforms, and they assume clinical applications will follow naturally as the data accumulates…

That narrative is missing something critical.

❝

The commercial digital health industry has built increasingly sophisticated ways to collect longitudinal data. What it has not built is the interpretive framework that makes longitudinal data meaningful at the individual level.

The N=1 interpretation layer is the methodology that treats each person as their own dataset, establishes their personal baseline, characterizes their individual response curves, and generates decisions from their trajectory rather than population averages does not exist at scale yet.

This is not a technology gap. It is a conceptual gap. The field has been collecting individual data and interpreting it with population logic. The result is more precise delivery of the same imprecision that has always characterized population-level medicine.

NASA's spaceflight human performance problem makes this gap impossible to ignore. When you are managing four people on a Mars mission with no return option, population averages are not just unhelpful… they are operationally dangerous. The mean deconditioning rate tells you nothing about which of your four crew members will hit a critical performance threshold on day 180. The population sleep-cognition correlation tells you nothing about how this specific crew member's cognitive capacity responds to their specific pattern of sleep disruption and caloric stress.

The mission demands N=1. Neither NASA nor the commercial digital health industry has fully operationalized this architecture at mission scale.

This is the gap that connects every problem identified in this report. The EVA monitoring void cannot be closed without individualized baseline characterization. The cognitive performance dashboard cannot be validated without multi-day individual trajectory modeling. The deconditioning prediction problem cannot be solved without longitudinal N=1 response profiling established before the mission begins.

The companies that understand that the data collection problem is largely solved and the interpretation methodology problem is not will be the ones positioned to close these gaps. Not by building another wearable. By building the framework that tells you what the data from any wearable actually means for a specific person across time.

Most investment theses in digital health are built on total addressable market calculations, reimbursement pathway analysis, and regulatory timeline projections. Those frameworks apply imperfectly to the spaceflight human performance category. The mission to Mars is not a product launch. The researchers, flight surgeons, and commercial space scientists working on these problems are not operating on normal fiscal responsibility assumptions.

What that means for the structure of capital, partnerships, and go-to-market strategy for companies building in this category is the subject of the next report in this series.

What this report establishes is the problem set:

The conceptual reframe required to see it clearly. Four critical gaps, documented from the source, all connected by the same root cause: the absence of a methodology that treats the individual as the unit of analysis rather than the population.

The biology being solved for space is the same biology failing on Earth. The data collection infrastructure exists. The N=1 interpretation framework that makes it actionable does not.

That is the opportunity.

Log 008

A Note on What This Report Doesn't Include

Two days in that room produced more intelligence than a single report can responsibly contain. What's published here reflects the gaps that are documentable from session content and public NASA research. A second layer specific to product opportunities, cross-industry analog applications, and the consulting entry points that only surface in direct researcher conversations isn't something that belongs in a public document.

If this report raised more questions than it answered, that's intentional. Reach out.

About the Author

Dr. David Heitmann DC, MS is an independent industry observer, digital health strategist, and applied frontier biology practitioner. He attended SHOP 2026 as an independent observer with 30 years of background in clinical performance medicine, digital health architecture, and individual longitudinal data methodology. His prior work includes multi-modal N=1 longitudinal analysis integrating CGM, HRV, sleep staging, and exercise data; consulting with digital health companies on data architecture and strategy.

He is the founder of Guild of the Wild — frontier biology and adventure performance for professionals who do hard things.

For consulting and advisory inquiries:

Email: [email protected]

LinkedIn: Dr. David Heitmann DC, MS

Guild of the Wild: guildofthewild.com

This report reflects the independent observations and analysis of Dr. David Heitmann. It does not represent the official positions of NASA, SHOP, or any affiliated institution. All session content cited reflects publicly presented research by named investigators and institutions.

If this Field Note made you think differently about something — forward it to one person who needs to hear it. The right ideas find the right people when we pass them along.

And if something in here sparked a question, a story, or a thought you can't shake — hit reply. I read every one.

Why NASA Can't Predict Who Survives the Mission... And What That Means for Every High Performer on Earth

Field Note: 005

Log 001

Executive Summary

Log 002

The Context: Why Now, Why Industry

What follows is a map of where to show up — and what to bring.

Log 003

Gap 01: The EVA Monitoring Void

The Problem

Why It's Hard

The Solution Category

Log 004

Gap 02: The Cognitive Performance Dashboard That Doesn't Exist

The Problem

Why Single-Day Data Isn't Enough

The Westworld Horizon

Log 005

Gap 03: The Deconditioning Prediction Problem

The Problem

The N=1 Imperative

Log 006

Gap 04: Cumulative Cognitive Load — The Mission Planning Blind Spot

The Problem

The Missing Framework

Log 007

The Macro Picture: The Gap Nobody Has Named Yet

Log 008

A Note on What This Report Doesn't Include

About the Author

Reply

Keep Reading

Guild of the Wild

The Frontier of Human Possibility