Mantis Biotech is making ‘digital twins’ of humans to help solve medicine’s data availability problem
Large language models trained on vast datasets could speed genomics research, streamline clinical documentation, improve real-time diagnostics, support clinical decision-making, accelerate drug discovery, and even gener…
Large language models trained on vast datasets could speed genomics research, streamline clinical documentation, improve real-time diagnostics, support clinical decision-making, accelerate drug discovery, and even generate synthetic data to advance experiments.
But their promise to transform biomedical research often runs into a bottleneck: beyond the structured data healthcare relies on, these models struggle in edge cases like rare diseases and unusual conditions, where reliable, representative data is scarce.
New York-based Mantis Biotech claims it’s developing the solution to fill this data availability gap. The company’s platform integrates disparate sources of data to make synthetic datasets that can be used to build so-called “digital twins” of the human body: physics-based, predictive models of anatomy, physiology, and behavior.
The company is pitching these digital twins for use in data aggregation and analysis. These digital twins could be used for studying and testing new medical procedures, training surgical robots, and simulating and predicting medical issues or even patterns of behavior. For example, a sports team could predict the likelihood of a specific NFL player developing an Achilles heel injury based on their recent performance, training load, diet, and how long they’ve been active, Mantis’ founder and CEO Georgia Witchel explained to TechCrunch in a recent interview.
To build these twins, Mantis’ platform first takes data from a variety of sources such as textbooks, motion capture cameras, biometric sensors, training logs and medical imaging. Then, it uses an LLM-based system to route, validate, and synthesize the various data streams, and runs all that information through a physics engine to create high-fidelity renders of that dataset, which can then be used to train predictive models.
“We’re able to take all these disparate data sources and then turn them into predictive models for how people are going to perform. So anytime you want to predict how a human being is going to be performing, that is a really good use case for our technology,” Witchel said.
The physics engine layer is key here, Witchel told TechCrunch, because it helps the platform enhance the available information by grounding the generated synthetic data and realistically modeling the physics of anatomy.
“If I asked you to do hand-pose estimation for someone who is missing a finger, it would be really, really hard, because there are no publicly available datasets of labeled hand positions of someone who is missing a finger. We could generate that dataset really, really easily, because we just take our physics model and we say, remove finger X, regenerate model,” she said.
Source: TechCrunch