High-Consequence Research & Education

The Oden Institute is tackling the challenges of Predictive Data Science across high-consequence applications in science, engineering and medicine.

Predictive Data Science | Convergence of data science and Computational Science & Engineering

Big decisions need more than just big data

They need big models too.

Today's high-consequence decisions in engineering, science and medicine need methods based on more than just data analytics. These decisions must incorporate the predictive power, interpretability and domain knowledge of physics-based models. Enter Predictive Data Science.


The shortcomings of data science for high-consequence decisions

We need to stop building computer systems that merely get better and better at detecting statistical patterns in data sets....

How to Build Artificial Intelligence We Can Trust, New York Times

High-consequence decisions in science, engineering and medicine are almost always based on predictions that go beyond the available data. We often need to make predictions about a future state — about the future state of a patient's illness, about the states that an engineering system may find itself experiencing in operation, or about the future state of the Earth's climate in the decades to come.

...and start building computer systems that from the moment of their assembly innately grasp three basic concepts: time, space and causality.

How to Build Artificial Intelligence We Can Trust, New York Times

In these settings, there are multiple reasons that pure-data machine learning and statistical approaches will struggle to generalize with high confidence:

pure-data machine learning approaches struggle with multiphysics dynamics

The applications are characterized by complex multiscale multiphysics dynamics, so that small changes in parameters can lead to large changes in system behavior.

pure-data machine learning approaches struggle with high-dimensional parameters

The parameter space is very high dimensional. Many parameters of interest are fields (infinite dimensional). Without the constraints of physics, the solution space is so vast that driving decisions with data alone is doomed to failure.

pure-data machine learning approaches struggle with sparse data

Data are sparse and typically rely on physical sensing infrastructure, making them expensive to acquire. Data may be large in volume, but they provide only limited peeks into the underlying high-dimensional parameter space.

pure-data machine learning approaches struggle with uncertainty quantification

Uncertainty quantification of predictions must provide quantified confidence in the recommended decisions. This is especially challenging but especially important as we extrapolate beyond the data to issue predictions about future states.

Predictive Data Science — A new convergence

Predictive Data Science integrates physics-based models and data-based machine learning approaches.

What is Data Science?

Data science is a multi-disciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from structured and unstructured data. [Wikipedia]

What is Computational Science & Engineering?

Computational Science & Engineering (CSE) is an interdisciplinary field that uses mathematical modeling and advanced computing to solve complex problems. In CSE, we develop models and simulations to understand physical and natural systems. [Wikipedia]

Predictive Data Science addresses challenges faced by statistical approaches.

What is Predictive Data Science?

Predictive Data Science is the convergence of Data Science and Computational Science & Engineering. Predictive Data Science integrates physics-based models and data to tackle the challenges of high-consequence decisions across science, engineering and medicine.

Predictive Data Science respects physical constraints of the problem.

Respect physical constraints

Predictive Data Science embeds domain knowledge.

Embed domain knowledge

Predictive Data Science brings interpretability to results.

Bring interpretability to results

Predictive Data Science integrates noisy, heterogeneous and incomplete data.

Integrate heterogeneous, noisy & incomplete data

Predictive Data Science obtains predictions with quantified uncertainties.

Get predictions with quantified uncertainties

Predictive Data Science guides optimal data acquisition strategies.

Guide optimal data acquisition strategies

Watch a video explanation of Predictive Data Science at ICIAM 2019.

What is a physics-based model?

Without the concepts of time, space and causality, much of common sense is impossible.

How to Build Artificial Intelligence We Can Trust, New York Times

What is a physics-based

A physics-based model is a representation of the governing laws of nature that innately embeds the concepts of time, space and causality. A physics-based model incorporates governing principles and laws of nature. These laws of nature define how physical, chemical and biological processes evolve.

Physics-based model takes in laws and processes.

These laws of nature typically appear in physics-based models as systems of differential equations.

Differential equationns in physics-based models

How do you work with these equations?

You need numerical methods. These numerical methods discretize the governing differential equations, resulting in numerical models that are large-scale systems of equations. With high-performance computing, we solve these numerical models to determine the solutions.

Why are physics-based models so useful?

The numerical models are parameterized with many, many parameters, representing system properties such as geometry, material properties, initial conditions, boundary conditions, and more. Thus, the numerical model is a valuable tool for exploring "what-if" questions — What if we apply this treatment? What if we choose this shape for the aircraft wing? — all informed by predictions that respect the laws of nature.

The Unreasonable Effectiveness of physics-based predictions

Why are physics-based models predictive?

Because in solving the governing equations of the system, they constrain the predictions to lie on the solution manifold defined by the laws of nature.

Is it easy to combine data with physics-based models?

No — these models usually manifest as high-dimensional computational models that take hours or weeks to simulate.

But, this is where modern techniques of CSE come in — techniques that exploit structure, discover low-dimensional approximations, and leverage high-performance computing. Explore just a few such techniques below.

What's next: The future of Predictive Data Science


across science, engineering & medicine need Predictive Data Science.

need new scalable methods for learning from data through the lens of physics-based models.

need scalable uncertainty quantification that targets certified predictions in support of decision.

need students trained at the interfaces of computer science, mathematics, statistics, high performance computing, and applications across science, engineering and medicine.


conducts cutting-edge interdisciplinary research and education to tackle these challenges.

is addressing science, engineering & medicine grand challenges through new methods, scalable algorithms & high-performance computing.

is advancing the architecture–algorithm–application nexus to transform next-generation computational science.

is building the mathematical & statistical foundations for predictive science, data science & machine learning.

leads the world in interdisciplinary education at the interfaces of computer science, mathematics, statistics, high performance computing, and applications across science, engineering and medicine.

Learn more at www.oden.utexas.edu