​
Determining Partial Differential Equation from Data: PART I
​
The purpose:
​
We will discuss how to determine Partial differential equations from data. The ideas will be very similar to determining ODEs and SDEs from data except that more variables will be involved. We will discuss determining multiscale PDEs from data in Part II.
​
The goal is to determine the map F as a function of the data u:
​
​
​
This map will determine the time evolution of the data:
​
​
​
Via the loss function:
​
​
​
​
​
​
Note: The last two terms in the norm signs is just the integral form of the solution at time t_{j+1}.
​
​
​
​
​
​
Constructing an Ansatz:
​
The next question is, how do we construct the map F? This is up to the reader, however, I will present an ansatz which will take the form of a multivariable polynomial function of "Operators"
​
Operators are just mathematical transformations of the data u. The operators will also be up to the reader to choose. Here are some example operators:
​
​
​
​
​
​
​
​
​
​
​
Once the reader chooses a set of initial operators, that he/she believes might be involved in the dynamics (and possibly compositions of) of the data, we can then start constructing the ansatz with an RNN:
Example:
​
To better understand the ansatz, we produce an example ansatz using 1 layer and 2 initial operators:
Remark: For vector equations, one needs to build an ansatz for each component.
​
​
Numerical Examples:
​
Example 1) Linear transport Kinetic equation:
​
​
​
​
​
​
​
​
​
This equation is interesting because it depends on a parameter "epsilon", depending on the magnitude of this parameter, the dynamics of this equation changes. To see this, we apply asymptotic analysis:
Numerical Examples: We will show some results using the Forward-Euler scheme and a different class of time-stepping scheme called the IMEX scheme a.k.a. Implicit-Explicit Runge-Kutta. What will be apparent from these results is that when epsilon becomes smaller, it becomes harder to determine the PDE. This is true in two regards:
1) Need more data, i.e. higher N_x and N_t the smaller epsilon becomes.
2) Need to train for a much longer time.
​
We will address these issues later when we introduce multiscale learning of PDEs.
RNN Structure:
​
The above formulas from input data u to output F(u) can be represented diagrammatically. We produce an example RNN below:
Remark: I want to point out some interesting things:
-If epsilon is very close to zero, there are potentially two equations a machine learning algorithm could learn for the rho-equation, namely equations 2.4 or 2.7. It would be nice if we could learn 2.7 given data from rho. Why would we want this? The reason is that 2.7 is a lower dimensional PDE (depends on fewer variables), and thus can be of better use to other researchers for more efficient numerical simulations.
-The next order O(epsilon) term in 2.7 is unknown, however we can certainly produce black box like neural networks to approximate the next order dynamics.
-This example also gives one hope that PDE discovering algorithms can come up with more efficient numerical schemes. I will later present another example related to quantum mechanics.
​
Numerical Implementation: Now we can discuss numerical implementation, note I will also add sparse regularization of weights.