Gaussian process regression (GPR) is a widely used method in machine learning for interpolation, classification, supervised learning, etc. The standard GPR is a data driven method that uses observations (e.g., measurements) of a state of interest to construct a Gaussian process (GP) that describe the state. In many scientific and engineering problems, existing knowledge are available in forms of physical law, simulations tools, etc. that reflect (partial) understanding of the system. We propose a framework that incorporate this knowledge in the GPR method to construct a nonstationary GP that intrinsically includes these physical constraints in the GPR results. We also provide an error estimate in preserving physical constraints in the form of a linear operator (e.g., Dirichlet boundary condition, divergence free flow field, conservation law). More importantly, we will show that this framework enables more accurate multi-fidelity data fusion, and it can be combined with different multi-fidelity strategies, e.g., multi-level Monte Carlo, reduced-basis method, to further enhance the efficiency.