Progressive Augmentation of Reynolds Stress Tensor Models for Secondary Flow Prediction PDF
Document Details
Aarhus University
2023
Mario Javier Rincón, Ali Amarloo, Martino Reclari, Xiang I.A. Yang, Mahdi Abkar
Tags
Related
- Vasculature: Arterial Blood Flow and Peripheral Resistance - PDF
- Cours Mécanique et Thermodynamique des Fluides, Fluides Visqueux et Turbulence PDF
- Mécanique et thermodynamique des fluides 2018 PDF
- Basics For Air Traffic Control - Wake Turbulence PDF
- PHY 115 Fluid Flow Lecture Notes PDF
- Underwater Visible Light Communication: Recent Advancements and Channel Modeling PDF
Summary
This paper details a progressive approach to enhance secondary flow prediction within computational fluid dynamics simulations using data-driven models and Bayesian optimization. The focus is on improving turbulence models, particularly in addressing secondary flows, using a linear eddy-viscosity model, and progressively adding algebraic Reynolds stress model capabilities without affecting existing performance in canonical cases. The research also explores generalisability testing with various test cases.
Full Transcript
International Journal of Heat and Fluid Flow 104 (2023) 109242 Contents lists available at ScienceDirect International Journal of Heat and Fluid Flow...
International Journal of Heat and Fluid Flow 104 (2023) 109242 Contents lists available at ScienceDirect International Journal of Heat and Fluid Flow journal homepage: www.elsevier.com/locate/ijhff Progressive augmentation of Reynolds stress tensor models for secondary flow prediction by computational fluid dynamics driven surrogate optimisation Mario Javier Rincón a,b ,1 , Ali Amarloo a ,1 , Martino Reclari b , Xiang I.A. Yang c , Mahdi Abkar a ,∗ a Department of Mechanical and Production Engineering, Aarhus University, 8200 Aarhus N, Denmark b Quality & Sustainability Department, Kamstrup A/S, 8660 Skanderborg, Denmark c Department of Mechanical Engineering, Pennsylvania State University, State College, PA, 16802, USA ARTICLE INFO ABSTRACT Keywords: Generalisability and the consistency of the a posteriori results are the most critical points of view regarding Turbulence modelling data-driven turbulence models. This study presents a progressive improvement of turbulence models using RANS simulation-driven Bayesian optimisation with Kriging surrogates where the optimisation of the models is Progressive augmentation achieved by a multi-objective approach based on duct flow quantities. We aim for the augmentation of Surrogate modelling secondary-flow prediction capability in the linear eddy-viscosity model 𝑘 − 𝜔 SST without violating its original Kriging Secondary flows performance on canonical cases e.g. channel flow. Progressively data-augmented explicit algebraic Reynolds stress models (PDA-EARSMs) for 𝑘 − 𝜔 SST are obtained enabling the prediction of secondary flows that the standard model fails to predict. The new models are tested on channel flow cases guaranteeing that they preserve the successful performance of the original 𝑘 − 𝜔 SST model. Subsequently, numerical verification is performed for various test cases. Regarding the generalisability of the new models, results of unseen test cases demonstrate a significant improvement in the prediction of secondary flows and streamwise velocity. These results highlight the potential of the progressive approach to enhance the performance of data-driven turbulence models for fluid flow simulation while preserving the robustness and stability of the solver. 1. Introduction studies aiming at improving the performance of RANS turbulence mod- els (Duraisamy et al., 2019). The majority of these studies have used Reynolds-averaged Navier–Stokes (RANS) equations are widely pre- available high-fidelity data of the RST values to train a model and ferred over high-fidelity methods like direct numerical simulation improve the predictions obtained from empirical models (Tracey et al., (DNS) and large-eddy simulation (LES) for the industrial applications 2013; Wang et al., 2017; Wu et al., 2018; Ling et al., 2016; Kaandorp of computational fluid dynamics (CFD) due to their robustness and and Dwight, 2020; McConkey et al., 2022). These studies have mainly computational speed. In RANS, the physics of turbulence is predicted focused on obtaining ways to correct the RST prediction (Cruz et al., by a Reynolds stress tensor (RST) model; hence, the results obtained are 2019; Weatheritt and Sandberg, 2016, 2017; Schmelzer et al., 2020; dependent on the performance of these model predictions. Despite the Amarloo et al., 2023), or to modify the available empirical models (Du- popularity of RANS simulations, the common empirical models have raisamy, 2021; Singh et al., 2017b; Holland et al., 2019). Taking into been found to have shortcomings (Slotnick et al., 2014), particularly in capturing Prandtl’s second kind of secondary flow (Nikitin et al., consideration a different approach, CFD-driven models (Zhao et al., 2021). This limitation is accentuated in the most commonly used RANS 2020; Saïdi et al., 2022) have shown promising results in finding turbulence models (e.g. two-equation models based on the Boussinesq reliable variants of RANS turbulence modelling. Hence, CFD-driven assumption like 𝑘−𝜀 and 𝑘−𝜔) due to their inability to predict complex models can guarantee consistency and robustness in a posteriori results turbulence anisotropy. (i.e. by a model-consistent training method Duraisamy, 2021), as op- Although the development of reliable RANS turbulence models has posed to other data-driven RANS models (i.e. by a a priori-training remained stagnant for decades (Xiao and Cinnella, 2019), the recent method Duraisamy, 2021) since they are optimised while performing advances in data-driven techniques have motivated a new wave of ∗ Corresponding author. E-mail address: [email protected] (M. Abkar). 1 M. J. Rincón and A. Amarloo contributed equally to this study. https://doi.org/10.1016/j.ijheatfluidflow.2023.109242 Received 1 June 2023; Received in revised form 30 October 2023; Accepted 2 November 2023 Available online 8 November 2023 0142-727X/© 2023 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/). M.J. Rincón et al. International Journal of Heat and Fluid Flow 104 (2023) 109242 simulations to guarantee the improvement of the results obtained by The generalisability is the ability of data-driven models to perform new models. well both with unseen cases (out of the training cases range) and with In order to improve these models by a CFD-driven approach, it canonical cases like channel flow. The combination of a conventional is necessary to solve a complex optimisation problem. In this re- progressive approach with data-driven approaches has been proposed gard, the use of optimisation algorithms that aim to minimise or to address this issue (Bin et al., 2022). In the progressive approach, maximise one or more functions in a multi-dimensional parameter the starting point is a baseline model that is already performing ad- space is essential. Some of the commonly used optimisation algorithms equately for simple flows and adds more complexity step by step include slope followers (Barzilai and Borwein, 1988), simplex meth- without violating the model’s performance for flow cases that the model ods (Nelder and Mead, 1965), multi-objective evolutionary (Coello was calibrated against. In this study, we use the progressive approach to add the capability of secondary flow reconstruction to a linear et al., 2007), and particle swarm algorithms (Eberhart and Kennedy, eddy-viscosity model without violating its successful performance in a 1995), among others. However, these algorithms require the evaluation channel flow simulation. of an objective function, which can be computationally expensive if Since the CFD-driven technique ensures the consistency of the a CFD simulations are involved, especially in cases where a large number posteriori results and the progressive approach aids with the general- of test configurations are required. isability of the new models, in this study we combine the CFD-driven To address this issue, development has been made towards the use surrogate and Bayesian optimisation technique with the progressive of a relatively smaller set of simulations to create computationally approach to obtain a progressively data-augmented explicit algebraic efficient surrogate models, also known as response surfaces, that can Reynolds stress model (PDA-EARSM) to the 𝑘 − 𝜔 SST (Menter, 1994) then be fastly optimised (Forrester et al., 2008). Response surfaces are model for exclusively predicting secondary flows. We likewise inves- mathematical models that approximate the behaviour of the objective tigate the potential of a progressive approach in the development of function in the parameter space and can be used to predict the function generalisable CFD-driven RANS models. Since standard linear eddy- values at untested configurations. This approach has been applied to viscosity models have difficulty predicting secondary flows (Nikitin various engineering applications, such as optimising complex internal- et al., 2021), we use the Pope’s decomposition of RST (Pope, 1975) flow systems based on ultrasonic flow metering (Rincón et al., 2022, to add a non-linear term of the RST to the model, and we optimise 2023), improving the efficiency of gas cyclones (Singh et al., 2017a), the new models for the prediction of secondary flows induced by a optimising the aero-structural design of plane wings (Lam et al., 2009), duct flow with an aspect ratio (AR) of 1 (squared) at bulk Reynolds and improving the performance of ground vehicles (Urquhart et al., number of 3500 (depicted in Fig. 1). The new models’ boundary-layer 2020), among others. prediction is tested on the channel flow to ensure that they do not There are only a few studies that have investigated the use of affect the successful performance of the original model. Considering the optimisation algorithms and data-driven models for the improvement of generalisability, the new models are verified on duct flow cases with the RANS turbulence models. Zhao et al. (2020) combined CFD-driven different Reynolds numbers and aspect ratios. Furthermore, we test optimisation with gene expression programming to obtain a correction the new models against an extremely disadvantageous case regarding for the RST modelled by the 𝑘 − 𝜔 SST model. They showed that the accurate flow prediction by RANS models, we verify the models in a case where the secondary flow is induced by roughness patches in a CFD-driven model had an improved performance compared to the data- channel flow (Forooghi et al., 2020; Amarloo et al., 2022b) with a driven model trained on the same case. They concluded that CFD-driven nominally infinite Reynolds number. Finally, we test the performance models have a great potential for developing reliable improved RANS of the models against two novel data-driven EARSMs and the traditional models even though their new model is limitedly optimised for wake BSL-EARSM based on the constitutive relation of Wallin and Johansson mixing flow. In another study, Saïdi et al. (2022) used response surfaces (2000). to find the best linear combination of candidate functions to correct the RST values modelled by the 𝑘 − 𝜔 SST model for the cases with 2. Methodology flow separation. They also showed that a CFD-driven approach yields models that perform better than the data-driven models trained on the This section presents the structure of the progressive correction same set of data. In the study by Huang and Yang (2021), the authors model for RANS modelling and the optimisation technique for training resort to Kriging and obtained wall models for boundary layer flows the correction model. By using the Reynolds decomposition of velocity subjected to system rotation in an arbitrary direction. The model was and pressure, the Navier–Stokes equation for an incompressible steady shown to predict deviations in the mean flow from the equilibrium law flow can be written as (Pope, 2000) of the wall. In the study by Xiang et al. (2021), the authors employed 𝑢𝑖 = ⟨𝑢𝑖 ⟩ + 𝑢′𝑖 , 𝑝 = ⟨𝑝⟩ + 𝑝′ , (1) an evolutionary neural network and arrived at numerical strategies for the pressure Poisson equation with density discontinuities. Further- 1 ( ) 𝜕𝑖 ⟨𝑢𝑖 ⟩ = 0, 𝜕𝑗 (⟨𝑢𝑖 ⟩⟨𝑢𝑗 ⟩) = − 𝜕𝑖 ⟨𝑃 ⟩ + 𝜕𝑗 𝜈𝜕𝑗 ⟨𝑢𝑖 ⟩ − 𝐴𝑖𝑗 , (2) more, the CFD-driven methodology is extended to a multi-objective 𝜌 optimisation for coupled turbulence closure models by Waschkowski where 𝑖, 𝑗 = 1, 2, 3 are indicating the streamwise (𝑥), spanwise (𝑦), et al. (2022). However, the application of surrogate-based optimisation and vertical (𝑧) directions, respectively. 𝑢𝑖 and 𝑝 are velocity and (SBO) and Bayesian optimisation methods (Queipo et al., 2005) to pressure decomposed to a temporal mean term, indicated by ⟨⋅⟩, and improve complex mathematical tools such as RANS turbulence mod- fluctuations, indicated by ⋅′. The kinematic viscosity is denoted by 𝜈, elling has not been widely explored. In this regard, SBO and Bayesian and 𝜌 is the fluid density. 𝐴𝑖𝑗 = ⟨𝑢′𝑖 𝑢′𝑗 ⟩ − 13 ⟨𝑢′𝑘 𝑢′𝑘 ⟩𝛿𝑖𝑗 is the anisotropic optimisation methods show potential in evaluating problems based on part of RST, and pressure is modified with the isotropic part of the RST design and analysis of computer experiments (DACE) (Sacks et al., as ⟨𝑃 ⟩ = ⟨𝑝⟩ + 13 𝜌⟨𝑢′𝑖 𝑢′𝑖 ⟩. 1989) with a low-to-medium number of parameters to optimise. In this study, the 𝑘 − 𝜔 SST (Menter, 1994) model is used as a One of the most important topics in data-driven RANS modelling baseline model to be progressively corrected to predict secondary flows. is the generalisability of the new models for unseen cases (Sandberg In the standard 𝑘 − 𝜔 SST model, 𝐴𝑖𝑗 is modelled as and Zhao, 2022). It has been suggested that using a multi-case CFD- 𝐴𝐵𝐿 𝑖𝑗 = −2𝜈𝑡 𝑆𝑖𝑗 , (3) driven approach to consider different turbulent phenomena during 1 the optimisation process can help with the generalisability of the where 𝑆𝑖𝑗 = (𝜕 ⟨𝑢 ⟩ + 𝜕𝑗 ⟨𝑢𝑖 ⟩) 2 𝑖 𝑗 is the strain rate tensor, 𝜈𝑡 is the turbulent new model (Fang et al., 2023). However, even these new models are viscosity which is calculated by values of turbulent kinetic energy (TKE) still specific to either wall-free or wall-bounded flows; therefore, the (i.e. 𝑘), and the specific dissipation rate (i.e. 𝜔), which are modelled by generalisability problem requires further investigation. the original two-equation model (Menter et al., 2003). 2 M.J. Rincón et al. International Journal of Heat and Fluid Flow 104 (2023) 109242 Fig. 1. Flow characteristics of a duct flow with AR = 1 and hight and width of 2H. Note the formation of secondary motions and their symmetry. 2.1. Progressively data-augmented EARSM = {𝐼1 , 𝐼2 , 𝐼3 , 𝐼4 , 𝐼5 , 𝐼12 , 𝐼22 , 𝐼32 , 𝐼42 , 𝐼52 , We extend the structure of the linear eddy-viscosity model (Eq. (3)) 𝐼1 𝐼2 , 𝐼1 𝐼3 , 𝐼1 𝐼4 , 𝐼1 𝐼5 , 𝐼2 𝐼3 , 𝐼2 𝐼4 , (8) by following Pope’s decomposition (Pope, 1975) of the RST and con- 𝐼2 𝐼5 , 𝐼3 𝐼4 , 𝐼3 𝐼5 , 𝐼4 𝐼5 }, sidering the two first terms of the decomposition, where 𝜃𝑖 are constant coefficients to be determined by the CFD-driven (1) 𝑆𝑖𝑗 (2) 𝑆𝑖𝑘 𝛺𝑘𝑗 − 𝛺𝑖𝑘 𝑆𝑘𝑗 optimisation process. To achieve a more efficient sparse optimisation, 𝐴𝑖𝑗 = 2𝑘𝛼 + 2𝑘𝛼 , (4) 𝜔 𝜔2 the normalised candidate functions are normalised and defined as where 𝛺𝑖𝑗 = 12 (𝜕𝑖 ⟨𝑢𝑗 ⟩ − 𝜕𝑗 ⟨𝑢𝑖 ⟩) is the rotation rate tensor, and 𝛼 (𝑛) are − 𝜇𝑖 𝑖 = 𝑖 , (9) unknown functions of 5 invariants defined as, 𝜎𝑖 tr(𝑆𝑖𝑘 𝑆𝑘𝑗 ) tr(𝛺𝑖𝑘 𝛺𝑘𝑗 ) tr(𝑆𝑖𝑘 𝑆𝑘𝑚 𝑆𝑚𝑗 ) where 𝜇𝑖 , 𝜎𝑖 are the mean and the standard deviation of each candidate 𝐼1 = , 𝐼2 = , 𝐼3 = , 𝜔2 𝜔2 𝜔3 function 𝑖 , respectively. These statistics are calculated based on high- tr(𝛺𝑖𝑘 𝛺𝑘𝑚 𝑆𝑚𝑗 ) tr(𝛺𝑖𝑘 𝛺𝑘𝑚 𝑆𝑚𝑙 𝑆𝑙𝑗 ) fidelity data from the optimisation case. Therefore, Eq. (7) is rewritten 𝐼4 = , 𝐼5 =. (5) as, 𝜔3 𝜔4 A comparison between Eqs. (3) and (4) shows that the 𝑘 − 𝜔 SST ∑ 20 model is already providing the first term in Pope’s decomposition of the 𝛼 (2) = 𝐶0 + 𝐶𝑖 𝑖 , (10) RST (i.e., the turbulent viscosity); therefore following the progressive 𝑖=1 approach, only the second term is trained in this study. The reason where an optimisation technique determines the coefficients 𝐶𝑖 based behind this decision is to preserve the original performance of the 𝑘 − 𝜔 on the performance of the correction model for the reconstruction of SST model in the prediction of the turbulent viscosity (i.e., 𝜈𝑡 ) while the high-fidelity velocity field of an optimisation case. the second basis tensor’s coefficient is determined with the exclusive In this study, only 2D canonical flow cases are considered, since they purpose of secondary flow prediction. Hence, the new RST model can are computationally cost-effective. However, reducing the dimensional- be written as ity of the optimisation problem and using computational parallelisation ( ) 𝑆𝑖𝑘 𝛺𝑘𝑗 − 𝛺𝑖𝑘 𝑆𝑘𝑗 is key if a complex 3D case is used in the training process. 𝐴𝑖𝑗 = −2𝜈𝑡 𝑆𝑖𝑗 − 𝛼 (2) , (6) Since considering 21 optimisation variables is impractical, making 𝜔 the model unnecessarily complex and exposed to solution instabilities, where 𝜈𝑡 and 𝜔 are modelled by the standard 𝑘 − 𝜔 SST model, where two approaches are chosen: Eq. (6) is used for the production term by the Reynolds stress tensor instead of Eq. (3), and the unknown function of 𝛼 (2) is determined by 1. Selecting only the first 𝑚 leading candidate functions. the CFD-driven optimisation technique. It should be mentioned that the 2. Reducing the dimensionality of the problem using a statistical linear part of the new model is treated implicitly as turbulent viscosity, technique. and the non-linear part of the RST is added explicitly to the RANS equations. The assumption behind the progressive approach is that Both of these approaches are compared, and for the purpose of di- using only the second basis tensor does not affect the incompressible mensionality reduction, principal component analysis (PCA) is applied parallel shear flow, either in the momentum equation or via production on the 𝑖 functions to obtain the first 𝑚 principal components as, in the 𝑘-equation; therefore, the performance of 𝑘 − 𝜔 SST is preserved ∑ 20 in cases where secondary flow is not present. 𝜑𝑗 = 𝑎(𝑗) 𝑖 𝑖 , (11) Inspired by a sparse regression of candidate functions (SpaRTA) 𝑖=1 (Schmelzer et al., 2020) used for 𝛼 (𝑛) , we use a set of candidate where the coefficients 𝑎(𝑗) are obtained by performing PCA on the functions to describe 𝛼 (2) as 𝑖 high-fidelity data from the optimisation case. It should be mentioned ∑ 20 that PCA also determines which features among all the 20 features 𝛼 (2) = 𝜃0 + 𝜃𝑖 𝑖 , (7) have a higher importance in the variability of 𝛼 (2) by determining the 𝑖=1 3 M.J. Rincón et al. International Journal of Heat and Fluid Flow 104 (2023) 109242 Fig. 2. Optimisation strategy employed involving an initial sampling plan based on Latin hypercube sampling (LHS) which is solved by CFD. Subsequently, an initial surrogate is constructed using Kriging. Since it is likely that the initial sampling may not evaluate the surrogate in extrema, Bayesian strategies based on efficient global optimisation (EGO) and the evaluation of the expected improvement (𝐸 [𝐼(𝑥)]) function are then applied to further explore the surrogate and improve its quality. Finally, once the quality requirements have been met, a genetic algorithm is employed to search for non-dominated optima on the surrogate model. 𝑎(𝑗) 𝑖 coefficients for each of them. By considering this transformation addition of these secondary flow motions is strong enough to also equation, Eq. (10) is rewritten as, incur changes in the streamwise direction of the flow. In addition, once ∑ 𝑚 the corrections are made, the mean streamwise flow is simultaneously 𝛼 (2) = 𝐶0 + 𝐶𝑖 𝜑𝑖. (12) modified. Since the aim of this study is to improve the overall flow 𝑖=1 prediction, it is important to not disregard possible predictions that Based on the high-fidelity data of the optimisation case, in Section 3.1 could improve secondary flow but worsen streamwise velocity. Hence, it is shown that the first two principal components are enough to two objective functions that physically describe the streamwise flow represent a high percentage of the variability of the dataset (i.e. 𝑚 = 2). and the secondary flow prediction are defined and evaluated. Therefore, two different structures for 𝛼 (2) are considered: In this regard and to holistically evaluate the flow prediction, the normalised error of both the streamwise velocity and vorticity with 𝛼𝐼(2) = 𝐶0 + 𝐶1 1 + 𝐶2 2 , (13a) respect to high-fidelity data, are evaluated and taken as objectives to be minimised. To yield a fair and accurate single value representing (2) 𝛼𝐼𝐼 = 𝐶0 + 𝐶1 𝜑1 + 𝐶2 𝜑2 , (13b) these quantities, the volumetric average of each field is computed as which are described as model I and model II , respectively. Where a the objective functions, following ( ) unique set of three coefficients 𝐶0 , 𝐶1 , and 𝐶2 are determined by a ∫ ⟨𝑢1 ⟩PDA-EARSM − ⟨𝑢1 ⟩HF d𝑉 multi-objective optimisation technique for each of the models. 𝑗1 = 𝑉 ( ) , (14a) ∫𝑉 |⟨𝑢1 ⟩𝑘−𝜔 − ⟨𝑢1 ⟩HF | d𝑉 ( ) ∫ ⟨𝜔1 ⟩PDA-EARSM − ⟨𝜔1 ⟩HF d𝑉 2.2. Optimisation methods 𝑗2 = 𝑉 ( ) , (14b) ∫𝑉 |⟨𝜔1 ⟩𝑘−𝜔 − ⟨𝜔1 ⟩HF | d𝑉 To approach the optimisation problem, a surrogate based on Kriging where | ⋅ | is the absolute value operation, ⟨𝜔𝑖 ⟩ = 𝜀𝑖𝑙𝑘 𝜕𝑙 ⟨𝑢𝑘 ⟩ is the and Gaussian processes (GPs) is built. The surrogate is based on DACE, vorticity of the mean flow, where 𝜀𝑖𝑙𝑘 is the alternating unit tensor, HF using observations obtained from a sequence of CFD solutions and post- stands for high fidelity, and 𝑘 − 𝜔 refers to the solution of the standard processing of results. SBO has shown its advantages for these types 𝑘 − 𝜔 SST model. The definition of these functions yields a value of 1 of problems, minimising the number of required observations, and when their output matches the prediction of 𝑘 − 𝜔 SST and 0 when the allowing the use of goal-seeking algorithms such as Bayesian optimi- predictions match the high-fidelity data. Subsequently, to evaluate the sation (Jones, 2001). To assess the model’s performance, two main overall results of the method, a global fitness function is defined as objective functions are established that involve streamwise velocity and 1( ) streamwise vorticity, defining a multi-objective optimisation problem. 𝐽= 𝑗 + 𝑗2. (15) 2 1 An effective sampling plan is also outlined, and an infill criterion constrained by quality metrics is specified. Finally, the methodology 2.2.2. Sampling plan is implemented using the general-purpose software OpenFOAM (Weller Two main algorithms are used in order to sample observations and et al., 1998). A flow diagram of the complete optimisation methodology data in this study: Monte Carlo and Latin hypercube. Monte Carlo is depicted in Fig. 2. sampling is used when extracting a test subset to compute the surrogate quality. Since the chosen sampling plan for the initial observations is 2.2.1. Objective functions oriented to be space-filling, Latin Hypercube Sampling (LHS) (McKay The principal issue that is assessed in this study is the complete et al., 2000) with optimised design by the enhanced stochastic evolu- lack of prediction of secondary flows in RANS turbulence models based tionary (ESE) algorithm (Jin et al., 2005; Damblin et al., 2013) is used on the Boussinesq assumption. In the chosen optimisation case, the to obtain an initial sample of the surrogate with a minimum number 4 M.J. Rincón et al. International Journal of Heat and Fluid Flow 104 (2023) 109242 of observations, and simultaneously, representing the real variability In this study, thresholds of 0.2 for RMSE and 0.8 for 𝑟2 are es- of the parameters. This deterministic sampling technique is a type of tablished as indicators of good surrogate quality, based on previous stratified Monte Carlo that divides each dimension space representing a studies (Sobester et al., 2008; Hastie et al., 2009). variable into 𝑛0 sections, and only one point is placed in each partition. The initial sample by LHS with ESE follows 𝑛0 = 30, where 𝑛0 is the 2.2.5. Infill space exploration initial number of observations, and is the number of design variables. Efficient global optimisation (EGO) based on Bayesian optimisation strategies is used to improve the accuracy of surrogate models and 2.2.3. Surrogate construction decrease overall uncertainty (Jones, 2001; Jones et al., 1998). This is In the context of a sparse optimisation problem, surrogates are com- achieved by exploring the surrogate beyond the initial sampling. EGO monly used to approximate a function 𝑦 = 𝑓 (𝑥) based on known obser- is a well-known algorithm that employs both local and global searches vations. In this study, the Kriging method is chosen for generating the to find the optimal solution by means of the expected improvement surrogate based on CFD-driven observations. Kriging is one of the most (𝐸[𝐼(𝑥)]) function as a key metric to direct its search. The function common surrogate construction methods due to its well-known imple- calculates the potential improvement that can be obtained by evalu- mentation, ability to compute uncertainties and speed. Furthermore, ating a new observation point based on the current best solution and it has been proven useful in physical, uncertainty quantification, and the overall uncertainty of the surrogate model (Mockus et al., 1978). engineering applications (Rincón et al., 2023; Kawai and Shimoyama, The implementation of this function provides the necessary sparsity- 2014). Kriging interpolates the observations as a linear combination of promoting behaviour to explore the design space in regions where their a deterministic term and a stochastic process, which is represented by initial sampling was not sufficient. The expected improvement function ∑ 𝑘 is defined as 𝑓̂(𝑥) = ( ) ( ) 𝛽𝑖 𝑓𝑖 (𝑥) + 𝑍(𝑥), (16) ( ) 𝑓min − 𝜇(𝑥) 𝑓min − 𝜇(𝑥) 𝑖=1 𝐸[𝐼(𝑥)] = 𝑓min − 𝜇(𝑥) 𝛷 + 𝜎(𝑥)𝜙 , (20) 𝜎(𝑥) 𝜎(𝑥) where 𝑓̂(𝑥) is the surrogate prediction, 𝛽 is a linear deterministic model, where 𝑓min = min 𝑌 , and 𝛷(⋅) and 𝜙(⋅) are respectively the cumulative 𝑓 (𝑥) is the known function, and 𝑍(𝑥) is the realisation of a stochastic and probability density functions of (0, 1), following the distribution process with zero mean and spatial covariance function given by (𝜇(𝑥), 𝜎 2 (𝑥)). Using EGO to explore the surrogate beyond the initial [ ( ) ( )] ( ) sampling aids in decreasing the overall uncertainty and improving the cov 𝑍 𝑥(𝑖) , 𝑍 𝑥(𝑗) = 𝜎 2 𝑅 𝑥(𝑖) , 𝑥(𝑗). (17) precision of the surrogate model, especially in the unexplored and Here, the spatial correlation function 𝑅 determines how smooth the extreme regions of the design space. Consequently, the algorithm can Kriging model is, how easily the response surface can be differentiated, effectively balance local and global searches and locate the optimal and how much influence the nearby sampled points have on the model. solution more proficiently. Hence, the following sampling point is In this study, the spatial correlation is defined following the squared determined by exponential (Gaussian) function as [ 𝑥𝑛+1 = arg max (𝐸[𝐼(𝑥)]). (21) ∏𝑛𝑥 ( )2 ] 𝑥 exp −𝜃𝑙 𝑥𝑙(𝑖) − 𝑥(𝑗) , ∀𝜃𝑙 ∈ R+ , (18) 𝑙 In order to fully explore the design space, new data is collected for 𝑖=1 each objective function separately. This involves taking one sample where the correlation scalar 𝜃𝑙 is used to define the variance of a of 𝑛EGO = 𝑛0 𝑥 → ←← 𝑦 for each objective, resulting in a total of 180 Gaussian process at each point, with higher values indicating a stronger new sampled points (𝑛0 for each of the objectives). With the addition correlation between points. By maximising the maximum likelihood of the initial LHS, the total number of observations sums to 270 per estimation, optimal values for hyper-parameters as 𝜃𝑙 , mean, and model, yielding a cost-efficient computational problem to be analysed standard deviation can be found (Bouhlel et al., 2019a). by Kriging. However, since the design space explored is large and the use of Bayesian optimisation might sample observations in very 2.2.4. Quality metrics close proximity, the gradients between points may be high, yielding The quality of the surrogate refers to how accurately it approximates possible noise. Therefore, the surrogate is regularised following the the true function being modelled. In order to evaluate this quality, a methodology by Bouhlel et al. (2019b), where noise is considered to be number of random test observations 𝑛𝑡 = 𝑛0 𝑥 → ←← 𝑦, are taken. where Gaussian distributed and its homoscedastic variance is evaluated based 𝑛𝑡 is a function of the number of initial observations. Once the test on the performed observations to obtain a smooth response surface. observations have been generated, they are used to compare the surro- gate’s predictions to the true values of the function being modelled. The 2.2.6. Multi-objective solution root-mean-squared error (RMSE) and Pearson’s correlation coefficient The Kriging method offers a significant benefit since it allows the ap- squared (𝑟2 ), defined as √ plication of algorithms that can thoroughly explore the objective space. √ ∑𝑛 ( √ 𝑡 𝑦(𝑖) − 𝑦̂(𝑖) )2 Consequently, the objective of this study is to find the optimal solution √ 𝑖=1 RMSE = , (19a) for the surrogate model. To achieve this, the multi-objective evolu- 𝑛𝑡 tionary algorithm (MOEA) NSGA-II has been chosen. This algorithm ( )2 is known for its versatility, fast and efficient convergence, and ability cov(𝑦, 𝑦) ̂ 𝑟2 = √ , (19b) to handle non-penalty constraints. It also has a wide-range search for var(𝑦)var(𝑦) ̂ solutions, which implies that it can explore the objective space more are two common metrics used to evaluate the quality of surrogates. thoroughly and is less likely to get grounded in local optima (Deb et al., While the RMSE measures the difference between the surrogate’s pre- 2002). By using the MOEA NSGA-II algorithm, the improved solutions dictions and the true values, 𝑟2 measures the strength of its linear are ensured to be non-dominated, meaning that they are not worse than relationship. A high RMSE and a low 𝑟2 indicate that the surrogate is any other feasible solution in all objective functions. This also ensures performing poorly and needs to be refined, while a low RMSE and a that the solutions avoid local minima and are highly likely to discover high 𝑟2 indicate that the surrogate is performing accurately and can be the global optimum for the sampled response surface. This makes the used with confidence. Hence, these values serve as reference points to algorithm able to search a large portion of the objective space and determine whether the surrogate model is accurate enough to be used find the best possible solution for the surrogate model while avoiding in the optimisation process. suboptimal solutions. 5 M.J. Rincón et al. International Journal of Heat and Fluid Flow 104 (2023) 109242 Fig. 3. Contours of streamwise velocity and secondary flow streamlines for duct flow optimisation case at AR = 1 and Re𝑏 = 3500 for RANS 𝑘 − 𝜔 SST (left) and DNS (right) data. DNS data obtained from Pinelli et al. (2010). The dotted white lines denote the location of the velocity profiles evaluated throughout this study to perform a quantitative comparison. It should be noted the varying thickness of the secondary flow streamlines denotes their magnitude. Fig. 4. Barycentric representation of RSTs’ shape for 𝑘 − 𝜔 SST (left) and DNS (centre) for training duct flow case with AR = 1 at Re𝑏 = 3500. DNS data obtained from Pinelli et al. (2010). The plain-strain limit is represented by the dotted line, and the colour representation (right) follows the work by Emory and Iaccarino (2014). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) 2.3. High-fidelity data 3.1. Surrogate-based optimisation Since the main purpose of this study is to obtain a PDA-EARSM 3.1.1. Optimisation case: 𝑘 − 𝜔 SST performance for capturing the secondary flow, the DNS data of a canonical case It is essential to qualitatively indicate the original performance of with this flow characteristic is chosen. A duct flow case of AR = 1 𝑘 − 𝜔 SST against high-fidelity data prior to any modifications. For and bulk Reynolds number of Re𝑏 = 3500 is used for the training both fidelity levels, the results for the square duct case show the process (depicted in Fig. 1). The DNS data is obtained from Pinelli et al. progressive deceleration of the streamwise flow as it approaches the (2010) curated by McConkey et al. (2021). Following the progressive walls of the duct, exhibiting higher gradients in the near-wall regions as approach, the trained PDA-EARSMs are tested on two cases of channel a consequence of the boundary layer and the fully developed turbulent flow with friction Reynolds number of Re𝜏 = 395 and Re𝜏 = 5200 for flow. Whereas this behaviour can be predicted by 𝑘 − 𝜔 SST, the which data is obtained from Moser et al. (1999) and Lee and Moser presence of a secondary flow predicted by DNS is completely neglected (2015), respectively. Considering the generalisability of the new mod- by the RANS model (Fig. 3), as expected (Nikitin et al., 2021). Two els, we likewise test them on four unseen cases containing secondary antisymmetric rotational regions are generated diagonally at the duct’s flows, including duct secondary flow cases with AR = 1 and higher vertex, preserving their symmetry at the 4 quadrants of the duct. This bulk Reynolds numbers of Re𝑏 = 5700 (Vinuesa et al., 2018) and secondary flow is strong enough to drive the streamwise flow towards Re𝑏 = 10320 (Huser and Biringen, 1993) , a duct secondary flow case the duct vertices, skewing the streamwise velocity field in the domain with AR = 3, and a lower bulk Reynolds number of Re𝑏 = 2600 (Vinuesa through an impinging-like flow behaviour. Since 𝑘 − 𝜔 SST does not et al., 2018), and a roughness-induced secondary flow with nominally predict these rotational motions, the streamwise velocity distribution is infinite Reynolds number more information about the geometry and not skewed and the spanwise and vertical components of the velocity properties of this case is available at Amarloo et al. (2022b,a). are zero. Fig. 4 compares the barycentric plots representing the shape of RSTs 3. Results and discussion for the optimisation case. The location of each point is calculated based on the eigenvalues of the normalised anisotropic part of the RST (more In this section, the results are presented in two subsections. In information about barycentric plots available at Hornshøj-Møller et al., the first one, the results of the optimisation process and training of 2021 and Emory and Iaccarino, 2014). While 𝑘−𝜔 SST yields all results the two best PDA-EARSMs are presented, whereas, in the second, within the plain-strain limit as expected from a linear eddy-viscosity the performance of the trained models is evaluated on the test cases RANS model, DNS shows a more complex stress anisotropy distribution with different geometries, Reynolds numbers, and boundary conditions. as a combination of states towards one-component turbulence fluc- Associated contours and velocity profiles accompany all results which tuations 𝑥̂ 1𝑐 (also known as rod-like or cigar-like turbulence) and an are similarly described and discussed, followed by their corresponding offset from the plain-strain limit with a tendency towards isotropic (or figures. spherical) turbulence 𝑥̂ 3𝑐. 6 M.J. Rincón et al. International Journal of Heat and Fluid Flow 104 (2023) 109242 Fig. 5. PCA results of the candidate functions calculated by DNS data of the optimisation case. 3.1.2. Optimisation case: PDA-EARSMs streamwise flow does not necessarily correlate with the secondary flow Since 𝑘 − 𝜔 SST is not able to predict the secondary flow of prediction in this case. the optimisation case, CFD-driven optimisation is applied following Regarding the surrogate quality (Fig. 7(a)), and Pareto front pre- the models described in Eqs. (13a) and (13b) focusing on predicting diction and numerical validation (Fig. 7(b)), the surrogate predicts the secondary flows and correct streamwise velocity. test set with great accuracy for the whole design space tested, which is Regarding model II, PCA is applied to the set of candidate functions likewise reflected by a highly accurate Pareto front prediction, where (listed in Eq. (8)) using the high-fidelity data of the optimisation case. all numerical tests simulated are able to predict the objective space Fig. 5 presents the PCA results and shows that only the first 2 principal within 1𝜎 uncertainty bounds for both objective functions. components (𝜑1 and 𝜑2 ) yield an explained variance ratio of 0.90 Model II. In contrast with model I, the surrogate visualisation for both (Fig. 5(a)), where 𝜑1 is the main principal component explaining the objective functions (Fig. 8) in model II shows a non-linear prediction variability. Fig. 5(b) presents the coefficients corresponding to the two for all variables. This behaviour is somewhat expected due to the added first principal components used for model II. complexity of the model with the PCA. The global minimum for 𝑗1 does As mentioned in Section 2, the two first principal components for not have a straightforward location. Instead, there are diverse regions model II are chosen. To compare model I in an analogous manner, where the global extrema can be found. Specifically, negative values of the two first leading candidate functions are likewise chosen where 𝐶0 , and near-zero values of 𝐶1 and 𝐶2 predict generally lower values the coefficients of these two models are determined by surrogate and of 𝑗1 and 𝑗2 (Figs. 8(a) and 8(b)). Similarly to model I, although the Bayesian optimisation. objective spaces for both functions follow similar tendencies, there are The surrogate model is generated using the RANS simulations re- visible differences in the predicted locations of the minima, reflecting sults to find the optimal values of the optimisation variables. Regarding once more that a more accurate prediction of the streamwise flow does the optimisation process and since cases with high values of the objec- not necessarily correlate with the secondary flow prediction. tive functions are not of interest, the MOEA optimisation is performed Regarding the surrogate quality (Fig. 9(a)), and Pareto front pre- by imposing the constraints 𝑗1 ≤ 0.5 and 𝑗2 ≤ 0.4, therefore, the diction and numerical validation 8(b), the surrogate predicts the test analysed non-dominated solutions only take place in the regions of set with high accuracy for the whole design space tested. These lower- highest interest and potential global minimisation. To yield a smooth confidence regions are not reflected in the non-dominated solutions non-dominated solution front, 5000 samples are required by the MOEA nor in their numerical validation. Since the expected improvement in this study. This showcases the cost advantages of surrogates, where function prioritises good confidence in the near-extrema region and 270 CFD observations are required, yielding a cost reduction of 18.4 the complexity of the model is higher, a certain level of uncertainty is times. expected in far-away regions of the global minimum. Nevertheless, the Regarding both models’ results, an optimal design space of surrogate shows a good level of confidence at the Pareto front (within [−2, −1, −1] ≤ 𝐶𝑖 ≤ [0, 1, 1] is chosen for model I, and an optimal design 2𝜎 in Fig. 9(b)). space of [−1.75, −0.25, −0.25] ≤ 𝐶𝑖 ≤ [−1.25, 0.25, 0.25] is chosen for model II after various progressive surrogate iterations. The iterations 3.2. Comparison of models re-define the design space limits by analysing the non-dominated values of the coefficients after optimisation. If these values reach the imposed The final results for each model are shown in Table 1, where the best limits, the objective space is reset, and the design space is adapted. cases per model are shown, followed by their 𝐶𝑖 values and surrogate Model I. Contour plots in Fig. 6(b) illustrate how the optimisation quality parameters. Both models significantly improved the standard variables affect the objective functions. On the one hand, the surrogate 𝑘 − 𝜔 SST by 61.3% and 60.2% for model I and II respectively. Differ- visualisation for 𝑗1 (Fig. 6(a)) shows a quasi-linear tendency for all vari- ences in performance for the objective values are considered negligible. ables. The global minimum for 𝑗1 is shown towards negative values for As seen in Figs. 7(a) and 9(a), the surrogate quality is slightly worse all 𝐶𝑖 without clear visualisation of extrema. On the other hand, there for model II. This is due to the higher complexity added by PCA and is a clear global minimum seen for 𝑗2 values, shown by a non-linear the inclusion of higher order terms in 𝛼2 , yielding more non-linear surrogate for 𝐶0 and 𝐶1 , where quasi-linear behaviour is predicted behaviour in the design space and, thus, a higher RMSE and lower 𝑟2 by the relationship between 𝐶1 and 𝐶2 (Fig. 6(b)). It is important for model II. to highlight that, although the lowermost values for the coefficients Regarding qualitative results for the optimisation case, both mod- predict an improvement for 𝑗1 , the tendency is not completely followed els are able to accurately predict the direction and symmetry of the by 𝑗2 prediction. This implies that a more accurate prediction of the secondary motions while improving the streamwise flow prediction. 7 M.J. Rincón et al. International Journal of Heat and Fluid Flow 104 (2023) 109242 Fig. 6. Surrogate contour visualisation for model I. It should be noted that each pair of variables is shown by holding the other variable at the mid-value of their range. Fig. 7. Surrogate quality against test set (Fig. 7(a)), and Pareto-front validation with 1𝜎 uncertainty bounds (Fig. 7(b)) for model I. Table 1 Objective function results, coefficients, and quality for best models and surrogates of each approach. Model 𝐽 𝑗1 𝑗2 𝐶0 𝐶1 𝐶2 RMSE1 RMSE2 𝑟21 𝑟22 I 0.387 0.455 0.319 −1.653 0.625 1 0.004 0.048 0.999 0.966 II 0.398 0.457 0.339 −1.613 0.074 0.015 0.032 0.079 0.895 0.880 However, the streamwise vorticity error is generally higher in the impact on the overall performance of the models. In summary, both near-wall regions (Fig. 10). As a consequence of the prediction of models are able to predict the secondary flow with high accuracy. a correct secondary flow direction, the streamwise flow prediction Concerning the quantitative analysis of the velocity profiles at likewise improves. Regions of slight overprediction of ⟨𝑢1 ⟩ are seen in 𝑦∕𝐻 = [0.25, 0.5, 0.75], an overall and significant improvement in the near-wall vicinity as well as the middle section of the computational the prediction of both models against standard 𝑘 − 𝜔 SST, can be volume, whereas regions of ⟨𝑢1 ⟩ underprediction are seen in the bulk seen. Although the prediction of both models displays minor differ- flow close to the channel vertex. Regarding the prediction of ⟨𝜔1 ⟩, anti- ences between each other, they both predict with high accuracy the symmetric predictions are seen with respect to the diagonal symmetry DNS data. Some light discrepancy can be seen in the magnitudes at line with a general over and underprediction in the near-wall region. near-wall regions, where the gradient of ⟨𝑢2 ⟩ is high. Nevertheless, As expected from the objective function results, qualitative differ- gradients of velocity are accurately predicted, only displaying a slight ences between models (Fig. 11) are not clearly seen. For the streamwise underprediction in the velocity magnitude in near-wall regions. velocity prediction, no significant differences can be seen in the con- The turbulence shape of the developed models is represented in tours of the error function, whereas for the vorticity prediction, slight Figs. 12 and 13, where the reliasability for both models is conserved. variations in the error can be seen between models without a significant A certain level of anisotropy is added to the models, yielding results in 8 M.J. Rincón et al. International Journal of Heat and Fluid Flow 104 (2023) 109242 Fig. 8. Surrogate contour visualisation for model II. It should be noted that each pair of variables is shown by holding the other variable at the mid-value of their range. Fig. 9. Surrogate quality against test set (Fig. 9(a)), and Pareto front validation with 2𝜎 uncertainty bounds (Fig. 9(b)) for model II. the vicinity of the plain-strain limit. Both models predict a very similar Qualitatively and quantitatively, both models improve the predic- turbulence shape while model II slightly yields results farther away tion and agreement with high-fidelity Reynolds stresses while not from the plain-strain limit. Nonetheless, results are distant from 𝑥̂ 1𝑐 and showing significant differences in the prediction of Reynolds stresses DNS anisotropy. Although these results are expected, the integration between models. of the whole Pope’s decomposition with its 10 tensors and a more thorough definition of could be able to further improve the RANS 3.3. Verification and generalisability on test cases anisotropy prediction. Fig. 14 presents the profiles of RST components obtained by new In this subsection, the best-found models are tested against diverse models and shows that new correction models improve the prediction canonical cases in which secondary flow is an important component as of RST components. It should be mentioned that the Reynolds stress well as cases where secondary flow is not present. This verification is tensor is calculated as performed in order to provide a performance overview of the models’ 2 generalisability, stability, and robustness to preserve canonical flow 𝑅𝑖𝑗 = 𝐴𝑖𝑗 + 𝑘𝛿. (22) 3 𝑖𝑗 features. Firstly, validation of the channel case and the boundary layer The results of 𝑅𝑖𝑗 show that the introduced correction does not is performed at different Reynolds numbers. Secondly, the models change the original 𝑅11 prediction by 𝑘 − 𝜔 SST while the 2 other are tested against diverse cases of ducts with different aspect ratios principal components of the tensor (𝑅22 and 𝑅33 ) are predicted in and Reynolds numbers. Lastly, the models are tested against a wall- better agreement with high-fidelity data. Predictions of the off-diagonal modelled nominally infinite Reynolds number case in which secondary components of 𝑅𝑖𝑗 are overall displaying an improvement in prediction, flow is roughness-induced. Similarly as in Section 3, all results are highlighting the non-zero prediction of 𝑅23 , which shows a certain level shown and discussed with their respective velocity contour plots with of mismatch with high-fidelity data in the near-wall regions. stream functions and qualitative flow analysis of the velocity profiles. 9 M.J. Rincón et al. International Journal of Heat and Fluid Flow 104 (2023) 109242 Fig. 10. Qualitative streamwise and secondary flow prediction of the best cases for each of the models. Left: streamwise flow contours with stream function lines depicting secondary flow prediction and direction. Centre: streamwise flow error compared to DNS data. Right: streamwise vorticity error compared to DNS data. It should be noted that the stream function line thickness denotes the secondary flow intensity. In the colourbar, 𝜙 indicates either ⟨𝑢1 ⟩ or ⟨𝜔1 ⟩. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) Fig. 11. Profiles of velocity components for duct flow case with AR = 1 and Re𝑏 = 3500. Source: High-fidelity data obtained from Pinelli et al. (2010). Fig. 12. Barycentric map showing the physical reliasability and turbulence anisotropy of the developed models following the colour representation by Emory and Iaccarino (2014). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) 3.3.1. Channel flow and law of the wall canonical channel flow cases are driven by a constant pressure gradient in order to match the friction Reynolds number from DNS. The addition of 𝑇𝑖𝑗(2) and further modifications in this study must Results shown in Fig. 15 depict the streamwise velocity as 𝑢+ = 𝑢∕𝑢𝜏 not destabilise 𝑘 − 𝜔 SST and yield unphysical results. Therefore, both and 𝑘+ = 𝑘∕𝑢2𝜏 in function of 𝑦+ = 𝑢𝜏 𝑦∕𝜈, where 𝑢𝜏 is the friction models are tested in a channel flow case at friction Reynolds numbers velocity. The results show a consistent prediction of the law of the 395 and 5200 to verify that the law of the wall is preserved. The wall with 𝑘 − 𝜔 SST without introducing any noticeable changes. Both 10 M.J. Rincón et al. International Journal of Heat and Fluid Flow 104 (2023) 109242 Fig. 13. Contours of RSTs’ shape for duct flow case with AR = 1 and Re𝑏 = 3500. The colours follow the reference barycentric map in Fig. 4. Source: DNS data from Pinelli et al. (2010). Fig. 14. Profiles of Reynolds stress components for duct flow case with AR = 1 and Re𝑏 = 3500. Source: High-fidelity data obtained from Pinelli et al. (2010). models yield the same results for both the streamwise velocity and the In order to verify the performance of the models in diverse cases turbulence kinetic energy, and no improvements or diminishments in with similar features, the models are likewise tested against a duct of the prediction of these variables are seen compared to standard 𝑘 − 𝜔 AR = 3 and a slightly lower Re𝑏 = 2600 compared to the optimisation SST. case. The qualitative results shown in Fig. 18 indicate a similar perfor- 3.3.2. Duct cases mance for both models: gradients, symmetry, and direction of sec- To showcase the generalisability of the models, diverse duct cases ondary flow are predicted accurately with slight inaccuracies regarding are tested. The first case is the flow through a duct of AR = 1 and a the secondary flow magnitude in the near-wall regions. The location of considerably higher Re𝑏 = 5700 compared to the optimisation case. the vortices centres is likewise predicted with high accuracy, agreeing A qualitative depiction of the results is shown in Fig. 16, where with high-fidelity data (Fig. 18). both models are able to predict the gradients of the secondary flow In terms of the quantitative analysis and the evaluation of the and improve the prediction of the streamwise velocity compared to velocity profiles, the results shown in Fig. 19 likewise follow a similar 𝑘 − 𝜔 SST. In agreement with the baseline results of the models, the trend compared to previous results in this study. The region in the magnitude of the secondary motion is weaker than the high-fidelity vicinity of the duct vertex shows some discrepancy in the velocity data although the motion’s antisymmetry and location of the vortices prediction. Nevertheless, the bulk flow away from the near wall is centres are predicted accurately. accurately predicted. Regarding the velocity profiles of the case (Fig. 17), a similar trend is observed where the gradients and magnitude of the profiles are 3.3.3. Roughness-induced secondary flow case predicted with high accuracy in both models, only displaying some Finally, the verification of the models is tested on a roughness- discrepancies in the wall-near regions, where the high-fidelity data induced secondary flow case. The case is based on the studies by yields slightly higher velocity gradients. Forooghi et al. (2020) and Amarloo et al. (2022b), and is characterised 11 M.J. Rincón et al. International Journal of Heat and Fluid Flow 104 (2023) 109242 Fig. 15. Mean streamwise velocity and turbulent kinetic energy profiles for channel flow: Re𝜏 = 395 (Fig. 15(a)), and Re𝜏 = 5200 (Fig. 15(b)). Source: High-fidelity data obtained from Moser et al. (1999). Fig. 16. Streamwise flow contours with stream function iso-lines depicting secondary flow prediction and direction: Duct flow case with AR = 1 and Re𝑏 = 5700. Source: High-fidelity data obtained from Vinuesa et al. (2018). Fig. 17. Profiles of velocity components for duct flow case with AR = 1 and Re𝑏 = 5700. Source: High-fidelity data obtained from Vinuesa et al. (2018). by displaying a nominally infinite Reynolds number. It should be performance in a more complex and challenging to predict case, mostly noted that due to the very high Reynolds number, the use of wall by two-equation RANS turbulence models. For a better analysis of roughness-induced secondary flow, disper- models is inevitable with current hardware, therefore, the atmospheric sive velocity components are defined as, wall models (Parente et al., 2011) have been used. The secondary flows generated in this case are recognised as Prandtl’s second kind ⟨𝑢′′ 𝑖 ⟩ = ⟨𝑢𝑖 ⟩ − ⟨𝑢̃ 𝑖 ⟩, (23) of secondary flow, similar to the square duct flow (Nikitin et al., ⟨𝑢′′ where 𝑖 ⟩ is the dispersive velocity components, and ⟨𝑢̃ 𝑖 ⟩ is the spatial 2021). The roughness-induced case is chosen to showcase the models’ spanwise-averaged mean velocity. 12 M.J. Rincón et al. International Journal of Heat and Fluid Flow 104 (2023) 109242 Fig. 18. Streamwise flow contours with stream function iso-lines depicting secondary flow prediction and direction: Duct flow case with AR = 3 and Re𝑏 = 2600. Source: High-fidelity data obtained from Vinuesa et al. (2018). Fig. 19. Profiles of velocity components for duct flow case with AR = 3 and Re𝑏 = 2600. Source: High-fidelity data obtained from Vinuesa et al. (2018). Following previous verification, the qualitative results of the stream- compared to the high-fidelity data. The predicted tendencies are cor- wise dispersive velocity (⟨𝑢′′ 1 ⟩) are shown in Fig. 20. On the one hand, rect, although there is a discrepancy in the gradients and magnitudes results show that standard 𝑘 − 𝜔 SST does not predict any secondary predictions relative to previous verification cases. motion that drives the flow to display a high-momentum path on top To provide a complete and holistic comparative study for this case, of the high-roughness patch and a low-momentum path at 𝑦∕𝐻 = 0, the spatial average of the streamwise, as well as vertical root-mean- as the high-fidelity data predicts. On the other hand, the enhanced squared (RMS) dispersive velocity (⟨𝑢̃ 1 ⟩ and RMS ⟨𝑢′′ ⟩, respectively), 3 models are both capable of predicting the secondary motion, although are calculated. These results are shown in Fig. 22, where the prediction at a lower intensity compared to high-fidelity results. The direction of of ⟨𝑢̃ 1 ⟩ shows a considerable improvement compared to standard 𝑘 − 𝜔 the rotation (clockwise) is correctly predicted although the vortex cen- SST. However, the bulk flow for the developed models is still over- tre does not visibly match the high-fidelity counterpart. Even though predicted as a consequence of the underpredicted secondary flow and the predicted secondary flow is not strong enough to move the low- its derived under-subtraction of the streamwise momentum. Regarding momentum path to the top of the low-roughness patch (𝑦∕𝐻 = 0), the RMS ⟨𝑢′′ ⟩, a clear improvement in the prediction can be seen the existence of a weak high-momentum path is visible on top of the 3 by the developed models, where 𝑘 − 𝜔 SST is unable to predict any high-roughness patch. Since the source of vorticity production lies at roughness-heterogeneity existing at the bottom wall (more information vertical dispersive velocity. The predictive profiles are underpredicted at Amarloo et al. (2022b)), the authors believe that further investiga- and slightly skewed towards 𝑧∕𝐻 = 0 compared to the high-fidelity tions about wall models regarding the secondary flow can help a better data, which verifies the mismatched location of the predicted secondary prediction of the secondary flow. flow vortices. Both developed models yield an almost identical pre- Quantitive results reflect the previous qualitative analysis. Fig. 21 diction of the flow, and a similar improvement is seen by models at [ ] shows the velocity profiles at 𝑦∕𝐻 = 𝜋∕8, 𝜋∕4, 3𝜋∕8 , where the their prediction of RMS ⟨𝑢′′ 3 ⟩. Overall, the applicability of the models improvement of the models can be seen. In contrast to other verification to this case shows their generalisability, where robust and improved cases in this study, both models’ predictions improve the performance predictions are obtained in more complex cases at very high Reynolds of standard 𝑘 − 𝜔 SST, however, the secondary flow intensity is weaker numbers. 13 M.J. Rincón et al. International Journal of Heat and Fluid Flow 104 (2023) 109242 Fig. 20. Streamwise dispersive velocity contours with stream function iso-lines depicting secondary flow prediction and direction: Roughness-induced secondary flow case with Re𝑏 = 2 × 108. Source: High-fidelity data obtained from Amarloo et al. (2022b). Fig. 21. Profiles of velocity components for roughness-induced secondary flow with Re𝑏 = 2 × 108. Source: High-fidelity data obtained from Amarloo et al. (2022b). Fig. 22. Roughness-induced secondary flow quantitative results of spanwise-averaged streamwise velocity (left), and root-mean-squared dispersive vertical velocity (right). Source: High-fidelity data obtained from Amarloo et al. (2022b). 14 M.J. Rincón et al. International Journal of Heat and Fluid Flow 104 (2023) 109242 Fig. 23. Model comparison of mean streamwise velocity and turbulent kinetic energy profiles for channel flow: Re𝜏 = 395 (Fig. 23(a)), and Re𝜏 = 5200 (Fig. 23(b)). Source: High-fidelity data obtained from Moser et al. (1999). 4. Comparison with other EARSMs To provide a complete and updated performance of the developed models, a comparison is made against the two novel machine-learned EARSMs developed by Saïdi et al. (2022) denoted as M1 (Model 1) and M2 (Model 2); and the classical (not machined-learned) EARSM BSL-EARSM based on 𝑘 − 𝜔 SST model and on the explicit constitutive relation by Wallin and Johansson (2000) and developed by Menter et al. (2012). The comparison is made on CF395 , CF5200 , and duct flow with Re𝑏 = 10320 and AR= 1. As shown in the comparative results of Fig. 23, Saidi et al. models mispredict the prediction of boundary layers due to its modification of 𝑇𝑖𝑗(1) in their models. These modifications improve the prediction of separated flows at the expense of modifying the successful prediction of the law of the wall by standard 𝑘 − 𝜔 SST. This undesirable effect, however, can be avoided by deactivating the 𝑇𝑖𝑗(1) correction in the regions corresponding to equilibrium boundary layers (e.g., a space- dependant aggregation in the study by Cherroud et al. (2023)). Since Fig. 24. Normalised vertical velocity component comparison between models for the work performed in this study only adds the contributions of 𝑇𝑖𝑗(2) , DF10320 with AR= 1 along line 𝑦 = 𝑧. the developed models are able to preserve the successful predictions of Source: High-fidelity data obtained from Huser and Biringen (1993). 𝑘 − 𝜔 SST for equilibrium boundary layers. Finally, a comparison showing the prediction of the ⟨𝑢2 ⟩ component along the 𝑦 = 𝑧 line of a square duct at Re𝑏 = 10320 is shown in Fig. 24. It can be seen that the PDA-EARSMs outperform the models by Saidi the 𝑘 − 𝜔 SST model in a successful reproduction of the velocity and et al. and yield similar performance compared to BSL-EARSM. TKE profiles. For the purpose of generalisability investigation, new models are 5. Conclusions tested against verification duct-flow cases with different Reynolds num- bers and different aspect ratios. Both enhanced models show a suc- This study employs CFD-driven surrogate and Bayesian optimisation cessful prediction of secondary flow and improvement of streamwise to enhance the 𝑘 − 𝜔 SST turbulence model focusing on predicting velocity prediction in the unseen cases. The third and final verification Prandtl’s second kind of secondary flow. A progressive approach is case is a roughness-induced secondary flow case at a nominally infinite adopted to enhance the linear eddy-viscosity model with the ability to Reynolds number, where the enhanced models are able to predict the predict secondary flows while maintaining its effective performance in secondary motion, although at a lower intensity compared to high- canonical flow cases. The enhancement is based on the introduction of fidelity data due to the limitations of wall modelling for this type of explicit algebraic Reynolds stress correction models with two levels of cases. complexity: introducing a linear combination of the first two candidate Overall, the results, summarised in Table 2, show that the developed functions and introducing a linear combination of the first two principal models perform substantially better than the standard 𝑘 − 𝜔 SST model components obtained from PCA on all candidate functions. These two in all verification cases. The enhanced models are able to predict models are optimised to reach the best prediction of both secondary secondary flow features, showing global improvements in the velocity flow and streamwise flow in the optimisation case of a duct flow with field prediction. Both models show similar performance in successfully an aspect ratio of 1 and Re𝑏 = 3500. predicting secondary flows and streamwise velocity. Therefore, more Considering the progressive approach, the enhanced models are invariants combined with PCA do not necessarily improve the per- tested against channel flow cases at different friction Reynolds num- formance of the models in the 2-dimensional cases considered in this bers, where the enhanced models preserve the original performance of study. Nevertheless, considering scaling up the complexity of the model 15 M.J. Rincón et al. International Journal of Heat and Fluid Flow 104 (2023) 109242 Table 2 References Objective function results for optimisation case and all verification cases tested. Results are presented by the case abbreviation and characteristics: Duct flow cases as DFAR, Re Amarloo, A., Cinnella, P., Iosifidis, A., Forooghi, P., Abkar, M., 2023. Data-driven and roughness-induced secondary flow as RIRe. It should be noted that DF1, 3500 is the Reynolds stress models based on the frozen treatment of Reynolds stress tensor optimisation case. 𝐽 = 1 indicates the 𝑘 − 𝜔 SST original prediction and 𝐽 = 0 indicates and Reynolds force vector. Phys. Fluids 35 (7), 075154. a match of the high-fidelity data. Amarloo, A., Forooghi, P., Abkar, M., 2022a. Frozen propagation of Reynolds force Case Model 𝐽 𝑗1 𝑗2 vector from high-fidelity data into Reynolds-averaged simulations of secondary I 0.387 0.455 0.319 flows. Phys. Fluids 34 (11), 115102. DF1, 3500 II 0.398 0.457 0.339 Amarloo, A., Forooghi, P., Abkar, M., 2022b. Secondary flows in statistically unstable I 0.405 0.482 0.328 turbulent boundary layers with spanwise heterogeneous roughness. Theor. Appl. DF1, 5700 Mech. Lett. 12 (2), 100317. II 0.392 0.461 0.323 Barzilai, J., Borwein, J.M., 1988. Two-point step size gradient methods. IMA J. Numer. I 0.458 0.551 0.365 DF3, 2600 Anal. 8 (1), 141–148. II 0.502 0.580 0.424 Bin, Y., Chen, L., Huang, G., Yang, X.I., 2022. Progressive, extrapolative machine I 0.693 0.663 0.723 learning for near-wall turbulence modeling. Phys. Rev. Fluids 7 (8), 084610. RI2×108 II 0.698 0.666 0.730 Bouhlel, M.A., Hwang, J.T., Bartoli, N., Lafage, R., Morlier, J., Martins, J.R.R.A., 2019a. I 0.486 0.538 0.434 A Python surrogate modeling framework with derivatives. Adv. Eng. Softw. 102662. Average II 0.498 0.541 0.454 Bouhlel, M.A., Hwang, J.T., Bartoli, N., Lafage, R., Morlier, J., Martins, J.R., 2019b. A python surrogate modeling framework with derivatives. Adv. Eng. Softw. 135, 102662. Cherroud, S., Merle, X., Cinnella, P., Gloerfelt, X., 2023. Space-dependent aggregation in 3-dimensional cases, the combination with PCA dimensionality re- of data-driven turbulence models. arXiv preprint arXiv:2306.16996. duction could further improve the predictions. It should be noted that Coello, C.A.C., Lamont, G.B., Van Veldhuizen, D.A., et al., 2007. Evolutionary the optimisation process and the nature of the PDA-EARSMs yield a cer- Algorithms for Solving Multi-Objective Problems, Vol. 5. Springer. Cruz, M.A., Thompson, R.L., Sampaio, L.E., Bacchi, R.D., 2019. The use of the Reynolds tain level of turbulence anisotropy while always maintaining solution force vector in a physics informed machine learning approach for predictive robustness and stability. This fact results in some discrepancies between turbulence modeling. Comput. & Fluids 192, 104258. the PDA-EARSMs and high-fidelity data in the predicted turbulence Damblin, G., Couplet, M., Iooss, B., 2013. Numerical studies of space-filling designs: shape. optimization of Latin Hypercube Samples and subprojection properties. J. Simul. 7 These findings suggest that the use of CFD-driven optimisation with (4), 276–289. surrogate modelling and Bayesian optimisation is a robust approach Deb, K., Pratap, A., Agarwal, S., Meyarivan, T., 2002. A fast and elitist multiobjective to enhance linear-eddy-viscosity turbulence models to predict more genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6 (2), 182–197. Duraisamy, K., 2021. Perspectives on machine learning-augmented Reynolds-averaged complex physics. The PDA-EARSMs developed greatly improve the and large eddy simulation models of turbulence. Phys. Rev. Fluids 6 (5), 050504. prediction of turbulence anisotropy in wall-bounded flows, especially Duraisamy, K., Iaccarino, G., Xiao, H., 2019. Turbulence modeling in the age of data. in cases where the standard 𝑘 − 𝜔 SST presents difficulties in accurately Annu. Rev. Fluid Mech. 51, 357–377. predicting the secondary flow. The progressive nature of this devel- Eberhart, R., Kennedy, J., 1995. Particle swarm optimization. In: Proceedings of the opment as a first step focuses on the prediction of secondary flows, IEEE International Conference on Neural Networks, Vol. 4. Citeseer, pp. 1942–1948. allowing further improvement of the models by adding more predictive Emory, M., Iaccarino, G., 2014. Visualizing turbulence anisotropy in the spatial domain physics and verifying them on other test cases. with componentality contours. In: Center for Turbulence Research Annual Research Briefs. pp. 123–138. Fang, Y., Zhao, Y., Waschkowski, F., Ooi, A.S., Sandberg, R.D., 2023. Toward more gen- CRediT authorship contribution statement eral turbulence models via multicase computational-fluid-dynamics-driven training. AIAA J. 1–16. Mario Javier Rincón: Conceptualisation, Methodology, Software, Forooghi, P., Yang, X.I., Abkar, M., 2020. Roughness-induced secondary flows in stably Validation, Formal analysis, Writing – original draft. Ali Amarloo: stratified turbulent boundary layers. Phys. Fluids 32 (10), 105118. Conceptualisation, Methodology, Software, Validation, Formal analysis, Forrester, A., Sobester, A., Keane, A., 2008. Engineering Design Via Surrogate Writing – original draft. Martino Reclari: Formal analysis, Method- Modelling: A Practical Guide. John Wiley & Sons. ology, Writing – review & editing. Xiang I.A. Yang: Formal analy- Hastie, T., Tibshirani, R., Friedman, J.H., Friedman, J.H., 2009. The Elements of sis, Methodology, Writing – review & editing. Mahdi Abkar: Con- Statistical Learning: Data Mining, Inference, and Prediction, Vol. 2. Springer. Holland, J.R., Baeder, J.D., Duraisamy, K., 2019. Field inversion and machine learning ceptualisation, Methodology, Formal analysis, Project administration, with embedded neural networks: Physics-consistent neural network training. In: Resources, Supervision, Writing – review & editing. AIAA Aviation 2019 Forum. p. 3200. Hornshøj-Møller, S.D., Nielsen, P.D., Forooghi, P., Abkar, M., 2021. Quantifying Declaration of competing interest structural uncertainties in Reynolds-averaged Navier–Stokes simulations of wind turbine wakes. Renew. Energy 164, 1550–1558. The authors declare that they have no known competing finan- Huang, X.L., Yang, X.I., 2021. A bayesian approach to the mean flow in a channel with cial interests or personal relationships that could have appeared to small but arbitrarily directional system rotation. Phys. Fluids 33 (1), 015103. influence the work reported in this paper. Huser, A., Biringen, S., 1993. Direct numerical simulation of turbulent flow in a square